All Categories - Mr Bartlett Blogs

Who is your copilot?

7/14/2024

Copilot is now available on my tools/machine at work. Sitting here asking it to run down information and help me make plans to conquer the world.... I think back to my younger days watching Star Trek episodes on Television. Captain Kirk would ask the computer for information the 'ship' would spit it out in seconds.

Fast forward to the present. We have almost reached this reality, minus the space travel and aliens.. But that isn't too far off either. The next step will be how fast can you go from a question to answer to action. Once the machines are able to process commands directly from output from Copilot/ChatGpt/etc we will only be limited by our imagination. In my yesteryear I did product evaluations. One of the areas that was fascinating to me was the SOAR field (security orchestration automation and response (SOAR)). With a SOAR framework you could put different pieces of a workflow together to accomplish different use cases and/or tasks. Each piece a step in process to accomplish the overall goal. Pulling and pushing information from different tools/platforms. Using that information to decide on next steps in the adventure. The next few years in tech will be pretty amazing. Keep an open mind and don't limit your imagination.

Bartlett

You don't know what you don't know - Inventory

1/3/2022

With all the Log4J madness happening over the last few weeks it got me thinking ..... (https://www.cisa.gov/uscert/apache-log4j-vulnerability-guidance) Here are a few questions you should be kicking around with your OPS/Security teams when the dust settles. But don’t wait too long after an incident, you want the wound to be fresh so you get the details. People forget over time, especially things they do not want to remember.

How do we know if we are affected?
- Do we even run the product/library/application that is affected by the vulnerability? Where do you start? It all begins with INVENTORY. Over the many years of the cyber industry one thing still remains: we are horrible at inventory and tracking of assets. If you do not have a centralized asset list or inventory, start today! Open a spreadsheet and start filling in rows with asset details; name, ip, OS version, Patch info for OS, Applications running on Asset with Version Numbers. This is a small list and can be made as complicated/detailed as you want. I know, most folks are saying a spreadsheet? Why a spreadsheet? The point isn’t the spreadsheet, the point is you have a list of assets. Most of the time it isn’t the medium you are capturing the data, it is the process around the inventory that collapses. After creating the list, share the list in a centralized location where multiple ‘approved’ folks can update the list. Bake it into the company's psyche. Onboarding/offboarding of assets, update the inventory list. Patches to inventory, update the asset list. Most companies have that ‘running’ inventory file that when you look at it you find multiple assets that haven’t been living on the network for years..
- This is where a centralized CMDB or Asset Database might be helpful using a vendor tool but without the basics you will still need to have a basic list of assets. When the vendor comes in to plan the implementation one of the first questions they will ask is ‘ do you have a list of assets/CIs?
How do we know ‘what’ is affected?
- Most vulnerabilities relate to a version or library within a product/application/service it is important to capture the version numbers of applications running on your assets so you can narrow down what is affected. The building blocks to your inventory start with an asset/device from there start filling in the application list and version numbers. It does feel like a daunting task when starting from nothing but remember, every little piece of information you add to your inventory list will save you hours of pain later or during an actual incident when you are scrambling for the information.
- This is where a centralized CMDB with the version details is helpful or an Asset Inventory/Patching system. Most Vulnerability scanners today now create a list of Assets and associated application versions as well but might not be your central inventory list and you will need to integrate that platform with your inventory of record. Most vendor tools are providing integrations between these types of systems and updating/adding to your central inventory has gotten a lot easier then yesteryear. When doing a review of your current systems make sure to lay out/document the details around each platform, the use, data sets/information that it provides, and how this could be useful within other platforms when trying to answer a question. If all the data is collected and integrated together it becomes a lot easier to narrow down what asset might be affected by a vulnerability or in answering other questions like:
  - What system is not patched?
  - What system is running unapproved software/applications?
How do we remediate what is affected?
- Answering this question needs to happen far before an incident is happening. Your organization needs to have a process/SoP/Plan documented on the incident response and patching procedures with different use cases/scenarios and actions different teams in the organization will take to remediate the issue and how communications will happen. How will you remediate? Patch, apply workaround, network configuration change, pull device from the network. This scenario can play out like a bad ‘pick your own adventure’ book if you haven’t thought through some of these scenarios and understand what is needed for each. Who is involved? How will we track changes during the incident? How will we communicate during the incident? Does the severity of the vulnerability affected roll out? How does that affect the Change Management Process? What does an emergency Change Request encompass when dealing with the situation? When do we go ‘back’ to using “normal” changes for an asset? How will the company handle the post incident review and incorporate lessons learned? Evolution of process/SoPs is important and it all starts with having the basic block in place to add to it. Most of the time the block is a first cut at a document to capture the process. It doesn’t have to be perfect, it won’t be as long as you refer to it and improve upon it.
How do we protect against the vulnerability if we cannot remediate?
- Patch, work around, network config change, pull from network? Really depends on a lot of factors:
  - What is affected?
  - What systems? Are they mission critical? Are they internet facing? Do they have security controls around them that already loosen the vulnerability criticality
How do we check for this being exploited prior to knowledge of vulnerability?
- Need to have that historical look into your network and assets. What has happened to them? Who has talked/communicated with them? Did they take actions not normal to their behavior?
- Logging of assets will go a long way here in helping you understand if you were impacted by a vulnerability. Yes, logging sucks, setting up the data pipeline, collecting logs, setting up monitoring of logs (alerts/triggers), etc. It’s a lot of work but if you build a repeatable process that is implemented into other processes (onboarding/offboarding) the burden gets easier every time you do it. One major area that isn’t done enough is log analysis and learning the logs. There are TONS of log types out there and most products don’t stick to a universal log format so you’ll need to roll up your sleeves and learn the specifics of the log types, event types, fields associated with log/event types, what types of events are logged by the product/app/service/etc… Take some time here do high level queries against the data to start from the 50k foot view and drill into the data. Start with high level count queries against the log type(event type) fields.
  - COUNT, ACTION FIELD
    - (24987, Deny; 300009, Accept; 23421, Reject)
  - COUNT, DISINCT(CNT((SrcIP), ACTION FIELD, order by COUNT
  - Look for patterns in event type events. Are there event types that should never appear in logging? If so, might be a good alert to setup in case they do start showing up in the logs.
How do we shorten the above?
- Automation… Looking at each of the above, the process around it and breaking out a game plan on how you can take a manual process and automated it with a tool/script/platform.
  - Any piece of a manual process that can be shortened with the help of a computer should be looked at. Does it make sense to automate it? Where do you start?
  - Do you integrate this workflow into a SIEM, Ticketing platform, Automation Platform? Ideally some of the tools you are running in your environment already handle some of these tasks and you will just need to work on enabling plugins/add-ons to create the connections and communications between systems. Lean on your vendor for assistance in this area, they should have ample documentation to get you started and if there is a custom plugin you need to discuss it with them to get support in place. Last thing you want to be doing is managing a plugin and trying to keep up with all the API updates/changes done by a vendor.
How do we onboard devices/services/assets with auto tracking/inventory built in?
- This question builds off the Automation response above. Look at your basic SoPs that are done for a majority of the employees in your organization. From onboarding of the employee to hardware assignment to MFA and security controls. Anything you can automate within this process will save you time/effort 3 fold.
- There should be a repeatable process when it comes to server build out as well so identify where you can start automating. Gold image server build, call to inventory system to add asset to CMDB, update inventory with applications, etc.
How do we query/dashboard using the above data?
- If a new vulnerability is disclosed can you build reports or ‘close to’ real time dashboards showing your current exposure?
  - Do you have the data to do this?
  - Do the tools you use have the ability to do this?
  - How easy is it to create these reports/dashboards? Do you need a trained person to whip out these reports or is the tool easy enough to use that in a time of need someone could go in and create a report?

Once you start getting a handle on the above.. You will need to ask the same questions around your vendors, supply chain, and partners. It never ends…

Supply chain, vendors, partners:

How do we check vendors for vulnerabilities?
How do we check our supply chain for vulnerabilities?
How do we get the above details without the wait?
How do these parties notify you of the vulnerabilities? How do they communicate remediation/patching/etc?

That’s all for now.
Bartlett

I told you so...

1/2/2022

You Never Know What You Are Going to Get

Who doesn’t like a good meeting

As we gathered into the conference room to discuss transitioning of applications to the Development team I knew we were about to uncover some nasty secrets hidden in the OPS team….. Manager presents an application called CENTRON, it is basically an analyst task and incident tracking tool with scheduling capabilities. I was familiar with this app, 2 years ago during the initial discussions of the app i had recommended the code, requirements, and issue tracking be done using the Development teams dev studio. We got as far as importing the v1 code base into the repo and then never heard anything else from the application author/developer/team…. Now, with turnover and organization ‘realignment’ they were coming to the development team for help (and to take over ownership of the tool). I was about to say: “I told you so….” but when looking around the room there was NO ONE from the prior organization that i had worked with or worked on the tool. Acquisitions can have a huge impact on people and vision. So I focused my energy on cleanup and getting this application under a supportable maintenance cycle.

Laying the Foundation

First step, onboard the project into the Development Studio:

Create needed Confluence space to store documents, notes, requirements, etc.
Create needed Jira Project for issue management, resource planning, and prioritization
Create needed Bitbucket repo(s) for storing code and configuration items.
Create needed Bamboo plans for CI/CD

Our Development Studio has all the necessary components to on board any type of project and work it from the requirements/gathering phases all the way to the automated deployment of said tool. The studio doesn’t just have integrated tools that make for quick execution but the documentation and processes to keep the studio running. If folks are hired or the path of the team changes there should be an overall layout of how things run. In this day and age of all things Agile, sometimes we lose the fact that documentation needs to be written, ‘lived’, and reviewed to keep it up to date and relevant to the current day operation.

Collect ALL CODE!

Need to get ALL code and configurations under version control. A majority of the work done to the app was on the production server so it was a phased approach:
- Get all code into the repo
- Verify the files ‘match’ what is available in production
- Cleanup unneeded files
  - With many projects that do not have a repeatable release process you will find there are backup files and directories include. (ie: CENTRON.php_backup or CENTRON.php_old)
Removing the unnecessary files will allow the new maintainers to focus on the important files/parts of the tool. Any time saved is well spent. Think about the code review of the application by the ‘new’ development team. If they are reviewing copies of the originals it is still time they have used for ‘wasted work’ and ineffective cycles. Removing the files will guarantee that this time will be saved in the future over and over again.

Documentation Review/Creation:

Collect all related/relevant documentation and notes about the application. Storing them all in a Confluence Space for the Application.
Create a System Integration Diagram to visualize the other systems the application integrates/communicates with. Without some type of diagram or map you can refer to, it becomes very difficult to understand the other pieces of the puzzle.
ANY documentation about the system/application will help you in the long run!

System Review:

Review current application in production
Noting directories, configuration items, notes, security concerns, etc
Adding all notes to a central page in confluence
Maintaining an application in a production environment is more than just the CODE BASE! You need to understand the overall architecture of the application, how it integrates and communicates with other tools and services, operating system level configurations, asset details, user/account information, Outage and recovery procedures/documents are all important and cannot be overlooked.

Backups/Recover:

Do a complete backup of the system before ANYTHING is done on the server.
- In our case it is a VM Snapshot which makes life soooo much easier then yesteryear
- Along with the backup procedure there should be recovery steps documented somewhere for the greater good and to reduce SPF (Single Point of Failure) within the teams.
  - It’s hard to train without some documentation.
  - How do you know procedure is followed if it isn’t documented?

A New Day
Start a new:

Deploy a new server(vm) to replace the old production server. After years of a system living in production you can no longer fully understand who was on the box, what they did on the box, what changes were made on the box, and who has root access on the box. You are better off starting from a clean slate.
Determine how the code and components can be more easily managed. Does it make sense to switch to containers and have different parts/components updatable by piece or all at once.
Determine how testing will be done to the application. This project had no unit or integration tests. Any change of the code needs to be manually verified with the knowledge that other areas of the tool could break. We implemented a basic testing framework and focused on adding tests as we fixed bugs and/or implemented new features.
Determine how deployments will be done. The old application was updated by hand on the server. We moved it to a bamboo deployment plan which pulled docker images down from our private repo and started them with a docker-compose file. Now when deployments are done it is automated and done by the machine which greatly reduces the errors made by someone making the changes.
Determine when and how the "lift and shift" will happen from the old production server to the new pristine server. Keep in mind that with any active application communication is key. Users need to be aware when maintenance windows will happen and understand "how" to handle this time in their day to day jobs. Very much like a disaster incident, folks will need to know how to do work without this system for the short period of time. Have a shadow period with both systems up and running. This way you can always hit the o sh88 handle if things are found missing (or wrong) on the new system.
Determine where issues, complaints, and feedback will go during the cut over AND moving into the future. We have a designated email distro list setup just for this. Next step for us is to automate the case creation off of email submissions (ie: outage email submitted, case created and put in IT team queue for resolution)
Document and make known where your documentation is for the application, system, and environment. If your IT team has to respond to outages they better know where to go and what to do.
Make an effort to understand WHY and WHAT areas of the tool/application are used for operations. Times change, new technology/tools come along, don’t repeat the same steps when new and improved actions can happen. Management input will be a huge portion of this discussion, they will need to drive the change and what will happen with legacy applications in your workplace!

You don't know what you got.... Unless you know what you got

1/1/2022

With all the Log4J madness happening over the last few weeks it got me thinking ..... (https://www.cisa.gov/uscert/apache-log4j-vulnerability-guidancewww.cisa.gov/uscert/apache-log4j-vulnerability-guidance)

Don't call that number!

6/15/2021

Story of my moms computer with a popup add that is saying it is a virus/trojan.

Cyber Presentation

2/17/2020

Here is a presentation i did for the Girls that Code group at ODU last year:

https://docs.google.com/presentation/d/1VWj0SeMFmxwFr0zQq2RC1eazuw8AxNhk5ZSdyqE_X0M/edit?usp=sharing

SNORTing in Docker

6/14/2019

RFUN18

10/24/2018

RFUN is Recorded Futures yearly conference. It's a 2 day event. First day was multiple key notes and breakout sessions and second day was talk and training sessions for the tool. This year it was held at the InterContinental (The Wharf) in Washington, DC.

Day 1:
Welcome and Featured Speakers all morning.

Chrisopher Ahlberg (CEO/Co-Founder of Recorded Future - Opening remarks
Christopher provided some insights into where Recorded Future is going and the current state of the company. One word "GROWTH", the company has grown 90% since RFUN17! Congrats to the team at RF!
A couple of points from his talk:
* In the near future (prior to 2020) your company/business will be judged not just on your earnings but also your on-line/corporate risk reputation. The corporate risk surface will be made up of many factors(data points) gleamed from incidents, attack surface, company RELATION to other companies with incident issues, company RELATION to supply chain incidents, and overall on-line persona(rep).
** Recorded future is in the 'beta' stage for a new offering which will help companies understand their Corporate Risk Surface. By analyzing data available to the RF platform they can provide details of where/what/who/how things are affecting this overall rating or score.

Geoff Brown from NYC Cyber Command:
Geoff is the Chief Information Security Officer for the City of New York. he discussed the NYC Cyber Command and it's role in the overall security of the city. He discussed how the CITY is making a strong effort to keep citizens information PRIVATE.
NYC provides a free app for to citizens to help them know if their mobile device has a security issue (ie: malware, connecting to suspicious wifi, etc) https://secure.nyc/

The city is also trying to provide 100% free wifi to all citizens: https://www.link.nyc/ One major reason for providing this 'service' is to give users a secure wifi/network to connect and to reduce the number of wifi hotspot attacks.

The NYC Cyber Command has multiple roles:
* Education - Cyber education for citizens AND to bolster the Cyber workforce
* Incubation - Research and Innovation
* VC/City Funding into new Security and Safety Measures in the city.

Priscilla Moriuchi from Recorded Future :
Priscilla discussed the importance of Attribution.
Key takeaways, with Attribution you need to know:
* How
* Who
* What was hit
* Risk of future attacks

Operations/Companies cannot be happy with just getting threat/attackers off their networks after an incident. They need to understand how it happened and fix the root cause of the problem. To do this you need to understand the environment, assets, and have the ability to constantly improve your detection, defenses, response capabilities, and have an understanding of who/why someone would want to attack.

Threat Intelligence Awards:
Shout out to my colleague Danny Chong for his nomination!

Alexander Schlager from Verizon:
Discussed the role of Corporate risk and how very soon companies will be evaluated by their Corporate Risk Score along with other metrics used today in deals/purchasing/and every day business activities. He mentioned the importance of Sector Analysis and understanding that every Sector(Industry) will be affected by threats in different ways and in different attack vectors. Key takeaway here is Corporate Reputation scores will be influenced by Risk Scores and security incidents. Supply chain attacks CAN and WILL have an affect on your Corporate Reputation so you need to be aware of what your partners are doing (and aren't doing).

Mind Hunter:
Presentation about different Threat Actors. Great discussion but not allowed to post about it :)

Key takeaway: Don't use/resuse passwords or passwords across platforms/applications.

Splunk Smarter: Security Operations with Threat Intelligence:
Rich Dube from the Recorded Future delivery team presented on integrating Recorded Future Threat Intel into Splunk. Utilizing the watch lists and correlation rules in the Recorded Future Splunk app allows users to have the information needed for better decision making and alerting. Threat Intel ingestion keeps getting more efficient into the platform and data enrichment is a necessity when doing IR!

Day 2
Good and Bad, Indicators Beget Indicators - Why Not All Indicators are Good IOCs
Adrian Porcescu
Adrian Porcescu from the Recorded Future Professional Services team discussed the use of IOCs and how one size doesn't fits all. An organization needs to understand their environment and assets to be able to apply good Threat Intel. An IOC against a company in another industry might be more severe then what you have in your environment. You need to have the ability to adjust the risk score (or severity) associated to an IOC against the asset it is trying to attack/hit. Just like IDS rules, if you have a UNIX rule in your rule base with NO unix servers in your environment you will be flooded with false flags and missing some of the important alerts.

Adrian discussed the importance of chaining IOCs together for a better understanding of what is happening. One example used was hosts calling out to the DNS server 8.8.8.8. Which is most cases would be a LOW severity because it is a google dns server, but if you are paying attention to traffic before an after the call you might see some activity related to malware or some other threat. The 8.8.8.8 IOC on it's own might not be helpful, but together with more log sources and visibility it could be an indicator to something bad.

Organization 'context' is important:
* What do they do?
* What do they have?
* What are they running?
* Who do they work with?

The ability to categorize assets and understand the traffic patterns and behaviors in your environment will be the determining factor in stopping threats. One example he used was traffic seen in a client environment 'talking' with a vendor service which the client uses as part of their delivery. It was originally marked as a false alarm but upon further review and after categorizing assets in the environment they could determine the server 'calling' out to the vendor service was not part of the systems that 'should' be using the service. IR process was followed and the host was removed and examined off line.

Adrian hit on Discovery and Detection.

Discovery using IOC data/context against historical data
Detection using IOC data against real time/now data

Both methods are very effective but have totally different uses. Also that organizations should have some level of maturity models around these methods. Just like most things in life, how do you know where you are going if you don't have a map... You need to gather metrics around these activities and study if they are helping your defensive capabilities.

It was nice to see ARGUS in his presentation. If you don't know argus check it out: https://qosient.com/argus/ . It can be used for all sorts of captures/replays of pcap type data. In Adrians example he was passing a list of IOCs into a filter in argus to see if there had been any traffic in the capture (historical search). I've used ARGUS a ton in the past and will be starting a new project which includes argus in the near future. keep an eye here: https://github.com/mabartle/bloodhound

$ ra -nnnr argus.log.1.gz - ‘indicator’
- Argus - next-generation network flow technology, processing packets, either on the
wire or in captures, into advanced network flow data.
- nnn - lookup any address names
- r - read from file
- You can add the logic of you logrotate

Threat Intelligence with Automation and Orchestration -
Randy Conner
Randy presented on automating some of their security operations in Service Now with Recorded Future data. He hit on the high points of Automation and Orchestration and the time/resource savings that can be made. He provided some great examples of using Threat Intelligence with asset data to drive IR and weed out false positives. Further driving home one of the important takeaways from the conference that YOU NEED TO KNOW YOUR ENVIRONMENT. No level of threat intel will protect you from the 'bad guys' if you don't understand what is in your environment. Randy discussed some great use cases and showed examples of where automation has saved his organization a ton of time and effort. By utilizing their CMDB to cross-reference some threats (with CVE numbers) they can quickly identify where/what assets are affected to roll out patches and/or protections quickly.

A word of caution for anyone entering the 'world' of automation, it isn't just building a playbook and calling it a day. A lot like software development you need to define what(and why) you are developing a playbook, how it will be tested, how it will be rolled out into production, how(and who) this automation will effect, and how changes will be communicated to the organization overall.

Intelligence, Vulnerabilities, and Patching
Ryan Miller

Ryan presented some great information vulnerabilities and exploits. He showed time lines of some of the heavy exploits used last year and the time it took between the vulnerability and delivery method. In a lot of cases the vulnerability to exploit time might have been shorter than it took for the vendor to come out with a patch or workaround. Highlighting the need to have a good knowledge of your environment/assets and the related products/services/applications they use so you can KNOW when a new vulnerability is out and if/when it will affect you. He stressed the need for dedicated resources within organizations to do vulnerability research and keep up with trends in this area. He discussed using the Recorded Future platform to gain an insight into new vulnerabilities which are coming out which may not be talked about in the 'normal' channels (dark web). using this data for tracking/alerting, proactive analysis using mentions/notes from hackers on what exploits might be used, and using this data along with internal data to get a better understanding of trends in your company (like which Vulnerabilities were used the most vs what you have in your environment).
Take aways:
* every company should have a dedicated resources to investigate/research vulnerabilities and how it relates to the company environment (hosts, services, applications, etc)
* Vulnerability management in an organization is paramount, you need to know what has a patch and what is exploitable when the stuff hits the fan. You don't want to waste a ton of time on an exploit which will have little to no damage within your org.
* The '30 day standard' for patching is too long in most cases. You need to have a good patch process which takes into account the priority/severity of new vulnerabilities with what is in the environment.
* A majority of the time if you patch for the most wide spread vulnerabilities it will protect you from a wide range of attacks. Bad guys are using readily available exploits not 0 day type attacks.
* You need to have the internal ability to create your own detection methods against some of these new exploits. If you have an ear to the web (with tools like recorded future) you can create rules/alerts to trigger in case someone starts hitting you with an exploit against a vulnerability with no patch.

Until Next Year!
The Recorded Future team puts on a great conference. Every year the venue, events, and talks get better! It was great seeing everyone!

The Drain

9/21/2018

While trying to implement an Agile development environment with a new team a few years ago one of the developers said to me; "I don't like Agile, it makes the whole team mediocre. Your bad developers will bring the team down and your good developers will have to slow down to wait for the rest."

It's a comment that has stayed with me ever since.

Like many things in the Agile world it is easy to point at "Agile" and state that IT is the issue when the underlying issue has nothing to do with Agile practices/ways.

When applying Agile practices to a team/group/project it is important to remember that it isn't a cure all for any problem you have. I've seen Agile work great, taking a great team and improving their execution, communication, and collaboration. I've also see the 'other' side of Agile when a group/team of individuals are brought together for a project and nothing can help the outcome.

It takes MORE than just AGILE to have a high performing team. Agile helps with recommending different frameworks or methodologies but how they are implemented, adjusted, and stream lined is really up to the people that make up the team. If the team is not ready to ask the hard questions... I'm not talking about the hard questions related to a project, but the hard questions for people:

Do they have the drive to go above and beyond?
Do they execute on what they say they will do?
Are they constantly trying to learn and better themselves?
Are they trying to achieve goals? Both personal and business?

Agile will NOT solve the above issues! But as a manager/project lead/team lead you need to have a way of righting the ship when team members do NOT have the drive necessary to deliver. This comes in a couple different ways at different times during a project/employee cycle.

1. Hire the RIGHT people!
A line that is easier said then done. Where do you start? You can always start with the skills necessary to fill a role. Or you could look at how the current team interacts and ask what type of traits make up a good team member. At times there are soft skills that are more important than the technical skills. If i can train you up on what is needed and you have the drive and determination it might suit me well to do that vs hiring someone with that type of experience and being stuck with an under performing employee that will do the bare minimum and 'punch the clock' day in and day out.

Hiring the RIGHT developer can be tricky, you can't take a Entry level Developer and push them into a Sr Dev role, this is where the experience is necessary. You need someone in the Sr Role who can help mentor and motivate the younger developers on the team. Ideally the Sr Dev is a 'teacher' and will help the team in software architecture, design, and best practices necessary to build great applications.

But when hiring a less experienced developer if they have the skill set and know how in a specific programming language you will have flexibility to train them in another language. With the pace of technology today new languages, techniques, and frameworks are available to make the development process easier and provide an equal playing field when switching languages.

I do recommend having some level of code challenge/scenario the interviewee should run through during the interview process so you can verify they are able to do what they 'say' on their resume.

Lastly TAKE THE TIME to have all team members and folks associated with the project meet the candidate during the interview process. Everyone will have their own perception of the candidate and this is valuable feedback when determining if they are the right fit for the team AND team members will be more invested in new hires if they are part of the process and have a 'vote' in the decision.

It's a LOT easier to NOT hire someone vs dealing with PIPs and/or performance issues after they are on board. Look for the red flags in the interview process and if anything feels 'off' examine why.

2. The Legacy Developer
Teams are made up of all types of people and personalities. What do you do if you take over a team/project with members who have been involved for years and under performing?

First off don't make any changes right away! Learn how the team operates (or doesn't) See how folks interact, do they work together or in silos? Do they share information and techniques? Do they help each other? Or does that only happen when the 'manager' dictates for folks to help each other?

Talk to the team members. Start up a 1v1 at least once a month to discuss the goals and aspirations of the team members. Great book on 1v1 conversations: Behind Closed Doors: Secrets of Great Management

Sometimes employees are NOT doing what they want in their work/career. Having these conversations are hard but needed. Knowing 'what' an employee would like to be working on can help YOU put them in the right place when the opportunity arrives.

If the developer is not producing have the talk, be honest and let them know what you expect.

If the developer does not start producing after a few of these talks then it is time for a Performance Plan. Don't wait on this step! It sucks having to put anyone on a PIP but what sucks more??? Months of wasted time and effort on a project!

3. Set expectations
Set clear expectations! Ideally in an Agile 'team' the team members will be defining what they expect from themselves and others on the team. Holding each other accountable when expectations are NOT met and offering helping when needed. Different teams handle this in different ways. On some of the Scrum teams i have been a part of this happens in different areas:
1. Sprint Planning - Setting the Sprint Goal. This is an expectation for the iteration. The Team is 'committing' to the goal so a precedent is set that this will be achieved.
2. Day to Day Process - Software Development has many steps and teams handle these steps in different variations. For example; many teams have some form of code review before code is merged into 'master'/the product.

The team defines what needs to happen to 'get' to a Code Review
- Passing Build
- Definition of code change
- Testing steps
What is expected by the submitter
- Best Practices are met: code style, formatting, functionality, etc
- Answering questions from team
- Presenting to the team (if doing group code review/learning session)
What is expected by the reviewer
- Pulling branch down and checking locally
- Commenting and questioning
Overall expectations that need to be met BEFORE the code merge
- Multiple approvers

Here is a great article (which leads to other articles) on ways to set expectations and have these discussions: letsgrowleaders.com/2018/08/21/how-to-motivate-your-team-stop-treating-them-like-family/

4. Provide a way to improve/learn

As part of the feedback and expectations you will need to provide a way for them to improve. Point them in the right direction. Could be:

google :)
An on-line learning site like Pluralsight or Udemy .
Your companies internal Learning system
A colleague/mentor with experience/knowledge in the area
A book

5. Provide the Vision

There are times in business where the day to day gets a little ho-hum and the excitement might not be there. YOU need to provide the team a vision of where you will be 'in the future'. Could be a vision for next month, 6 months, or a year. People do better when they know where they are headed. This might be a difficult task if vision and priorities are shifting within the company. If so, focus on the SHORT AND OBTAINABLE opportunities which the team can do! Things that will allow the team to brush up on existing technologies, learn new technologies, update those build/deployment plans which haven't been touched in a year. Small victories but victories none the less!

Dev is hard

8/31/2018

Software development work is hard. No way around it, it is one of the hardest things i have ever done. It's not so much the coding/programming work but the complex coding, testing, building, deploying.... and most importantly communication AND collaboration. New requirements for an application come in and from the drafting of the requirement document to the deployment of the feature it is like a game of telephone.

Every step of the software development process is crucial in feature delivery. If one step is not done or done incorrectly it will impact the outcome.

Requirement gathering.. I was going to start with this one but there is actually one step you need to complete before you get to this step.

Feedback gathering and documentation
There needs to be some understanding of HOW this will be done. A process if you will. Sometimes folks get all bent out of shape when a 'process' is mentioned but without process the team cannot be in sync. Everyone cannot use their own steps when trying to work together. This process would define who, what, where, and how requirements and documentation will happen:
Who:

Who will be collecting the feedback/requirements/needs?
Who will be providing the feedback/requirements/needs?
Who will be present for these discussions? Project Manager, Stake holder, Lead/Sr Developer, Users

What:

What will they capture?
What will be discussed?

Where:

Where will the notes/discussions/documents be stored? (Central Documentation repository/Idea Central):
Where will the discussions take place? Face to face, Virtual environment/Webex, Phone, ??

How:

How will the requirements get captured? Overall Goal of tool/application, High Level ideas, detailed requirement documents, word docs vs confluence pages
How will the PO and Team breakdown the requirements into working issues?

Requirement breakdown and Prioritization
Each requirement needs to be discussed, planned, and brain stormed, by the project manager and tech leads. Thinking about the design, implementation, deployment, and training of new features is critical here. Spend the time and document what it will look like, how it will operate, how the user will interact with it, etc. Visuals are important here; mockups, data flows, system diagrams, etc. Everyone will have a picture of what it will do and/or look like in their mind you NEED visuals so everyone can see what the collective sees. A lot of time and effort will be wasted with assumptions; product owners assuming the development team understands the requirement can lead to problems later. Along that line, if there is too much time between the requirement discussion and the actual work it will take the team time to understand the requirement all over again and where/how it fits into the application they are building. Just like multitasking it takes time to shift from one thing to another.

Breakdown the requirements into small issues to ensure the team does not overextend themselves when planning an iteration. What is 'small'? That is a whole other blog :) The team should be providing the level of effort on the issues so inexperienced/non-technical folks are not providing the wrong expectations on the work. The team doing the work should be estimating the work. Story points vs hour estimates.. Doesn't really matter what matter is the team has decided on the scale that will be used and what it means to them. And you have to be consistent with this scale from sprint to sprint and while doing estimations.

The "real" work
This section is what most folks think of when they think about a software development team. A group of developers that are skilled in writing code in a language or multiple languages and producing a tool/application. But it's more than that, much more. A typical software team will have a list of issues/stories/tasks prioritized and ready for work. Developers will take a task/issue/story to work and go 'head down' into coding the feature/bugfix. Along with the coding the dev should be creating new tests for new code and update existing tests for any code updates. So for any piece of work started there really is a lot of other work already attached to it; code, test, build, code review, merge(version build), deploy. Before any line of code is started the team needs to understand the work! Don't skimp on the Sprint Planning (if doing Scrum) the more discussion in the planning session the less questions should arise during the work cycle. If you have a question about the work ASK NOW there really is no dumb question (as they say) and others might have the same question or worse yet, no one has thought about your specific question and you just saved yourself and the team a lot of heartache later. Read the issues all the way through, sometimes teams get so used to pulling in work into the sprint during planning that they don't go through and review the issue, description, overall goal of work, and the acceptance criteria. Development work is a discipline once you start getting lazy, issues will start arising in other areas of your work. Cut corners and you will feel the pain later. My advise, take the time up front because once you are working on the code you have set the expectation that you(and the team) can accomplish it, if you can't deliver that code on time and in the amount of effort discussed it impacts the whole process.

Deploy
After the code has been written, tested (both with automated and manual testing), reviewed, merged, and version tested... it is time for a production deployment! My biggest advice in this section is if there is anything manual you are doing during your deployment take the TIME AND AUTOMATE. There are CI/CD servers for a reason and it is to make the building and deployment of code consistent. You might be thinking.. O the 2 minutes I take every deployment to run a DB backup script or do the VM snapshot doesn't take long.. It doesn't, but it does take time AND God forbid you forget these steps when deploying. Build them into your deployment plan, if you think they are too complicated to script discuss it with the team. Take those ideas and break them down into their own game-plan on how you can make small changes to improve the overall deployment process. (2 minutes x 52 releases/year = 104 minutes which doesn't sound like much but a lot of shops these days are doing multiple deployments a DAY!)

Another part of the deployment plan should include deployment verification. How is your team verifying the deployment was successful (above the CI/CD saying it was)? Are you doing health monitoring on the server? Are you running a list of tests against the production server to verify API and basic application functionality post deployment?

Feedback!
The job is NOT over after the deployment has been made. There needs to be interaction with the user base to see how the new features have been improving their work life. Did it save them time? Did the feature get delivered the way the user group expected? Or are there a lot of 'this is great but..' conversations happening after deployment. This could point to multiple problems in the overall process. The team needs to identify where the disconnect has happened and fix the problem.

Was it an issue with requirement documentation? not enough details? Too many details? missing mockups so the user group didn't understand the way a new feature would 'look' before it was delivered?
Was it an issue during development? Did the developer not understand the requirement?

My advice here for feedback is to bake in as many ways of getting feedback as possible. If using metrics generated by your application is possible use it! Put in counters and controls to tell you what pieces of the application are being used and which ones are sitting collecting dust after a release. If a user base is not using new features after a deployment it could be a matter of the prioritization of the requirements is wrong.. You are delivering things that aren't crucial to the user base. or worse yet, THEY DON"T CARE and you are working on a tool/application that would not be missed if the lights were turned off. The only way to know any of these things is to keep that open line of communication between all the teams, personnel, management, and users. (I also wrote a blog about picking a communication medium everyone can agree on.)