What is Good Enough?
Good enough. That catch-phrase that has been around since teams started taking their code into production and wondering if it’s good enough to release. It’s a complex decision that unfortunately, to this day, doesn’t have any definitive answers. This is something which I find myself grappling with on a regular basis as I try to work with the various teams I support in determining when we are ready and satisfied to move the code from test into production.
What makes this a tough challenge is the need to balance high quality with delivery. If we want to truly run through every possible test or code scenario for a complex system that consists of over 100 microservices and needs to meet the needs of several hundred thousand customers without any defects, we would probably release software only once a year – and that being a very small change. Even with the majority of our testing all automated and running in CI pipelines – there is the risk of missing something and so teams still need to make judgement calls on their testing every single day. After all, this is 2019 and we want to push enhancements through every day if we could.
So, having multiple teams that need to have the ability and accountability to push code into production regularly, requires some level of oversight to measure and determine when code is good to go into production. And while I think the overall freedom to determine this should be a team decision that can be guided by the simple question of "am I happy for customers to use this?", I do think teams need to make educated decisions based on data rather than emotions. After all if your team is filled with an equal bunch of optimists and pessimists, as many of the teams I have been are, the chances are your team will never come to an agreement on this.
These below guidelines, while not exhaustive, are some important factors which have guided me on my journey in determining exactly when a piece of software is considered good enough to roll out to production.
Essentially any code a team works on should have a predetermined coverage that needs to be passed before you would even consider it ready for production. And the coverage I am speaking of here is what can be measured through code coverage tools in your pipeline across your unit and component tests (including both decision and full line coverage) as well as the known functional coverage for your integration and end to end tests. I say "known" as it’s difficult to know every possible scenario and you can’t really measure or test what you don’t know.
What you would ideally want though is that all known tests along with your code coverage be as close to 100% as possible, with all tests passed successfully. I do understand that 100% is not always achievable given the complexities of certain code bases, but as long as the team has a predetermined goal that is required to pass and specific critical functional areas that needed to be hit, then you have a data point to work with on determining if something should be released to production or not.
It’s difficult to prevent or catch all defects, even in the development process and inevitably, especially on big pieces of functionality, the reality is that there are bound to be some minor issues that arise during the development cycle. And while it should be clear that anything critical or major should not be allowed to go into production, determining whether small issues should be fixed or allowed into production becomes complicated. On the one side, minor defects, by nature, don’t affect functionality and so any inconvenience to the end-user should be cosmetic or edge case related at least. However, having too many minor defects in one release makes the experience look unprofessional to the end user, as well as creating an unnecessary amount of tech debt for the team.
Ideally, teams should determine some pre-set measurement about what they will allow into production and this should cover both severity, likelihood of failure and importantly, number. Which I describe below:
Severity – What is the impact to the customer should a defect occur. If usage, security or any form of money is compromised as a result, it’s a straight up no-go with regards to release. If its cosmetic or performance related – it could be lived with.
Likelihood – How often is this defect likely to occur. Is it something which will surface every time for every customer, or likely only for a few customers or in extreme situations?
Number – The total number of known defects that you will be allowing in production for fix at a later time. Again, all these should fall into the lower ranking of the above two criteria, but similarly should not be too numerous either. A general guideline I follow is that this number should not exceed 1% of total development hours. This assumes that you are tracking these values of course, if not, come up with a measure that can be effectively tracked in your team and that suits the requirements of your product. As someone in the financial services industry, I tend to err on the side of few to no defects, but some products or applications can get away with more.
Manage the Tech Debt:
As above, it’s not just defects that create tech debt, but other decisions too. Don’t introduce too much tech debt in what you’re releasing and put rules in place that will prevent this from happening.
Something you can read more on here.
Critically, it’s not just about the quality of the code going into production, but also how you are able to monitor it once it is. There should be clear guidelines that exist on what should be monitored that teams should adhere to and use to determine when something is ready for release or not. Something which I won’t go into here, as it is a topic on its own. You can read some information on monitoring in the following article though.
What is critical through monitoring though is not just the ability to monitor what is going on with the product, but also how it should alert the team when something major happens or when performance or load gets high, so that the team can proactively support the product. Don’t wait for something to happen before responding to it as a team – put the mechanisms in place to be alerted before something goes wrong or as close to the event as possible.
Load and Performance Testing:
Another thing that is easily forgotten. Don’t put code into production until you are certain it won’t adversely affect performance in any way. Performance and load testing is something that also shouldn’t be left for late in the development cycle, but form part of your continuous development and be tested on as regular a basis as possible. What performance is suitable and how much any release should deviate from your proven benchmarks is something your team should consider.
The Importance of Data Driven Decision Making
The critical aspect to all this though is providing your team with data to make the critical decision. People shouldn’t base these things on gut feel or confidence levels, but on the raw data in front of them. Will data lea to perfect decisions all the time? No, especially if the data itself is incorrect. What it will do is at least legitimise your decision and help to provide further data points to not only learn from, but reduce these decisions in future. It’s about getting better.
Deploying code into production shouldn’t need to be a hair-raising decision. It should be a calculated one and one where the decision making variables are clearly laid out before you need to make them. So, take the stress out of big deployments by managing your risk up front and having these clear guidelines on whether something really is good enough