Evolving Beyond the Code Coverage Percentage Debate: Part 2 of 2

Check box icon digital concept with binary data code

In my previous post, Evolving Beyond the Code Coverage Percentage Debate, Part 1 of 2, I recounted the “hypothetical” but all too familiar horrors of the great code coverage debate. I proposed that despite our best intentions in carrying on that debate, its caused us to become unfortunately fixated on code coverage instead of the true goal, writing quality tests. A point which left me to surmise that code coverage percentage is irrelevant compared to the relevance of test quality. I concluded that this realization left me unsatisfied, causing me to seek out an evolution of the accumulated practices of my experience to instead pursue something called “code coverage accountability”.

So what exactly is “code coverage accountability”? I suppose it could be described in any number of ways, none of which matters much to me at all (including what one chooses to call it) so long as it encompasses the following two core principals.

  1. Deliberately decide for each bit of code if it should be tested and how
  2. Ensure that for each bit of code the test decision is upheld

Now if that sounds a bit too ideological for you, that’s good. As stated it is essentially only an ideology. By itself it isn’t particularly helpful until paired with practical application. So how exactly is this helpful and how can it be applied?

Most importantly, the first principal frees us from assuming all lines of code or any specific percentage of lines must be tested. This is significant for a couple of reasons. First, knowing that not all lines of code must be tested encourages developers to consider what lines are most worth testing. Second, developers are encouraged to start discussions about difficult to test code, instead of plunging down the 100% code coverage rabbit hole or seeking out the path of least resistance to some arbitrary (e.g. 80%) code coverage goal.

Nearly as important, the second principal requires a system of accountability (hopefully one which is automated) that ensures every line is accounted for by either being tested or being granted an exclusion from testing. This is significant, perhaps for obvious reasons, because it is the mechanism which asserts the first principal is respected. In other words, if the first principal of code coverage accountability were considered law, the second principal would be its enforcement.

While this helps explain how the principals are helpful, if you’re like me, you find examples far more useful. To that end I’ll share my recent experience in applying these principals. As a disclaimer, I’d like to point out that while I hope sharing this experience is helpful I do not present it as the only method, nor do I present it as the best method, rather I present it as simply a method. For whatever portions of this experience prove useful (if any) I would wholly expect any other applications to be modified in manners suitable to the given context.

With that out of the way, here is how I’ve been applying these principals with my teams. First, and perhaps most strange given what I’ve advocated thus far, we use a balance of 100% code coverage and coverage exclusions.  Let me explain. Using 100% code coverage is what allows us to ensure we make coverage decisions for every bit of code. By default, we assume we want to test everything. However, because we know not all code coverage is equal, let alone valuable, we also assume we don’t have to test everything. This is where the coverage exclusions apply. As a part of deliberately deciding whether or not to test each bit of code developers are encouraged to use code coverage exclusions to ignore appropriate portions of code where testing is too difficult or not beneficial. Of course, the obvious flaw with this approach is that these determinations can often be quite subjective. This leads us to the next part of the application, code reviews.

The second line of practical defense we use is tried and true code reviews. Specifically, we use a workflow which requires all changes be reviewed and approved by some subset of the team members, and at minimum, a senior level engineer. This ensures that other team members have ample opportunity to review the code, tests and the level of quality they achieve. This distributes the burden of responsibility to all team members. If a critical test is later found to be missing, flawed or otherwise deficient, the responsibility is owned by the team and not a single team member.

The final ingredient, which will be obvious to some and is quite critical (at least in my team’s context), is automation. The application of these steps is automated in a workflow that holds our team accountable by doing the one necessary thing technology excels at, and which we humans are mostly inferior, disciplined and consistent accountability.

This system includes the ability, when desired, for developers to identically reproduce builds in their environment, as performed on our build server, including unit test execution and code coverage evaluation. This allows developers to more easily achieve build server compatibility prior to code review. If a developer forgets or makes a mistake the error is caught by the build server and through the system prevents the code from being merged into the primary code base.

Further the system enforces a pre-required sub-set of reviewers, which helps to mitigate the discrepancies over the aforementioned determinations. If a developer is behind schedule or is lazy and attempts to exclude a large portion of testable code the exclusion should be caught by a reviewer and the situation can be collectively addressed by the team.

Stepping back to look at the big picture, critics will undoubtedly be able to point out imperfections. On one hand this is welcome in the interest of continual improvement. On the other hand, it’s somewhat irrelevant as the goal isn’t to create a perfect system, rather a practical one. The end result we’ve attempted to achieve is a system of checks and balances. Checks, such as code review confirmations and 100% code coverage which can be easily evaluated and enforced by technology. Balances, such as code review discussion and code coverage exclusions which can allow common sense to rule.

Our system isn’t perfect and doesn’t need to be. Most importantly, it’s accomplishing its goal in evolving our methods, turning our focus away from arbitrary debate and back to writing quality tests and ultimately better quality software. This too will undoubtedly require further evolution. Nonetheless, I share it now in the hopes that it might help shape the next steps of others who share this same journey to escape the great code coverage debate. Godspeed.

About Aaron Osterwyk

Leave a Reply

Your email address will not be published.