Software is a critical component of modern life. It allows us to drive cars, fly planes, launch rockets into space, do our banking online, and have a heartwarming video with a loved one when we’re not able to be with them.
Given such significance, software must perform as expected, and if it fails, it should do so gracefully and predictably. A key method for creating software that does this is to ensure that it’s well tested, with modern test-driven development practices such as writing unit, functional, system, and integration tests.
However, how do you know if you’ve tested your code sufficiently? There are many approaches, but for this article, I’ll focus on code coverage. If you’re not too familiar with the term, code coverage:
“…provides a visual measurement of what source code is being executed by a test suite. This information indicates to the software developer where they should write new tests in an effort to achieve higher coverage. Testing source code helps to prevent bugs and syntax errors by executing each line with a known variable and cross-checking it with an expected output.”
Here’s an example of a code coverage report that PHPUnit generated. You can see that the level of code coverage is displayed both as a percentage and using a color legend.
Dashboard views, such as this one, make it relatively easy to determine the coverage level for a codebase as a whole, as well as for specific aspects, such as an individual line, function, statement, branch, or condition.
That said, not only is it important to implement code coverage, but also to do so appropriately for your organization. In this post, you’ll learn about eight factors that can influence how you implement code coverage in your organization or project.
Code Coverage Goals
What are you aiming to achieve by using code coverage? Specifically, what are your goals?
- Do you want to improve the quality of an old, legacy codebase?
- Do you want to replace manual testing, which can be inconsistent, time-consuming, and expensive, with cheaper, repetitive, automated tests?
- Do you want to reduce developer reluctance and decrease the time required to implement new features?
- Do you want to improve the testing proficiency of your development team?
Whether it’s one or more of the above goals or something else entirely, be clear as to what those goals are. As much as you’re able, make informed choices, as different goals have different needs and priorities.
For example, if you’re starting with an established codebase with very little coverage, it’s unrealistic to shoot for 100% coverage in 6 months. More likely, you’ll want to aim to cover specific, high-risk portions of your application first and gradually increase coverage over time.
Programming Languages and Available Libraries
Which programming languages are used in your organization? Most software projects have a mix of at least two languages: JavaScript or TypeScript on the frontend and Python, Ruby, Java, or Golang on the backend. Depending on the age and diversity of the project, you may have pieces of the codebase written in several languages or even different versions of those languages.
That said, not all languages and frameworks make it easy to build solid code coverage. Before settling on a desired code coverage level, review the level of functionality in the testing frameworks available for the languages your project uses. Can they report to the level of depth and complexity that you require?
If you’re using a beta version of a language or framework, you might have to write your own test framework while most well-established languages like Java have many options for testing. On the other hand, some old code written in a procedural style might be very hard to test if not well-designed from the start.
Industry Regulations
In addition to setting a desired level of coverage, depending on your industry, you may be legally obligated to meet a minimum benchmark. The aviation, transportation, and electrical safety industries are all especially regulated in this regard. Here are three examples of national and international quality standards, collated by bullseye.com:
DO-178B
The aviation standard DO-178B requires 100 percent code coverage for safety-critical systems. This standard specifies progressively more sensitive code coverage metrics for more critical systems.
IEC 61508
The standard IEC 61508:2010 “Functional Safety of Electrical/Electronic/Programmable Electronic Safety-Related Systems” recommends 100 percent code coverage of several metrics, but the strenuousness of the recommendation relates to the criticality.
ISO 26262
ISO 26262-1:2018 “Road vehicles—Functional safety” specifies that if the code coverage achieved “is considered insufficient,” then a rationale must be provided. The standard recommends different coverage metrics for unit testing than for integration testing. In both cases, the strenuousness of the recommendations relates to the criticality.
Organizational Maturity
Does your development team have the professional maturity and experience required to achieve your code-quality objectives? And I’m referring not only to the hands-on developers, but also to the team leads. Everyone is essential.
But focusing on the developers for a moment, be sure to ask where they’re at on their software development journey. How much practical experience do they have with testing, and with code coverage specifically? It’s important to have the right people guiding your developers and helping them get used to using code coverage as part of your standard software development workflow. It’s just as critical that team leads can communicate the importance of coverage to the rest of the business.
A blog article called How much test coverage do you need? The Testivus Answer, bears this out nicely.
Team Size and Maturity
Your desire to meet or exceed a given level of coverage needs to be balanced with your team’s ability to meet it. For example, if you have a small team of mid-level developers responsible for many codebases without much test coverage, it’s unreasonable to expect a rapid increase in code coverage. On the other hand, if you have a larger, more experienced team that’s responsible for the same number of or fewer codebases, then the same goal is achievable in a shorter time frame.
Obviously, the level of code coverage you expect for your organization needs to be realistic. How can you balance the number of developers you have, their respective skill levels, the demands upon your codebase(s), and the state that those codebases are currently in with your desired coverage levels and target dates?
Instead of aiming to achieve a specific coverage level by a specific date, consider aiming for small, incremental improvements over a longer period, until the desired level is achieved.
Test Quality
There are two flawed beliefs around code coverage:
- Higher coverage automatically leads to a better codebase
- 100 percent coverage means code with zero bugs
It’s easy to write tests until 100 percent coverage is achieved. According to Martin Fowler:
Test coverage is of little use as a numeric statement of how good your tests are.
A test suite with 100 percent coverage may contain only superficial tests, ones that don’t cover many—if any—notable edge cases. So rather than aim for an arbitrarily high coverage percentage, use code coverage to determine what still needs to be tested, and review the quality of your existing tests.
If you’re looking for a rough figure to aim for, a broadly agreed-upon consensus, backed by empirical research, is 70 to 80 percent.
Types of Tests You’re Using
There is quite a range of types of software testing available, such as unit testing for testing methods and classes, and end-to-end testing for replicating user behavior in a typical environment. However, they were designed around different goals, test software in different ways, and afford different levels of control.
For example, unit tests are often quick and cheap to write; they test code at a very low level and, through tools such as mocking facilities, faked data, and debuggers, provide a high level of control over the testing environment.
Contrast this with end-to-end testing, which tests an entire flow of execution through an application. These can be expensive, as the environment may need to be put in a given state before the test suite can run. They may also require external services such as logging, email, and notification servers to be prepared. Moreover, they can be complex because of the sophistication of the user flow they’re testing.
Given that, it’s important to weigh the practicality of testing against aspects of your application. For example, in a previous role, I was developing e-commerce shops using a widely available e-commerce framework. While the framework came with a test suite, no one in the company I worked at could run them—PHP always maxed out available memory before a test suite could complete. This was despite constantly giving more memory to the PHP runtime. Consequently, no one used the tests, simply because it was impractical.
When you’re weighing code coverage levels for the different test types you use, here is some sage advice from bullseye.com:
It makes sense to set progressively lower goals for unit testing, integration testing, and system testing. For example, 90 percent during unit testing, 80 percent during integration testing, and 70 percent during system testing.
Be realistic about what a given type of testing is for, what it can achieve, and what it takes to run.
What Is Your Approach to Writing Code?
The final point to consider about determining code coverage is how your code is written. Specifically, when your developers are writing code, do they follow the three rules of Test-Driven Development (TDD):
- You are not allowed to write any production code unless it is to make a failing unit test pass.
- You are not allowed to write any more of a unit test than is sufficient to fail, and compilation failures are failures.
- You are not allowed to write any more production code than is sufficient to pass the one failing unit test.
If code is written following these rules, then that code will always have a high, if not a complete, level of test coverage. If, however, tests are written after the fact, then code coverage levels will tend to vary.
While not all organizations develop using TDD, and some are reluctant to do so, the more an organization follows TDD, the higher the level of code coverage will be.
Conclusion
Different types of software need different levels of coverage, and different approaches to testing produce different levels of coverage. Rather than targeting a universal code coverage metric, ensure that your priority is proper testing. After that, factor in regulatory requirements, team size and maturity, test types, and test quality. Let these guide your decisions and the level of code coverage will follow.
Looking for help measuring code coverage in your codebase? Check out Codecov. We can help you improve your code review workflow and quality through highly integrated tools to group, merge, archive, and compare coverage reports.