[This is an updated version of a note that was first published to a Microsoft internal blog in November 2008]

 

There are many opportunities to misuse code coverage (see Marick 1999 or Lamey 2010).  Focusing solely on code coverage numbers can lead to some very bad behaviors, but it does not follow that code coverage does not have the potential to add value. 

 

Here is a short list of ways to make code coverage a useful part of the software development process.  Additional suggestions for how to get the most from code coverage can be found in Rollison (2009 and 2010). 

 

Identification of Testing Gaps

Code coverage measures whether a test (or suite of tests) exercises the application code.  It does not tell us, at least not directly, how well the application code is tested, only that the code is touched.

 

Code coverage can be used as a tool to help build a better suite of tests by identifying the gaps in the existing testing.  Because 0% code coverage is an indication that the test(s) fail to touch a unit of code, 0% code coverage can be used as a quick, coarse-grained way to “triage” areas that need additional testing effort.  The metric can be applied sequentially at multiple levels to identify assemblies, files, classes, or even methods that require additional testing focus.

 

A refinement of this methodology is to treat any code coverage value less than 100% as an indication of some weakness in the design of the test suite (Marick 1999).  The idea goes something like this.  For non-zero code coverage values, spend time analyzing the areas that are not covered.  Consider what categories of tests are missing or what modifications to existing tests could be included to improve the original test design.  Modify the test design, create the tests, and measure code coverage again.  Iterate between measuring code coverage and modifying the test design until the desired level of code coverage is achieved.  The important point is that the focus should be on improving the test design and not on improving code coverage (Marick 1999).  Good test design is correlated with high code coverage values, but high code coverage values do not necessarily imply you have a good set of tests (Lamey 2010).

 

Risk Mitigation

Assuming the tests do a good job of testing the code they cover, code coverage can help point out areas in the code base where there may still be substantial risk of a fault.  Code coverage can be a valuable tool when attempting to decide if a product is ready for release, provided there has been some analysis of the areas that were not covered during testing.  As long as the code coverage information is used to understand exactly what code is not covered by testing, then the actual code coverage value is of little significance.  Regardless of whether the coverage value is 58% or 92% it is important to identify the portions of the application that are not exercised and to come to some agreement that the lack of coverage in these areas is an acceptable risk.

 

Dead Code Removal

Analysis of code coverage values that are less that 100% can unearth cases where the application code is unreachable (i.e. dead code).  Once the dead code is identified, it can be removed to improve the hygiene of the code base.  This is another case where the benefit comes from striving to fully understand the reasons for incomplete code coverage and not from the actual code coverage number. 

 

Testability

Product code that is difficult to reach with tests may represent a testability problem.  Systems that are testable are likely to the associated with high code coverage values, so code coverage can provide an indirect indicator of the level of testability in the product.  Again, it is the analysis of the code that is not covered that provides the insight into the testability or lack of testability of the application, not the actual code coverage number. 

 

Test Case Selection

Code coverage information can be cross referenced with code changes to the application to select the test cases that, because they exercise the affected code, are the best choice to run to detect regressions.  Use of code coverage to select a set of tests maximizes the chances of detecting an error as the result of a particular product code change while reducing the need to run all possible tests. 

 

Test Case Prioritization

In some cases, execution of a subset of tests may not be an option (e.g. perhaps because of legal or contractual requirements), but it may still be of value to know which tests are most likely to detect problems so they could be executed first.  Code coverage information can used as a heuristic for prioritizing the order in which tests are run to improve the rate of fault detection early in a regression run (Elbaum et al. 2002).  These prioritization techniques improve the odds that regression testing will provide valuable early feedback even in cases where for other reason, it is still necessary to run all tests.

 

In summary, code coverage numbers in and of themselves are of little value.  With the exception of using code coverage in test case selection or test case prioritization, code coverage information only adds value to the software development process when the organization is committed to analyzing the implications of code coverage values that are less than 100%.  Failure to reach 100% coverage may be indicative of one or more of the following:  poor test design, limited testability support, the presence of dead code, and exposure to undiscovered faults.  Understanding which issue underlies the observed code coverage number is only possible after spending time analyzing the portions of the product code that are not covered.

 

Recommended Reading

Elbaum, S.,  Malishevsky, A., and G. Rothermel.  2002.  Test Case Prioritization:  A Family of Empirical Studies. IEEE Transactions on Software Engineering, 28-2:159-182.  http://www.cse.unl.edu/~elbaum/papers/journals/tse01.pdf

Kaner, C.  1996.  Software negligence and testing coverage.  http://www.kaner.com/coverage.htm

Lamey, T.C.  2010.  Don’t be seduced by numbers.  Testing Ledger blog post:  http://blogs.msdn.com/tim_lamey/archive/2010/02/08/don-t-be-seduced-by-numbers.aspx

Marick, B.  1999.  How to misuse code coverage.  International Conference and Exposition on Testing Computer Software.  http://www.exampler.com/testing-com/writings/coverage.pdf

Rollison, Bj.  2009.  Reconsidering code coverage.  I.M. Testy blog post:  http://www.testingmentor.com/imtesty/2009/11/25/reconsidering-code-coverage/

Rollison, Bj.  2010.  Code coverage: more than just a number.  I.M. Testy blog post:  http://www.testingmentor.com/imtesty/2010/01/21/code-coverage-more-than-just-a-number/