To help drive test quality improvements, Dynamics CRM, Dynamics AX, and Dynamics NAV are using a metric that indicates the percentage of all tests that are passing or which fail due to a product issue. This metric is called Automation Accuracy (although Dynamics NAV has used the term Test Reliability, at least to this point). Automation Accuracy has several desirable characteristics, but it is important to understand the limitations of this metric to avoid misuse or misinterpretation.
In software testing we want tests that are likely to detect product defects when present (high Sensitivity) and which do not result in a large number of false failures (high Specificity; Lamey 2009). It would be nice to be able to measure Sensitivity and Specificity and drive towards high values for each, but we are unable to assign values to these metrics because they suffer from the difficulty of separating True Pass (no issues) from False Pass (undetected product issues).
It is easier to assign a value to Automation Accuracy because it does not differentiate between tests that are passing when there is no product issue and those that do not detect a product issue and pass anyway. As a result, Automation Accuracy is not directly related to either Sensitivity or Specificity. For example, the following diagram depicts the relationship between Automation Accuracy, Sensitivity, Specificity, and Positive Predictive Value when only the percentage of True Pass and False Pass are allowed to vary. Tests with the highest quality should appear in the upper right corner of the diagram (where Sensitivity and Specificity are highest).
Notice that while Automation Accuracy and Positive Predictive Value do not vary, Sensitivity and Specificity vary based on the percentage of True Pass results. If high Sensitivity and Specificity are desirable test quality attributes, Automation Accuracy does not help much in the pursuit of these attributes.
Automation Accuracy can be used as a rough measure of test quality if we assume that the rate of True Passes (pass result correctly indicates there are no product issues) is relatively high. Rigorous test design techniques might help to keep the rate of True Passes high, but we can only verify our assumption by looking at the number of product issues that are discovered in the field and determining if tests should have detected the issue.
Automation Accuracy is a bit like Code Coverage: high values do not indicate high-quality tests, but low values are a sign of a serious problem. Despite its limitations, Automation Accuracy is a useful measure as it exposes poor test quality and because it is a driver of several desirable behaviors, notably
Focused attention on test failures that occur due to non-product causes
Lamey, T. 2009. Test Quality Metrics: Diagnosing the Problem. Testing Ledger, MSDN Blogs: http://blogs.msdn.com/b/tim_lamey/archive/2009/11/21/test-quality-metrics-diagnosing-the-problem.aspx