Performance Monitor (perfmon) is the preferred tool to measure the performance of Windows systems. The perfmon tool provides an analysis view with a chart and metrics of the Last, Average, Minimum, and Maximum values.
There are scenarios where the line in the chart is the most valuable piece of information, such as a memory leak. Other times we may not be looking for a trend, the Last, Average, Minimum, and Maximum metrics may be valuable. One example where the metrics are valuable is when evaluating average disk latency over a period of time. In this article we are going to use disk latency counters to illustrate how metrics are calculated for performance counters. The concepts we will illustrate with disk latency apply to all performance counters. This article will not be a deep dive into understanding disk latency, there are already many sources of information on that topic.
Most performance counter metrics are pretty straightforward. The minimum and maximum metrics are self-explanatory. The last metric is the last entry in the data. The metric that is confusing is the average. When calculating averages it is important to consider the cardinality of the data. This is especially important when working with data that is already an average, such as the Avg. Disk sec/Read counter which displays the average time per each read from a disk.
Perfmon logs are gathered at a specific time interval, such as every 15 seconds. At every interval the counters are read and an entry is written to the log. In this interval there may have been many reads from the disk, a few reads, or there may have been none. The number of reads performed is a critical aspect of the average calculation, this is the cardinality of the data.
Consider the following 10 entries in a perfmon log:
1 reads took 150ms
0 reads took 0 ms
Often, averages are calculated by adding a column of numbers and dividing by the number of entries. However this calculation does not work for the above data. If we simply add and divide we get an average latency of 15ms (150 / 10) per read, but this is clearly incorrect. There has been 1 read performed and it took 150ms, therefore the average latency is 150ms per read. Depending on the system configuration, an average read latency of less than 20ms may be considered fast and more than 20ms may be considered slow. If we perform the calculation incorrectly we may believe the disk is performing adequately while the correct calculation shows the disk is actually very slow.
What data is used to calculate averages?
Let’s take a look at the data perfmon is working with. Perfmon stores data in two different structures. Formatted values are stored as PDH_FMT_COUNTERVALUE. Raw values are stored as PDH_RAW_COUNTER.
Formatted values are just plain numbers. They contain only the result of calculating the average of one or more raw values, but not the raw data used to obtain that calculation. Data stored in a perfmon CSV or TSV file is already formatted, which means they contain a column of floating point numbers. If our previous example was stored in a CSV or TSV we would have the following data:
The above numbers contain no information about how many reads were performed over the course of this log. Therefore it is impossible to calculate an accurate average from these numbers. That is not to say CSV and TSV files are worthless, there are many performance scenarios (such as memory leaks) where the average is not important.
Raw counters contain the raw performance information, as delivered by the performance counter to pdh.dll. In the case of Avg. Disk sec/Read the FirstValue contains the total time for all reads and the SecondValue contains the total number of reads performed. This information can be used to calculate the average while taking into consideration the cardinality of the data.
Again using the above example, the raw data would look like this:
On first look the above raw data does not resemble our formatted data at all. In order to calculate the average we need to know what the correct algorithm is. The Avg. Disk sec/Read counter is of type PERF_AVERAGE_TIMER and the average calculation is ((Nx - N0) / F) / (Dx - D0). N refers to FirstValue in the raw counter data, F refers to the number of ticks per second, and D refers to SecondValue. Ticks per second can be obtained from the PerformanceFrequency parameter of KeQueryPerformanceCounter, in my example it is 14318180.
Using the algorithm for PERF_AVERAGE_TIMER the calculation for the formatted values would be:
((2147727 - 0) / 14318180) / (1 - 0) = 0.15
((2147727 - 2147727) / 14318180) / (1 - 1) = 0*
*If the denominator is 0 there is no new data and the result is 0.
Because the raw counter contains both the number of reads performed during each interval and the time it took for these reads to complete, we can accurately calculate the average for many entries.
If you’ve taken the time to read this far you may be wondering why I have taken the time to explain such a mundane topic. It is important to explain how this works because many performance tools are not using the correct average calculation and many users are trying to calculate averages using data that is not appropriate for such calculations (such as CSV and TSV files). Programmers should use PdhComputeCounterStatistics to calculate averages and should not sum and divide by the count or duplicate the calculations described in MSDN.
Recently we have found that under some conditions perfmon will use the incorrect algorithm to calculate averages. When reading from log files perfmon has been formatting the values, summing them, and dividing by the number of entries. This issue has been corrected in perfmon for Windows 8/Server 2012 with KB2877211 and for Windows 8.1/Server 2012 R2 as part of KB2883200. We recommend using these fixes when analyzing perfmon logs to determine the average of a performance counter. Note that KB2877211/KB2883200 only change the behavior when analyzing logs, there is no change when the data is collected. This means you can collect performance logs from any version of Windows and analyze them on a system with these fixes installed.
What about a backport of the fix to 2008 R1/R2?
[We are aware of requests to have this fix ported to other versions of Windows. The KB article and this page will be updated if the fix is ported to other versions.]
"If we simply add and divide we get an average latency of 15ms (150 / 10) per read"
Sorry, what? Why is the denominator 10? There was 1 read; summing 1+0+...+0 gives 1. 150ms/1 = 150ms. All is fine.
[Often averages are calculated using the number of entries as the divisor. In this example there are 10 entries.]
Interesting article and it is useful to point out that avergae or mean is not always the best method to view a trend. In terms of average as a generic term I will often use mean, mode and medium from the raw data as well as standard deviation to indicate the variation (en.wikipedia.org/.../Standard_deviation). Having cut my teeth as an Exchange Engineer / Architect I find the whitepapers on Loadsim very useful (even though MS is pahsing this tool out). Another useful measurement is percentile as this will give you an indication as to a large spike causing a disproportionate effect on the average; e.g. if the 95th percentile is showing <20ms but average is above 20ms you are probably getting some very high but occasional spikes. I've used this in the past with PSS to demonstrate we don't have a disk latency issue per se, we have an intermittent problem that needs resolving.