Note : An updated version of this post can be found here.
Recently I was tasked with changing our daily and Continuous Integration builds so that they would also execute our unit tests. This seemed like a straightforward task, and indeed it was. Except for one thing: every so often a build would fail because another process was locking the test results directory.
When we looked at the build server we saw a number of instances of a process called DW20 running on the server. This process was locking the test results directory. Terminating these processes allowed further builds that included unit tests to run.
DW20 is the Windows Error Reporting program. What was happening was that Windows Error Reporting was asking for permission to send error reports to Microsoft, and waiting for a response from a user. Being on a server this was never going to happen, especially as the prompt could not be seen!
So, two questions needed to be addressed. The first was to find out what was causing the errors that were triggering Windows Error Reporting in the first place, and the second was to see if there was a way to stop DW20 waiting for a prompt when there was an error.
The easiest one to solve was to prevent DW20 waiting and locking the test results directory. On Windows Server 2008 you can configure Windows Error Reporting to not wait for input from the user. To find it, run Server Manager and then click on Turn on Windows Error Reporting (you may need to scroll down to find it). As you can see from the screenshot below, on our build server it was configured to ask about sending reports every time there is an error. Choose either the first or the last option to be sure that DW20 will not wait and lock your test results directory.
The other problem was to work out why we were getting these errors in the first place. For this I used the Windows Sysinternals Process Explorer tool. Using this I hovered the mouse over the DW20 processes, this showed me the command line parameter that was being passed to each DW20 process. This pointed me to a file in a temporary directory which contained the following:
EventLogSource=Team Test Error Reporting
Main_Intro_Bold=An unexpected condition has occurred.
Main_Intro_Reg=An unexpected condition has occurred in the test execution framework. Information about the condition has been gathered.
Main_Plea_Bold=Please tell Microsoft about this problem.
Main_Plea_Reg=We have created an error report that you can send to help us fix bugs. We will treat this report as confidential and anonymous.
Queued_EventDescription=An exception has occurred in the test execution framework component: Value cannot be null.
Parameter name: certificateFindKey
The important thing to note in this file is that the process which crashed, was running QTAgent32.exe. This is the test runner used by the TestToolsTask task to run unit tests on the build server without requiring Visual Studio to be installed.
This immediately reminded me that two of the unit tests were failing with an “Error” state rather than the more usual “Failed” state. When the test runner reports an Error state for a test it means that the test caused an error in the test runner itself. The most common reason I have seen for this is in tests that use threads, and indeed in this case the problem tests were using threads. Removing those tests stopped the QTAgent32 errors from happening in the first place.
So, the conclusion is simple.
I hope this has been useful and helps you to avoid problems running unit tests as part of your daily and Continuous Integration builds.
Written by Rob Jarratt