If broken it is, fix it you should

Using the powers of the debugger to solve the problems of the world - and a bag of chips    by Tess Ferrandez, ASP.NET Escalation Engineer (Microsoft)

Automated .NET Hang Analysis

Automated .NET Hang Analysis

  • Comments 19

In my constant effort to make my job obsolete, I have created an "add-in" for windbg that automatically checks for hang conditions in a .NET memory dump.

Disclaimer:

It is very crude and doesn't have a lot in terms of exception handling so basically I don't follow my own examples here:) but on the other hand, this is supposed to be used by one person at a time and if it crashes it won't really hurt anyone so that is really why I omitted exception handling and code optimization.

This is written as a .NET console app that you run with the .shell command in windbg and pass in ~* kb 2000, !threads and ~* e !clrstack. The application then parses this output and look for a few conditions that are common for hangs.  I have written it so that it should be easy to add checks for other conditions as well and the solution for the application is attached to this blog post.

BTW, if anyone knows how to emit DML in the output sent from a console app, please let me know... I have tried a thousand ways but I haven't been able to do it.

Output:

The output when running this on a dump that experienced a deadlock with a lot of different hang causes looks like this: 

====================================================================================================================
                   Hang Analysis                          
====================================================================================================================

____________________________________________________________________________________________________________________

 GC RELATED INFORMATION 
____________________________________________________________________________________________________________________

The following threads are GC threads:
	15 16 

The following threads are waiting for the GC to finish: 
	11 13 21 23 24 25 27 28 32 33 34 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 54 55 56 57 59 60 61 62 63 64 65 66 67 68 69 70 71 74 75 

The GC was triggered by thread: 72 

The GC is working on suspending threads to continue with garbage collection
The following threads can't be suspended because preemptive GC is disabled:
	20 22 30 31 35 53 58 

The Finalizer (Thread 17) is not blocked


BLOG POSTS ABOUT GC RELATED ISSUES
===================================
GC-LoaderLock Deadlock 		 http://blogs.msdn.com/tess/archive/2007/03/12/net-hang-case-study-the-gc-loader-lock-deadlock-a-story-of-mixed-mode-dlls.aspx
High CPU in GC 			 http://blogs.msdn.com/tess/archive/2006/06/22/643309.aspx
High CPU in GC (Viewstate) 	 http://blogs.msdn.com/tess/archive/2006/11/24/asp-net-case-study-bad-perf-high-memory-usage-and-high-cpu-in-gc-death-by-viewstate.aspx
Blocked finalizer 		 http://blogs.msdn.com/tess/archive/2006/03/27/561715.aspx

____________________________________________________________________________________________________________________

 INFORMATION ABOUT LOCKS AND CRITICAL SECTIONS 
____________________________________________________________________________________________________________________

The following threads are spinning waiting to enter a .NET Lock: 
	20 22 30 31 35 53 58 

The following threads are waiting for a critical section:	(TIP! Run !sieextPub.critlist to find out who the owner is)
	12 19 

The following threads are waiting in a WaitMultiple: 
	21 25 


BLOG POSTS ABOUT LOCKS AND CRITICAL SECTIONS
==============================================
WaitOne and WebService calls 	 http://blogs.msdn.com/tess/archive/2006/02/23/537681.aspx
Locks and Critical sections 	 http://blogs.msdn.com/tess/archive/2006/01/09/510773.aspx

____________________________________________________________________________________________________________________

 INFORMATION ABOUT THREADS WAITING ON EXTERNAL PROCESSES 
____________________________________________________________________________________________________________________

The following threads are waiting in a Socket.Receive: 
	13 34 43 44 45 46 47 49 50 51 56 63 66 68 69 


BLOG POSTS ABOUT DEBUGGING HANGS
==============================================
Hang debugging walkthrough		 http://blogs.msdn.com/tess/archive/2006/10/16/net-hang-debugging-walkthrough.aspx
Things to ignore when you're debugging a hang	 http://blogs.msdn.com/tess/archive/2007/04/02/things-to-ignore-when-debugging-an-asp-net-hang-update-for-net-2-0.aspx

In this particular case we can see that the GC is running since we have a lot of threads (pretty much all .net threads) waiting for the GC to finish.  The GC however can not finish because it is waiting to suspend threads 20, 22, 30, 31, 35, 53 and 58, which can't be suspended because they have preemptive GC disabled.  This is in turn caused by these threads being stuck in a spin-loop trying to enter a .net Lock and for some reason they can't exit that spin-loop.  I might post more details about this case later but for the purposes of this post I am just showing the hanganalyzer.

In addition we can see that threads 13, 34, 43, 44, 45, 46, 47, 49, 50, 51, 56, 63, 66, 68 and 69 are in a socket.receive waiting for some results from a socket call.  The observant reader will probably have noticed that these threads are also waiting for the GC to complete so it really won't continue reading until the GC completes.

 

Armed with the info above you can go out and look at the stacks to get more detailed information about the threads, and I have also included a number of posts of interest for each section.

 

Writing an application for windbg:

Writing an application to run from windbg is very easy. 

The command for running an app from windbg looks like this

.shell -ci ["commands"] [application] 

for example to find all the instances of the word "Threadpool Worker" in the !threads output using the DOS command FIND /C "word" you would do this

0:032> .shell -ci "!threads" FIND /C "Threadpool Worker"
47
.shell: Process exited

The FIND application will then receive the output of the commands you pass in after -ci and can read it using standard in (i.e. Console.ReadLine()), and whatever it prints out using standard out (i.e. Console.WriteLine()) gets displayed in the debugger.

So, a simple app that just reads the output of a windbg command and echoes it back to the debugger would then look something like this

        static void Main(string[] args){

		string input = Console.ReadLine();
		while (input != null){
			Console.WriteLine(input);
			input = Console.ReadLine();	
		}
	}

And you would call it from windbg like this  .shell -ci "~* kb" c:\MyApp

 

Hang analyzer implementation details: 

The HangAnalyzer takes the input ~* kb;.echo MANAGED THREADS;!threads;.echo MANAGED CALLSTACKS;~* e !clrstack  (the echos for MANAGED THREADS and MANAGED CALLSTACKS are just there so i know when to stop parsing native callstacks and !threads output... this could easily be refined by better parsing but it serves the purpose)

.shell -ci "~* kb;.echo MANAGED THREADS;!threads;.echo MANAGED CALLSTACKS;~* e !clrstack;" F:\blog\HangAnalyzer\HangAnalyzer\bin\Debug\HangAnalyzer

The application has 3 main classes

Program.cs

Main - Program control flow

ParseInput - Parses the input and populates the NativeCallstacks, NetCallstacks and NetThreads

 

Analysis functions

AnalyzeThis - This is where all analysis is done and where you can add more checks

CheckPreemptiveGCDisabled - Returns the list of threads that have preemptive GC disabled

GetFinalizerThread - Returns the ID of the Finalizer thread

CheckForCriteria - Returns a list of threadsids for threads matching the specified criteria

 

Printing functions

PrintHeader - Prints a header

PrintGCPosts - Prints out links to the posts on my blog about GC related issues

PrintLocksAndCritsecPosts - Prints out links to the posts on my blog about locks and critical sections

PrintGeneralHangPosts - Prints out links to the posts on my blog about general hang related info 

 

CallStack.cs - Contains the ThreadID and a list of frames for the callstack

AddFrame - Adds a frame to the callstack

FitsCriteria - Returns true if any of the frames in the CallStack contains the words listed as the criteria

 

NetThread.cs - Contains info about a specific .NET thread

Populate - Populates the thread info

 

Running the hang analyzer:

As mentioned before, you run the analyzer like this

.shell -ci "~* kb;.echo MANAGED THREADS;!threads;.echo MANAGED CALLSTACKS;~* e !clrstack;" F:\blog\HangAnalyzer\HangAnalyzer\bin\Debug\HangAnalyzer

Of course you would exchange the path for the location of your HangAnalyzer.  However you could also set up a command script so that you dont have to type in this path all the time, i.e. just copy this string into a text file and run the text file like this 

$><c:\TOOLS\EXTENSIONS\ANALYZEHANGS.TXT 

Note: Since the application will parse the kb and !clrstack output a few things are important

1. You have to have symbols loaded or else it will not find the text  (i.e. start off by running  .symfix c:\pubsym and .reload to make sure you have symbols set,  you may also want to run ~* kb just to check that the output looks normal).  If you don't have the correct symbols it may also falsely report that the finalizer is stuck since it cant find the call to WaitForFinalizerEvent

2. You have to have sos loaded so that !clrstack will run, otherwise you will miss info derrived from !threads and !clrstack output.

Note2:  This is not an exhaustive list of hang reasons so even if this reports nothing that doesn't neccesarily mean that there isn't a hang/performance issue.

Note3:  In its current version it works for .NET 2.0 but you can easily change it yourself to work for 1.1. or anything else by changing the parsing.

 

Download the solution:

You can download a Visual Studio.NET 2005 solution containing the project from here... 

 

Happy holidays,

Tess





Attachment: HangAnalyzer.zip
  • Link Listing - December 12, 2007

  • Sharepoint Applications Pages in SharePoint - Getting Started [Via: mattlind ] Code Camps Philly.NET...

  • This looks to be very very helpful. I hope that you are able to get this integrated into windbg going forward.

  • Here is the latest in my link-listing series .&#160; Also check out my ASP.NET Tips, Tricks and Tutorials

  • Here is the latest in my link-listing series .&#160; Also check out my ASP.NET Tips, Tricks and Tutorials

  • Hi Tess,

    Great post, and a special thanks for some tips on how to write tools to use with WinDBG. I was having some ideas of "plugins" i could use, and i guess i was too lazy to look around for info much, and the info you provided was really really useful.

    thanks!

  • Cool deal,  if you write some cool ones, be sure to let us know:)

  • Lately, no matter where you turn there is LINQ presentations and Silverlight demos. A couple of years

  • Here is the latest in my link-listing series .&#160; Also check out my ASP.NET Tips, Tricks and Tutorials

  • Here is the latest in my link-listing series .&#160; Also check out my ASP.NET Tips, Tricks and Tutorials

  • Here is another good article about debugging application hangs: http://www.correlsense.com/cto-blog/top-3-reasons-an-application-hangs/

  • This is the first in a series of about 10 labs on .NET debugging. The lab will use a site called BuggyBits,

  • In December I blogged about a little tool that i wrote to analyze hangs in dumps , and i showed the following

  • We have talked about high memory quite a bit so thought it would be a good idea to move onto hangs.&#160;

Page 1 of 2 (19 items) 12
Leave a Comment
  • Please add 3 and 7 and type the answer here:
  • Post