This post will require some basic knowledge of windbg and the sos extension. For this I recommend looking at the following posts:
For more information on Exceptions in general and why they should be avoided I'd like to recommend this post:
I thought it was time to write another post on how to use windbg for troubleshooting. A lot of my time is spent locating exceptions in various web applications, so I thought this might be a good topic to cover. I've previously written a post specifically targeting OutOfMemoryExceptions, but I thought I should broaden the terms and make it a bit more general. There are two scenarios that are exceptionally common in my line of work:
In this post I'll cover how to investigate what exceptions have been thrown by an application, as well as how to use windbg and adplus to automatically gather specific information for us.
Okay, so you have a web application that you've been monitoring and you believe it is throwing a lot of exceptions. You've taken a dump of the process and you're ready to begin the investigation. Where do you start?
If your application is running under the .NET Framework 1.1 you can use the !dumpallexceptions-command (!dae) to get a list of all the exceptions still on the heap. Now, remember that an exception is a managed object, so they will eventually be garbage collected just like everything else. This means that when looking at the heap for exceptions you will only get the exceptions still in memory, not every exception thrown by the application since startup.
Anyway, if you run !dae you'll get a list of exceptions that looks like this:
As you can see the !dae-command lists all exception types found on the heap. If possible it also gives us a callstack for each exception. Please note, however, that this doesn't mean that the call stack is more or less the same for all exceptions. In the sample above you might have ~20 different callstacks leading to the 136 NullReferenceExceptions you see.
Unfortunately the !dae command is not available in version 2.0 of sos.dll. Still, it's quite easy to get (more or less) the same result by using the !dumpheap command. If we just type "!dumpheap -type Exception" we'll get a list of all objects with the string "Exception" in their class name. This is almost as good.
As you can see, this gives us almost the same information except for the callstacks.
When analyzing data it is always good to know how to filter the information.
There are a three exceptions that are created as soon as the worker process starts. This means that you will always see them on the heap even if they haven't been thrown at all.:
So why are they created if we haven't thrown them? - Any guesses?
The answer is quite simple: If you run into a situation where you need to throw any of these exceptions you will probably be in a state where you can't create them. For example, you've run out of memory and are no longer able to allocate even the tiniest string. How would you then be able to allocate enough memory to create a new exception?
So, provided that there's still only one of each on the heap, you can most likely ignore these three exceptions. When it comes to ExecutionEngineExceptions and OutOfMemoryExceptions you will probably have a pretty good idea that this is what you're looking for, and finding a StackOverflowException isn't that hard. If you run !clrstack and find a callstack of 200+ lines you can be more or less certain that this is your problem.
Usually when you see a ThreadAbortExceptions it is because you've called Response.Redirect.
Whenever you call Response.Redirect, this will also result in a call to Response.End. This will terminate the thread prematurely, resulting in a System.Threading.ThreadAbortException. See the callstack below for an example.
Obviously I'm not saying you should discard all System.Threading.ThreadAbortExceptions as irrelevant. Even if you have no reason to believe that ThreadAbortExceptions are a major concern it's always a good idea to investigate a few of them. Take a minute or two to confirm that there is an underlying call to Response.End caused by a Response.Redirect. Once you think that you have enough statistical data to imply that the ThreadAbortExceptions are caused by Redirects you can move on.
Okay, so say we want to look at the callstacks for the System.Data.SqlClient.SqlException, well first of all we need the address for it. As you might remember, this is easily obtained by using !dumpheap without the -stat option.
Now we have the address for the exception. In order to investigate the exception we could use !dumpobject, but there is another command I want to use first.
Running the !printexception command on the address of an exception will give us some neat information on the exception in question. Here's the result of running !printexception on the SqlException:
This is good stuff. The command was even able to generate a callstack for us. (This may not always be the case, since the callstack may very well have gone out of scope.)
I wouldn't say that !printexception is a complete replacement for !dumpobject when it comes to examining exceptions. !Printexception will fit the exception into a standard template, and since some exceptions may contain more data than others we sometimes want to use !dumpobject as well. The SqlException has a property called _errors that contains a System.Data.SqlClient.SqlErrorCollection that we might want to look at. This is not in the listing above, so we need to use !dumpobject to look at it.
There we have it. Now we can continue using !dumpobject to investigate it even further if we wish.
If we take a look at one of the HttpUnhandledExceptions we find that it has an inner exception. It is even nice enough to let us know how to find out more about it.
This means that the System.NullReferenceException mentioned lead to the HttpUnhandledException we're currently investigating. So if we want to find the root cause we'll need to investigate the inner exception as well.
If you've looked at my "Advanced commands"-post a while back you saw some examples of the .foreach command. This is a great command to use, for example if you want to see the callstack for all System.ArgumentNullExceptions. Instead of manually iterating through all the exceptions we can now dump them all at once, check their callstacks, etc.
Well this post was even longer than usual.
I hope you found it of value, and I'll gladly listen to any comments, feedback or wishes on future topics you might have.
/ Johan