If broken it is, fix it you should

Using the powers of the debugger to solve the problems of the world - and a bag of chips    by Tess Ferrandez, ASP.NET Escalation Engineer (Microsoft)

.NET Debugging Demos Lab 1: Hang

.NET Debugging Demos Lab 1: Hang

Rate This
  • Comments 51

 

This is the first in a series of about 10 labs on .NET debugging.  The lab will use a site called BuggyBits, and as the name suggests the bits are extremely buggy.

To get started, follow the setup instructions posted here.

I have a feeling that these hands-on-labs may generate a lot of questions and although I will try to answer any questions posted in the comments I can’t promise to answer them all so please feel free to answer other readers comments if you know the answer, and make sure that you have followed all the installation instructions.

Note: The questions in the labs (Q: … ) are only meant as an aid when troubleshooting the problem.  I will moderate any comments containing answers to these questions until I have released the lab review (about a week after the orignal lab post in order to give everyone a shot at the labs without answers)

Feel free to comment on the lab format good or bad so that I know what works well and what doesn't for future labs.

Without further a due, here comes Lab 1:

Reproduce the problem:

1. Browse to http://localhost/BuggyBits/FeaturedProducts.aspx
This should take about 5 seconds to show, you can see the start time and execution time the bottom of the page.

2. Open up 5 browsers, all browsing to this site and refresh them simultaneously

Note the execution time for each of them and make sure that the start time is pretty much the same on all (otherwise you probably didn’t run the reg file)

Q:  What are the execution times?

Q:  What is the CPU usage of the w3wp.exe process when reproing the problem? High or low CPU?

Q:  What are some potential reasons for a hang with these symptoms?

Get a memory dump:

1. Start a command window and browse to your debuggers directory.  Type the following command to prepare taking the dump but don’t hit enter quiet yet.
adplus –hang –pn w3wp.exe –quiet

2. Reproduce the problem either by browsing with 5 browsers as you did before or by stressing the site with tinyget with the following command line

tinyget -srv:localhost -uri:/BuggyBits/FeaturedProducts.aspx -threads:30 -loop:50

3. Hit enter in the adplus command window to take the memory dump while the requests are still executing.

Q: In adplus hang mode, what triggers the generation of the memory dump?


Q: What permissions do you need to take a memory dump of a process?


Q: Where are the dumps created? Hint: check the windbg help for adplus/hang mode

Open the dump in Windbg.exe

1. Open windbg and open the memory dump (.dmp file) with File/Open Crash dump.

2. Set up the symbol path (see Information and Setup Instructions for more info)

3. Load sos (see Information and Setup Instructions for more info)

Examine the stacks

1. Examine the native callstacks

~* kb 2000

2. Examine the .net callstacks

~* e !clrstack

Q:  Do you see any patterns or recognize any of the callstacks that suggests a thread is waiting for a synchronization mechanism?

Troubleshoot the hang

1. Determine the ID of the thread owning the lock
!syncblk

Q: What thread owns the lock?

Q: How many threads are waiting for the lock?
Hint: MonitorHeld = 1 for each owner and 2 for each waiter.

2. Pick one of the waiters (Hint: waiters will sit in AwareLock::Enter) and take a look at what it is doing.

~5s                          (move to thread 5, replace 5 with actual thread ID)
kb 2000                    (examine native stack)
!clrstack                    (examine .net stack)

Q: In which .net function is it waiting for the lock?

3. Determine what the owning thread is doing 

~5s                          (move to thread 5, replace 5 with actual thread ID)
kb 2000                    (examine native stack)
!clrstack                    (examine .net stack)

Q: Why is it blocking?

4. Examine the code for .NET method owning the lock to verify your theory.

Hints

The following articles may be useful when troubleshooting this hang:

Things to ignore when debugging an ASP.NET Hang - Update for .NET 2.0

A Hang Scenario, Locks and Critical Sections

.NET Hang Debugging Walkthrough

Automated .NET Hang Analysis

Have fun debugging 
/Tess

  • Thanks for these tutorials - superb!

    I've currently got a website which is hanging on the public IP address - but not on the local address - so when I look at call stacks in w3wp then everything looks good (because it is working).

    But something is hanging :( And at the moment it seems to be linked to some dodgy coming in from the GBPlugIn User Agent from Brazil...

    Do you have any ideas where to look? It doesn't feel like I should be looking in ASP.Net code at the moment :/

    Any pointers appreciated - I've enjoyed learning things like windbg this week!

    Stuart

  • If you can't find anything in the stacks, then perhaps you might want to try looking at fiddler traces and seeing if the requests look weird in some way.

  • Thanks for the answer.

    I hope this isn't too stupid a question... but how do I get fiddler traces on the server?

    I can't even ask the suspect users to get Fiddler traces - as far as I can tell, the requests from the users are coming through their work network - and on that network something called WEBWASHED is doing something to the HTTP requests to "protect" the users

    I'll keep looking...

  • you would get fiddler traces on the client so if that is not an option then you would have to find anohter way

  • Thanks - I'll keep looking.

    If I can't think of anything else I'll have to think about installing WinPCaps on the server - just have to get permission to do that in production environment!

  • Hi Tess, i am working on a dump to determine the reason for High CPU hang...i ran the command ~* e !clrstack. i noticed the following

    System.Reflection.Assembly.InternalGetSatelliteAssembly(System.Globalization.CultureInfo, System.Version, Boolean) line on all the stacks...

    In the code, i do subscribe to AppDomain.CurrentDomain.AssemblyResolve and in the handler i load the dll from a specific folder...

    In the handler, i do ignore loading the dlls when the assembly name ends with .resources.. (..assemblyName.EndsWith("resources"))).

    i loaded sosex and ran the !dlk command...i dont see any deadlocks..

    Any suggestions?

    Thanks,

    -Sashi.

Page 4 of 4 (51 items) 1234
Leave a Comment
  • Please add 5 and 6 and type the answer here:
  • Post