In part1 we have discussed the steps you need to get ready to start debugging with windbg, in this part we will walk through some steps and commands that might help you troubleshoot a specific problem

The first question is what are you trying to debug?

Troubleshooting a hang

 

a.     The first step is to check if this is a high CPU or low CPU hang?

 

  • Open task manager (Start->Run->taskmgr)
  • Check the cpu for your process, If the CPU is consistently high then you need to investigate a High CPU hang, alternatively you need to investigate low CPU hang.

 

High CPU hangs

 

1-    Are we spending too much time in garbage collection? Check the following perf counters in perfmon:

  • .Net CLR Memory
  • % Time in GC
  • Allocated bytes/sec

       Are these too high? If so you can follow this post

       Also detailed information about .net perf counters is available here

 

2-      If that’s not the case, the question is which thread is consuming the CPU? – for this we use !runaway command

 0:000> !runaway

 User Mode Time

  Thread       Time

   1:358       0 days 0:00:47.408

   2:150       0 days 0:00:03.495

   0:d8        0 days 0:00:00.000

 

Ok, now we know that both threads 1:358 and 2:150 are consuming a lot of CPU time, but which one is really stuck? Continue the process for a few more seconds by doing

0:000> g


Or pressing F5 After a few seconds we try again:

 0:000> !runaway

 User Mode Time

  Thread       Time

   1:358       0 days 0:00:47.408

   2:150       0 days 0:00:06.395

   0:d8        0 days 0:00:00.000

 

We can tell easily that thread 2:150 is taking more CPU that it was initially – so we know which thread is stuck, now we can switch the context to this thread and start debugging it:

0:000> ~2s


And then show the managed stack and unmanaged stacks

0:000> !clrstack

  

What to look for?

 

  •  Infinite loop
  •  Intensive IO operations
  • Huge allocations to the Large Object Heap
  • Any intensive CPU operations.

 

Useful commands and sequences

 

  •   ~*e !clrstack -> show all managed stacks
  •   ~*k -> show all stacks (unmanaged as well)
  •  !bpmd mscorlib.dll System.Threading.Thread..ctor (place a managed breakpoint on the Thread constructor).

Low CPU hangs


Critical section

Check the stack traces for all your threads are you seeing a similar pattern?

0:000> ~* kb50

1  Id: 6fc.3d8 Suspend: 1 Teb: 7ffde000 Unfrozen

ChildEBP RetAddr  Args to Child             

005afc14 7c90e9c0 7c91901b 000007d4 00000000 ntdll!KiFastSystemCallRet

005afc18 7c91901b 000007d4 00000000 00000000 ntdll!ZwWaitForSingleObject+0xc

005afca0 7c90104b 004a0638 00430b7f 004a0638 ntdll!RtlpWaitForCriticalSection+0x132

005afca8 00430b7f 004a0638 005afe6c 005afe78 ntdll!RtlEnterCriticalSection+0x46

 


 

·         !locks command can help us to identify the current owner of this critical section. When used without parameters, this command displays the list of critical sections that are currently held by the application's threads. 

·         !dlk

 

Locks and Deadlock

Alternatively, you might be hitting a deadlock – try the following command:

0:000> !syncblk


Index SyncBlock MonitorHeld Recursion Owning Thread Info SyncBlock Owner
146 0000000005a44688 99 1 000000000588cbf0 13e0 38 000000017f71c4f8 System.Object

 

The first thing to look at is the MonitorHeld, the owner adds 1 to the count, and every waiter adds 2, so in this case we have 1 owner and (99-1) /2 waiters or 49 waitings.

To see the stack of the owner you run ~38s followed by !clrstack, 38 is the thread number in SyncBlock Owner in output above.

Useful commands and sequences

-          ~*e !clrstack -> show all managed stacks

-          ~*k -> show all stacks (unmanaged as well)

-          !dso ->Dump stack objects


Waiting on SQL locks


Here is another interested deadlock I hit today – after doing ~*e!clrstack I found that all my threads were stuck here:

 

OS Thread Id: 0xb54 (35)

Child SP         IP               Call Site

000000002ccdca60 00000000772a78ca [NDirectMethodFrameStandalone: 000000002ccdca60] <Module>.SNIReadSync(SNI_Conn*, SNI_Packet**, Int32)

000000002ccdca20 000007ff00aeaf91 SNINativeMethodWrapper.SNIReadSync(System.Runtime.InteropServices.SafeHandle, IntPtr ByRef, Int32)

000000002ccdcb20 000007ff00aeaafc 000000002ccdcbc0 000007ff00aea870 System.Data.SqlClient.TdsParserStateObject.ReadNetworkPacket()

000000002ccdcc10 000007ff00aedfd4 System.Data.SqlClient.TdsParserStateObject.ReadByte()

000000002ccdcc40 000007ff00aed4fd System.Data.SqlClient.TdsParser.Run(System.Data.SqlClient.RunBehavior, System.Data.SqlClient.SqlCommand, System.Data.SqlClient.SqlDataReader, System.Data.SqlClient.BulkCopySimpleResultSet, System.Data.SqlClient.TdsParserStateObject)

000000002ccdcd10 000007ff00af5538 System.Data.SqlClient.SqlDataReader.ConsumeMetaData()

000000002ccdcd60 000007ff00af5342 System.Data.SqlClient.SqlDataReader.get_MetaData()

000000002ccdcdb0 000007ff00af4bf1 System.Data.SqlClient.SqlCommand.FinishExecuteReader(System.Data.SqlClient.SqlDataReader, System.Data.SqlClient.RunBehavior, System.String)

000000002ccdce20 000007ff00af3c0e System.Data.SqlClient.SqlCommand.RunExecuteReaderTds(System.Data.CommandBehavior, System.Data.SqlClient.RunBehavior, Boolean, Boolean)

000000002ccdcef0 000007ff00af23a0 System.Data.SqlClient.SqlCommand.RunExecuteReader(System.Data.CommandBehavior, System.Data.SqlClient.RunBehavior, Boolean, System.String, System.Data.Common.DbAsyncResult)

000000002ccdcf90 000007ff00af219c System.Data.SqlClient.SqlCommand.RunExecuteReader(System.Data.CommandBehavior, System.Data.SqlClient.RunBehavior, Boolean, System.String)

000000002ccdcfd0 000007ff00af1df5 System.Data.SqlClient.SqlCommand.ExecuteScalar()

 

 This is very interesting, why are all the threads stuck executing a SQL query?

Upon doing !dso, I found that this statement is being executed

sp_getapplock 

So interestingly this time we got blocked waiting on a SQL lock, not an app one



Part 3 is coming soon with details on how to debug crashes, high cpu and memory problems!