Written by Jeff Dailey:
Hello NTDebuggers, one of the most important things to understand in kernel debugging hung servers is the output of !locks. There can be a lot of data and it’s not always clear what is going on. One of the things I like to do in order to better understand the output is to use a visual representation of the resources involved and the threads that are blocking on those resources. Before we can do that we need to understand what to look for so we can document it in our diagram.
It’s a good idea to understand ERESOURCEs in general l before jumping into !locks. The following MSDN article goes into lots of great detail. http://msdn2.microsoft.com/en-us/library/aa490224.aspx
Simply put, you will typically see threads either with access to or trying to gain access to resources. If a thread has access to a resource it will be marked by <*>. Threads that have access to a resource can block other threads from gaining access to said resource.
You will see threads waiting for shared access. These threads do not have the <*> and listed above the threads that are Waiting on Exclusive Access.
You will also see threads that are Waiting on Exclusive Access. These threads are typically blocked waiting for the threads that have access or ownership of the resource to release it.
Let’s take a look at one section of !locks output and annotate each thread section...
Resource @ 0x896d2a68 Shared 1 owning threads << This info is the ERESOURCE in question. Contention Count = 15292 << The amount of contention for the object.
NumberOfSharedWaiters = 1 << This is self explanatory
NumberOfExclusiveWaiters = 39 << Number of exclusive waiters in the Ex Waiter List
Threads: 89bd1234-01<*> 896d2020-01 << We have two threads here. The owner, or shared owner <*>89bd1234 and the shared Waiter 896d2020
Threads Waiting On Exclusive Access:
888ed020 87c036f8 885dc7a0 8bc538b0 << All of these threads are waiting on exclusive access.
88e8cda0 88796988 8905fda0 8974dc10
Note the following output is completely fabricated, so alignment and variable names may not be valid.
The following is some sample output from !locks. In this scenario I document any ERESOURCE that has any threads waiting on exclusive access. I also document the ERESOURCES as nodes and show the relationship to the Threads. The key point is to show the threads involved, the resources they own, and the resources they are blocked on or trying to get exclusive access to. Ultimately you need to work your way toward the head of the blocking chain of events to figure out what is holding up the entire chain of execution from moving forward.
In this case you will see that a filter driver called MYFILTER has passed an invalid object to KeWaitForSingleObject. As a result the thread blocked and all the other threads and processes related to those threads froze and could not move forward. The machine was completely hung.
1: kd> !locks
**** DUMP OF ALL RESOURCE OBJECTS ****
KD: Scanning for held locks......
Resource @ 0x8a50ee98 Shared 4 owning threads
Threads: 896856d0-01<*> 89686778-01<*> 896862d0-01<*> 89685da0-01<*>
KD: Scanning for held locks............................................................
Resource @ 0x896dabcd Exclusively owned
Threads: 886e5678-01<*>
KD: Scanning for held locks..
Resource @ 0x896d2a68 Shared 1 owning threads
Contention Count = 15292
NumberOfSharedWaiters = 1
NumberOfExclusiveWaiters = 39
Threads: 89bd1234-01<*> 896d2020-01
888ed020 87c036f8 885dc7a0 8bc538b0
88d78020 87a7dda0 88b85b20 87b78020
8936e8a0 87dd7ae8 886005a0 88557890
887b3680 87cc2790 87dd4050 87fad8a0
88179580 87b53d70 87cd2775 88ba0578
87b676f8 8886b560 87f68388 89681da0
88952720 888833c0
KD: Scanning for held locks................
Resource @ 0x8959c790 Exclusively owned
Contention Count = 4827
NumberOfExclusiveWaiters = 35
Threads: 89bd1234-01<*>
883e3aa0 88873020 88290020 87f5f588
888154f0 88bd4b28 88cbc448 884bd6c8
881e5da0 8935f518 87bcc978 8889e020
88cb3020 88c92178 87cf9020 88daaac0
89376020 88fe9020 887b29d0 87b6f7f0
87e12020 87b4f498 894ee730 88810020
881a8020 87dd55f0 888d3020 885f6da0
881f7da0 880742e8 87a31b50 879ffb50
88451da0 88646da0 8833a020
KD: Scanning for held locks.....................................................
Resource @ 0x88ce81ff Exclusively owned
Contention Count = 108
Threads: 87ad6f78-01<*>
KD: Scanning for held locks......................................
Resource @ 0x87da48fb Exclusively owned
Threads: 87bddda0-01<*>
KD: Scanning for held locks.
Resource @ 0x87df455c Exclusively owned
Contention Count = 2
NumberOfExclusiveWaiters = 2
89bd1234 87ad6c68
KD: Scanning for held locks............................................
Resource @ 0x87fcfe30 Shared 1 owning threads
Threads: 8a60f8a3-01<*> *** Actual Thread 8a60f8a0
KD: Scanning for held locks...........
Resource @ 0x880ef1cd Shared 1 owning threads
Threads: 8a60c3af-01<*> *** Actual Thread 8a60c3a0
27044 total locks, 9 locks currently held
Good luck and happy debugging.
PingBack from http://microsoftnews.askpcdoc.com/?p=3106
Is this particular scenario (mass blocking on a bogus lock) the sort of thing "!analyze -hang" would find out for you without manual decoding? (Not that it's not useful to understand "!locks", mind you.)
great article, thanks for the detailed explanation.