Welcome to MSDN Blogs Sign in | Join | Help

Hello. It’s Ryan again with the second installment of my stack depletion walkthrough.  Part 1 of this blog covered the initial analysis of a kernel memory dump captured due to a Stop 0x7f EXCEPTION_DOUBLE_FAULT.  Our initial analysis revealed that kv was not able to provide us with a useful stack backtrace. Background information relating to Task States and Double Faults were also covered. If you haven’t yet reviewed this blog, you can find it at Part 1. 

 

Previously, in part one of this blog, we reviewed the memory dump of a stop 0x7f EXCEPTION_DOUBLE_FAULT  and found that we were not provided with a valid stack backtrace. Without a valid stack backtrace, we were unable to identify what depleted the stack. Whenever I am in the situation where kv is not able to properly walk the stack, my next course of action is to manually dump out the memory within the stack range using the dps command.

 

In the previous blog, we ran the !thread command to obtain the stack base and limit to view the thread in our fictiously named process, StackHog.exe.

 

!thread   

Owning Process            874c6800       Image:         StackHog.exe  

...

Base b8ae9000 Limit b8ae6000

 

Let’s pass this address range to the dps command. The dps command will display the contents of memory and treat the data dumped out as pointers. It will then try to match up these pointer addresses with symbol information in the same manner that kv does.

 

dps  b8ae6000 b8ae9000

.

.

.

 

omitting extremely long output

 

I'll spare the lengthy dps output here and instead describe what was observed.  When dumping the stack in this manner, I noticed that one product’s drivers were appearing on the stack over and over again. This product also provided the executable that was running as the current process (I obtained this information from the !thread command that I ran to obtain the limit and base values).   Running lm kv m and !lmi against these drivers verified that they were in fact all from the same product (StackHog.exe). 

 

Using dps in this manner will often provide a good idea of what may have been using the stack. However, there are a few problems that may affect the reliability of this method. Dps simply dumps out whatever data is present anywhere on the stack.  One problem is there can often be trash left over on the stack from previous activity that is unrelated to our present stack. Also, there may be trap handlers and other data that shows up to further complicate the call flow picture. In addition, the information will not clearly show the order of calls like a kv stack backtrace output will. The output can also be very lengthy since the entire range is displayed. On a stack where each call frame averages 8 frames, you may be only interested in the symbols from the return address in each frame. I have observed some stack frames where dps may resolve two or three symbols for that frame.  All of this means that while dps is a good tool and often useful, on occasion some of the output from dps may be confusing instead of revealing.  Sometimes when multiple unrelated code from various vendors (including Microsoft) are on the stack, you may need a more complete understanding of the true call flow.  For these reasons, whenever possible, I want to be able to see the kv stack backtrace to display the call flow that led up to the crash. 

 

Let's see if we can help kv to reconstruct the stack. Since the stack backtrace did not display properly on its own, we will need to help out a little to get the backtrace started. If we can provide  good starting point  values, the debugger can often dump out the rest of the stack. Also, when dealing with an overflowed stack, we will want to use the kf option instead of kv.  The kf command will give us a better idea of how much space a driver and the calls it made are occupying on the stack.  It does this by listing the amount of stack space between the stack frames. If you take these stack usage values as fact, you are assuming that the stack was backtraced correctly. Sometimes this isn’t the case when you don’t have symbols for all the code on the stack.  So to investigate the stack usage, we need to get a good backtrace. The k command accepts parameters that will help it display the stack using address values that you provide. We need to provide a few addresses to any of the various versions of the k command by using the equal sign as I will demonstrate shortly. We need to provide the BasePtr, the StackPtr, and the InstructionPtr.   For more information on the k command, refer to the msdn documentation .

Since the real problem here is that various modules have used up all the stack, we simply need to dump out most of that stack to see where most of the usage is. It isn't important that we identify what was going on when we died or what was at the very top of the stack. We just want to see as much of the stack as we can dumped out in an easy to understand format by using kf. 

 

 So where can we find these values? I'll start by dumping out the stack using the address of the stack limit. Let's dump out this area of memory. I'll start by clearing the screen so that I can examine the output

 

3: kd>.cls

 

3: kd> dps  b8ae6000 b8ae9000

 

                               <omitting output>

 

I wasn’t able to locate any valid patterns in the very top of the stack. This was probably due to lack of symbols and valid stack related register addresses combined with FPO or other optimizations. Things started making sense further down the stack in the stack range that I have listed below. Observe the patterns I have highlighted. Note that the symbol output is listed next to the return addresses. I will try to display my stack starting at this point. B8ae6100 is very close to the stack limit so we won’t be missing much of the output. We will get most of the stack output that we need to see if we start here.

 

 

b8ae60ec  b8ae6100    <----------------------------------points to the next stack frame pointer (saved ebp) below

b8ae60f0  8081df65 nt!IofCallDriver+0x45 <---possible Return address. Start here.

b8ae60f4  8763f718

b8ae60f8  87758bd8

b8ae60fc  8b4abb00

b8ae6100  b8ae6128     This points to the next frame

b8ae6104  f7a2ec45 fltmgr!FltpDispatch+0x6f <----possible return address

b8ae6108  8b4abb00

b8ae610c  87758bd8

b8ae6110  00000000

b8ae6114  89751350

b8ae6118  00000080

b8ae611c  00000000

b8ae6120  b8ae6130

b8ae6124  8084cff9 nt!MmIsAddressValid+0xf

b8ae6128  b8ae613c

b8ae612c  8081df65 nt!IofCallDriver+0x45

b8ae6130  8765b6a8

b8ae6134  87758bd8

b8ae6138  89751350

b8ae613c  b8ae6144

b8ae6140  b958e196 BossHog+0x1196

b8ae6144  b8ae61c4

b8ae6148  b958f4bc BossHog!StackEater+0x80c

 

I'll start by identifying all of the stack address in the hopes of finding any candidates for saved stack frame pointers (saved ebp values). Based on the limit and base, these addresses will all start with b8ae. The next digit will be 6, 7, 8, or 9.  I have identified all of the possible values. Next, I looked for patterns of addresses that are pointing to other pointers below them to create a chain.  Next, I will look for possible return addresses. They should appear on the line right under the saved EBP value.

 

I'll try dumping the stack using the values from the first possible frame. The format is

 

kf=BasePtr, StackPtr, InstructionPtr

 

So now, it appears that we have a pattern, let me pass in these numbers to the kf command. If this works, then the stack walker will dump the stack out from this point down up to the maximum that you have set by using the .kframes command. I’ll start by raising the number of stack frames displayed using .kframes.

 

3: kd> .kframes 200

 

Default stack trace depth is 0n512 frames

 

Now, lets dump the stack out using kf. Kf will display the number of bytes of stack space used in hex on each line before it displays the  frame.

 

3: kd> kf=b8ae6100 b8ae60ec 8081df65

  Memory  ChildEBP RetAddr 

          b8ae6100 f7a2ec45 nt!IofCallDriver+0x45

       28 b8ae6128 8081df65 fltmgr!FltpDispatch+0x6f

       14 b8ae613c b958e196 nt!IofCallDriver+0x45

WARNING: Stack unwind information not available. Following frames may be wrong.

        8 b8ae6144 b958f4bc BabyHog+0x1196

       80 b8ae61c4 8081df65 BabyHog!HogFarm+0x88c

       14 b8ae61d8 b80c18a6 nt!IofCallDriver+0x45

       84 b8ae625c b80cf367 BossHog+0x78a6

       10 b8ae626c b80cf3b7 BossHog+0x15367

       28 b8ae6294 8081df65 BossHog!DEVICEDISPATCH::DispatchPassThrough+0x48

       14 b8ae62a8 b76fbcf7 nt!IofCallDriver+0x45

       84 b8ae632c b7709ae6 BossHog01+0x7cf7

       10 b8ae633c b7709b36 BossHog01+0x15ae6

       28 b8ae6364 8081df65 BossHog01!DEVICEDISPATCH::DispatchPassThrough+0x48

       14 b8ae6378 8081e4ed nt!IofCallDriver+0x45

       18 b8ae6390 8085114a nt!IoPageRead+0x109

       9c b8ae642c 8085ea66 nt!MiDispatchFault+0xece

       84 b8ae64b0 8088c798 nt!MmAccessFault+0x89e

        0 b8ae64b0 808b64a6 nt!_KiTrap0E+0xdc

       c8 b8ae6578 bae5af2d nt!CcMapData+0x8c

       20 b8ae6598 bae5849b Ntfs!NtfsMapStream+0x4b

       74 b8ae660c bae5adf0 Ntfs!NtfsReadMftRecord+0x86

       38 b8ae6644 bae5afac Ntfs!NtfsReadFileRecord+0x7a

       38 b8ae667c bae19903 Ntfs!NtfsLookupInFileRecord+0x37

      110 b8ae678c bae1a6c4 Ntfs!NtfsLookupAllocation+0xdd

      1d0 b8ae695c bae1a87c Ntfs!NtfsPrepareBuffers+0x25d

      1dc b8ae6b38 bae1b1a6 Ntfs!NtfsNonCachedIo+0x1ee

       ec b8ae6c24 bae1b0c9 Ntfs!NtfsCommonRead+0xaf5

      1ac b8ae6dd0 8081df65 Ntfs!NtfsFsdRead+0x113

       14 b8ae6de4 f7a2ec45 nt!IofCallDriver+0x45

       28 b8ae6e0c 8081df65 fltmgr!FltpDispatch+0x6f

       14 b8ae6e20 b958e196 nt!IofCallDriver+0x45

        8 b8ae6e28 b958f4bc BabyHog+0x1196

       80 b8ae6ea8 8081df65 BabyHog!HogFarm+0x88c

       14 b8ae6ebc b80c18a6 nt!IofCallDriver+0x45

       84 b8ae6f40 b80cf367 BossHog+0x78a6

       10 b8ae6f50 b80cf3b7 BossHog+0x15367

       28 b8ae6f78 8081df65 BossHog!DEVICEDISPATCH::DispatchPassThrough+0x48

       14 b8ae6f8c b76fbcf7 nt!IofCallDriver+0x45

       84 b8ae7010 b7709ae6 BossHog01+0x7cf7

       10 b8ae7020 b7709b36 BossHog01+0x15ae6

       28 b8ae7048 8081df65 BossHog01!DEVICEDISPATCH::DispatchPassThrough+0x48

       14 b8ae705c 8081e4ed nt!IofCallDriver+0x45

       18 b8ae7074 8085114a nt!IoPageRead+0x109

       9c b8ae7110 8085ea66 nt!MiDispatchFault+0xece

       84 b8ae7194 8088c798 nt!MmAccessFault+0x89e

        0 b8ae7194 808b64a6 nt!_KiTrap0E+0xdc

       c8 b8ae725c bae5af2d nt!CcMapData+0x8c

       20 b8ae727c bae5d9d5 Ntfs!NtfsMapStream+0x4b

       30 b8ae72ac bae5f5e4 Ntfs!ReadIndexBuffer+0x8f

      174 b8ae7420 bae5f786 Ntfs!NtfsUpdateFileNameInIndex+0x62

       fc b8ae751c bae5f8c6 Ntfs!NtfsUpdateDuplicateInfo+0x2b0

      208 b8ae7724 bae5c8d9 Ntfs!NtfsCommonCleanup+0x1e82

      170 b8ae7894 8081df65 Ntfs!NtfsFsdCleanup+0xcf

       14 b8ae78a8 f7a2ec45 nt!IofCallDriver+0x45

       28 b8ae78d0 8081df65 fltmgr!FltpDispatch+0x6f

       14 b8ae78e4 b958e196 nt!IofCallDriver+0x45

        8 b8ae78ec b958f472 BabyHog+0x1196

       80 b8ae796c 8081df65 BabyHog!HogFarm+0x842

       14 b8ae7980 b80c18a6 nt!IofCallDriver+0x45

       84 b8ae7a04 b80cf367 BossHog+0x78a6

       10 b8ae7a14 b80cf3b7 BossHog+0x15367

       28 b8ae7a3c 8081df65 BossHog!DEVICEDISPATCH::DispatchPassThrough+0x48

       14 b8ae7a50 b770a8ac nt!IofCallDriver+0x45

       28 b8ae7a78 b76fb994 BossHog01!DEVICEDISPATCH::LowerDevicePassThrough+0x48

       7c b8ae7af4 b76fbbaf BossHog01+0x7994

       94 b8ae7b88 b7709ae6 BossHog01+0x7baf

       10 b8ae7b98 b7709b36 BossHog01+0x15ae6

       28 b8ae7bc0 8081df65 BossHog01!DEVICEDISPATCH::DispatchPassThrough+0x48

       14 b8ae7bd4 808f9732 nt!IofCallDriver+0x45

       30 b8ae7c04 80934bac nt!IopCloseFile+0x2ae

       30 b8ae7c34 809344ad nt!ObpDecrementHandleCount+0xcc

       28 b8ae7c5c 80934546 nt!ObpCloseHandleTableEntry+0x131

       44 b8ae7ca0 80934663 nt!ObpCloseHandle+0x82

       10 b8ae7cb0 8088978c nt!NtClose+0x1b

        0 b8ae7cb0 8082e811 nt!KiFastCallEntry+0xfc

       7c b8ae7d2c b8d8ec2d nt!ZwClose+0x11

       50 b8ae7d7c b8d8ede5 MamaHog+0x5c2d

       54 b8ae7dd0 b8d8fa85 MamaHog+0x5de5

      164 b8ae7f34 b8d917fe MamaHog+0x6a85

       40 b8ae7f74 b8d8d22a MamaHog+0x87fe

      2c4 b8ae8238 b958ecdf MamaHog+0x422a

       24 b8ae825c b958eee0 BabyHog!HogFarm+0xaf

       34 b8ae8290 8081e103 BabyHog!HogFarm+0x2b0

       30 b8ae82c0 bae1a22c nt!IopfCompleteRequest+0xcd

       10 b8ae82d0 bae5c00a Ntfs!NtfsCompleteRequest+0xc8

      104 b8ae83d4 8081df65 Ntfs!NtfsFsdCreate+0x48c

       14 b8ae83e8 f7a3c458 nt!IofCallDriver+0x45

       2c b8ae8414 8081df65 fltmgr!FltpCreate+0xe4

       14 b8ae8428 b958e196 nt!IofCallDriver+0x45

        8 b8ae8430 b958f71c BabyHog+0x1196

       60 b8ae8490 8081df65 BabyHog!HogFarm+0xaec

       14 b8ae84a4 b80d012b nt!IofCallDriver+0x45

       28 b8ae84cc b80c1862 BossHog!DEVICEDISPATCH::LowerDevicePassThrough+0x48

       8c b8ae8558 b80cf367 BossHog+0x7862

       10 b8ae8568 b80cf3b7 BossHog+0x15367

       28 b8ae8590 8081df65 BossHog!DEVICEDISPATCH::DispatchPassThrough+0x48

       14 b8ae85a4 b76f9639 nt!IofCallDriver+0x45

       4c b8ae85f0 b76fbb42 BossHog01+0x5639

       94 b8ae8684 b7709ae6 BossHog01+0x7b42

       10 b8ae8694 b7709b36 BossHog01+0x15ae6

       28 b8ae86bc 8081df65 BossHog01!DEVICEDISPATCH::DispatchPassThrough+0x48

       14 b8ae86d0 808f8f71 nt!IofCallDriver+0x45

       e8 b8ae87b8 80937942 nt!IopParseDevice+0xa35

       80 b8ae8838 80933a76 nt!ObpLookupObjectName+0x5b0

       54 b8ae888c 808eae25 nt!ObOpenObjectByName+0xea

       7c b8ae8908 808ec136 nt!IopCreateFile+0x447

       48 b8ae8950 b76ff4ae nt!IoCreateFileSpecifyDeviceObjectHint+0x52

       9c b8ae89ec b76ff0e6 BossHog01+0xb4ae

       58 b8ae8a44 b7685f8b BossHog01+0xb0e6

       50 b8ae8a94 b76867a3 daddyHog+0x3f8b

       28 b8ae8abc b76f66e7 daddyHog+0x47a3

       38 b8ae8af4 b76f6e76 BossHog01+0x26e7

      124 b8ae8c18 b770a0d5 BossHog01+0x2e76

       44 b8ae8c5c 808f5e2f BossHog01!DEVICEDISPATCH::DispatchPassThrough+0x5e7

       a4 b8ae8d00 808eed08 nt!IopXxxControlFile+0x255

       34 b8ae8d34 8088978c nt!NtDeviceIoControlFile+0x2a

        0 b8ae8d34 7c8285ec nt!KiFastCallEntry+0xfc

          0335e534 00000000 0x7c8285ec

 

Problem solved, we have our stack. Please note that the technique demonstrated above is only applicable to an x86 platform. The x64 architecture is completely different. For more information, please refer to Trey Nash’s previous blog

 

The same drivers I observed previously using the dps output are listed; however, it’s much easier to follow the call flow this time. Let’s see how much stack these hogs are using:

 

3: kd> ? 0x8+0x80+0x84+0x10+0x28+0x84+0x10+0x28+0x8+0x80+0x84+0x10+0x28+0x84+0x10+0x8+0x80+0x84+0x10+0x28+0x28+0x7c+0x94+0x10+0x28+0x50+0x54+0x164+0x40+0x2c4+0x24+0x34+0x8+0x60+0x28+0x8c+0x10+0x28+0x9c+0x58+0x50+0x28+0x38+0x124+0x44

Evaluate expression: 4176 = 00001050

 

How many kb is this?

 

3: kd> ? 4176/1024

Evaluate expression: 4 = 00000004

 

I would also like to dump out one of the functions that is using a large amount of stack space to show you how you can dig deeper into problems like this. Sometimes this level of granularity is needed when a vendor is attempting to optimize their code by showing them what caused their large allocations to occur.

 

Let’s review the frame listed below that used 0x2c4 bytes of stack space

 

       40 b8ae7f74 b8d8d22a MamaHog+0x87fe

      2c4 b8ae8238 b958ecdf MamaHog+0x422a

       24 b8ae825c b958eee0 BabyHog!HogFarm+0xaf

 

I have underlined the return address of this stack frame. This is the address right after the instruction where the call to MamaHog took place. After MamaHog would have completed, execution would of course have continued with the line of code right after the call. To view the call to MamaHog, let’s unassemble backwards one instruction to see the call being made.

 

3: kd> ub b958ecdf  L1

BabyHog!HogFarm+0xa8:

             

call    dword ptr BabyHog!HogFarm +0x7460 (b9596090)   <-stores the location we called

 

Let’s dump out this location to see what we actually called. The code dereferences this location to make the actual call.

 

3: kd> dps b9596090 L1

b9596090  b8d8c0d0 BabyHog +0x30d0<-----highlighted address is the function called

 

3: kd> uf b8d8c0d0

b8d8c0d0 55              push    ebp               

b8d8c0d1 8bec            mov     ebp,esp

b8d8c0d3 6aff            push    0FFFFFFFFh

b8d8c0d5 6840e2d9b8      push    offset BabyHog +0x15240 (b8d9e240)

b8d8c0da 6808b2d9b8      push    offset BabyHog +0x12208 (b8d9b208)

b8d8c0df 64a100000000    mov     eax,dword ptr fs:[00000000h]

b8d8c0e5 50              push    eax

b8d8c0e6 64892500000000  mov     dword ptr fs:[0],esp

b8d8c0ed 81c470fdffff    add     esp,0FFFFFD70h         

 

To subtract 656, we are adding -656 to esp to bump the stack up. This is where the compiler is allocating stack space for storage of local variables. I am guessing there were multiple structures being allocated directly on the stack.  If the programmer had instead called ExAllocatePoolWithTag to obtain memory, we could have instead only stored the pointers to this memory on the stack. Simply using the stack for storage is of course faster than calling out to get memory. The developer should balance the performance needs of the code with the need to conserve the stack space which is a limited resource.

 

3: kd> .formats 0x0FFFFFD70

Evaluate expression:

  Hex:     fffffd70

  Decimal: -656

 

So this one instruction raised the stack more than .5 kb all by itself. The function made a bunch of other pushes (some of which are listed above) which when combined with the return address pushed by the call resulted in 708 (0x2c4 was listed by kf above) bytes of stack space to be allocated by this one function.

 

3: kd> .formats 2c4

Evaluate expression:

  Hex:     000002c4

  Decimal: 708

 

Two allocations like this will use up almost 1.5kb of the 12k stack space. By itself, this isn’t necessarily an issue; however, you can see from the output above how liberal stack usage by multiple calls from stacked up drivers can quickly add up. So what have we discovered here? Without even considering the fact that this products drivers may also be responsible for some of the stack space used by calls that they may have initiated, they have used at least 4k of the 12k stack. Also, the full 12k of stack space is not available for driver use. This is because the operating system also requires stack space for such overhead as the I/O operation, the file system components, thread startup, etc. The customer removed the product which prevented further bugchecks while the vendor was being engaged to assist or provide leaner hogs.

 

Summary

 

This two part blog has covered CPU task states, stack overflows, x86 stack reconstruction techniques, and examining functions to observe the stack allocations taking place. Hopefully, this blog will help you to understand what went wrong the next time that you encounter a Stop 0x7f (EXCEPTION_DOUBLE_FAULT).

 

You don’t have to wait until you encounter an UNEXPECTED_KERNEL_MODE_TRAP (7f)  Arg1: 00000008, EXCEPTION_DOUBLE_FAULT to get familiar with this. Open up any dump file and see if you can find the limit and base. Then observe the values of your ebp and esp registers. If you are feeling truly geeky, dump the stack range out, identify patterns, and try passing various values to kf.

 

Please check out our previous blog post on this topic:

http://blogs.msdn.com/ntdebugging/archive/2008/02/01/kernel-stack-overflows.aspx

 

Keep in mind that this two part blog has only discussed stop 0x7f bugchecks where Arg1 lists value 0x8 EXCEPTION_DOUBLE_FAULT. There are other causes for a 0x7f bugcheck. For more information, please refer to:

 

314102  General Causes of STOP 0x0000007F Errors

 

 

Bug Check 0x7F: UNEXPECTED_KERNEL_MODE_TRAP

 

 

References:

“Windows Internals, Fifth Edition” By Mark E. Russinovich, David A. Solomon, with Alex Ionescu

Chapter 9, page 786, “Kernel Stacks”

Share this post :

My name is Ryan Mangipano (ryanman) and I am a Sr. Support Escalation Engineer at Microsoft.  This two part blog will consist of a complete walkthrough of a bugcheck that occurred due to an overflowed stack condition. What is unique about this situation is the stack backtrace wasn’t being displayed.  As we proceed with the walkthrough of the dump analysis, I will be providing demonstrations and background information relating to Task States and Double Faults. 

 

I began my review with the command !analyze –v

 

!analyze -v

UNEXPECTED_KERNEL_MODE_TRAP (7f)

Arg1: 00000008, EXCEPTION_DOUBLE_FAULT

 

You can see from the output above that an unexpected kernel mode trap has taken place. Arg1 reported by the output from !analyze –v indicates the type of trap that occurred was a double fault. A double fault indicates that an exception occurred during a call to the handler for a prior exception. Although a double fault can be caused by other reasons (hardware or a corrupt stack pointer value), we most commonly observe this bugcheck when the drivers executing on the system have caused all of the available 12k of Kernel mode stack space to become exhausted.

 

 Threads on a 32-bit system are given 12k of kernel-mode stack space.  16k of kernel virtual address space will actually be consumed due to the occupation of 4K of virtual address space by an invalid PTE. This guard PTE is used to guard the virtual address range before the kernel stack limit (The stack grows toward lower memory addresses).  This 4k guard page is placed in this location to catch stack overflows. The 12k stack size is not configurable because it is hard coded into the kernel.  For more information, please refer to “Windows Internals, Fifth Edition, page 786, Kernel Stacks”.  

 

If the 12k of kernel stack space is all used up and drivers attempt to use stack space beyond the valid range, a page fault exception will occur as the invalid virtual addresses related to the guard PTE are referenced.   

 

When this page fault exception occurs, the CPU will automatically attempt to push some data onto the stack before transferring control to the page fault handler (Thank you to one of our reader's for correcting this information). More details on what data is pushed to the stack is available in the Intel Processor Family Developer’s Manual, Vol. 3 Chapter 14 (Protected-Mode Exceptions and Interrupts). However when the CPU tries to push this data, another fault will occur due to the stack pointer still providing an invalid address. This causes a double exception (AKA EXCEPTION_DOUBLE_FAULT).

 

So how can the OS handle this type of situation in order to write out the dump file?  The code associated with TRAP 0x8 (EXCEPTION_DOUBLE_FAULT) will perform a task state segment switch and obtain a new stack pointer which is valid. Task State Segment switching is a CPU provided mechanism that allows us to switch to a new task state and store a link to the previously executed task state. The information that is needed to restore a task is stored in a task-state segment (TSS).  The debugger command .tss can later be used to switch back to the previous task state to examine the context at the time of failure. More information regarding Task-State Segment (TSS) is available in the Intel Processor Manual Set (volume 3, Chapter 6).

 

 

In addition to the bugcheck data listed above, the output from the command !analyze -v has also provided me with the .tss command that I needed to type into the debugger.:

 

TSS:  00000028 -- (.tss 0x28)

 

You can type .tss 0x28 in the command window but I simply clicked the DML (debugger Markup Language) hyperlink which entered the .tss command for me. As discussed above, this command accepts the address of the saved Task State Segment (TSS) information for the current processor. This command will set the appropriate context just like the .trap or .cxr commands. 

 

The processor provides a Task Register which contains a 16-bit segment selector.  The register is actually larger. There is other data stored in this register, however it is only viewable by the processor for caching the segment descriptor. Windbg’s r command can be used to dump out the usable portion of this register.

 

3: kd> rtr

tr=00000050

 

So the task register was pointing to a different task (.tss 0x50) at the time of the second exception. But where did !analyze –v get this .tss 0x28 value from?

 

Let’s do some digging. You can get the address of the TSS for the current processor by using the !pcr command

 

3: kd> !pcr

KPCR for Processor 3 at f7737000:

.
.

(omitted several fields for this blog)

 

              TSS: f773a2e0

 

Extensions like !pcr are great, but I also like to understand how the values were obtained. So instead of just getting the value from !pcr,  How else can we find it?

The fs register points to the memory segment that the _KPCR for the current processor is stored. This structure is stored at the base, offset 0x0.

 

3: kd> rfs

Last set context:

     fs=00000030       ßpoints to the segment where the nt!_KPCR is stored at the base of.

 

Let’s see where the _KTSS pointer is stored within the KPCR structure.

 

3: kd> dt nt!*PCR*

          ntkrpamp!_KPCR

 

3: kd> dt ntkrpamp!_KPCR TSS

   +0x040 TSS : Ptr32 _KTSS  0x40 is the offset that the pointer to the TSS is stored.

 Let’s use those two values to dump this out. The 0030: represents the memory segment. Note that I have added 0x40 from the base and dumped out this location 

 

3: kd> dd 0030:00000040 L1   

0030:00000040  f773a2e0    ßpointer to the nt!_KTSS

 

3: kd> dt nt!_KTSS f773a2e0 Backlink

 +0x000 Backlink : 0x28    ß And here is our pointer to the previous task state.

 

This is why !analyze –v   has directed us to type in .tss 0x28

 

But where did !pcr get the address of the KPCR itself? !pcr is listing f7737000. We can find that out also.

 

3: kd> dt ntkrpamp!_KPCR SelfPcr

   +0x01c SelfPcr : Ptr32 _KPCR   ßso the pointer is stored at 0x1c

This command demonstrates the use of fs: instead of 0030: (BTW - I then provided the offset of 1c to get the pointer)

 

3: kd> dd fs:0x1c L1                 

0030:0000001c  f7737000         there it is, we found it

 

To demonstrate that both addresses reference the same data, let’s  dump it out using the size given below.

 

3: kd> dd f7737000 L0x54         

f7737000  b8ae60dc 00000000 00000000 f7737fe0

f7737010  19d5c42c 00000008 7ff9c000 f7737000

f7737020  f7737120 0000001f 00000000 00000000

f7737030  ffffffff 00000000 f773d800 f773d400

f7737040  f773a2e0 00010001 00000008 00000e56

f7737050  08000300 00000000 00000000 00000000

f7737060  00000000 00000000 00000000 00000000

f7737070  00000000 00000000 00000000 00000000

f7737080  00000000 00000000 00000000 00000000

f7737090  00100000 00000003 09f15190 00000000

f77370a0  09f15190 dabc6620 00000000 334e730f

f77370b0  00000000 00000000 00000000 00000000

f77370c0  00000000 00000000 00000000 00000000

f77370d0  00000000 00000000 00000000 00000000

f77370e0  00000000 00000000 00000000 00000000

f77370f0  00000000 00000000 00000000 00000000

f7737100  00000000 00000000 00000000 00000000

f7737110  00000000 00000000 00000000 00000000

f7737120  00010001 87d68438 00000000 f773a090

f7737130  00000003 00000008 0401010f 00000000

f7737140  00000000 00000000 00000000 00000000

 

3: kd> dd fs:0 L0x54

0030:00000000  b8ae60dc 00000000 00000000 f7737fe0

0030:00000010  19d5c42c 00000008 7ff9c000 f7737000

0030:00000020  f7737120 0000001f 00000000 00000000

0030:00000030  ffffffff 00000000 f773d800 f773d400

0030:00000040  f773a2e0 00010001 00000008 00000e56

0030:00000050  08000300 00000000 00000000 00000000

0030:00000060  00000000 00000000 00000000 00000000

0030:00000070  00000000 00000000 00000000 00000000

0030:00000080  00000000 00000000 00000000 00000000

0030:00000090  00100000 00000003 09f15190 00000000

0030:000000a0  09f15190 dabc6620 00000000 334e730f

0030:000000b0  00000000 00000000 00000000 00000000

0030:000000c0  00000000 00000000 00000000 00000000

0030:000000d0  00000000 00000000 00000000 00000000

0030:000000e0  00000000 00000000 00000000 00000000

0030:000000f0  00000000 00000000 00000000 00000000

0030:00000100  00000000 00000000 00000000 00000000

0030:00000110  00000000 00000000 00000000 00000000

0030:00000120  00010001 87d68438 00000000 f773a090

0030:00000130  00000003 00000008 0401010f 00000000

0030:00000140  00000000 00000000 00000000 00000000

 

 

Now that you have an idea of what a state is, let’s examine the stack output of the two states. First, we shall use .tss 0x50 to examine the stack backtrace associated with this state. We shall use the kC command to dump the stack after the .tss command. Notice that we have used the ; command to enter multiple commands on each line.

 

3: kd> .tss 0x50;kC

eax=00000000 ebx=00000000 ecx=00000000 edx=00000000 esi=00000000 edi=00000000

eip=8088b702 esp=f773d3c0 ebp=00000000 iopl=0         nv up di pl nz na po nc

cs=0008  ss=0010  ds=0023  es=0023  fs=0030  gs=0000             efl=00000000

nt!_KiTrap08:

8088b702 fa              cli

  *** Stack trace for last set context - .thread/.cxr resets it

 

nt!_KiTrap08  

nt!_KiTrap0E       

 

The stack backtrace shows two trap handlers and nothing else. According to this stack output, we were first attempting to handle a Trap 0x0E which is a page fault. The page fault handler was invoked in an attempt to handle the invalid address that we accessed in the guard page when we overflowed the stack.  You can see that after the page fault, we encountered another exception represented by the KiTrap08. This is a result of the EXCEPTION_DOUBLE_FAULT indicating that the page fault handler has also encountered an exception. This matches what is listed as arg1 in the bugcheck data that !analyze –v has output. So, the stack backtrace for .tss 0x50 shows that we were first executing the task referenced by 0x28 for Trap0E/Page Fault, when a task state switch occurred and we switched to .tss 0x50 to handle the Trap08/DoubleFault. 

 

Next, we will use the command .tss 0x28 and dump the stack backtrace associated with that task state

 

3: kd> .tss 0x28;kC

eax=b8ae0023 ebx=b8ae60ec ecx=87d68438 edx=87758bd8 esi=b8ae6068 edi=808813d8

eip=8088c718 esp=b8ae5fe4 ebp=b8ae5fe4 iopl=0         nv up di pl zr na pe nc

cs=0008  ss=0010  ds=0023  es=0023  fs=0030  gs=0000             efl=00010046

nt!_KiTrap0E+0x5c:

8088c718 89550c          mov     dword ptr [ebp+0Ch],edx ss:0010:b8ae5ff0=????????

  *** Stack trace for last set context - .thread/.cxr resets it

 nt!_KiTrap0E       

 

You can see in the output above that the stack backtrace has only displayed KiTrap0xE. We should see multiple stack frames listed. I’m a bit concerned about the fact that I do not see a valid stack backtrace listed in the output above. Nevertheless, let's proceed with our examination.

 

 

Now that we have set the proper task state using .tss 0x28 which loaded the registers with the appropriate context, our next step will be to determine where the stack related registers are pointing and how they relate to the 12k range of valid stack addresses for the current thread. This will help us to validate that we did in fact overflow the stack. The easiest way to examine the stack range that this thread was given is to use !thread

 

!thread   

Owning Process            874c6800       Image:         StackHog.exe

...

Base b8ae9000 Limit b8ae6000

 

 

3: kd> resp;rebp

Last set context:

esp=b8ae5fe4   ßNotice that this is outside of the Base and Limit ranges listed above.

Last set context:

ebp=b8ae5fe4

 

Since the stack grows toward lower addresses, an overflow of the b8ae6000 limit will result in a value that is below the address of the limit, you can see that the address of esp has fallen out of the valid range of stack space.

 

3: kd> dd b8ae5fe4 L1

b8ae5fe4  ????????

 

Let’s look at this memory range in more detail. The invalid ranges displayed by ????? represent the guard page. The range of valid stack addresses starts (or ends depending on how you look at it) at b8ae6000.

 

3: kd> dd b8ae5fe0 L10

b8ae5fe0  ???????? ???????? ???????? ????????

b8ae5ff0  ???????? ???????? ???????? ????????

b8ae6000  00000000 00000000 00000000 00000000

b8ae6010  00000000 b8ae0000 b8ae0023 00000023

 

Also, note that we are running in trap handler 0x0E.  This is the page fault handler on x86 (refer to your Intel Processor Manuals for more details).

 

3: kd> u . L1

nt!_KiTrap0E+0x5c

8088c718 89550c          mov     dword ptr [ebp+0Ch],edx

 

The address we we're attempting to access may be in cr2. Let’s dump it out.

 

3: kd> rcr2

Last set context:

cr2=b8ae5fe0     ßThis address is just beyond the stack limit for this thread

 

What is the present instruction in the trap handler doing?

 

3: kd> u . L1

nt!_KiTrap0E+0x5c

8088c718 89550c          mov     dword ptr [ebp+0Ch],edx

 

Ok, so were dereferencing ebp plus an offset of 0x0C. What does that add up to be?

 

3: kd> ? ebp+0x0c

Evaluate expression: -1196531728 = b8ae5ff0

 

3: kd> dd b8ae5ff0 L1

b8ae5ff0  ????????

 

Once the stack overflowed, we can see there were many access attempts to addresses which are not in the valid stack range. This led us to the 7f bugcheck with the double fault parameter.

 

When the system bugchecks because the entire 12k range of a thread’s kernel-mode stack space has been filled up, there can be a few causes. Drivers on the stack may have made very large allocations on the stack instead of using other methods of obtaining memory such as calling ExAllocatePoolWithTag(). Sometimes this is done since it is quicker to use the stack instead of making calls to allocate and free memory from the operating system pools. Other times a driver will have made calls in a manner that causes too many other calls to be made filling up the stack. It is possible for nested functions that never encounter an exit condition to continuously call themselves to exhaust the stack. Often a system will have software from many different vendors that all install heavy stack consuming drivers into the I/O path. Each driver will use a portion of stack space that will add up to a lot since there are so many drivers installed.  For example, if a system has too many file system filter drivers installed in the file system stack and they use more than the minimum amount of stack space possible, it’s not uncommon for all of them put together to cause a stack overflow.

 

Sometimes when dealing with this error, we need to realize that there may not be any one product to blame. A stop 7f sometimes isn’t about identifying the faulting component as it often is in other areas of troubleshooting. It is more about understanding that stack space isn’t an unlimited resource and developing a clear picture of what lead up to the stack space filling up. Sometimes this will result in the need to engage multiple vendors for assistance when there are a combination of drivers on the stack that are all using a large amount of stack space.  Sometimes vendors will provide newer updated drivers that have been optimized to use less stack space. Other times, we simply have too much I/O related software installed and the only answer is to simply remove some of the drivers by uninstalling the product.

 

NTFS and some 3rd party file system filter drivers employ a technique to avoid a stack overflow.  What they do is probe the stack by calling IoGetRemainingStackSize() and if there is not enough stack space left, they will offload the remainder of the work to a dedicated kernel thread that they created just for that purpose. On Vista or Later (or 2003 x64), developers can call KeExpandKernelStackAndCallout, which will allow chaining to another 16k stack.  For more information, see http://msdn.microsoft.com/en-us/library/aa906738.aspx.

 

The easiest way to figure out why we have overflowed the stack is to dump it out and examine the stack backtrace. Therefore, this is typically the first and sometimes the only step necessary to perform when reviewing an EXCEPTION_DOUBLE_FAULT memory dump.  We will now proceed to dump out the stack and examine the stack usage of the different drivers and the calls that they made to determine if further investigation is needed. So, let’s do that now.  I will use the L200 option; otherwise the debugger will only display the default number of frames which won’t display the entire stack. It doesn’t make much sense to review only the top of the stack since the entire stack is full.  I dumped the stack and only got one stack frame listed.

 

3: kd> kfL200

  Memory  ChildEBP RetAddr 

          b8ae5fe4 00000000 nt!_KiTrap0E+0x5c

 

This is not what I was hoping to see. We don’t have a stack. Let me try using kv to see if there is a trap frame

 

3: kd> kvL200

ChildEBP RetAddr  Args to Child             

b8ae5fe4 00000000 00000000 00000000 00000000 nt!_KiTrap0E+0x5c (FPO: [0,0] TrapFrame-EDITED @ 00000000)

 

So, I don’t see a valid trap frame either.  I went back to my !analyze –v output and verified that it had also displayed this one frame only. How will we see what filled up the stack to provide recommendations to the customer or analysis on what happened without the stack? In part two of this blog, we will review how to manually reconstruct the stack and pass values into the kf command in order to get a useful stack backtrace to display.

 

 

Share this post :

Today I want to talk about tracking down leaking pool. Back with Server 2003 and before, leaking pool was a major issue because it was a limited resource. In Vista and beyond, it isn’t as much of an issue since pool is allocated dynamically, but it can still cause system performance issues if a component uses too much of it. Since all nonpaged pool allocations are locked in RAM, it leaves the system with a smaller amount of RAM for anything else.

I recently worked a case involving leaking nonpaged pool and it was easy to troubleshoot. I thought I would walk through the steps I used to narrow down the issue in this article.  So let's get started. Note - The following can be done on a live machine or with a kernel memory dump as well.

Running the !vm command, you can get statistics about the virtual memory usage on the system.

  

       Virtual Memory Usage

 

 

       Physical Memory:   655219   ( 2620876 Kb)

 

       Page File: \??\C:\pagefile.sys

          Current:   4190208Kb Free Space:   4115148Kb

          Minimum:   4190208Kb Maximum:      4190208Kb

 

       Available Pages:   303760   ( 1215040 Kb)

       ResAvail Pages:    571159   ( 2284636 Kb)

       Modified Pages:       677   (    2708 Kb)

       NonPagedPool Usage: 59613   (  238452 Kb)

       NonPagedPool Max:   69377   (  277508 Kb)

                                                          NonPaged Pool Usage :  85%

       ********** Excessive NonPaged Pool Usage *****

       PagedPool 0 Usage:  31684   (  126736 Kb)

       PagedPool 1 Usage:   1824   (    7296 Kb)

       PagedPool 2 Usage:   1821   (    7284 Kb)

       PagedPool 3 Usage:   1824   (    7296 Kb)

       PagedPool 4 Usage:   1827   (    7308 Kb)

       PagedPool Usage:    38980   (  155920 Kb)

       PagedPool Maximum:  86016   (  344064 Kb)

                                                          Paged Pool Usage :  45%

 

Hmm….looks like something is using a lot of nonpaged pool. To determine what is using the maximum amount of nonpaged pool, run the !poolused command.

0: kd>!poolused 2

   Sorting by NonPaged Pool Consumed

 

  Pool Used:

            NonPaged            Paged

 Tag    Allocs     Used    Allocs     Used

 Thre   326514 208968960         0       0

 MmCm      647  2830624         0        0

 LSwi        1  2293760         0        0

 Npfs     1681  1276672      1202   208320

 File     5863   945472         0        0

 Dmio      131   525760         0        0

 Irp      1158   508800         0        0

 

Adding a value of 2 to end of the command instructs the debugger to sort by nonpaged pool used. For more information on the command, please review the help file.

According to the output, the “Thre” tag is the highest user of nonpaged pool. To get a rough idea of all the modules that might be allocating pool using this tag, we can search all the modules to see if they contain the string. This can help narrow down the problem to a few drivers. To search all the modules, just use the for_each_module command.

      !for_each_module s -a @#Base @#End "Thre"

 

Look at David Butler’s blog post if you want to narrow it down even further by looking at each driver’s import table to see if they are importing the ExAllocatePool or ExAllocatePoolWithTag functions. Though, in this case I already know  the “Thre” tag is used by the kernel to allocate thread objects. Could the kernel be leaking thread objects? Most likely not, otherwise we would have seen a lot of customers reporting this issue. I am not ruling it out, but to start off, I am going to focus on the third party software installed on the machine.

Let’s dump out a few sample pool allocations, to see what exactly they are. I was sure there were going to be a ton of them, so I just hit the Break button once the output started flying by.

0: kd> !poolfind Thre 0

Scanning large pool allocation table for Tag: Thre

Searching NonPaged pool (81093000 : 89093000) for Tag: Thre

 

81098000 size:  280 previous size:    0  (Allocated) Thre (Protected)

81098380 size:  280 previous size:  100  (Allocated) Thre (Protected)

81098600 size:  280 previous size:  280  (Allocated) Thre (Protected)

81098880 size:  280 previous size:  280  (Allocated) Thre (Protected)

81098b00 size:  280 previous size:  280  (Allocated) Thre (Protected)

81098d80 size:  280 previous size:  280  (Allocated) Thre (Protected)

81099000 size:  280 previous size:    0  (Allocated) Thre (Protected)

810992c0 size:  280 previous size:   40  (Allocated) Thre (Protected)

810996e0 size:  280 previous size:  1a0  (Allocated) Thre (Protected)

81099960 size:  280 previous size:  280  (Allocated) Thre (Protected)

81099d80 size:  280 previous size:  1a0  (Allocated) Thre (Protected)

8109a000 size:  280 previous size:    0  (Allocated) Thre (Protected)

8109a380 size:  280 previous size:  100  (Allocated) Thre (Protected)

8109a600 size:  280 previous size:  280  (Allocated) Thre (Protected)

8109a880 size:  280 previous size:  280  (Allocated) Thre (Protected)

.

.

.

 

It would be interesting to see the thread stacks for the KTHREAD structures in these pools. To dump out the thread using the !thread command, I needed to figure out where the base of the KTHREAD structure started in the pool. To figure out the offset I dumped out all the threads in the System process, picked a random thread, and then calculated the thread’s offset from the start of it’s pool block. You can use any thread in the entire system because the offset will always be the same.

0: kd> !thread  8904a180

THREAD 8904a180  Cid 8.18  Teb: 00000000  Win32Thread: 00000000 WAIT: (WrEventPairLow) KernelMode Non-Alertable

    80478d40  Unknown

Not impersonating

Owning Process 890754a0

Wait Start TickCount    3831270       Elapsed Ticks: 209711

Context Switch Count    28238              

UserTime                  0:00:00.0000

KernelTime                0:00:00.0875

Start Address nt!ExpWorkerThread (0x80417a9a)

Stack Init f6444000 Current f6443d34 Base f6444000 Limit f6441000 Call 0

Priority 13 BasePriority 13 PriorityDecrement 0 DecrementCount 0

ChildEBP RetAddr  Args to Child

f6443d4c 8042ef5f bfec0068 80478d40 8904a180 nt!KiSwapThread+0x1b1

f6443d70 80417b0d 00000000 80442f00 00000000 nt!KeRemoveQueue+0x197

f6443da8 804578c2 00000000 00000000 00000000 nt!ExpWorkerThread+0x73

f6443ddc 8046c966 80417a9a 00000000 00000000 nt!PspSystemThreadStartup+0x54

00000000 00000000 00000000 00000000 00000000 nt!KiThreadStartup+0x16

 

0: kd> !pool 8904a180 

 8904a080 size:   e0 previous size:   20  (Allocated)  Dire (Protected)

*8904a160 size:  280 previous size:   e0  (Allocated) *Thre (Protected)

 8904a3e0 size:  280 previous size:  280  (Allocated)  Thre (Protected)

 

0: kd> ?8904a180-8904a160

Evaluate expression: 32 = 00000020

 

So based on my calculations, the Kthread structure starts 32 bytes into the Thre pool.

Now I can use this offset to dump threads stacks from the pools returned by the poolfind command.  Some of the threads are going to be valid, but most of them are going to be doing something fishy. Basically I need to use a large sample to figure out what would be causing the Thre tag to leak. After a few tries, I was able to find a lot of threads which looked like this -

kd> !Thread 81098620 

THREAD 81098620Cid 8e8.a391c  Teb: 00000000  Win32Thread: 00000000 TERMINATED

Not impersonating

Owning Process 886f5020

Wait Start TickCount    1949088       Elapsed Ticks: 2091893

Context Switch Count    2              

UserTime                  0:00:00.0000

KernelTime                0:00:00.0000

Start Address 0x7c57b740

Win32 Start Address 0x20003474

Stack Init 0 Current bca4fc44 Base bca50000 Limit bca4d000 Call 0

Priority 16 BasePriority 8 PriorityDecrement 0 DecrementCount 0

 

 

kd> !PROCESS 886f5020 0

PROCESS 886f5020SessionId: 0  Cid: 08e8    Peb: 7ffdf000  ParentCid: 016c

    DirBase: 694bc000  ObjectTable: 88b3ec68  TableSize: 391189.

    Image: xxxxxx.EXE

 

Looks like the thread is terminated. But why is it still hanging around?

0: kd> !object 81098620

Object: 81098620  Type: (89075900) Thread

    ObjectHeader: 81098608

    HandleCount: 1  PointerCount: 1

 

Aha! Since the handle count and pointer count are not 0, the system has not freed this Thre allocation. Most likely something has an open handle to this object. Dumping out all the processes using the !process command, I found a particular process with 388,836 open handles. This is the culprit. Interesting enough, this is the same process which was listed in the thread’s process field.

       1  Token

       1  Desktop

       1  Section

       2  WindowStation

       3  Directory

       4  IoCompletion

       5  Mutant

       7  Port

       9  Semaphore

      15  File

      73  Event

   65110  Key

  323155  Thread

 

  388386  Handles in process

 

Running !handle 0 2 886f5020 and analyzing the output, I found a ton of thread handles. Most likely the process is leaking thread handles and preventing the system from cleaning up the thread allocations. This is resulting in high usage of nonpaged pool, which is causing the performance issues seen by the customer.

We contacted the software vendor responsible for the process, and it turned out to be a known bug in their software. An update was provided to the customer, and it fixed the issue immediately.

 

Share this post :

Hi All. Recently I had a Windows 2000 case where the machine wouldn’t shut down. After initiating the shutdown process, we saw the user get logged off, and on the console we watched the services shut down. The final “Windows is shutting down…” message was displayed on the screen, and then the screen would go blank. At that point we expected the machine to power down, but it continued running. Even after waiting 10 minutes the machine stayed powered up. The same behavior was exhibited after a restart as well. We tried the same thing in safe mode, but the machine still wouldn’t shut down.

We setup the machine for a full memory dump, and crashed the box once the screen went blank. The customer uploaded the dump, and I started digging.

The first thing I noticed was a bunch of processes still running. I expected csrss.exe, smss.exe, winlogon.exe and the System process , but to my surprise I found other processes such as svchost.exe and services.exe still lingering. Why were they still running?

 

0: kd> !process 0 0

**** NT ACTIVE PROCESS DUMP ****

PROCESS f95ed480  SessionId: 0  Cid: 0008    Peb: 00000000  ParentCid: 0000

    DirBase: 00030000  ObjectTable: f95edf68  TableSize:  62.

    Image: System

 

PROCESS f94de520  SessionId: 0  Cid: 00fc    Peb: 7ffff000  ParentCid: 0008

    DirBase: 2465c000  ObjectTable: f94def08  TableSize:  34.

    Image: SMSS.EXE

 

PROCESS f94b01c0  SessionId: 0  Cid: 0118    Peb: 7ffff000  ParentCid: 00fc

    DirBase: 26515000  ObjectTable: f94b29e8  TableSize: 103.

    Image: CSRSS.EXE

 

PROCESS f94a5d60  SessionId: 0  Cid: 0130    Peb: 7ffff000  ParentCid: 00fc

    DirBase: 2671a000  ObjectTable: f94b18a8  TableSize: 114.

    Image: WINLOGON.EXE

 

PROCESS f9495a60  SessionId: 0  Cid: 014c    Peb: 7ffff000  ParentCid: 0130

    DirBase: 26814000  ObjectTable: f94b2848  TableSize: 204.

    Image: SERVICES.EXE

 

PROCESS f9494860  SessionId: 0  Cid: 0158    Peb: bffdf000  ParentCid: 0130

    DirBase: 2681d000  ObjectTable: f9499608  TableSize: 247.

    Image: LSASS.EXE

 

PROCESS f9476500  SessionId: 0  Cid: 01e8    Peb: 7ffff000  ParentCid: 014c

    DirBase: 272a1000  ObjectTable: f94833a8  TableSize: 136.

    Image: svchost.exe

 

 

Let’s find out.

How is Shutdown Performed

At first I thought some service was stuck and not shutting down, causing the entire system to stall. I pulled out my copy of Windows Internals and thumbed through the Startup and Shutdown chapter. I found that to initiate a shutdown, Csrss sends a shutdown message to Winlogon. Winlogon then calls ExitWindowsEx, which in turn sends a message to each session’s Csrss. Inside each session, Csrss then sends messages to all the threads that own a Windows message loop to shut down. If the threads do not respond, Csrss waits for a timeout specified by HKCU\Control Panel\Desktop\HungAppTimeout. Once all the threads that own windows exit, Csrss then terminates the processes as well. If Csrss finds a console application, it sends it a message to shut down. It will wait for a timeout specified by HKCU\Control Panel\Desktop\WaitToKillAppTimeout, and then displays a message on the desktop.

If you want Windows to terminate the processes immediately after timeout period, you can set the HKCU\Control Panel\Desktop\AutoEndTasks to 1.

At this point all the processes in the interactive user process have been shut down. Csrss in session 0 then sends messages to the Service Control Manager(SCM) to shut down all the services. When a service is registered, it also specifies a wait hint. During shutdown, this hint is used by the SCM to wait for services to terminate. Prior to Vista, Csrss waits for the SCM to shut down, but if that does not happen by the timeout value in HKLM\SYSTEM\CurrentControlSet\Control\WaitToKillServiceTimeout, Csrss would just kill the SCM and proceed with the shutdown. Since this could lead to services not completing their shutdown routines, going forward in Vista and beyond Windows now implements preshutdown notification routines. Services that want to get these notifications can register themselves by using the SetServiceStatus API.

Once Csrss has finished its pass notifying system processes that the system is shutting down, Winlogon in session 0 finishes the shutdown process by calling the NtShutdownSystem. Note, many system processes such as Csrss, Smss, WinInit, Services, Lsass are still running when the machine actually powers down. They are ready to be shut down, but the processes still exist. NtShutdownSystem calls PoSetSystemPowerState to shut down all the drivers and the rest of the executive subsystems.

One of the things that PoSetSystemPowerState does is call the I/O manager to send notifications to all the drivers that have requested the shutdown notification. Once the drivers receive these notifications, they can then clean up and perform any special steps that might be needed for their devices. Other subsystems also perform clean up, and the page file is cleared by the Memory Manager if that option was turned on. The I/O manager is called a second time, and then the file system drivers are notified that the system is shutting down. The System actually shuts down in the power manager. The power manager tells the BIOS to turn the power off to the machine.

The Memory Dump Analysis

I started by looking at the Winlogon process in session 0 and found the main thread had called NtShutdownSystem as we expected. I always start with this thread when troubleshooting shutdown problems.

 

0: kd> !process f94a5d60 17

PROCESS f94a5d60  SessionId: 0  Cid: 0130    Peb: 7ffff000  ParentCid: 00fc

    DirBase: 2671a000  ObjectTable: f94b18a8  TableSize: 114.

    Image: WINLOGON.EXE

    VadRoot f93eb748 Clone 0 Private 515. Modified 396. Locked 0.

    DeviceMap f95c8448

    Token                             e246cdf0

    ElapsedTime                        0:34:42.0937

    UserTime                          0:00:00.0109

    KernelTime                        0:00:01.0015

    QuotaPoolUsage[PagedPool]         30112

    QuotaPoolUsage[NonPagedPool]      31264

    Working Set Sizes (now,min,max)  (1172, 50, 345) (4688KB, 200KB, 1380KB)

    PeakWorkingSetSize                1866

    VirtualSize                       29 Mb

    PeakVirtualSize                   31 Mb

    PageFaultCount                    3603

    MemoryPriority                    FOREGROUND

    BasePriority                      13

    CommitCharge                      888

 

        THREAD f94a5ae0  Cid 130.114  Teb: 7fffe000  Win32Thread: e002e328 WAIT: (Suspended) KernelMode Non-Alertable

            f93c2010  SynchronizationEvent

            f94a5bc8  NotificationTimer

        Not impersonating

        Owning Process f94a5d60

        Wait Start TickCount    138381        Elapsed Ticks: 842

        Context Switch Count    1914                   LargeStack

        UserTime                  0:00:00.0078

        KernelTime                0:00:00.0875

        Start Address winlogon!_setargv (0x0100ae2c)

        Stack Init f1d00000 Current f1cffa60 Base f1d00000 Limit f1cfb000 Call 0

        Priority 15 BasePriority 15 PriorityDecrement 0 DecrementCount 0

 

        ChildEBP RetAddr  Args to Child

        f1cffa78 dd42d893 dd0655c0 f93c2008 dd0654f0 nt!KiSwapThread+0x1b1

        f1cffaa0 dd490be9 f93c2010 00000005 00000000 nt!KeWaitForSingleObject+0x1a3

        f1cffae0 dd4908c5 f93c2000 00000001 00000001 nt!PopWaitForSystemPowerIrp+0xf2

        f1cffb04 dd4903db 00000000 f1cffbf4 f1cffc78 nt!PopSetDevicesSystemState+0xfb

        f1cffbe0 dd468389 00000005 00000004 c0000004 nt!NtSetSystemPowerState+0x329

        f1cffbe0 dd432197 00000005 00000004 c0000004 nt!_KiSystemService+0xc9

        f1cffc64 dd490122 00000005 00000004 c0000004 nt!ZwSetSystemPowerState+0xb

        f1cffd48 dd494db4 00000005 00000004 c0000004 nt!NtSetSystemPowerState+0x70

        f1cffd58 dd468389 00000001 00000000 00000000 nt!NtShutdownSystem+0x2e

        f1cffd58 77f88e07 00000001 00000000 00000000 nt!_KiSystemService+0xc9

        0006fe90 01014c3d 00000001 00000001 000767b8 ntdll!NtShutdownSystem+0xb

        0006fea8 01014f6c 00000060 0000000b 0000000b winlogon!ShutdownMachine+0x165

        0006ff04 0100e20f 000767b8 0000000b 5ffa0000 winlogon!Logoff+0x216

        0006ff20 01007e8c 000767b8 00000005 000735dc winlogon!MainLoop+0x1fb

        0006ff58 0100af70 00071fc8 00000000 000735dc winlogon!WinMain+0x37a

        0006fff4 00000000 7ffff000 000000c8 00000100 winlogon!WinMainCRTStartup+0x156

 

Looks like the I/O Manager has sent shutdown notifications to all the devices and this winlogon thread is waiting for it to complete. Dumping out the IRP, I found it had been completed, but for some reason the completion routine had never been called. That is why we are waiting forever. Digging further, it looked like someone had messed up the current IO stack location on the IRP, which resulted in the IRP completion routine never being called.

Let’s take a look at the IRP.

Based on the thread stack you can see Winlogon sent an IRP down to Plug and Play and is waiting for the action to complete. Since there is no obvious way to get a pointer to the IRP from the stack we’ll need to do some digging. We see that nt!PopWaitForSystemPowerIrp is waiting for some type of dispatch object to signal. According to MSDN the first parameter passed to KeWaitForSingleObject() is the object. Here I’m dumping the raw memory for the Object parameter.

0: kd> dc f93c2010

f93c2010  00040001 00000000 f94a5b4c f94a5b4c  ........L[J.L[J.

f93c2020  00000000 f94a5ae0 00000000 00000063  .....ZJ.....c...

f93c2030  00000028 00000005 dd47bd20 f93c203c  (....... .G.< <.

f93c2040  f93c203c f93c2044 f93c2044 f938fea8  < <.D <.D <...8.

f93c2050  00000000 00000000 01010000 00000100  ................

f93c2060  f946c860 f946c860 f93c23ac 00000000  `.F.`.F..#<.....

f93c2070  00000000 f93c2360 f93c2360 f93c207c  ....`#<.`#<.| <.

f93c2080  f93c207c f93c2084 f93c2084 f93c208c  | <.. <.. <.. <.

 

0: kd> !pool f93c2010 2

*f93c2000 size:  420 previous size:    0  (Allocated) *PDss

 

You may ask yourself, “are any of these values in the PDss pool pointers to IRP pool blocks?”  If you wanted to know that, you could iterate over each one of them passing them to !pool or !address.  That sounds tedious to do manually.  But thankfully, the debugger has some nice command tokens such as .foreach that make this a breeze.  If you want to pass each of the above values to !pool, you can perform the following in the debugger:

0: kd>  .foreach /pS 1 /ps 1 ( value { dp /c 1 f93c2000 L 420/@$ptrsize } ) { .if( value != 0 ) {.printf "**** %p ****\n", ${value}; !pool ${value} 0x2} }

 

**** 21018000 ****

21018000: Unable to get contents of pool block

**** 73734450 ****

73734000: Unable to get contents of pool block

**** 00000002 ****

00000000: Unable to get contents of pool block

**** 00000006 ****

00000000: Unable to get contents of pool block

.

. (I truncated this output for the blog. There were many more entries)

.

**** f946c860 ****

**** f946c860 ****

*f946c800 size:  1e0 previous size:   40  (Allocated) *Irp

**** f946c860 ****

*f946c800 size:  1e0 previous size:   40  (Allocated) *Irp

**** f93c23ac ****

 

**** f93c207c ****

*f93c2000 size:  420 previous size:    0  (Allocated) *PDss

 

Bingo! There is a pointer to an IRP block in our PDss pool. Let’s take a look at f946c800. I’m dumping it’s pool.

 

        0: kd> !pool f946c800

 f946c000 size:   80 previous size:    0  (Allocated)  MmCa

 f946c080 size:   40 previous size:   80  (Free)       ....

 f946c0c0 size:   40 previous size:   40  (Allocated)  Ntfn

 f946c100 size:   a0 previous size:   40  (Allocated)  File (Protected)

 f946c1a0 size:   40 previous size:   a0  (Allocated)  Ntfr.

.

.

.

*f946c800 size:  1e0 previous size:   40  (Allocated) *Irp

 

Here you see the IRP tag in this raw memory output. The first two DWORDs contain the POOL_HEADER.

 

0: kd> dc f946c800

f946c800  0f018002 20707249 01d80006 00000000  ....Irp ........

f946c810  00000000 00000000 f946c818 f946c818  ..........F...F.

f946c820  c00000bb 00000000 0b0a0000 04000000  ................

f946c830  00000000 00000000 00000000 00000000  ................

f946c840  00000000 00000000 00000000 00000000  ................

f946c850  00000000 00000000 00000000 00000000  ................

f946c860  f93c2060 f93c2060 f946c9e0 00000000  ` <.` <...F.....

f946c870  00000000 00000000 00000000 00000000  ................

 

Since the pool header is 8 bytes long we need to add 8 to f946c800 to skip over the header and get to the actual IRP.

 

0: kd> ? f946c800 + 8

Evaluate expression: -112801784 = f946c808

 

0: kd> !irp f946c808

Irp is active with 10 stacks 11 is current (= 0xf946c9e0)

 No Mdl Thread 00000000:  Irp is completed. 

     cmd  flg cl Device   File     Completion-Context

 [  0, 0]   0  0 00000000 00000000 00000000-00000000   

 

                    Args: 00000000 00000000 00000000 00000000

 [  0, 0]   0  0 00000000 00000000 00000000-00000000   

 

                    Args: 00000000 00000000 00000000 00000000

 [  0, 0]   0  0 00000000 00000000 00000000-00000000   

 

                    Args: 00000000 00000000 00000000 00000000

 [  0, 0]   0  0 00000000 00000000 00000000-00000000   

 

                    Args: 00000000 00000000 00000000 00000000

 [  0, 0]   0  0 00000000 00000000 00000000-00000000   

 

                    Args: 00000000 00000000 00000000 00000000

 [  0, 0]   0  0 00000000 00000000 00000000-00000000   

 

                    Args: 00000000 00000000 00000000 00000000

 [  0, 0]   0  0 00000000 00000000 00000000-00000000   

 

                    Args: 00000000 00000000 00000000 00000000

 [  0, 0]   0  0 00000000 00000000 00000000-00000000   

 

                    Args: 00000000 00000000 00000000 00000000

 [  0, 0]   0  0 00000000 00000000 00000000-00000000   

 

                    Args: 00000000 00000000 00000000 00000000

 [ 16, 2]   0 e1 f9533ba0 00000000 dd491028-f93c2354 Success Error Cancel pending

              \Driver\Diskperf    nt!PopCompleteSystemPowerIrp

                    Args: 00000000 00000000 00000006 00000005

 

The Major function is 16 (IRP_MJ_POWER) which synchs up with the power management activity we saw in the stack. You can find all the major function codes for IRPs in the standard header files such as WDM.H or NTDDK.H available with the WDK. There is an excellent chance this is the IRP Winlogon sent down to Plug and Play. Now that we have the IRP in question with the IO Stack locations we see that the pending status is set.

 

                     Args: 00000000 00000000 00000000 00000000

 [ 16, 2]   0 e1 f9533ba0 00000000 dd491028-f93c2354 Success Error Cancel pending

              \Driver\Diskperf    nt!PopCompleteSystemPowerIrp

                     Args: 00000000 00000000 00000006 00000005

 

 

It appears to be stuck in \Driver\Diskperf which is the device stack for the device associated with Diskperf.  At this point I determined it was most likely due to one of the devices in it’s devstack.

0: kd> !devstack f9533ba0

  !DevObj   !DrvObj            !DevExt   ObjectName

  f94d5660  \FileSystem\Foo      f94d5718 

  f95338a0  \Driver\VSP          f9533958 

> f9533ba0  \Driver\Diskperf     f9533c58 

  f9539150  \Driver\Ftdisk       f9539208  HarddiskVolume1

!DevNode f953aba8 :

  DeviceInst is "STORAGE\Volume\1&30a96598&0&Signature80Offset7E00Length2730C00"

  ServiceName is "DiskPerf"

 

Doing a bing search for the drivers in the list above (and shutdown), I found a file system filter driver known to cause this issue in the past. Updating to the latest version of the driver fixed the issue, and the machine would shutdown normally.

BTW - For more information on how Windows performs a shutdown check out this past article from Ntdebugging - http://blogs.msdn.com/ntdebugging/archive/2007/06/09/how-windows-shuts-down.aspx

 

Share this post :

 

Here's one from the Rube Goldberg debug collection!

The dripping sarcasm is because I'm about to show the reeeeally long way to figure out what's eating MmCm, skip down to the end if you are in a time crunch.

 

Otherwise, do resist the temptation to skip ahead as some of the techniques can be used in other debugs.

 

-Tate

 

 

What is this Pool Tag?

 

MmCm is a Non Paged Pool (NPP) tag allocated via a kernel mode call (a driver or the OS itself…applications cannot allocate this memory directly) to either MmAllocateContiguousMemory  or MmAllocateContiguousMemorySpecifyCache.

 

Great comments from msdn…

 

MmAllocateContiguousMemory can be called to allocate a contiguous block of physical memory for a long-term internal buffer, usually from the DriverEntry routine…

 

A device driver that must use contiguous memory should allocate only what it needs during driver initialization because physical memory is likely to become fragmented as the operating system runs.

 

Warning  If you use the MmAllocateContiguousMemory routine on computers with large amounts of memory, the operating system's performance might severely degrade when the system tries to create a contiguous chunk of memory. This degradation is reduced for Windows Server 2008. Memory that MmAllocateContiguousMemory allocates is uninitialized...

 

 

First, how much is normal?

 

As the comments indicate it is contingent mostly on the driver's needs, done at initialization time.  We have seen consumption in the single MB range up to tens of MB consumed depending on what drivers are loaded.

 

 

What drivers are the most likely consumers of this memory and why would the amount vary?

 

The most likely consumers on any machine would be the following types of drivers. Note that each will likely allocate a multiple of some amount per instance for each physical adapter exposed to Windows. (i.e. A single NIC driver used for 3 NICs in a system will likely allocate 3 times the MmCm vs. a single NIC.)

 

Storage Adapters for your SAN...SCSI Controller, Fibre Channel Controller, etc.

Video Adapters

Network Adapters

 

These drivers need to pre-allocate memory to service transfers and as mentioned need to do this right at boot time since memory can become fragmented later.  In other words, this allocation pattern attempts to "bake-in" a known set of scratch space if you will for the adapter to function properly given the features you wish to use.

 

 

So what's the problem?

 

Not everyone plays nice and the machine can run out of memory!  This is downright painful on x86 Exchange servers and their limited NPP ceiling.

 

Recall from an earlier post and your likely experience, that running out of such memory can cause hangs or general system instability as the machine runs out of NPP.  The scary part is that this usually happens unexpectedly under incrementally higher than normal user demand. i.e when you most need the server it fails.  Not coincidently most often we see older  x86 machines, especially /3GB configured Exchange servers, have this memory consumption problem. I often wonder just how many machines are out there just waiting for a liiiiitle more user load and are hovering at the brink of NPP exhaustion…

 

<Ok, queue the sad music for the heart breaking x86 Exchange server example>

 

This server has the standard 128MB maximum for NPP because of /3GB

 

0: kd> !vm

 

*** Virtual Memory Usage ***

Physical Memory:      851418 (   3405672 Kb)

Page File: \??\C:\pagefile.sys

  Current:   4193280 Kb  Free Space:   4038364 Kb

  Minimum:   4193280 Kb  Maximum:      4193280 Kb

Available Pages:      639032 (   2556128 Kb)

ResAvail Pages:       796753 (   3187012 Kb)

Locked IO Pages:         272 (      1088 Kb)

Free System PTEs:      27083 (    108332 Kb)

Free NP PTEs:           5778 (     23112 Kb)

Free Special NP:           0 (         0 Kb)

Modified Pages:         4138 (     16552 Kb)

Modified PF Pages:      4116 (     16464 Kb)

NonPagedPool Usage:    25151 (    100604 Kb)

NonPagedPool Max:      32026 (    128104 Kb)

PagedPool 0 Usage:     11497 (     45988 Kb)

PagedPool 1 Usage:      1645 (      6580 Kb)

PagedPool 2 Usage:      1667 (      6668 Kb)

PagedPool 3 Usage:      1662 (      6648 Kb)

PagedPool 4 Usage:      1679 (      6716 Kb)

PagedPool Usage:       18150 (     72600 Kb)

PagedPool Maximum:     63488 (    253952 Kb)

Session Commit:          586 (      2344 Kb)

Shared Commit:          4720 (     18880 Kb)

Special Pool:              0 (         0 Kb)

Shared Process:         7517 (     30068 Kb)

PagedPool Commit:      18214 (     72856 Kb)

Driver Commit:          8779 (     35116 Kb)

Committed pages:      237424 (    949696 Kb)

Commit limit:        1867524 (   7470096 Kb)

 

Given 100Megs used, 28MB of that is MmCm

 

0: kd> !poolused /t 10 2

   Sorting by  NonPaged Pool Consumed

 

  Pool Used:

            NonPaged            Paged

Tag    Allocs     Used    Allocs     Used

MmCm     3210 28779488         0        0        Calls made to MmAllocateContiguousMemory , Binary: nt!mm

NDpp     1013  4076960         0        0        packet pool , Binary: ndis.sys

MPIO   121985  3457752         0        0        UNKNOWN pooltag 'MPIO', please update pooltag.txt

elxs        6  3299344         0        0        UNKNOWN pooltag 'elxs', please update pooltag.txt

BCM0       24  3057232         0        0        UNKNOWN pooltag 'BCM0', please update pooltag.txt

LSwi        1  2654208         0        0        initial work context

 RaME        3  2572288         0        0        RiAllocateMiniportDeviceExtension , Binary: storport.sys

ElxA        5  2360208         3      160        UNKNOWN pooltag 'ElxA', please update pooltag.txt

Io        223  2319712       123     5136        general IO allocations , Binary: nt!io

TPLA      512  2097152         0        0        UNKNOWN pooltag 'TPLA', please update pooltag.txt

TCPt       40  1662448         0        0        TCP/IP network protocol , Binary: TCP

Mdl      7225  1435160         0        0        Io, Mdls

 Pool        3  1134592         0        0        Pool tables, etc.

 Devi      602  1124400         0        0        Device objects

 RcpI        1  1048576         0        0        Internal memory mgr initial heap block , Binary: sacdrv.sys

brcm       38  1032528         0        0        UNKNOWN pooltag 'brcm', please update pooltag.txt

TOTAL      219941 75555240     70157 56589336

 

That's not a lot right?  Well, it may be more than you like.  Depending on load cycles on this server it could be too much, you'd have to know a bit more history of the maximum amount of NPP demanded at peak usage, etc. to know how close to the edge this server may be.  Let's go with "what's using the ~28MB?" as the question to answer...

 

 

Who uses this memory?

 

Here's where it gets interesting.  A useful heuristic here is to group the sizes of the allocations in Excel.

You get this data from a !poolfind MmCm

 

0: kd> !poolfind MmCm

 

Scanning large pool allocation table for Tag: MmCm (f9a67000 : f9b67000)

 

*f7fc7000 :free large page allocation, Tag was MmCm, size was 0x2000 bytes

*f8298000 :free large page allocation, Tag was MmCm, size was 0x2000 bytes

*faeae000 :large page allocation, Tag  is MmCm, size  is 0x4000 bytes

*faeb2000 :large page allocation, Tag  is MmCm, size  is 0xd000 bytes

*fae84000 :large page allocation, Tag  is MmCm, size  is 0x11000 bytes

*fae95000 :large page allocation, Tag  is MmCm, size  is 0x11000 bytes

*fae23000 :large page allocation, Tag  is MmCm, size  is 0x1000 bytes

*fae24000 :large page allocation, Tag  is MmCm, size  is 0x3000 bytes

*fae2d000 :large page allocation, Tag  is MmCm, size  is 0x1000 bytes

.

.

.

fdf2e000 size:  f18 previous size:    0  (Allocated) MmCm

fdf2f000 size:  f18 previous size:    0  (Allocated) MmCm

fdf30000 size:  f18 previous size:    0  (Allocated) MmCm

fdf31000 size:  f18 previous size:    0  (Allocated) MmCm

fdf32000 size:  f18 previous size:    0  (Allocated) MmCm

fdf33000 size:  f18 previous size:    0  (Allocated) MmCm

fdf34000 size:  f18 previous size:    0  (Allocated) MmCm

fdf35000 size:  f18 previous size:    0  (Allocated) MmCm

fdf36000 size:  f18 previous size:    0  (Allocated) MmCm

fdf37000 size:  f18 previous size:    0  (Allocated) MmCm

fdf38000 size:  f18 previous size:    0  (Allocated) MmCm

fdf39000 size:  f18 previous size:    0  (Allocated) MmCm

fdf3a000 size:  f18 previous size:    0  (Allocated) MmCm

fdf3b000 size:  f18 previous size:    0  (Allocated) MmCm

fdf3c000 size:  f18 previous size:    0  (Allocated) MmCm

 

Searching NonPaged pool (fe000000 : ffb7e000) for Tag: MmCm

 

 

When I have to do this, usually some variation of following works.

 

.logopen

!poolfind

.logclose

 

Open the txt file up in notepad, clean it up a bit (remove the frees and extra lines)

Open the txt in Excel and use the old standby Text Import Wizard via file open

I usually also save the large pool allocs to one txt file and the regular to another and then cut paste to combine them in Excel.

 

There's probably an easier way to do this import, but this works well enough.

image

You can really go crazy getting the data all pretty but don't bother as even something as ugly as this is useful because a quick sort on Column B shows the allocation size pattern.  Note the Text Import Wizard allows you to skip columns here, so you can import only the address column and the size column.  i.e. faeae000’s column and the 0x4000 column in the above.

 

image

Via Excel I just generated another table on the fly to keep track of our totals here.  Note, I’m just counting the number of each size of alloc by visual inspection.  i.e. There are 9 allocations of 0x1000 size in my list.

 

Allocation Sizes in Bytes (h)  

Number of Allocations

Total MmCm in Bytes

0xf18 bytes 

2014

7,782,096

0x1000 bytes

9

36,864

0x2000 bytes

1017

8,331,264

0xf000 bytes

23

1,413,120

0x11000 bytes

92

6,406,144

0x156000 bytes

2

2,801,664

 

Ahhh…..In total we've got 26 of 28 Megs sampled here, well representing the MmCm usage minus some insignificant allocs and rounding, great.

 

(I love the new Win7 calc.exe)

clip_image003

So how does this help me again?  Here's the crucial step.  Look at samples of these allocations (which is why I kept the address of the alloc in the Excel import).  There are very often telltale tags or strings inside the alloc that give you a strong indication if not proof positive who allocated this memory.

 

Here's a common allocation example:

 

Dump several samples of the f18 sized allocations…

 

0: kd> dc fdf18000 fdf18000+f18

fdf18000  0be30000 6d436d4d fd1df008 00000000  ....MmCm........<--the pooltag, okay…knew that...

fdf18010  1f2e3d4c 00000000 00000000 00000000  L=..............<-------hum, this 1f2e3d4c seems to be in every one of these puppies...

fdf18020  00000000 00000000 00000000 00000000  ................

fdf18030  00000000 00000000 00000000 00000000  ................

fdf18040  00000000 00000000 00000000 00000000  ................

 

Remember we learned that these are predominantly allocated on boot?  Turns out that Storport actually allocates 1000 of these(per adapter) based on the extension size of the adapter and guess what it does, it puts a handy little tag in here defined as 1f2e3d4c.    Here's the proof.

 

storport!RaInitializeRaidResources+0x47:

f5d6aeb3 8bd7            mov     edx,edi

f5d6aeb5 8bce            mov     ecx,esi

f5d6aeb7 ff1510d1d6f5    call    dword ptr [storport!_imp_InterlockedPushEntrySList (f5d6d110)]

f5d6aebd ff4508          inc     dword ptr [ebp+8]

f5d6aec0 8b4508          mov     eax,dword ptr [ebp+8]

f5d6aec3 3b450c          cmp     eax,dword ptr [ebp+0Ch]

f5d6aec6 c747084c3d2e1f  mov     dword ptr [edi+8],1F2E3D4Ch<------------------ah ha!

f5d6aecd 72be            jb      storport!RaInitializeRaidResources+0x21 (f5d6ae8d)

 

 

Cool! So now I've accounted for 7.8 Megs of the 28, and I know at this point that since I've got ~2000 allocs I've got at least two storage adapters on this machine.

Actually, If you are suspicious about the roughly 1000 0x2000 sized allocations and you think it's another adapter with a larger extension size, you'd be right!

 

0: kd> dc fcc20000

fcc20000  fcc22000 00000000 1f2e3d4c 00000000  . ......L=......<-------again!

fcc20010  f7b7e000 00000000 00000000 00000000  ................

fcc20020  00000000 00000000 00000000 00000000  ................

fcc20030  00000000 00000000 00000000 00000000  ................

fcc20040  00000000 00000000 00000000 00000000  ................

fcc20050  00000000 00000000 00000000 00000000  ................

fcc20060  00000000 00000000 00000000 00000000  ................

fcc20070  00000000 00000000 00000000 00000000  ................

 

 

Now, I've accounted for likely all my storage adapters (7,782,096 + 8,331,264  =  16,113,360) which are between the three of them consuming 16 of 28 Megs!

(I say three because of the ~3000 allocations in total, with 1000 per adapter as stated.  So two adapters of size 0xf18 and one of size 0x2000)

Sure enough checking msinfo32 on this machine shows three adapters.

 

So what about the rest of the odd ball sized allocations and 12 Megs?  Especially the 92 0x11000 and the two whopping 0x156000 bytes ones?

 

First , the two whopper 0x156000 byte allocs.  Here's where you are limited usually only by your own creativity and patience...( this particular sample is courtesy of a fellow persistent debugger,  Pushkar)

 

//Dump the beginning of one…

 

0: kd> dc fdb24000

fdb24000  00036c40 fdb24080 fdc43100 08b24080  @l...@...1...@..

fdb24010  00000000 08c43100 00000000 00000001  .....1..........

fdb24020  00000000 00000000 00000000 00000000  ................

fdb24030  00000000 00000000 00000000 00000000  ................

fdb24040  00000000 00000000 00000000 00000000  ................

fdb24050  00000000 00000000 00000000 00000000  ................

fdb24060  00000000 00000000 00000000 00000000  ................

fdb24070  00000000 00000000 00000000 00000000  ................

 

On the line below with the highlight we are looking for interesting pointers and !pool'ing them to figure out who's associated with this alloc.

 

0: kd> dc

fdb24080  00000003 00000940 08b24100 00000000  ....@....A......

fdb24090  fdef0080 00000000 08b241d8 00000000  .........A......<-----This is the line

fdb240a0  00000080 08b25080 00000000 00000080  .....P..........

fdb240b0  08b35080 00000000 00000080 08b29080  .P..............

fdb240c0  00000000 00000080 08b37080 00000000  .........p......

fdb240d0  00000080 08b2d080 00000000 00000080  ................

fdb240e0  08b39080 00000000 00000080 08b31080  ................

fdb240f0  00000000 00000080 08b3b080 00000000  ................

 

//Oh, it happens to be a Device object...

 

0: kd> !pool fdef0080

Pool page fdef0080 region is Nonpaged pool

*fdec1000 : large page allocation, Tag is Dev., size is 0x33000 bytes

Owning component : Unknown (update pooltag.txt)

 

//Dump it out via devobj, adding the object offset and poolheader length…this one looks like it's going to belong to my Emulex adapters

 

0: kd> !devobj fdec1000+38

Device object (fdec1038) is for:

ElxPlus*** ERROR: Module load completed but symbols could not be loaded for elxplus.sys

\Driver\elxplus DriverObject fdef41c0

Current Irp 00000000 RefCount 1 Type 0000002a Flags 00000048

DevExt fdec10f0 DevObjExt fdef3ca0

ExtensionFlags (0000000000) 

AttachedTo (Lower) fdf68ba0 \Driver\PnpManager

Device queue is not busy.

 

On to the 0x11000 sized, first pass is to just dc out the memory like before and look for interesting tags or strings…At the end of Network card adapter driver allocations via their calls into the Ndis layer for instance, you can flag them by the END of the allocation may have an ND** string.  I don't see any of those here but it "looks like" the 0x11000 sized allocations have network related data and strings as a common factor throughout, so I'm guessing these are associated with one of the four network adapters on this machine.  Turns out if you sample more of these they may have pointers back to other interesting pool allocations that can flag the network miniport adapter, etc.

 

You may be thinking, hum…I get the storport one but these last two sound like mere correlation?  Absolutely.  However, given the lack of randomness sampling multiple allocations it's often strong enough to at least inform your hypothesis and test, quickly.

 

 

Yes, there is an easier way!

 

If this memory is allocated at driver initialization and most drivers initialize at system boot time, it stands to reason that most of this memory will be consumed on boot.  Cool!  Then it also stands to reason we can use our friend Poolmon.exe to quickly check how much is being used shortly after boot and do some quick testing by excluding some of the usual suspect adapters.  First, you could just trust me and check for ancillary storage, NIC, and enhanced video adapters and disable/remove these first as a low risk test.

 

In any case, the testing sequence is:

 

1.      Promptly record the amount of MmCm after a reboot via Poolmon.exe

(Sort by Non Paged Pool (toggle through by hitting 'P') and then descending by bytes (same, but 'B')

2.      Disable the ancillary test adapter(s)

3.      Reboot, check poolmon.exe again

4.      What's the decrease?

 

5.      Perhaps repeat testing with updated drivers or disabled features if you must have the additional hardware in place (hoping they use less MmCm).

 

I hope this post saved you time by being wary of configuring more adapters than absolutely necessary, especially /3GB x86 Exchange servers and if you have to reactively triage this a quick and dirty way to determine the primary consumers of this memory.

 

Enjoy!

 

-Tate

 

P.S. If you can't scale down the number or features of let's say one of the suspect adapters, do check with your driver vendor to see if there is a way to throttle back the consumption.  For instance, when we shipped the Scalable Networking Pack feature, we often saw higher consumption in the NIC drivers supporting this functionality, later driver releases from hardware vendors reportedly used much less.

 

 

 

 

Bonus:

 

Here's another handy debug trick to isolate who can call these two Memory manager functions and narrow down the driver population...

 

//Find the address of MmAllocateContiguousMemorySpecifyCache

0: kd> x nt!MmAllocateContiguousMemorySpecifyCache

e080efac nt!MmAllocateContiguousMemorySpecifyCache = <no type information>

 

//Use the handy !for_each_module to scan each module for the address (e080efac )

0: kd> !for_each_module ".echo ${@#ModuleName} ;s -d ${@#ModuleName} L?${@#Size} e080efac"

 

//Hits will look like this, with the matching address listed…

ATMFD

nt

e0a81044  e080efac e0806d12 e082e237 e082ddc7  .....m..7.......

hal

e0a81044  e080efac e0806d12 e082e237 e082ddc7  .....m..7.......

RDPWD

.

.

.

ati2mtag

f5575234  e080efac e083b69c e08454c9 e082f651  .........T..Q...

f5836084  e080efac e08329f9 e0839c13 e08e2d78  .....)......x-..

.

.

.

ql2300

f5d6d174  e080efac e0806d12 e08412f9 e08eaeb1  .....m..........

f5e170cc  e080efac e08d15a3 e08e8543 e0815d48  ........C...H]..

storport

f5d6d174  e080efac e0806d12 e08412f9 e08eaeb1  .....m..........

f5e170cc  e080efac e08d15a3 e08e8543 e0815d48  ........C...H]..

elxstor

f5e170cc  e080efac e08d15a3 e08e8543 e0815d48  ........C...H]..

SCSIPORT

f5e170cc  e080efac e08d15a3 e08e8543 e0815d48  ........C...H]..

 

 

//Example match in the import table for ati2mtag...

0: kd> dps f5575234

f5575234  e080efac nt!MmAllocateContiguousMemorySpecifyCache

f5575238  e083b69c nt!ZwQueryInformationProcess

f557523c  e08454c9 nt!PsGetCurrentThreadProcessId

 

 

//Repeat for the other call...

0: kd> x nt!MmAllocateContiguousMemory

e080be42 nt!MmAllocateContiguousMemory = <no type information>

 

 

 

 

Share this post :

 

 

 

 

It's Venkatesh with a WMI nugget. While troubleshooting or debugging WMI issues you may come across WMI queries wherein you don’t know which provider implemented the WMI class used in the query. You may want to know the binary and the product that implemented the provider so you can contact the vendor or upgrade the binary to see if it resolves the issue you are investigating.

To help make the investigation easier, Windows Vista and all post operating systems have the cool new 'Analytic and Debug log' for WMI tracing. To get similar tracing in Windows 2003 and earlier versions, WMI has a “Verbose Logging” option enabled through the WMI control (WmiMgmt.msc). Check out this blog for detailed steps to enable the WMI Analytic and Debug -

http://blogs.msdn.com/wmi/archive/2009/05/27/is-wmiprvse-a-real-villain.aspx

After enabling WMI logging you will see events like the ones pictured below. There may be a particular WMI query relevant to the issue you're troubleshooting requiring you to know the provider binary for the WMI class in the query. Apart from the class name (InstProvSamp) in the query you need the WMI namespace to which this class belongs to such as '\\.\root\default' as shown below.

image

Fig 1: Event ID 1 in the Windows Vista’s Event Viewer\Windows Logs\Applications and Services Logs\Microsoft\Windows\WMI-Activity\Trace log

Now we have the WMI Namespace and the class name so let’s find the provider for it.

1. Get the provider name from the Class Name and Namespace:

Every WMI class has a qualifier “provider” containing the name of the provider to which it belongs. Using wbemtest.exe connect to the namespace we mentioned above(\\.\root\default). Then get the class information for our class, InstProvSamp using the “Open Class…” button. After clicking "OK" the Object editor for InstrProvSamp displays the class information for InstProvSamp including the name of the provider it belongs to (CIM_STRING qualifier “provider” as shown below).

image

image

2. Get the class identifier (CLSID) of the provider from the provider name:

For every WMI provider WMI maintains an instance of system class __Win32Provider which contains information about the physical implementation of the provider like CLSID which contains the GUID that represents the class identifier (CLSID) of the provider COM object. Now query the repository to get the provider information for “InstProvSamp” provider as I did below.

image

image 

image

3. Get the provider binary from the provider’s CLSID:

During registration every COM object adds an entry under HKLM\SOFTWARE\Classes\CLSID. Using regedit.exe you can browse to provider’s COM registration information using the CLSID obtained from Step 2 as shown below.

image

This should help you connect the dots while debugging your WMI issue. Let me know if you have additional questions!

Share this post :

Introduction and Overview

In this article we will explore a practical use for ETW tracing, and discover what ETW (Event Tracing for Windows) tracing is available for a popular Windows user-mode component, Internet Explorer. In my previous article ETW Introduction and Overview, we covered what ETW tracing is and how it could be used.

The goal in this exercise is to learn about ETW tracing in general, how to self-discover what tracing is available in a component, and some ways you can leverage the tracing to self-troubleshoot issues. ETW logging is essentially allowing Microsoft code to speak for itself. It tells you what code ran, what that code did, and any errors produced. Also ETW logs can be used along with a more traditional toolset for troubleshooting that class of issue such as the SysInternals tools, Network Monitor, etc.

For this article, we are using the RC Build of Windows 7. The concepts and examples should be very similar and work fine for Windows Vista.

Getting Started “Exploring”

First we start off by opening Internet Explorer 8 whose home page is set to the NTDebugging Blog - http://blogs.msdn.com/ntdebugging

clip_image001

In order to find what Internet Explorer is logging with ETW, we list all of the registered ETW providers on a box, which includes all of the installed code on the machine (the component doesn’t need to be actively running).

We start off by running “logman query providers” and look through the list for relevant hits. However, the results below return more than 400+ hits on Vista, and 600+ on Win7. You will probably quickly see though that searching through this large list of providers might not always be best way to go about finding which providers Internet Explorer is logging with.

clip_image003

Using this list we can also filter or browse for our component. This may work just fine, however, one issue with this approach is that sometimes you may not know the particular naming convention for a process. Also, many software components use shared dlll’s to do a variety of the under the hood work. As such, you may not know what each of those pieces of software are, or how they are named.

Here we try running a built-in cmd line tool logman ‘query providers’ and searching for “Internet”, which didn’t turn up anything on this particular search. It looks like the naming convention for Internet Explorer might be a bit different than our first search.

clip_image005

While we could continue to browse the large list of providers, I have decided to use another useful method to find which providers IE uses, which is to filter all providers that a certain ProcessId uses. We start out by getting the ProcessId of Internet Explorer using two methods shown here, Task Manager and the command line tasklist filtering on iexplore.exe.

clip_image007

clip_image009

Now that we know Internet Explorer is running with process id 6200, we can do another query with logman to find out the ETW providers.

Internet Explorer ETW Providers

‘Logman query providers –pid 6200’ is used to list all of the user-mode ETW providers that Internet Explorer uses and the associated GUIDs (GUIDs are the Globally Unique ids that enable tracing for a component).

In looking at the screenshot below, notice that a wide variety of providers are shown, and at first glance, you may wonder how some relate to Internet Explorer. As you investigate further though, all these components enable certain features within Internet Explorer, but even with tracing enabled, may not log anything unless you specifically use the part of Internet Explorer which runs that code at run-time.

It is also important to note that Internet Explorer will also use a lot of kernel services to eventually do its work, such as NDIS. There is ETW tracing for these kernel components, but would not show up under a specific process as they are used by all processes at the kernel level. Here we see a specific component of interest that we want to follow – Microsoft-Windows-WinINet. A quick Bing search of WinINet turns up this on MSDN, which sounds relevant.

Extracted from MSDN: The Microsoft Windows Internet (WinINet) application programming interface (API) enables applications to access standard Internet protocols, such as FTP and HTTP. For ease of use, WinINet abstracts these protocols into a high-level interface.

So Microsoft-Windows-WinINet looks like a good component to enable tracing in.

clip_image011

From using SysInternals Process Explorer you can also confirm and that WININET.dll is loaded into the address space of iexplore.exe as shown below.

clip_image013

Now that we know the ETW Provider name and GUID, we will launch Computer Mgmt and turn on tracing for that component. Note that not every Provider can be decoded this way (due to architectural and security reasons), but many can, and at the very least, the log can be enabled and provided to Microsoft Support to fully decode.

Launch Computer Management by typing ‘compmgmt.msc’ into the start search box or right clicking Manage on Computer Management.

clip_image014

Enabling ETW Logging

Once in Computer Management, Navigate to and click on Event Viewer -> Applications and Services Logs. Once there, make sure View -> Show Analytic and Debug Logs is enabled as shown below, which will provide a much greater set of logs to look at. Most logs are under Event Viewer -> Applications and Services Logs -> Microsoft -> Windows, where we will spend the majority of our time. There are generally four channels that can show up under each provider. Channels are targeted at different roles. Admin and Operational channels target IT professionals and Administrators and are enabled by default, while Analytic and Debug Channels are more in depth, and not usually enabled by default.

The logs that show up here are XML Manifest based tracing called Windows Events. ETW tracing splits up the collection and decoding of traces into two separate steps for performance and security reasons. In the manifest based tracing first available with Vista, most components and events are defined in a XML manifest compiled along with the binary, and which are defined in a resource file language dll. Most user-mode resource dlls show up under C:\Windows\System32\en-US (for English US). The good part about manifest based tracing is that many logs are self-discoverable and customer decodable!

clip_image016

Next we will browse to: Event Viewer -> Applications and Services Logs -> Microsoft -> Windows -> WinINet, right click on the Analytic channel, and choose ‘Enable Log’. Remember that we had to enable this log because Analytic and Debug logs are not enabled by default. Enabling the channel will automatically enable the ETW tracing for that component and logs will start being decoded similar to the Event Log.

clip_image017

Capture and view WinINet logging for our scenario

After enabling the log, I am going to close Internet Explorer which is running code that is now logging using ETW and navigate to the website http://blogs.msdn.com/ntdebugging.

clip_image001[1]

Now we can look at the result of our work and tracing, which decodes the tracing for us. Below, we are on the WININET_REQUEST_HEADER category, showing the GET request. Note that the tracing here is showing us proxy interaction, DNS requests and responses, cookies, TCP requests/responses, as well as HTTP requests and responses.

Now, one might be able to get the same information using a network capturing program such as NetMon or WireShark, but the nice thing about the ETW tracing here is that is

a) In box without having to install additional tools

b) The aggregate view of what the WinINet component is seen from its point of view (composed of DNS, TCP, Proxy, Cookies, etc).

This is a good example of using this tracing in combination with other troubleshooting tools (if required), which may or may not be useful depending on the scenario.

clip_image019

Dumping the trace log out using other methods:

You can use the Event Viewer GUI, or if you prefer to view the log in a text editor (or spreadsheet). You can dump the log to text or CSV format using a couple of different methods.

The first method is using the Windows Event Log GUI to export or save the event log

clip_image020

Or you can output the entire log to screen in human readable text format:

clip_image022

We can also output to the screen filtering for any messages with error level using an XPath Query. We can start right away if you know the XPath Syntax for your query, or you can use the GUI to help build the query for you. Once you choose ‘Filter Current Log’ you can filter by Event Level, or a variety of other criteria.

clip_image024

The XML view will show you the raw XPATH syntax needed to use wevtutil to query for only Error events.

clip_image025

Now we actually run the XPath query using ‘wevtutil query-events Microsoft-Windows-WinINet/Analytic /f:Text "/q:*[System[(Level=2)]]"’

clip_image027

If you prefer a more complex format containing processors, threads, etc, then you can output to a csv log file. This is useful for sorting, filtering, comparing, and doing more complex analysis on the log file entries.

clip_image029

Here we use ‘tracerpt c:\windows\system32\winevet\logs\Microsoft-Windows-WinINet%4Analytic.etl –of CSV –o c:\temp\Microsoft-Windows-WinINet%4Analytic.csv’ to output to a CSV.

clip_image031

Finally, you can use PowerShell v2 to dump out the log. The PowerShell scenario is interesting, because you can write powerful scripts around the event logs.

clip_image033

Look at ‘get-help get-winevent –examples” for many more really cool examples about how to list and filter these logs:

clip_image035

Behind the Scenes

We also need to explore behind the scenes of what is happening when you enable the analytic channel of the WinInet provider. Part of the definition of that channel is to enable an ETW Trace Session with certain pre-defined Keywords (Flags) and Levels. Keywords usually specify functional sub-components (HTTP, COOKIES, CONNECTION, etc) while Levels control the level of detail (Error, Informational, Verbose) events are logged. This is useful to fine-tune logging, but the channels make it easy not to have to mess with these unless required.

Under Performance -> Data Collector Sets -> Event Trace Sessions we see a new session created for us and auto-named “EventLog-Microsoft-Windows-WinINet-Analytic”. This ETW logger was automatically created for us when we enabled the WinInet Analytic channel. You can also manually create the logger, which we may cover in another blog post. If you open up the properties for this session, you'll notice our provider listed as well as all the Keywords (Any) and Level specified. The screenshot also show which Keywords/Flags or categories a provider supports.

clip_image037

Once you know a provider name, you can also query for it on the command line, which is useful to get all the Keywords(Flags) and Levels a provider supports, along with the processes that are using that provider.

clip_image039

For the curious, we wanted to finish this post and show you how to find what tracing is available, and its format. Here we again use the wevtutil to get full publishing event and message information.

‘wevtutil get-publisher Microsoft-Windows-WinInet /getevents /getmessage’

clip_image041

This can be combined with another version of the same utility searching for a certain string or log – e.g. “wevtutil gp Microsoft-Windows-WinINet /ge /gm | findstr /i httpopen”

clip_image043

To wrap-up, here is the overall architecture of ETW covered in a previous blog post, along with the specifics of the scenario we covered. In our scenario the MMC Event Viewer console has two roles – enabling the tracing and decoding the real-time delivery of events.

clip_image045

Where to go from here?

This was meant as a practical introduction to using and self-discovering how to use the ETW tracing that is available right now in newer Windows releases (Vista/2008 and above). There are many, many more components that use tracing – 400+ in Vista and 600+ in Win7, which you can start exploring.

For these specific examples, it is worth noting that as far as the networking aspect is concerned, similar information might be obtained from using a packet capture utility such as NetMon. What is interesting about NetMon in Win7 is that it actually uses ETW under the hood to do most of its tracing. Here is an interesting post about Network Tracing in Windows 7 from the Netmon team.

The use of ETW under the hood of many tools is a general trend in Windows and Microsoft software. You might notice different diagnostics and tracing tools in Windows using ETW under the hood. These tools can add value to the raw ETW tracing such as further parsing, filtering, and rich views into the data. You can mix and match these tools along with the tracing for powerful views into your Windows box or server.

It’s also important to note that ETW tracing isn’t just for Microsoft software. Since any software running on Windows uses many services provided by the OS, there is a variety of ETW logging that is available for all software. For example, try running ‘logman query providers –pid PROCESSID_OF_FIREFOX’ on FireFox! You should find a great deal of logging similar to Internet Explorer.

In future posts, we hope to explore other ways of using ETW tracing, and touch on some of the kernel level tracing available.

 

 

Share this post :

Pushlocks were a new locking primitive first introduced in Windows Server 2003 and are primarily used in place of spinlocks to protect key kernel data structures. Unfortunately, Pushlocks are not documented in the WDK, and are not available for public use; however, a few internal drivers do use them, so you might see them while debugging a machine. Also, I was digging around on MSDN for Pushlocks, and I found that the Filter Manager does expose certain APIs to use Pushlocks, so you are in luck if you are developing a filter driver!

 

Gate objects

 

Pushlocks are built on primitive gate objects, which are defined by KGATE structures. The gate object is a highly optimized version of the basic event object. By not having both the notification and synchronization versions of the basic event object, and by being the exclusive object that a thread can wait on, the code for acquiring and releasing a gate is heavily optimized. Gates even have their own dispatcher lock instead of acquiring the entire dispatcher database.

 

Unlike spinlocks, which must be acquired exclusively for all operations on a data structure, pushlocks can be shared by multiple “readers” and need only be acquired exclusively when a thread needs to modify the data structure.

 

Operation

 

When a thread acquires a normal pushlock, the pushlock code marks the pushlock as owned, if it is not owned already. If someone owns the pushlock exclusively, or the thread wants to own the pushlock exclusively while someone else has it in shared mode, the thread allocates a wait block on its stack, initializes a gate object in the wait block, and then add the wait block to the wait list associated with the pushlock. When the thread holding the pushlock finally releases it, it wakes the next waiter by signaling the event in the waiters wait block.

 

When debugging a machine, there is no easy way to figure out the current owner of the pushlock, apart from doing code review. By looking at the waitlist, you can always figure out the threads trying to get access to it, but since the gate does not keep track of the owner like a regular mutex, it is much harder to find the current owner.

For more details on the operation and structure of a pushlock, please review the Pushlocks section in Windows Internals book, under the System Mechanisms Chapter.

 

Advantages of using a PushLock

 

If a pushlock is held by one or more readers, threads that want to modify the data structure are queued for exclusive access. This queuing mechanism provides some of the same benefits of queued spinlocks—for example, FIFO ordering, elimination of race conditions, and reduced cache thrashing when more than one thread is waiting for the pushlock.

 

Another advantage of using a pushlock is the size. A regular resource object is 56 bytes, however a pushlock is the size of a pointer. Apart from a small memory footprint, this helps especially in the non-contended case, since pushlocks do not require lengthy operations to perform acquisition or release. Because the pushlock fits in one “machine word”, the CPU can use atomic operations to compare and exchange the old lock with the new one.

 

Pushlocks are also self-optimizing in the sense that the list of threads waiting on a pushlock will be periodically rearranged to provide fairer behavior when the pushlock is released.

 

Cache Aware Pushlocks

 

A cache-aware pushlock adds to the basic pushlock by allocating a normal pushlock for each processor in the system and associating it with the cache-aware pushlock. When a thread wants to acquire a cache-aware pushlock for shared access, it simply acquires the pushlock on that processor; however if it needs to acquire the lock for exclusive access, it has to acquire the pushlocks for each processor in exclusive mode.

 

What does a Pushlock look like?

 

3: kd> !thread 8c9764c0

THREAD 8c9764c0  Cid 2410.1be4  Teb: 7ff9f000 Win32Thread: e5c6f298 GATEWAIT

Stack Init b386b000 Current b386a978 Base b386b000 Limit b3867000 Call 0

ChildEBP RetAddr  Args to Child             

b386a990 80833485 8c9764c0 8c9764e4 00000003 nt!KiSwapContext+0x26 (FPO: [Uses EBP] [0,0,4])

b386a9bc 8082ffe0 b06a6a03 e11e0b18 b386aa54 nt!KiSwapThread+0x2e5 (FPO: [Non-Fpo]) (CONV: fastcall)

b386a9e4 8087d722 00000000 e11e0b08 e11e0b18 nt!KeWaitForGate+0x152 (FPO: [Non-Fpo]) (CONV: fastcall)

e11e0b18 00000000 0c050204 7346744e e37b2808 nt!ExfAcquirePushLockExclusive+0x112 (FPO: [Non-Fpo]) (CONV: fastcall)

 

Above is a snipped output from a dump that I was recently looking at. From the stack, you can see the ExfAcquirePushLockExclusive call trying to acquire the pushlock, which then calls KEWaitForGate. In this case, the lock was already acquired, so this thread allocated a wait block on its stack, and then added itself to the waitlist.

Also, the stack is broken due to the fastcall, therefore the debugger cannot display it entirely. So we can manually reconstruct the stack by passing parameters to the kb command.

k[b|p|P|v] = BasePtr StackPtr InstructionPtr

 

To get the arguments, we first dump the stack manually using the dps command with the current esp.

3: kd> dps b386a978 l50

b386a978  b386ad40

b386a97c  00000000

b386a980  8088dafe nt!KiSwapContext+0x26

b386a984  b386a9bc

b386a988  b386aa00

b386a98c  f773f120

b386a990  8c9764c0

b386a994  80833485 nt!KiSwapThread+0x2e5

b386a998  8c9764c0

b386a99c  8c9764e4

b386a9a0  00000003

b386a9a4  8c9764c0

b386a9a8  00000003

b386a9ac  00000002

b386a9b0  00000002

b386a9b4  f773fa7c

b386a9b8  008c0030

b386a9bc  b386a9e4

b386a9c0  8082ffe0 nt!KeWaitForGate+0x152

b386a9c4  b06a6a03

b386a9c8  e11e0b18

b386a9cc  b386aa54

b386a9d0  00000000

b386a9d4  8c976504

b386a9d8  00000000

b386a9dc  0000001c

b386a9e0  00000000

b386a9e4  b386aa40

b386a9e8  8087d722 nt!ExfAcquirePushLockExclusive+0x112

b386a9ec  00000000

b386a9f0  e11e0b08

b386a9f4  e11e0b18

b386a9f8  b386aa40

b386a9fc  8096e9a9 nt!SeOpenObjectAuditAlarm+0x1cf

b386aa00  00040007

b386aa04  00000000

b386aa08  8c976568

b386aa0c  8c976568

b386aa10  b06a6a00

b386aa14  b4ee0a00

b386aa18  b127cc10

b386aa1c  00000000

b386aa20  00000001

b386aa24  80a60456 hal!KfLowerIrql+0x62

b386aa28  b386ac04

b386aa2c  8d117800

b386aa30  00000000

b386aa34  00000000

b386aa38  b386aa20

b386aa3c  01943080

b386aa40  b386aa64

b386aa44  808b7a14 nt!CmpCheckRecursionAndRecordThreadInfo+0x2a

 

From the output above, we can see the stack. To reconstruct the stack, we can get the ebp, esp, and eip from the stack for the ExfAcquirePushLockExclusive frame, and pass it to the kb command. Voila!

 

3: kd> kb = b386aa40 b386a9e4 8087d722

ChildEBP RetAddr  Args to Child             

b386aa40 808b7a14 b386ac04 e11e0b18 e11e0b18 nt!ExfAcquirePushLockExclusive+0x112

b386aa64 808b7b09 e11e0b18 b386aa80 e101bf40 nt!CmpCheckRecursionAndRecordThreadInfo+0x2a

b386aaa4 808da118 0000001c b386ab58 00000001 nt!CmpCallCallBacks+0x6b

b386ab90 80937942 e101bf40 00000000 89f13648 nt!CmpParseKey+0xd4

b386ac10 80933a76 00000000 b386ac50 00000040 nt!ObpLookupObjectName+0x5b0

b386ac64 808bb471 00000000 8e930480 00000d01 nt!ObOpenObjectByName+0xea

b386ad50 808897bc 0243eba0 00020019 0243eb68 nt!NtOpenKey+0x1ad

b386ad50 7c8285ec 0243eba0 00020019 0243eb68 nt!KiFastCallEntry+0xfc

WARNING: Frame IP not in any known module. Following frames may be wrong.

0243eba4 00000000 00000000 00000000 00000000 0x7c8285ec

 

 

Share this post :

This isn’t our typical debugging type article however I found the information very useful for the developer community.

Windows 7 is slated to launch in the next few weeks. If your applications aren’t quite ready for Windows 7 and having you’re having issues we may be able to help you out!

Last Monday we launched a new pilot program in our Advisory Services space.  Advisory Services is a consultative support option that provides support beyond standard break-fix issues.  The new program involves remote, phone-based support for issues such as product migration, code review or new program development.  The service is intended for Developers and IT Professionals for shorter engagements that don’t require traditional onsite consulting or sustained account management services available via other Microsoft support options.

For the Application Compatibility engagements, we’ll start off with some basic scoping questions such as whether the application is 16-, 32-, or 64-bit.  Is it a client-server application?  What compatibility issues are you experiencing?  Slow Performance?  Hang or Crash?  Installation problems?  The support engineers will be using tools such as the Application Compatibility Toolkit, the the Standard User Analyzer Wizard, and the Setup Analysis Tool.

The KB Article referenced below has more details about the program and how to engage Microsoft.  So, if you’re working on a pesky Windows Vista or Windows 7 Application Compatibility issue, give us a call – we can help!

 

Additional Resources:

Cheers,

Ronsto

 

Share this post :

Introduction:

 

Event Tracing for Windows (ETW) is a system and software diagnostic, troubleshooting and performance monitoring component of Windows that has been around since Windows 2000. However, it wasn't until Windows Vista that major components of the OS were updated to heavily use ETW tracing; making it much more practical and useful.

ETW is useful from a variety of scenarios, including:

 -User & Admin: Being in control of your system and knowing what is going on.
 -User & Admin: Troubleshooting performance, hardware and OS issues.
 -Enthusiast: Learning further about the OS and the low level guts of the OS.
 -Software Developer/ISV/OEM: Investigating issues with your software's interaction with Microsoft OS & technologies
 -Hardware Developer/IHV/OEM: Investigating issues with hardware interaction with the OS, including kernel, driver subsystems, up to the user stack.

ETW is a set of technologies and tools that can absolutely complement existing tools while providing a look into the guts of the OS at a very low level.

A great article from the April 2007 edition of MSDN Magazine - http://msdn.microsoft.com/en-us/magazine/cc163437.aspx, covers ETW in great depth, and is recommended reading.

Here is graphical overview of the ETW infrastructure that covers how provider(s) log to high performance in memory buffers, which can be kept in memory in a circular buffer, or written to disk in a sequential or circular fashion.

 

image

Some uses of ETW:

 

Today on Windows Vista, Windows 7, and sparingly on earlier OSes; ETW is used by Microsoft Development and Microsoft Support, as well as others; to help troubleshoot issues, find root-cause of bugs, analyze performance; and a large variety of other tasks.

As it stands today, ETW has two major thrusts:

1. Exposing time-based event data similar in practicality to plain old text based logged (without the normal performance overhead). Troubleshooting logs have long been used across the industry by software developers and IT professionals to troubleshoot issues. Here are some examples of how ETW logging is used.

First off, shown below is Windows Event Viewer enabling viewing of analytic and debug logs (which show you many more logs than the default view):

image

A variety of logs can be viewed graphically using the Event Viewer or via the built-in command line OS utility – wevtutil. There are a quite a variety of logs that are enabled by default and are available via the Event Viewer or in a kernel memory dump.

Since ETW is also a foundational component, you might find it appearing in other troubleshooting utilities. For example, Office Communications Server 2007 uses ETW for troubleshooting.

ETW is used to monitor for network events from the kernel which are then added to Process Monitor, a utility from Microsoft Sysinternals. You should also see tools such netsh and NetMon being able to capture and decode ETW traces.

 

image

There are many other methods and utilities to view the ETW tracing providers available, which may be found in the OS, via the Device Driver Kit (DDK), enabled with Microsoft products (such as Exchange or IIS), or used by tools, such as Office Communication System Logging.

The second major thrust of ETW is exposing performance data.

2. Performance data is exposed from the lowest levels of the kernel, driver subsystems, and every major user-mode component of Windows (such as Windows Media Player).

Performance data can be used in a hybrid fashion both for troubleshooting and performance reasons. Take for example a built in OS tool that uses ETW tracing to provide rich data - Resource Monitor.

Resource Monitor is available from the Windows Task Manager - Performance Tab. The cool thing about Resource Monitor is that it internally uses kernel ETW tracing to gather its data, providing rich data from the kernel itself; and thus providing a practical exposure of ETW technology for Administrators to troubleshoot their systems. This goes beyond the data that Task Manager provides, which often tends to be based on performance counters which you cannot drill down into further.

image

image

Resource Monitor is useful from both a performance standpoint, as well as troubleshooting. For example, you can view top cpu users, top disk users (why your disk is thrashing), or search for a locked file handle that is open.

For a deeper drill-down into performance data XPerf can be used to capture ETW traces and make them available for decoding offline (on another box).

Here, Xperf (available with the Windows DDK and as a standalone download) is shown looking at Internet Explorer's file access time.

image

There have been several posts on the NTDebugging Blog regarding XPerf, other blog posts, as well as the Windows Performance Analysis Dev Center.

Where to go from here:

 

In Windows 7 and Windows Server 2008 R2, ETW tracing has even been further improved with many more providers including 600+ inbox providers’ registered on the box. This number increases when add-on features or products are installed (such as Hyper-V).

The tracing provided by ETW, whether it is used directly in the OS, or via add-on tools, provides deep insight into Windows and how it works. We will explore ETW in depth in future blog posts.

 

 

Share this post :

Our Global Escalation Services team at Microsoft hosted the first annual Global Engineering Conference on February 25th & 26th in 2009. This year’s theme was “Building the Engineering Community” with a focus on increased knowledge sharing, collaboration, and best practices throughout the engineering community. Several vendors attended the conference including our good friends at Citrix. In this video Citrix Escalation engineers Nicholas Vasile, Dmitry Vostokov, and Kapil Ramlal share tools they created to take advantage of the ETW tracing infrastructure in Windows, debugging scripts, and best practices for engaging issues that require multi-vendor support.

Video Link 

Video Table of Contents-

Who is Citrix?

Pattern-Drive Memory Dump Analysis (Presenter: Dmitry Vostokov)

Citrix & ETW: A case study with CDFControl (Presenter: Kapil Ramlal)

Working with Citrix Technical Support (Presenter:Nicholas Vasile)

Citrix’s Public Symbol Server

 

Citrix Engineers:

image

image

image


Feel free to send feedback!

-Ronsto

My name is Nischay Anikar from the Escalation Engineer team in Global Escalation Services. In today’s post I’ll present a weird problem I worked through with a client. When we started to work on the problem, we found the following:

  • Ping to the box worked.
  • Keyboard was responding.
  • Shares on the system were accessible remotely.
  • Could not Remote Desktop into the box.
  • Existing sessions were responsive to some extent – no new processes were getting created.
  • No new processes/application could be launched. Sometimes application would get launched but after waiting for a long time.
  • Remote computer management would work, but not all snap-ins would work. (Event logs showed up, but disk management did not respond).

At this time, perfmon was collected and nothing in it indicated any kind of resource contention. This was certainly not the case of any process/thread pegging the CPU. The above observations told us this was not a hard hang, as the system was responsive at DPC level. Rather system was even responding to certain extent at passive level – remember, shares were accessible. SMB requests are processed basically by the worker threads created by SRV.SYS.

Remote management snap-ins and remote registry responsiveness showed that RPC was working fine. However some snap-ins like disk management were behaving inconsistently.

This is the stage at which we had the Kernel dump of the system and we started our normal analysis trying to find the root of the problem.

From the dump there were no blocked threads on locks (ERESOURCE, !locks), no memory pressure (perfmon confirmed it too - !vm 1), no CPUs stuck in spinlocks, no DPCs pending(!dpcs), no ready threads pending to execute(!ready), and no alarming LPC wait chain among threads leading to system hang. These are some of the common causes that could lead to system hang. None of these were seen in the dump.

Then I said, enough of running behind the debugger commands to look for known problems, they didn’t yield me anything useful up front. When the dump was given we were told that they attempted to launch notepad from explorer (Start->Run->Notepad) which never launched. If we start chasing from this point we are likely going to hit the root of the problem or at least get some leads. With this in mind, when we dumped out the explorer threads we saw one of the threads that was indeed trying to launch notepad (there was one more in the same state but trying to launch some other application).

THREAD 892ef4a0  Cid 0d2c.0ea8  Teb: 7ffd8000 Win32Thread: e108e6c0 WAIT: (Unknown) KernelMode Non-Alertable

    f573bc2c  NotificationEvent

    892ef518  NotificationTimer

Not impersonating

DeviceMap                 e12bf190

Owning Process            892027f0       Image:         explorer.exe

Wait Start TickCount      40848          Ticks: 8313 (0:00:02:09.890)

Context Switch Count      284                 LargeStack

UserTime                  00:00:00.000

KernelTime                00:00:00.078

Win32 Start Address ntdll!RtlpWorkerThread (0x7c839f2b)

Start Address kernel32!BaseThreadStartThunk (0x77e617ec)

Stack Init f573c000 Current f573bb8c Base f573c000 Limit f5736000 Call 0

Priority 14 BasePriority 8 PriorityDecrement 0

ChildEBP RetAddr  Args to Child

f573bba4 8082ffd7 892ef4a0 892ef548 00000100 nt!KiSwapContext+0x25 (FPO: [Uses EBP] [0,0,4])

f573bbbc 808287d4 895c6548 80a560c6 00000000 nt!KiSwapThread+0x83 (FPO: [Non-Fpo])

f573bc00 80810135 f573bc2c 00000000 00000000 nt!KeWaitForSingleObject+0x2e0 (FPO: [Non-Fpo])

f573bc48 80842608 005c6548 00000000 00000000 nt!CcWaitForUninitializeCacheMap+0xa5 (FPO: [Non-Fpo])

f573bcd0 8091f8e7 f573bd20 000f001f 00000000 nt!MmCreateSection+0x1fc (FPO: [Non-Fpo])

f573bd40 80883938 0190d51c 000f001f 00000000 nt!NtCreateSection+0x12f (FPO: [Non-Fpo])

f573bd40 7c82860c 0190d51c 000f001f 00000000 nt!KiFastCallEntry+0xf8 (FPO: [0,0] TrapFrame @ f573bd64)

0190d174 7c826ed9 77e6cc9a 0190d51c 000f001f ntdll!KiFastSystemCallRet (FPO: [0,0,0])

0190d178 77e6cc9a 0190d51c 000f001f 00000000 ntdll!NtCreateSection+0xc (FPO: [7,0,0])

0190d994 77e424b0 00000000 001394f4 0013725c kernel32!CreateProcessInternalW+0x99c (FPO: [Non-Fpo])

0190d9cc 7c916750 001394f4 0013725c 00000000 kernel32!CreateProcessW+0x2c (FPO: [Non-Fpo])

0190e450 7c916b45 00030064 00000000 00139904 SHELL32!_SHCreateProcess+0x387 (FPO: [Non-Fpo])

0190e4a4 7c91617b 00136008 0190e4c4 7c915a76 SHELL32!CShellExecute::_DoExecCommand+0xb4 (FPO: [Non-Fpo])

0190e4b0 7c915a76 00000001 00000009 00136008 SHELL32!CShellExecute::_TryInvokeApplication+0x49 (FPO: [Non-Fpo])

0190e4c4 7c91599f 00000000 00000009 0190e500 SHELL32!CShellExecute::ExecuteNormal+0xb1 (FPO: [Non-Fpo])

0190e4d8 7c915933 0190e500 00000000 00000009 SHELL32!ShellExecuteNormal+0x30 (FPO: [Non-Fpo])

0190e4f4 7c9a3416 0190e500 0000003c 04000b00 SHELL32!ShellExecuteExW+0x8d (FPO: [Non-Fpo])

0190e954 7c9e3f92 00030064 0190e988 0190f828 SHELL32!ShellExecCmdLine+0x143 (FPO: [Non-Fpo])

0190ee20 7c9e4517 0190eea8 7c9e43f6 0190ee5c SHELL32!CRunDlg::OKPushed+0x179 (FPO: [Non-Fpo])

0190ee30 7739b6e3 000400b4 00000111 00000001 SHELL32!RunDlgProc+0x121 (FPO: [Non-Fpo])

0190ee5c 77395f82 7c9e43f6 000400b4 00000111 USER32!InternalCallWinProc+0x28

0190eed8 77395e22 00095fb4 7c9e43f6 000400b4 USER32!UserCallDlgProcCheckWow+0x147 (FPO: [Non-Fpo])

0190ef20 77395ffa 00000000 00000111 00000001 USER32!DefDlgProcWorker+0xa8 (FPO: [Non-Fpo])

0190ef3c 7739b6e3 000400b4 00000111 00000001 USER32!DefDlgProcW+0x22 (FPO: [Non-Fpo])

0190ef68 7739b874 77395fd8 000400b4 00000111 USER32!InternalCallWinProc+0x28

0190efe0 7739bfce 00095fb4 77395fd8 000400b4 USER32!UserCallWinProcCheckWow+0x151 (FPO: [Non-Fpo])

0190f010 7739bf74 77395fd8 000400b4 00000111 USER32!CallWindowProcAorW+0x98 (FPO: [Non-Fpo])

0190f030 77431848 77395fd8 000400b4 00000111 USER32!CallWindowProcW+0x1b (FPO: [Non-Fpo])

0190f04c 77431b9b 000400b4 00000111 00000001 comctl32!CallOriginalWndProc+0x1a (FPO: [Non-Fpo])

0190f0a8 77431d5d 001060a8 000400b4 00000111 comctl32!CallNextSubclassProc+0x3c (FPO: [Non-Fpo])

0190f0cc 75ed2f80 000400b4 00000111 00000001 comctl32!DefSubclassProc+0x46 (FPO: [Non-Fpo])

0190f0f0 77431b9b 000400b4 00000111 00000001 BROWSEUI!CAutoComplete::s_ParentWndProc+0xec (FPO: [Non-Fpo])

0190f14c 77431dc0 001060a8 000400b4 00000111 comctl32!CallNextSubclassProc+0x3c (FPO: [Non-Fpo])

0190f1a0 7739b6e3 000400b4 00000111 00000001 comctl32!MasterSubclassProc+0x54 (FPO: [Non-Fpo])

0190f1cc 7739b874 77431d6c 000400b4 00000111 USER32!InternalCallWinProc+0x28

0190f244 7739c2d3 00095fb4 77431d6c 000400b4 USER32!UserCallWinProcCheckWow+0x151 (FPO: [Non-Fpo])

0190f280 7739c337 004f8a78 004f5df0 00000001 USER32!SendMessageWorker+0x4bd (FPO: [Non-Fpo])

0190f2a0 77386cea 000400b4 00000111 00000001 USER32!SendMessageW+0x7f (FPO: [Non-Fpo])

0190f2d0 77396199 000400b4 00503dc8 00030064 USER32!IsDialogMessageW+0x41c (FPO: [Non-Fpo])

0190f30c 7738965e 000400b4 00030064 00000001 USER32!DialogBox2+0x144 (FPO: [Non-Fpo])

 

This is the first parameter to CreateProcess, and this thread coincided with the notepad launch from explorer.

 

kd> du 001394f4

001394f4  "C:\WINDOWS\system32\notepad.exe"

 

This thread has been waiting longer than 2 minutes. Looking at what this thread is doing, we see that it’s waiting for a Cache Map to be uninitialized (tear down of the existing references on this cache map) as part of creating the Image section during process creation.

Examining the state of threads in the whole box, you see there are a few more threads in different processes that are waiting on the CreateProcess while creating an Image section and waiting to un-initialize the cache map.

kd> !thread 891910a8

THREAD 891910a8  Cid 0180.0184  Teb: 7ffdf000 Win32Thread: e1442bb8 WAIT: (Unknown) KernelMode Non-Alertable

    f6d44c2c  NotificationEvent

    89191120  NotificationTimer

IRP List:

    894f0298: (0006,0094) Flags: 00000800  Mdl: 00000000

Impersonation token:  e105d028 (Level Impersonation)

DeviceMap                 e12bf190

Owning Process            89138708       Image:         winlogon.exe

Wait Start TickCount      48380          Ticks: 781 (0:00:00:12.203)

Context Switch Count      1617                 LargeStack

UserTime                  00:00:00.156

KernelTime                00:00:00.468

Start Address winlogon!__report_gsfailure (0x0103e1b0)

Stack Init f6d45000 Current f6d44b8c Base f6d45000 Limit f6d40000 Call 0

Priority 15 BasePriority 15 PriorityDecrement 0

ChildEBP RetAddr  Args to Child

f6d44ba4 8082ffd7 891910a8 89191150 00000100 nt!KiSwapContext+0x25 (FPO: [Uses EBP] [0,0,4])

f6d44bbc 808287d4 896b20e8 80a560c6 00000000 nt!KiSwapThread+0x83 (FPO: [Non-Fpo])

f6d44c00 80810135 f6d44c2c 00000000 00000000 nt!KeWaitForSingleObject+0x2e0 (FPO: [Non-Fpo])

f6d44c48 80842608 006b20e8 00000000 00000000 nt!CcWaitForUninitializeCacheMap+0xa5 (FPO: [Non-Fpo])

f6d44cd0 8091f8e7 f6d44d20 000f001f 00000000 nt!MmCreateSection+0x1fc (FPO: [Non-Fpo])

f6d44d40 80883938 0006eedc 000f001f 00000000 nt!NtCreateSection+0x12f (FPO: [Non-Fpo])

f6d44d40 7c82860c 0006eedc 000f001f 00000000 nt!KiFastCallEntry+0xf8 (FPO: [0,0] TrapFrame @ f6d44d64)

0006eb34 7c826ed9 77e6cc9a 0006eedc 000f001f ntdll!KiFastSystemCallRet (FPO: [0,0,0])

0006eb38 77e6cc9a 0006eedc 000f001f 00000000 ntdll!NtCreateSection+0xc (FPO: [7,0,0])

0006f354 7d1ec670 00000818 00000000 0006fadc kernel32!CreateProcessInternalW+0x99c (FPO: [Non-Fpo])

0006f3a0 75842db7 00000818 00000000 0006fadc ADVAPI32!CreateProcessAsUserW+0x108 (FPO: [Non-Fpo])

0006f424 75842f3a 0008c260 0006f8d4 0008c208 MSGINA!ExecApplication+0x8e (FPO: [Non-Fpo])

0006f884 0103be76 0008c208 0006f8d4 00710000 MSGINA!WlxStartApplication+0xbb (FPO: [Non-Fpo])

0006f8a8 01036d59 0007a868 0006f8d4 00008001 winlogon!StartApplication+0x40 (FPO: [Non-Fpo])

0006faf8 01036fa4 0007a868 00000001 0007a868 winlogon!HandleLoggedOn+0x203 (FPO: [Non-Fpo])

0006fb14 0103b24d 00050020 00000659 00000001 winlogon!LoggedonDlgProc+0x8b (FPO: [Non-Fpo])

0006fb38 7739b6e3 00050020 00000659 00000001 winlogon!RootDlgProc+0x6e (FPO: [Non-Fpo])

0006fb64 77395f82 0103b1df 00050020 00000659 USER32!InternalCallWinProc+0x28

0006fbe0 77395e22 0008fca4 0103b1df 00050020 USER32!UserCallDlgProcCheckWow+0x147 (FPO: [Non-Fpo])

0006fc28 77395ffa 00000000 00000659 00000001 USER32!DefDlgProcWorker+0xa8 (FPO: [Non-Fpo])

0006fc44 7739b6e3 00050020 00000659 00000001 USER32!DefDlgProcW+0x22 (FPO: [Non-Fpo])

0006fc70 7739b874 77395fd8 00050020 00000659 USER32!InternalCallWinProc+0x28

0006fce8 7739ba92 0008fca4 77395fd8 00050020 USER32!UserCallWinProcCheckWow+0x151 (FPO: [Non-Fpo])

0006fd50 7739bad0 0006fda0 00000000 0006fd84 USER32!DispatchMessageWorker+0x327 (FPO: [Non-Fpo])

0006fd60 77395d78 0006fda0 00000000 004f2cd0 USER32!DispatchMessageW+0xf (FPO: [Non-Fpo])

0006fd84 77396199 00050020 004f2cd0 00000000 USER32!IsDialogMessageW+0x56b (FPO: [Non-Fpo])

0006fdc0 7738965e 00050020 00000000 00000010 USER32!DialogBox2+0x144 (FPO: [Non-Fpo])

0006fde8 773896a0 01000000 0107cbc8 00000000 USER32!InternalDialogBox+0xd0 (FPO: [Non-Fpo])

0006fe08 773896e8 01000000 0107cbc8 00000000 USER32!DialogBoxIndirectParamAorW+0x37 (FPO: [Non-Fpo])

0006fe2c 0103de0a 01000000 00000578 00000000 USER32!DialogBoxParamW+0x3f (FPO: [Non-Fpo])

0006fe50 0102d838 01000000 00000578 00000000 winlogon!Fusion_DialogBoxParam+0x24 (FPO: [Non-Fpo])

0006fe8c 0103b6e0 0007a868 01000000 00000578 winlogon!TimeoutDialogBoxParam+0x28 (FPO: [Non-Fpo])

0006fec4 0103746e 0007a868 01000000 00000578 winlogon!WlxDialogBoxParam+0x80 (FPO: [Non-Fpo])

0006fee4 01038042 0007a868 77e62f9d 77e42014 winlogon!BlockWaitForUserAction+0x3a (FPO: [Non-Fpo])

0006ff08 01031b33 0007a868 ffffffff 00000004 winlogon!MainLoop+0x42d (FPO: [Non-Fpo])

0006ff50 0103e33b 0007a868 00000000 000724e4 winlogon!WUNotify+0x515 (FPO: [Non-Fpo])

0006fff4 00000000 7ffd7000 000000c8 000001c9 winlogon!__report_gsfailure+0x267 (FPO: [Non-Fpo])

 

 

kd> !thread 88a1c3a0

THREAD 88a1c3a0  Cid 01b0.072c  Teb: 7ff9d000 Win32Thread: 00000000 WAIT: (Unknown) KernelMode Non-Alertable

    f5ea7c2c  NotificationEvent

    88a1c418  NotificationTimer

Not impersonating

DeviceMap                 e1000128

Owning Process            8911fd88       Image:         services.exe

Wait Start TickCount      32679          Ticks: 16482 (0:00:04:17.531)

Context Switch Count      2043

UserTime                  00:00:00.015

KernelTime                00:00:00.140

Win32 Start Address 0x0000a1d5

LPC Server thread working on message Id a1d5

Start Address kernel32!BaseThreadStartThunk (0x77e617ec)

Stack Init f5ea8000 Current f5ea7b8c Base f5ea8000 Limit f5ea5000 Call 0

Priority 10 BasePriority 9 PriorityDecrement 0

ChildEBP RetAddr  Args to Child

f5ea7ba4 8082ffd7 88a1c3a0 88a1c448 00000100 nt!KiSwapContext+0x25 (FPO: [Uses EBP] [0,0,4])

f5ea7bbc 808287d4 895c83f0 80a560c6 00000000 nt!KiSwapThread+0x83 (FPO: [Non-Fpo])

f5ea7c00 80810135 f5ea7c2c 00000000 00000000 nt!KeWaitForSingleObject+0x2e0 (FPO: [Non-Fpo])

f5ea7c48 80842608 005c83f0 00000000 00000000 nt!CcWaitForUninitializeCacheMap+0xa5 (FPO: [Non-Fpo])

f5ea7cd0 8091f8e7 f5ea7d20 000f001f 00000000 nt!MmCreateSection+0x1fc (FPO: [Non-Fpo])

f5ea7d40 80883938 0359f270 000f001f 00000000 nt!NtCreateSection+0x12f (FPO: [Non-Fpo])

f5ea7d40 7c82860c 0359f270 000f001f 00000000 nt!KiFastCallEntry+0xf8 (FPO: [0,0] TrapFrame @ f5ea7d64)

0359eec8 7c826ed9 77e6cc9a 0359f270 000f001f ntdll!KiFastSystemCallRet (FPO: [0,0,0])

0359eecc 77e6cc9a 0359f270 000f001f 00000000 ntdll!NtCreateSection+0xc (FPO: [7,0,0])

0359f6e8 77e424b0 00000000 00000000 000b5100 kernel32!CreateProcessInternalW+0x99c (FPO: [Non-Fpo])

0359f720 0100928b 00000000 000b5100 00000000 kernel32!CreateProcessW+0x2c (FPO: [Non-Fpo])

0359f80c 01008a4c 0064a8b0 000b5100 0359f844 services!ScLogonAndStartImage+0x28b (FPO: [Non-Fpo])

0359f84c 010069b1 0064a8b0 00000000 00000000 services!ScStartService+0x1c6 (FPO: [Non-Fpo])

0359f87c 01005e57 0064a8b0 00000000 00000000 services!ScStartMarkedServices+0x9c (FPO: [Non-Fpo])

0359f8b4 01005ee0 0064a8b0 00000000 00000000 services!ScStartServiceAndDependencies+0x1f1 (FPO: [Non-Fpo])

0359f8d8 77c80193 000a0180 00000000 00000000 services!RStartServiceW+0x8c (FPO: [Non-Fpo])

0359f8f8 77ce33e1 01005e78 0359fae0 00000003 RPCRT4!Invoke+0x30

0359fcf8 77ce35c4 00000000 00000000 000abe9c RPCRT4!NdrStubCall2+0x299 (FPO: [Non-Fpo])

0359fd14 77c7ff7a 000abe9c 000a06d0 000abe9c RPCRT4!NdrServerCall2+0x19 (FPO: [Non-Fpo])

0359fd48 77c8042d 010024ef 000abe9c 0359fdec RPCRT4!DispatchToStubInCNoAvrf+0x38 (FPO: [Non-Fpo])

0359fd9c 77c80353 00000013 00000000 0101c148 RPCRT4!RPC_INTERFACE::DispatchToStubWorker+0x11f (FPO: [Non-Fpo])

0359fdc0 77c811dc 000abe9c 00000000 0101c148 RPCRT4!RPC_INTERFACE::DispatchToStub+0xa3 (FPO: [Non-Fpo])

0359fdfc 77c812f0 000abc30 0009ff08 000d5c58 RPCRT4!LRPC_SCALL::DealWithRequestMessage+0x42c (FPO: [Non-Fpo])

0359fe20 77c88678 0009ff40 0359fe38 000abc30 RPCRT4!LRPC_ADDRESS::DealWithLRPCRequest+0x127 (FPO: [Non-Fpo])

0359ff84 77c88792 0359ffac 77c8872d 0009ff08 RPCRT4!LRPC_ADDRESS::ReceiveLotsaCalls+0x430 (FPO: [Non-Fpo])

0359ff8c 77c8872d 0009ff08 00000000 00000000 RPCRT4!RecvLotsaCallsWrapper+0xd (FPO: [Non-Fpo])

0359ffac 77c7b110 0009e2b8 0359ffec 77e6482f RPCRT4!BaseCachedThreadRoutine+0x9d (FPO: [Non-Fpo])

0359ffb8 77e6482f 000d4d78 00000000 00000000 RPCRT4!ThreadStartRoutine+0x1b (FPO: [Non-Fpo])

0359ffec 00000000 77c7b0f5 000d4d78 00000000 kernel32!BaseThreadStart+0x34 (FPO: [Non-Fpo])

 

These threads stuck in Cache Manager while attempting to launch a process, can potentially lead to the symptoms that were described to us. Let’s try to prove it.

While we will not go into the details of Cache Manager mechanics (Refer to Cache Manager, Chapter 11 in Windows Internals), a quick note on how these threads will be unblocked is needed for the sake of this problem. When image sections are created if there is any existing shared cache map associated, we wait for any references on the shared cache map for this image section to drop to zero. The thread waiting on the cache map to be un-initialized will get signaled when the reference drops to zero on the shared cache map. The code that signals the un-initialization executes in the context of Cache Manager Worker and is queued onto a System Worker thread. Looking at so many threads, all waiting for Cache Manager Worker thread to signal the cleanup of the section, it appears that either-

  • The Cache Manager Worker kicked off but never reached a point to signal these blocked threads.
  • Cache Manager Worker has not had a chance to run yet.

The Cache Manager globals below indicate the maximum number of CC worker that can be active or queued at any time, and current active count. The counts below indicate we are already at the peak. The “nt!CcNumberActiveWorkerThreads” counter indicates the number of threads that already have work to do, but not necessarily currently executing Cache manager worker.

kd> x nt!CcNumberActiveWorkerThreads

80896144 nt!CcNumberActiveWorkerThreads = <no type information>

kd> dd 80896144 l1

80896144  00000008              <<This indicates the work items queued that will/or have engaged worker

kd> x nt!CcNumberWorkerThreads

80896140 nt!CcNumberWorkerThreads = <no type information>

kd> dd 80896140 l1

80896140  00000008              <<This is the counter for Max Cc worker threads

kd> x nt!CcWorkerThread

8081211e nt!CcWorkerThread = <no type information>

 

So what are these work queue items that are being executed?

If the first condition is true then we should find these worker (nt!CcWorkerThread) executing on top of a system worker thread. Yes we did search the stacks of all the threads in the dump, but we weren’t fortunate enough to find any System Worker Threads executing the Cache Manager Worker.

Only other possibility is these Cache Manager Worker threads never got a chance to run, likely system has no System Worker Threads idle enough to pick these Cache Manager work. So how do we prove/disprove this? (We could have started dumping out the System Worker Queues and its associated threads) We take a quicker approach - !exqueue. This command displays information and state of system worker queue and work items queued in each of its worker queue.

Let’s dump out the state of the System Worker Queue/Threads.

kd> !exqueue

Dumping ExWorkerQueue: 808A76C0

 

**** Critical WorkQueue( current = 0 maximum = 1 )

THREAD 898f9b40  Cid 0004.0010  Teb: 00000000 Win32Thread: 00000000 WAIT

THREAD 898f98d0  Cid 0004.0014  Teb: 00000000 Win32Thread: 00000000 WAIT

THREAD 898f8020  Cid 0004.0018  Teb: 00000000 Win32Thread: 00000000 WAIT

THREAD 898f8db0  Cid 0004.001c  Teb: 00000000 Win32Thread: 00000000 WAIT

THREAD 898f8b40  Cid 0004.0020  Teb: 00000000 Win32Thread: 00000000 WAIT

THREAD 898f88d0  Cid 0004.0024  Teb: 00000000 Win32Thread: 00000000 WAIT

THREAD 898f8660  Cid 0004.0028  Teb: 00000000 Win32Thread: 00000000 WAIT

THREAD 898f83f0  Cid 0004.002c  Teb: 00000000 Win32Thread: 00000000 WAIT

THREAD 898f7020  Cid 0004.0030  Teb: 00000000 Win32Thread: 00000000 WAIT

THREAD 898f7db0  Cid 0004.0034  Teb: 00000000 Win32Thread: 00000000 WAIT

THREAD 89652868  Cid 0004.0ed0  Teb: 00000000 Win32Thread: 00000000 WAIT

THREAD 895faa40  Cid 0004.0ed4  Teb: 00000000 Win32Thread: 00000000 WAIT

THREAD 891fb9b8  Cid 0004.0ed8  Teb: 00000000 Win32Thread: 00000000 WAIT

THREAD 89129db0  Cid 0004.0edc  Teb: 00000000 Win32Thread: 00000000 WAIT

THREAD 892c4780  Cid 0004.0ee0  Teb: 00000000 Win32Thread: 00000000 WAIT

THREAD 8961b6a0  Cid 0004.0ee4  Teb: 00000000 Win32Thread: 00000000 WAIT

THREAD 8917a730  Cid 0004.0ee8  Teb: 00000000 Win32Thread: 00000000 WAIT

THREAD 88a31b10  Cid 0004.0eec  Teb: 00000000 Win32Thread: 00000000 WAIT

THREAD 895eacb0  Cid 0004.0ef0  Teb: 00000000 Win32Thread: 00000000 WAIT

THREAD 891d7db0  Cid 0004.0ef8  Teb: 00000000 Win32Thread: 00000000 WAIT

THREAD 89667b08  Cid 0004.0f14  Teb: 00000000 Win32Thread: 00000000 WAIT

THREAD 8920a490  Cid 0004.0f48  Teb: 00000000 Win32Thread: 00000000 WAIT

THREAD 892f3cb0  Cid 0004.0fa8  Teb: 00000000 Win32Thread: 00000000 WAIT

THREAD 8962bdb0  Cid 0004.0fb0  Teb: 00000000 Win32Thread: 00000000 WAIT

THREAD 89661350  Cid 0004.0fb8  Teb: 00000000 Win32Thread: 00000000 WAIT

THREAD 8918adb0  Cid 0004.0fbc  Teb: 00000000 Win32Thread: 00000000 WAIT

 

<Pending Work Items list for this queue>

 

PENDING: WorkerRoutine nt!CcWorkerThread (8081211e) Parameter 898f51e0

PENDING: WorkerRoutine nt!CcWorkerThread (8081211e) Parameter 898f9670

PENDING: WorkerRoutine nt!IopProcessWorkItem (808e419a) Parameter 891f8648

PENDING: WorkerRoutine nt!CcWorkerThread (8081211e) Parameter 898fa1d8

PENDING: WorkerRoutine Ntfs!NtfsCheckpointAllVolumes (f7135a57) Parameter 00000000

PENDING: WorkerRoutine srv!SrvResourceAllocThread (f5edfa28) Parameter 00000000

PENDING: WorkerRoutine nt!IopProcessWorkItem (808e419a) Parameter 89308f00

PENDING: WorkerRoutine nt!ObpProcessRemoveObjectQueue (8092b70e) Parameter 00000000

PENDING: WorkerRoutine srv!SrvResourceThread (f5ee026d) Parameter 00000000

PENDING: WorkerRoutine netbt!NTExecuteWorker (f67cdbc2) Parameter f67eb6bc

PENDING: WorkerRoutine termdd!_IcaDelayedWorker (f767d29a) Parameter 89191008

PENDING: WorkerRoutine termdd!_IcaDelayedWorker (f767d29a) Parameter 8965d1e8

PENDING: WorkerRoutine termdd!_IcaDelayedWorker (f767d29a) Parameter 895edea0

PENDING: WorkerRoutine termdd!_IcaDelayedWorker (f767d29a) Parameter 892b8be8

PENDING: WorkerRoutine termdd!_IcaDelayedWorker (f767d29a) Parameter 895e11e8

PENDING: WorkerRoutine termdd!_IcaDelayedWorker (f767d29a) Parameter 89607210

PENDING: WorkerRoutine termdd!_IcaDelayedWorker (f767d29a) Parameter 896634a8

PENDING: WorkerRoutine termdd!_IcaDelayedWorker (f767d29a) Parameter 8915dce0

PENDING: WorkerRoutine termdd!_IcaDelayedWorker (f767d29a) Parameter 89221110

PENDING: WorkerRoutine termdd!_IcaDelayedWorker (f767d29a) Parameter 8922a968

PENDING: WorkerRoutine nt!CcWorkerThread (8081211e) Parameter 898f7278

PENDING: WorkerRoutine nt!CcWorkerThread (8081211e) Parameter 8998cd38

PENDING: WorkerRoutine nt!CcWorkerThread (8081211e) Parameter 898f9688

PENDING: WorkerRoutine nt!CcWorkerThread (8081211e) Parameter 898f8298

PENDING: WorkerRoutine nt!CcWorkerThread (8081211e) Parameter 8998c030

PENDING: WorkerRoutine termdd!_IcaDelayedWorker (f767d29a) Parameter 891fe578

PENDING: WorkerRoutine termdd!_IcaDelayedWorker (f767d29a) Parameter 891817c0

PENDING: WorkerRoutine Ntfs!NtfsCheckUsnTimeOut (f71489b8) Parameter 00000000

PENDING: WorkerRoutine termdd!_IcaDelayedWorker (f767d29a) Parameter 89648fd0

PENDING: WorkerRoutine termdd!_IcaDelayedWorker (f767d29a) Parameter 89207618

PENDING: WorkerRoutine termdd!_IcaDelayedWorker (f767d29a) Parameter 895fc7d0

PENDING: WorkerRoutine termdd!_IcaDelayedWorker (f767d29a) Parameter 89268950

PENDING: WorkerRoutine termdd!_IcaDelayedWorker (f767d29a) Parameter 8921e008

PENDING: WorkerRoutine termdd!_IcaDelayedWorker (f767d29a) Parameter 88acbe98

PENDING: WorkerRoutine termdd!_IcaDelayedWorker (f767d29a) Parameter 89685e98

PENDING: WorkerRoutine termdd!_IcaDelayedWorker (f767d29a) Parameter 8921ae60

PENDING: WorkerRoutine termdd!_IcaDelayedWorker (f767d29a) Parameter 896521a0

PENDING: WorkerRoutine termdd!_IcaDelayedWorker (f767d29a) Parameter 8920ab68

 

**** Delayed WorkQueue( current = 0 maximum = 1 )

THREAD 898f7b40  Cid 0004.0038  Teb: 00000000 Win32Thread: 00000000 WAIT

THREAD 898f78d0  Cid 0004.003c  Teb: 00000000 Win32Thread: 00000000 WAIT

THREAD 898f7660  Cid 0004.0040  Teb: 00000000 Win32Thread: 00000000 WAIT

THREAD 898f73f0  Cid 0004.0044  Teb: 00000000 Win32Thread: 00000000 WAIT

THREAD 898f6020  Cid 0004.0048  Teb: 00000000 Win32Thread: 00000000 WAIT

THREAD 898f6db0  Cid 0004.004c  Teb: 00000000 Win32Thread: 00000000 WAIT

THREAD 898f6b40  Cid 0004.0050  Teb: 00000000 Win32Thread: 00000000 WAIT

 

**** HyperCritical WorkQueue( current = 0 maximum = 1 )

THREAD 898f68d0  Cid 0004.0054  Teb: 00000000 Win32Thread: 00000000 WAIT

 

This command examines the state of the System Work queue and associated System Worker threads. It’s telling us that there are three set of queues prioritized as hyper-Critical, Critical and Delayed-Worker queues. While Delayed-worker and Hyper-Critical queues are empty, the Critical Worker queue has enough pending items to keep it busy. This is not good. On an ideal case we expect all the work to be processed immediately and almost no work pending in the queue.

Before we move on, let’s take a step back and see why we came here. We were chasing down the cache manager workers and we came here to find if there is any Cache Manager work pending in the worker queue to be picked up. Indeed yes, we can see all of the 8 (“nt!CcNumberActiveWorkerThreads”) still pending. This answers the puzzle as far as threads that were blocked at Cache Manager’s shared cache map un-initialization. And “!exqueue” did come to our rescue here.

It’s always like this! You get an answer to one question, but at the same time the next question is readied for you, i.e. why are these work items still pending and not being processed?

For this we need a little bit of background on how System Worker Threads work. Several system components and drivers may need to execute the code at PASSIVE LEVEL and in a thread context. For this they could always create new threads and use them to execute the code they want. Other option is to rely on the pre-created threads by the system called “System worker Threads” and get relieved from the burden of thread management itself. Based on the priority of the work, work is queued to any of the three queues (Critical, Hyper-Critical, and Delayed-Worker). By default there will be certain number of worker threads (Refer to Chapter 3, System Mechanisms - System Worker Threads – in Windows Internals) created for each of these queues and they will wait on the respective queues for any new work to come in, pick the work and get back to wait on the queue after the completion of the work.

At a certain point it could so happen that all these pre-created threads would be executing some work, and may get blocked on another work item to complete. But as there are no idle worker threads to pick up this work, it would sit in the pending queue, resulting in blockage of all the work to be done by these set of system worker threads.

The Operating System tries to address this kind of a problem to some extent by running deadlock detection algorithm in a timely manner. When this code runs and the system sees that the pending work items are increasing (that is to say that no work items are being picked up, or work is coming in a higher rate than the existing number of threads could handle), it may decide to create additional worker threads to help with the pending work items load. These threads are special worker threads called “Dynamic Worker Threads”. These threads exist as long as there is enough work to be done. However they terminate on being idle for a certain amount of time, so the system doesn’t tie up resources for unused worker threads. Even if these dynamic worker threads get blocked, the system cannot keep creating the additional dynamic worker threads forever, as this will lead to the system filling up with worker threads and all getting blocked.

The sole intention of Dynamic worker threads is to try to help any immediate additional load or help system recover from deadlock among existing worker threads. However a couple of dynamic threads should suffice this need if it’s indeed a transient state. But if this is not a transient state and there is a real software problem then System should have to stop creating these dynamic threads at some point. This will eventually lead to hung Worker threads with work items getting just queued.

This dynamic thread count is limited to 16 for the Critical worker queue, and System will not create any more as soon we reach this limit.

(Refer to Chapter 3, System Mechanisms - System Worker Threads – in Windows Internals and/or Documentation in DDK/WDK).

With this knowledge on System Worker threads, the next step ahead is to determine what the existing Worker threads in the Critical Worker queue are doing which is preventing them from picking up our work items. Below is one of those threads, waiting on a Notification event as part of processing the work from WorkerDrv.SYS. Checking what every single thread in the Critical Worker queue is doing, we see they all are waiting in WorkerDrv.SYS driver (All these threads may not be occupied by the same driver always, but could be a similar deadlock among different drivers).

kd> !thread 898f9b40

THREAD 898f9b40  Cid 0004.0010  Teb: 00000000 Win32Thread: 00000000 WAIT: (Unknown) KernelMode Alertable

    f78aed5c  NotificationEvent

Not impersonating

DeviceMap                 e1000128

Owning Process            898fa648       Image:         System

Wait Start TickCount      28506          Ticks: 20655 (0:00:05:22.734)

Context Switch Count      2

UserTime                  00:00:00.000

KernelTime                00:00:00.000

Start Address nt!ExpWorkerThread (0x8087acfe)

Stack Init f78af000 Current f78aecc4 Base f78af000 Limit f78ac000 Call 0

Priority 13 BasePriority 13 PriorityDecrement 0

ChildEBP RetAddr  Args to Child

f78aecdc 8082ffd7 898f9b40 898f9be8 00000400 nt!KiSwapContext+0x25 (FPO: [Uses EBP] [0,0,4])

f78aecf4 808287d4 891eac00 00000000 895b3268 nt!KiSwapThread+0x83 (FPO: [Non-Fpo])

f78aed38 f77b30fe f78aed5c 00000000 00000000 nt!KeWaitForSingleObject+0x2e0 (FPO: [Non-Fpo])

WARNING: Stack unwind information not available. Following frames may be wrong.

f78aed6c 808e41ad 88a80c08 f78ced5c 808a76c0 WorkerDrv+0x40fe

f78aed80 8087ade9 895b3268 00000000 898f9b40 nt!IopProcessWorkItem+0x13 (FPO: [Non-Fpo])

f78aedac 809418f4 895b3268 00000000 00000000 nt!ExpWorkerThread+0xeb (FPO: [Non-Fpo])

f78aeddc 80887f7a 8087acfe 00000000 00000000 nt!PspSystemThreadStartup+0x2e (FPO: [Non-Fpo])

00000000 00000000 00000000 00000000 00000000 nt!KiThreadStartup+0x16

 

And yes, we know who the culprit is. This Driver has utilized all of the default System Critical Worker threads and the additional Dynamic threads for this queue.

kd> x nt!ExWorkerQueue

808a76c0 nt!ExWorkerQueue = <no type information>

 

kd> dt nt!_EX_WORK_QUEUE 808a76c0 .

   +0x000 WorkerQueue      :

      +0x000 Header           : _DISPATCHER_HEADER

      +0x010 EntryListHead    : _LIST_ENTRY [ 0x898f51e0 - 0x8920ab70 ]<<---Pending (QueueDepthLastPass)

      +0x018 CurrentCount     : 0

      +0x01c MaximumCount     : 1

      +0x020 ThreadListHead   : _LIST_ENTRY [ 0x898f9c48 - 0x8918aeb8 ]<<---Threads attached to this queue

   +0x028 DynamicThreadCount : 0x10 <<--------------------------------------Count of additional threads created as per deadlock detection

   +0x02c WorkItemsProcessed : 0x10f3

   +0x030 WorkItemsProcessedLastPass : 0x10f3

   +0x034 QueueDepthLastPass : 0x26 <<Count of pending work items

   +0x038 Info             :

      +0x000 QueueDisabled    : 0y0

      +0x000 MakeThreadsAsNecessary : 0y1

      +0x000 WaitMode         : 0y0

      +0x000 WorkerCount      : 0y00000000000000000000000011010 (0x1a)

      +0x000 QueueWorkerInfo  : 210

 

Looking at the pending work items we know what kind of impact this deadlock could have on the system. Any operation that is dependent on this set of worker threads will surely be impacted, and over a period of time you expect the system to crawl and slowly could possibly reach a dead end with components having direct/indirect dependency on this component of the system. We see Termdd, NTFS, and Cache manager work items in the pending queue which explains RDP not working, new processes not getting launched and so on.

 

Closure:

At the point when we know this driver has consumed all the Critical Worker threads, the quickest way to get the system up and running is to disable this driver. And I could work on fixing our WorkerDrv.SYS so that this driver understands the importance of System Worker threads and doesn’t flood the worker queue with work items that will block for a long time or with work items that are dependent on other work items, leading to this situation.

 

To summarize, we started with a problem description of crawling/almost hung system (a few components were indeed responding). We found why application launch was being blocked, which lead us to cache manager threads. Chasing down cache manager threads, we ended up with System Worker Threads. Then to my driver WorkerDrv.SYS which never understood the importance of System Worker threads, and used them too freely.

 

Hope you enjoyed reading this post and could use some of it in case you encounter a problem of a similar kind.

 

 

Share this post :

Debuggers - This is the list of the upcoming articles “in the works”.

The Story of a Hung Box
ETW_BlogPost_ETW Introduction and Overview
ETW_BlogPost_Exploring and Decoding ETW...
ETW_BlogPost_MethodsOfTracing
Got Stack?   No. We ran out and kv won’t tell me why!
Part 2: Got Stack?   No. We ran out and kv won’t tell me why!
WMI: Finding provider binary

Also we’re planning to blog about the cool features in Windows 7 and 2008 R2 with debugging in mind. Feel free to send your feedback to help us tweak the blog.

Cheers,

Ron Stock

Debuggers,

Here’s another short video of the type of triage possible with Xperf, this time Bob shows us how to track down a slow I/O issue. Since I/O Manager is instrumented at IoCallDriver and IoCompleteRequest we can often use this information to diagnose a Slow I/O or SAN issue. The steps to enable the tracing are in the video description and here again below.

Thanks,

Tate

Here’s a link to download the WMV  - http://ntdebugging.members.winisp.net/videos/SlowIOv2.wmv

Video Description: (http://www.youtube.com/watch?v=0h-i_EfzadA)

Here's how you can use the Windows Performance Toolkit aka Xperf to investigate slow I/O issues on Windows 2008, Vista, and higher versions. Put the following into a batch file (modifying the -f to point the etl file generation elsewhere than the slow storage) and it will start a circular trace which can be stopped when the slow I/O occurs.

@echo off

xperf -on PROC_THREAD+LOADER+CSWITCH+FILENAME+FILE_IO+FILE_IO_INIT+DRIVERS -f kernel.etl -stackwalk CSwitch+DiskReadInit+DiskWriteInit+DiskFlushInit+FileCreate+FileCleanup+FileClose+FileRead+FileWrite -BufferSize 1024 -MaxBuffers 1024 -MaxFile 1024 -FileMode Circular

echo Reproduce the problem now, hit any key to finish, collect Merged Etl file

pause

echo merging etl...

xperf -d XperfSlowIOcir.etl

@echo on

 

Share this post :

My name is Ryan Mangipano (ryanman) and I am a Sr. Support Escalation Engineer at Microsoft.  Today’s blog will be a quick walkthrough of the analysis of a bugcheck 0xF4 and how I determined that the action plan going forward should consist of enabling pool tagging on this system.

 

I began my review with !analyze –v. From the output I can see that a process required for the system to function properly unexpectedly exited or was terminated. The goal of this debugging session will be to determine what failed and why.

 

0: kd> !analyze -v

*******************************************************************************

*                                                                             *

*                        Bugcheck Analysis                                    *

*                                                                             *

*******************************************************************************

 

CRITICAL_OBJECT_TERMINATION (f4)

A process or thread crucial to system operation has unexpectedly exited or been

terminated.

Several processes and threads are necessary for the operation of the

system; when they are terminated (for any reason), the system can no

longer function.

Arguments:

Arg1: 00000003, Process                                A value of 0x3 in this parameter indicated that it was a process that terminated, not a thread

Arg2: 8a03ada0, Terminating object        This value is a pointer to the _EPROCESS object that terminated

Arg3: 8a03af14, Process image file name              Process Name

Arg4: 805d1204, Explanatory message (ascii)         text message about the problem

 

We shall begin by dumping out all the parameters of the bugcheck. Let’s dump out the "Terminating Object" below

0: kd> !object 8a03ada0

Object: 8a03ada0  Type: (8a490900) Process

    ObjectHeader: 8a03ad88 (old version)

    HandleCount: 3  PointerCount: 228

 

First, let’s dump out the process image file name from the bugcheck parameter 3 above.

0: kd> dc 8a03af14

8a03af14  73727363 78652e73 00000065 00000000  csrss.exe

 

 

0: kd> dt _EPROCESS 8a03ada0 imageFileName

CSRSRV!_EPROCESS

   +0x174 ImageFileName : [16]  "csrss.exe"

 

Notice that if we add the base of the _EPROCESS object (8a03ada0- Parameter 2)to the offset of the imageFileName field (+0x174) we get parameter 3. The imageFileName field.

0: kd> ? 8a03ada0+0x174

Evaluate expression: -1979470060 = 8a03af14

 

0: kd> dc 8a03af14

8a03af14  73727363 78652e73 00000065 00000000  csrss.exe.......

 

Let’s dump out the ASCII message from parameter number 4

0: kd> dc 805d1204

805d1204  6d726554 74616e69 20676e69 74697263  Terminating crit

805d1214  6c616369 6f727020 73736563 25783020  ical process

 

Let’s review the debugger help file for more information. We can see that this bugcheck occurs when a critical process or thread terminates. “Several processes and threads are necessary for the operation of the system. When they are terminated for any reason, the system can no longer function. “

0: kd> .hh bug check 0xf4

 

 

 

 

 

 

Next, we need to determine why this process terminated. !analyze –v also provided us with an exception record which provides us with an error code:

 

PROCESS_NAME:  csrss.exe

 

EXCEPTION_RECORD:  9a85e9d8 -- (.exr 0xffffffff9a85e9d8)

ExceptionAddress: 7c92c375 (ntdll!RtlFindMessage+0x0000007c)

   ExceptionCode: c0000006 (In-page I/O error)

  ExceptionFlags: 00000000

NumberParameters: 3

   Parameter[0]: 00000000

   Parameter[1]: 7c99c3d8

   Parameter[2]: c000009a

Inpage operation failed at 7c99c3d8, due to I/O error c000009a

 

EXCEPTION_CODE: (NTSTATUS) 0xc0000006 - The instruction at 0x%p referenced memory at 0x%p. The required data was not placed into memory because of an I/O error status of 0x%x.

 

Since we have an error code, let’s investigate that error code. We can quickly perform this operation from within the debugger using the !error command

 

0: kd> !error c000009a

Error code: (NTSTATUS) 0xc000009a (3221225626) - Insufficient system resources exist to complete the API.

 

0: kd > .hh !error

 

 

Let’s check the output of the !vm command

 

0: kd> !vm 2

 

*** Virtual Memory Usage ***

                Physical Memory:      760875 (   3043500 Kb)

                Page File: \??\C:\pagefile.sys

                  Current:   4190208 Kb  Free Space:   4156380 Kb

                  Minimum:   4190208 Kb  Maximum:      4190208 Kb

                Available Pages:      579241 (   2316964 Kb)

                ResAvail Pages:       673481 (   2693924 Kb)

                Locked IO Pages:          69 (       276 Kb)

                Free System PTEs:     115226 (    460904 Kb)

                Free NP PTEs:              0 (         0 Kb)

                Free Special NP:           0 (         0 Kb)

                Modified Pages:          221 (       884 Kb)

                Modified PF Pages:       219 (       876 Kb)

                NonPagedPool Usage:    65534 (    262136 Kb)

                NonPagedPool Max:      65536 (    262144 Kb)

                ********** Excessive NonPaged Pool Usage *****

                PagedPool 0 Usage:     24167 (     96668 Kb)

                PagedPool 1 Usage:       967 (      3868 Kb)

                PagedPool 2 Usage:       967 (      3868 Kb)

                PagedPool 3 Usage:       984 (      3936 Kb)

                PagedPool 4 Usage:       977 (      3908 Kb)

                PagedPool Usage:       28062 (    112248 Kb)

                PagedPool Maximum:     92160 (    368640 Kb)

 

                ********** 2075 pool allocations have failed **********

 

                Session Commit:         1562 (      6248 Kb)

                Shared Commit:          2526 (     10104 Kb)

                Special Pool:              0 (         0 Kb)

                Shared Process:         4821 (     19284 Kb)

                PagedPool Commit:      28062 (    112248 Kb)

                Driver Commit:          5138 (     20552 Kb)

                Committed pages:      153449 (    613796 Kb)

                Commit limit:        1767229 (   7068916 Kb)

 

0: kd> !poolused

unable to get PoolTrackTable - pool tagging is disabled, enable it to use this command

Use gflags.exe and check the box that says "Enable pool tagging".

 

The output above has informed us that pool tagging is disabled. Let’s demonstrate how you can verify that it is disabled:

 

0: kd> dd nt!NtGlobalFlag L1

805597ec  00000000

 

0: kd> !gflag

Current NtGlobalFlag contents: 0x00000000

 

Let’s explore the debugging help file entry on the !poolused command

 

0: kd > .hh !poolused

 

 

 

Reading the text above, we are informed that “Pool tagging is permanently enabled on Windows Server 2003 and later versions of Windows. On Windows XP and earlier versions of Windows, you must enable pool tagging by using Gflags.”

 

 

Using the vertarget command, I can see that this system was running Windows XP.

0: kd> vertarget

Windows XP Kernel Version 2600 (Service Pack 2) MP (2 procs) Free x86 compatible

 

 

 

0: kd > .hh !gflag

 

 

 

By reviewing the help file entry for the !gflag extension, I was able to determine that if pooltagging was set, the following bit would have been set:

 

0x400 "ptg"         Enable pool tagging.

 

0: kd> .formats 0x400

Evaluate expression:

….

  Binary:  00000000 00000000 00000100 00000000                0x00000400

 

Gflags is included in the Debugging Tools for Windows. The screenshot below is from a Windows 7 system. Notice that Pool Tagging is enabled permanently as described above.

 

 

 

Summary: This system bugchecked when the critical process csrss.exe failed an I/O operation due to insufficient non-paged pool. For an action plan, we recommended the use of gflags to enable pool tagging in order to obtain more information about pool consumption.

 

 

Share this post :

More Posts Next page »
 
Page view tracker