Hello, it's the Debug Ninja back again for another NtDebugging Blog article. For as long as I can remember user mode debuggers have had an easy way to get call stacks for heap allocations. On more recent versions of Windows this has been as simple as using gflags +ust and umdh or !heap -k. Kernel debuggers have not always had an easy way to determine who allocated a pool block. Sure, we have pool tags to help us out, but often a programmer will use the same tag in many places and devalue this as a troubleshooting technique.
Fortunately, starting in Windows Vista and Server 2008, kernel debuggers can get call stacks from pool allocations. We can even get call stacks from pool frees. This little known technique is not quite as useful as gflags +ust is for heap, but when it is needed it is very useful.
First, you need to turn on special pool using driver verifier. Verifier will obtain and track the call stack for the allocation and the free, so this technique will not work with traditional special pool as documented in KB188831 because those settings do not use driver verifier. Because special pool requires additional memory overhead to run, this technique is not valuable for large memory leaks. However, this technique is a good way to determine what code allocated or freed your pool block in other conditions. For example, this works well if you find that pool has been freed when you expected it to be allocated. This also works for smaller memory leaks, especially those for which you can easily reproduce the leak. Analyzing the allocations and stacks for a leak must be done by hand, as there is no umdh-like tool for kernel mode.
Step 1 - Turning on verifier
In this example I am using Sysinternals’ notmyfault tool to generate the pool allocations. Because I know the driver in question I set verifier to only monitor that driver. Note that a reboot is required to make this setting take effect.
Verifier /flags 1 /driver myfault.sys
Step 2 - Finding the pool allocation to analyze
For this example I am going to find the call stack of a leaked pool allocation. First find the tag that is using the most pool by using !poolused.
kd> !poolused 4
Sorting by Paged Pool Consumed
Tag Allocs Used Allocs Used
Leak 0 0 23 23552000 UNKNOWN pooltag 'Leak', please update pooltag.txt
CM31 0 0 20520 18514560 Internal Configuration manager allocations , Binary: nt!cm
CIcr 0 0 2977 8511504 Code Integrity allocations for image integrity checking , Binary: ci.dll
Next find the pool allocations for that tag with !poolfind. There is some guessing to be done with all pool leak debugging techniques; you can’t be sure that the allocation you’re looking at has really been leaked and is not just in a state where it has not yet been freed. You need to make an educated guess because there is no umdh-type functionality to analyze allocates and frees. If you have the benefit of a live debug you can go the debugger and check back later to see if the memory has been freed or not.
kd> !poolfind Leak
Scanning large pool allocation table for Tag: Leak (fffffa8002e00000 : fffffa8002f80000)
*fffff8a006a00000 :large page allocation, Tag is Leak, size is 0xfa000 bytes
*fffff8a0058fa000 :large page allocation, Tag is Leak, size is 0xfa000 bytes
*fffff8a006200000 :large page allocation, Tag is Leak, size is 0xfa000 bytes
*fffff8a0068fa000 :large page allocation, Tag is Leak, size is 0xfa000 bytes
*fffff8a0060fa000 :large page allocation, Tag is Leak, size is 0xfa000 bytes
*fffff8a005a00000 :large page allocation, Tag is Leak, size is 0xfa000 bytes
*fffff8a006c00000 :large page allocation, Tag is Leak, size is 0xfa000 bytes
*fffff8a006400000 :large page allocation, Tag is Leak, size is 0xfa000 bytes
*fffff8a0062fa000 :large page allocation, Tag is Leak, size is 0xfa000 bytes
*fffff8a005afa000 :large page allocation, Tag is Leak, size is 0xfa000 bytes
*fffff8a005c00000 :large page allocation, Tag is Leak, size is 0xfa000 bytes
*fffff8a006e00000 :large page allocation, Tag is Leak, size is 0xfa000 bytes
*fffff8a006600000 :large page allocation, Tag is Leak, size is 0xfa000 bytes
*fffff8a0064fa000 :large page allocation, Tag is Leak, size is 0xfa000 bytes
*fffff8a005cfa000 :large page allocation, Tag is Leak, size is 0xfa000 bytes
*fffff8a006afa000 :large page allocation, Tag is Leak, size is 0xfa000 bytes
*fffff8a005e00000 :large page allocation, Tag is Leak, size is 0xfa000 bytes
*fffff8a006800000 :large page allocation, Tag is Leak, size is 0xfa000 bytes
*fffff8a0066fa000 :large page allocation, Tag is Leak, size is 0xfa000 bytes
*fffff8a005efa000 :large page allocation, Tag is Leak, size is 0xfa000 bytes
*fffff8a006cfa000 :large page allocation, Tag is Leak, size is 0xfa000 bytes
*fffff8a006000000 :large page allocation, Tag is Leak, size is 0xfa000 bytes
*fffff8a005800000 :large page allocation, Tag is Leak, size is 0xfa000 bytes
Step 3 – Dump the call stack for the allocation
This step is the easy one. Once you have the address of the allocation use !verifier 0x80 Address. If you were interested in all of the call stacks in the log you can run !verifier 0x80 without the Address parameter.
kd> !verifier 0x80 fffff8a005800000
Log of recent kernel pool Allocate and Free operations:
There are up to 0x10000 entries in the log.
Parsing 0x0000000000010000 log entries, searching for address 0xfffff8a005800000.
Pool block fffff8a005800000, Size 00000000000fa000, Thread fffffa8002be4060
Pool block fffff8a005800000, Size 0000000000001000, Thread fffffa8002187060
Parsed entry 0000000000010000/0000000000010000...
Finished parsing all pool tracking information.
Keep in mind that the log may contain allocate and free information that predates the current use of the pool block, and that the log is of a fixed size so eventually old data will fall off the end. The most recent use of the pool will be at the top of the output. Usually this is the stack at the top of the output is what you are interested in, I have highlighted the relevant call stack in red. In this instance we can see that the pool was most recently allocated by myfault.sys.
Sometimes it is useful to have historical information about previous uses of the pool block such as when dealing with pool that was improperly freed. In that scenario the most recent call stack may be from an allocate call when the pool block was reused by the memory manager and so you may need to go down several levels to find out where the pool was improperly freed.
For more information on using !verifier you can refer to the debugger help in MSDN, http://msdn.microsoft.com/en-us/library/ff565591.aspx.
Hi debuggers, Andrew Richards here for my first NT Debugging post. I thought I’d share a recent case that used a lot of discovery techniques to uncover the details of what was going on. Most bugchecks give you the information you need as arguments, but in the case of bugcheck 0x101, I had to go digging for a trap frame, the thread stack, look at the disassembly of the running threads, and lots of other goodies.
As usual, the first thing I did was run !analyze -v to get a more detailed explanation of what a bugcheck 0x101 “CLOCK_WATCHDOG_TIMEOUT” is.
0: kd> !analyze -v ******************************************************************************* * * * Bugcheck Analysis * * * *******************************************************************************
CLOCK_WATCHDOG_TIMEOUT (101) An expected clock interrupt was not received on a secondary processor in an MP system within the allocated interval. This indicates that the specified processor is hung and not processing interrupts. Arguments: Arg1: 00000031, Clock interrupt time out interval in nominal clock ticks. Arg2: 00000000, 0. Arg3: 87337120, The PRCB address of the hung processor. Arg4: 00000003, 0.
A bugcheck 0x101 occurs when the Clock interrupt (IRQL #28) has not been processed by each processor within a timeout. The Clock interrupt is quite high in the IRQL table for x86; only the Inter-Processor Interrupt (IPI), Power-Fail and High interrupts are higher.
The immediate hypothesis was that one of the processors was stuck processing an IPI (a common interrupt), causing it to be above the Clock IRQ level.
The questions were:
1. At what IRQL was the problem processor? Was it at IPI?
2. Why was it stuck?
3. Who did it?
This was a 4 core box, so I ran !prcb four times to view the Processor Resource control Block (PRCB) of each processor. Processor #3 matched the PRCB specified in the bugcheck’s 3rd argument (Arg3 = 87337120). The first question was answered; Processor #3 was the culprit. Interestingly, the Arg4 value in the bugcheck has a value of 3. Per the documentation it should be 0, but it seems that the processor number is provided.
0: kd> !prcb 0 PRCB for Processor 0 at 82b34d20: Current IRQL -- 28 Threads-- Current 82b3e380 Next 00000000 Idle 82b3e380 Processor Index 0 Number (0, 0) GroupSetMember 1 Interrupt Count -- 000475e1 Times -- Dpc 000008d6 Interrupt 0000007b Kernel 0001920f User 00003b0b 0: kd> !prcb 1 PRCB for Processor 1 at 807c7120: Current IRQL -- 0 Threads-- Current 9bb65d48 Next 9bf52c10 Idle 807cc800 Processor Index 1 Number (0, 1) GroupSetMember 2 Interrupt Count -- 00030ae8 Times -- Dpc 000005ca Interrupt 00000098 Kernel 00017e4a User 00004eab 0: kd> !prcb 2 PRCB for Processor 2 at 87300120: Current IRQL -- 0 Threads-- Current 9b41b6e8 Next 00000000 Idle 87305800 Processor Index 2 Number (0, 2) GroupSetMember 4 Interrupt Count -- 0002ab35 Times -- Dpc 00000568 Interrupt 000000ac Kernel 0001a788 User 00002565 0: kd> !prcb 3 PRCB for Processor 3 at 87337120: Current IRQL -- 0 Threads-- Current 8aaa17c8 Next 00000000 Idle 8733c800 Processor Index 3 Number (0, 3) GroupSetMember 8 Interrupt Count -- 00026c0d Times -- Dpc 00000620 Interrupt 0000008b Kernel 0001ac65 User 00001e33
The Processor Context Records reported that Processors #0, #1 and #2 were at IRQL 31 (1f - HIGH), and Processor #3 was at IRQL 27 (1b - SYNCH). Having a value of 31 is expected at bugcheck as that is how the bugcheck gains control of the processor to gather the current context. It was strange that Processor #3 was different and that is matched the processor mentioned in the bugcheck
0: kd> !pcr 0 KPCR for Processor 0 at 82b34c00: ... Irql: 0000001f ...
0: kd> !pcr 1 KPCR for Processor 1 at 807c7000: ... Irql: 0000001f ...
0: kd> !pcr 2 KPCR for Processor 2 at 87300000: ... Irql: 0000001f ...
0: kd> !pcr 3 KPCR for Processor 3 at 87337000: ... Irql: 0000001b ...
The next step was to look at the stacks of the processors to see what the threads were all involved in.
To determine the stack of the Processor #0’s thread before the bugcheck, the trap frame needed to be found. The trap frame is stored immediately above the interrupt handler. To find that, I looked at the Interrupt Descriptor Table to find out the name of the handler for clock interrupt and then I searched for that symbol on the stack.
-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=- ==> Processor #0 -=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=- 0: kd> knL # ChildEBP RetAddr 00 82b31674 82a84a6f nt!KeBugCheckEx+0x1e 01 82b316b0 82a840be nt!KeAccumulateTicks+0x242 02 82b316f0 82a83f6b nt!KeUpdateRunTime+0x145 03 82b3174c 82a88c17 nt!KeUpdateSystemTime+0x613 04 82b3174c 82a85e79 nt!KeUpdateSystemTimeAssist+0x13 05 82b317e0 82abfa17 nt!KiIpiSendPacket+0xdd 06 82b31820 82af0866 nt!KeFlushSingleTb+0x136 07 82b3190c 82b2ab90 nt!MmFreeSpecialPool+0x2b4 08 82b31970 82d4a06e nt!ExFreePoolWithTag+0xd6 09 82b31980 82d3fab2 nt!ViCtxFreeIsrContext+0xf 0a 82b31998 82d3a1c0 nt!VfIoFreeIrp+0xd3 0b 82b319a8 8ced5986 nt!IovFreeIrpPrivate+0x47 WARNING: Stack unwind information not available. Following frames may be wrong. 0c 82b319c4 82d3acd4 irsir+0x2986 0d 82b319f4 82a81933 nt!IovpLocalCompletionRoutine+0x14b 0e 82b31a3c 82d3ab64 nt!IopfCompleteRequest+0x128 0f 82b31aa4 8b7c6abd nt!IovCompleteRequest+0x133 10 82b31ab4 8b7c6461 serial!SerialGetNextIrpLocked+0x61 11 82b31ad8 8b7c7567 serial!SerialGetNextIrp+0x27 12 82b31b00 8b7b9eb7 serial!SerialTryToCompleteCurrent+0x7a 13 82b31b38 82a83039 serial!SerialReadTimeout+0x68 14 82b31b7c 82a82fdd nt!KiProcessTimerDpcTable+0x50 15 82b31c68 82a82e9a nt!KiProcessExpiredTimerList+0x101 16 82b31cdc 82a8100e nt!KiTimerExpiration+0x25c 17 82b31d20 82a80e38 nt!KiRetireDpcList+0xcb 18 82b31d24 00000000 nt!KiIdleLoop+0x38
0: kd> !irql Debugger saved IRQL for processor 0x0 -- 28 (CLOCK2_LEVEL)
0: kd> !idt
37: 82e35104 hal!PicSpuriousService37
51: 8aba1558 serial!SerialCIsrSw (KINTERRUPT 8aba1500)
71: 89003cd8 i8042prt!I8042KeyboardInterruptService (KINTERRUPT 89003c80)
72: 8aba17d8 USBPORT!USBPORT_InterruptService (KINTERRUPT 8aba1780)
82: 89123058 ataport!IdePortInterrupt (KINTERRUPT 89123000)
ataport!IdePortInterrupt (KINTERRUPT 89123a00)
ataport!IdePortInterrupt (KINTERRUPT 89123780)
ataport!IdePortInterrupt (KINTERRUPT 89123500)
92: 8aba1058 Impcd+0x8540 (KINTERRUPT 8aba1000)
a0: 8aba1a58 ndis!ndisMiniportMessageIsr (KINTERRUPT 8aba1a00)
a2: 8aba1cd8 USBPORT!USBPORT_InterruptService (KINTERRUPT 8aba1c80)
b0: 891232d8 ndis!ndisMiniportMessageIsr (KINTERRUPT 89123280)
b1: 89123cd8 ACPI!ACPIInterruptServiceRoutine (KINTERRUPT 89123c80)
b2: 8aba12d8 serial!SerialCIsrSw (KINTERRUPT 8aba1280)
c1: 82e353f4 hal!HalpBroadcastCallService
d1: 82e1d634 hal!HalpHpetClockInterrupt
d2: 82e1d898 hal!HalpHpetRolloverInterrupt
df: 82e351dc hal!HalpApicRebootService
e1: 82e35958 hal!HalpIpiHandler
e3: 82e356f8 hal!HalpLocalApicErrorService
fd: 82e35f2c hal!HalpProfileInterrupt
fe: 82e361a8 hal!HalpPerfInterrupt
The search was between the nt!KiIpiSendPacket call (I just chose it as it didn’t seem to be bugcheck related) and the current stack pointer. Using the dereferenced pointer (poi) and some maths, the trap frame location is retrieved.
0: kd> dps @esp 82b317e0 ... 82b31744 82b31760 nt!KiDoubleFaultStack+0x2760 <<<<< This is not a real symbol; it actually is a part of the trap frame 82b31748 82e1d72a hal!HalpHpetClockInterrupt+0xf6 82b3174c 82b31760 nt!KiDoubleFaultStack+0x2760 82b31750 82a88c17 nt!KeUpdateSystemTimeAssist+0x13 ...
0: kd> .trap poi(82b31748-4) ErrCode = 00000000 eax=87300120 ebx=841882dc ecx=4cdfc4c4 edx=82b34d20 esi=807c7120 edi=82b738c4 eip=82a85e79 esp=82b317d4 ebp=82b31820 iopl=0 nv up ei pl nz na po nc cs=0008 ss=0010 ds=0000 es=dea0 fs=040f gs=0008 efl=00000202 nt!KiIpiSendPacket+0xdd: 82a85e79 f390 pause
So you might be asking yourself, can this be done an easier way? The answer is definitely Yes. You just need to use kv instead; it adds the trap frame information on the end of the line.
0: kd> kv
ChildEBP RetAddr Args to Child
82b31674 82a84a6f 00000101 00000031 00000000 nt!KeBugCheckEx+0x1e
82b316b0 82a840be 00026161 00000000 0001cd00 nt!KeAccumulateTicks+0x242
82b316f0 82a83f6b 82a85e79 807c7120 00000000 nt!KeUpdateRunTime+0x145
82b3174c 82a88c17 ffffff1b ffffff1b 000000d1 nt!KeUpdateSystemTime+0x613
82b3174c 82a85e79 ffffff1b ffffff1b 000000d1 nt!KeUpdateSystemTimeAssist+0x13 (FPO: [0,2] TrapFrame @ 82b31760)
82b317e0 82abfa17 00000001 00000000 82a3cbe1 nt!KiIpiSendPacket+0xdd (FPO: [6,2,0])
82b31820 82af0866 a33d6f20 00000001 a33d00cf nt!KeFlushSingleTb+0x136
82b3190c 82b2ab90 a33d6f20 8a2096d8 8a2096d8 nt!MmFreeSpecialPool+0x2b4
82b31970 82d4a06e a33d6f20 00000000 82b31998 nt!ExFreePoolWithTag+0xd6
82b31980 82d3fab2 a33d6f20 a33d6f20 a33d6f20 nt!ViIrpFree+0xf
82b31998 82d3a1c0 8ced5986 905cadb0 82b319c4 nt!VfIoFreeIrp+0xd3
82b319a8 8ced5986 a33d6f20 8ced58f0 82b31a6c nt!IovFreeIrpPrivate+0x47
WARNING: Stack unwind information not available. Following frames may be wrong.
82b319c4 82d3acd4 00000000 a33d6f20 905cadb0 irsir+0x2986
82b319f4 82a81933 00000000 a33d6f20 82b31a6c nt!IovpLocalCompletionRoutine+0x14b
82b31a3c 82d3ab64 a33d6f20 8ab151ac 8ab150f0 nt!IopfCompleteRequest+0x128
82b31aa4 8b7c6abd 00000000 8ab151ac 82b31ad8 nt!IovCompleteRequest+0x133
82b31ab4 8b7c6461 8ab151ac 8ab1518c 82b31b0c serial!SerialGetNextIrpLocked+0x61 (FPO: [Non-Fpo])
82b31ad8 8b7c7567 8ab151ac 8ab1518c 82b31b0c serial!SerialGetNextIrp+0x27 (FPO: [Non-Fpo])
82b31b00 8b7b9eb7 8ab150f0 00000000 8ab15002 serial!SerialTryToCompleteCurrent+0x7a (FPO: [Non-Fpo])
82b31b38 82a83039 8ab15314 8ab15002 3c171d26 serial!SerialReadTimeout+0x68 (FPO: [Non-Fpo])
82b31b7c 82a82fdd 82b34d20 82b31ca8 00000001 nt!KiProcessTimerDpcTable+0x50
82b31c68 82a82e9a 82b34d20 82b31ca8 00000000 nt!KiProcessExpiredTimerList+0x101
82b31cdc 82a8100e 0001cacf 9bb374c0 82b3e380 nt!KiTimerExpiration+0x25c
82b31d20 82a80e38 00000000 0000000e 00000000 nt!KiRetireDpcList+0xcb
82b31d24 00000000 0000000e 00000000 00000000 nt!KiIdleLoop+0x38 (FPO: [0,0,0])
By using the stack based maths, or the value provide by kv, the trap frame address is used to set the context (.trap <addr>) to the code running before the interrupt. These are the stored registers and the stack at the time of the interrupt.
0: kd> .trap 82b31760
ErrCode = 00000000
eax=87300120 ebx=841882dc ecx=4cdfc4c4 edx=82b34d20 esi=807c7120 edi=82b738c4
eip=82a85e79 esp=82b317d4 ebp=82b31820 iopl=0 nv up ei pl nz na po nc
cs=0008 ss=0010 ds=0000 es=dea0 fs=040f gs=0008 efl=00000202
82a85e79 f390 pause
0: kd> knL *** Stack trace for last set context - .thread/.cxr resets it # ChildEBP RetAddr 00 82b317e0 82abfa17 nt!KiIpiSendPacket+0xdd 01 82b31820 82af0866 nt!KeFlushSingleTb+0x136 02 82b3190c 82b2ab90 nt!MmFreeSpecialPool+0x2b4 03 82b31970 82d4a06e nt!ExFreePoolWithTag+0xd6 04 82b31980 82d3fab2 nt!ViCtxFreeIsrContext+0xf 05 82b31998 82d3a1c0 nt!VfIoFreeIrp+0xd3 06 82b319a8 8ced5986 nt!IovFreeIrpPrivate+0x47 WARNING: Stack unwind information not available. Following frames may be wrong. 07 82b319c4 82d3acd4 irsir+0x2986 08 82b319f4 82a81933 nt!IovpLocalCompletionRoutine+0x14b 09 82b31a3c 82d3ab64 nt!IopfCompleteRequest+0x128 0a 82b31aa4 8b7c6abd nt!IovCompleteRequest+0x133 0b 82b31ab4 8b7c6461 serial!SerialGetNextIrpLocked+0x61 0c 82b31ad8 8b7c7567 serial!SerialGetNextIrp+0x27 0d 82b31b00 8b7b9eb7 serial!SerialTryToCompleteCurrent+0x7a 0e 82b31b38 82a83039 serial!SerialReadTimeout+0x68 0f 82b31b7c 82a82fdd nt!KiProcessTimerDpcTable+0x50 10 82b31c68 82a82e9a nt!KiProcessExpiredTimerList+0x101 11 82b31cdc 82a8100e nt!KiTimerExpiration+0x25c 12 82b31d20 82a80e38 nt!KiRetireDpcList+0xcb 13 82b31d24 00000000 nt!KiIdleLoop+0x38
Dissassembling the first few instructions reveals a jump (jmp) that is back up in the nt!KiIpiSendPacket function. Using the jmp location and the instruction after the jmp as the bound, we can disassemble the loop. At the time of the bugcheck, the thread was executing a pause (a CPU based delay), and seemingly doing this in a loop while waiting for to release it.
0: kd> u @eip nt!KiIpiSendPacket+0xdd: 82a85e79 f390 pause <<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<< @eip 82a85e7b eb9e jmp nt!KiIpiSendPacket+0x7f (82a85e1b) 82a85e7d 8d4900 lea ecx,[ecx]
0: kd> u 82a85e1b 82a85e7d nt!KiIpiSendPacket+0x7f: 82a85e1b 8b86a4180000 mov eax,dword ptr [esi+18A4h] 82a85e21 0bc0 or eax,eax <<<<<< Checking if value is nonzero 82a85e23 7538 jne nt!KiIpiSendPacket+0xc1 (82a85e5d) <<<<<< Take jmp to stay in loop 82a85e25 f00fb196a4180000 lock cmpxchg dword ptr [esi+18A4h],edx 82a85e2d 75ec jne nt!KiIpiSendPacket+0x7f (82a85e1b) 82a85e2f 59 pop ecx 82a85e30 d1e9 shr ecx,1 82a85e32 8d7f04 lea edi,[edi+4] 82a85e35 72df jb nt!KiIpiSendPacket+0x7a (82a85e16) 82a85e37 75f7 jne nt!KiIpiSendPacket+0x94 (82a85e30) 82a85e39 59 pop ecx 82a85e3a 64890d8c190000 mov dword ptr fs:[198Ch],ecx 82a85e41 8b4c240c mov ecx,dword ptr [esp+0Ch] 82a85e45 8b542410 mov edx,dword ptr [esp+10h] 82a85e49 64ff0590360000 inc dword ptr fs:[3690h] 82a85e50 52 push edx 82a85e51 51 push ecx 82a85e52 ff15a4a0a082 call dword ptr [nt!_imp__HalRequestIpi (82a0a0a4)] 82a85e58 5f pop edi 82a85e59 5e pop esi 82a85e5a c21800 ret 18h 82a85e5d 41 inc ecx 82a85e5e 850d7c3ab782 test dword ptr [nt!HvlLongSpinCountMask (82b73a7c)],ecx 82a85e64 7513 jne nt!KiIpiSendPacket+0xdd (82a85e79) 82a85e66 f605783ab78240 test byte ptr [nt!HvlEnlightenments (82b73a78)],40h <<<<<< Don’t spin if you’re an enlightened VM, just pause 82a85e6d 740a je nt!KiIpiSendPacket+0xdd (82a85e79) 82a85e6f 52 push edx 82a85e70 51 push ecx 82a85e71 51 push ecx 82a85e72 e8c8b60500 call nt!HvlNotifyLongSpinWait (82ae153f) 82a85e77 59 pop ecx 82a85e78 5a pop edx 82a85e79 f390 pause <<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<< @eip 82a85e7b eb9e jmp nt!KiIpiSendPacket+0x7f (82a85e1b)
The summary so far: Since processor #0 was the thread that created the bugcheck, it must have been interrupted by a Clock interrupt so as to trigger a CLOCK_WATCHDOG_TIMEOUT bugcheck. It is not surprising then that the value of !irql is CLOCK2.
The existence of the nt!KiIpiSendPacket function lends weight to the thought that this was the creator of an IPI and is not one of the processors that wasn’t processing IPIs.
Using the same technique as Processor #0, the disassembly for the loop on Processor #1 is determined. Processor #1 is in a tight loop within the nt!KeFlushMultipleRangeTb function, an ancestor function of nt!ExAllocatePoolWithTag, a memory related operation. This is an interesting function to see, considering that nt!ExFreePoolWithTag, the counterpart, is on Processor #0’s stack.
-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=- ==> Processor #1 -=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=- 1: kd> knL # ChildEBP RetAddr 00 95bfc7bc 82abb217 nt!KeFlushMultipleRangeTb+0x2d3 01 95bfc89c 82aa7d11 nt!MiFlushTbAsNeeded+0x12e 02 95bfc8dc 82b29487 nt!MiAllocatePagedPoolPages+0x567 03 95bfc940 82aa4674 nt!MiAllocatePoolPages+0x1f 04 95bfc998 82b2a132 nt!ExpAllocateBigPool+0xa6 05 95bfc9fc 82aab6b1 nt!ExAllocatePoolWithTag+0x12d 06 95bfca20 82c4cc62 nt!ExAllocatePoolWithTagPriority+0x196 07 95bfca78 82c7a662 nt!IopQueryNameInternal+0x60 08 95bfca98 82c57d88 nt!IopQueryName+0x1b 09 95bfcb1c 82c71a50 nt!ObpQueryNameString+0x7f 0a 95bfcb38 82c75e12 nt!ObQueryNameString+0x18 0b 95bfcc14 82c65788 nt!EtwTraceProcess+0xa2 0c 95bfcc38 82c73625 nt!PspExitProcess+0x37 0d 95bfccb4 82c87051 nt!PspExitThread+0x59a 0e 95bfcccc 82aba8c0 nt!PsExitSpecialApc+0x22 0f 95bfcd1c 82a472a4 nt!KiDeliverApc+0x28b 10 95bfcd1c 77556fc0 nt!KiServiceExit+0x64 WARNING: Frame IP not in any known module. Following frames may be wrong. 11 014ef918 00000000 0x77556fc0
1: kd> !irql
Debugger saved IRQL for processor 0x1 -- 0 (LOW_LEVEL) <<<<<< Windows Internals 4th Edition notes that IRQL may not be saved; this explains the 0
1: kd> u @eip nt!KeFlushMultipleRangeTb+0x2d3: 82a40c31 f390 pause <<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<< @eip 82a40c33 8b07 mov eax,dword ptr [edi] 82a40c35 85c0 test eax,eax 82a40c37 75de jne nt!KeFlushMultipleRangeTb+0x2b9 (82a40c17) 82a40c39 8a4d0f mov cl,byte ptr [ebp+0Fh] 82a40c3c ff1558a1a082 call dword ptr [nt!_imp_KfLowerIrql (82a0a158)] 82a40c42 5f pop edi 82a40c43 5e pop esi
1: kd> u 82a40c17 82a40c39 nt!KeFlushMultipleRangeTb+0x2b9: 82a40c17 46 inc esi 82a40c18 85357c3ab782 test dword ptr [nt!HvlLongSpinCountMask (82b73a7c)],esi 82a40c1e 7511 jne nt!KeFlushMultipleRangeTb+0x2d3 (82a40c31) 82a40c20 f605783ab78240 test byte ptr [nt!HvlEnlightenments (82b73a78)],40h 82a40c27 7408 je nt!KeFlushMultipleRangeTb+0x2d3 (82a40c31) 82a40c29 56 push esi 82a40c2a e810090a00 call nt!HvlNotifyLongSpinWait (82ae153f) 82a40c2f eb02 jmp nt!KeFlushMultipleRangeTb+0x2d5 (82a40c33) 82a40c31 f390 pause <<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<< @eip 82a40c33 8b07 mov eax,dword ptr [edi] 82a40c35 85c0 test eax,eax 82a40c37 75de jne nt!KeFlushMultipleRangeTb+0x2b9 (82a40c17)
This function tests a variable at edi in each loop. While the signal is not set, the thread goes around loop, eventually executes a pause instruction and then tries the test again. This thread seems to be waiting on someone to set a flag.
Processor #2 is also trying to send an IPI using nt!KiIpiSendPacket. It was caught in the same function that Processor #0 is in. In this case, it is executing the jmp instruction that is one instruction after the pause which Processor #0 is executing.
-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=- ==> Processor #2 -=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=- 2: kd> knL # ChildEBP RetAddr 00 a0c4fac0 82a40bcd nt!KiIpiSendPacket+0xdf 01 a0c4fafc 82b29431 nt!KeFlushMultipleRangeTb+0x26f 02 a0c4fbe8 82b2aef1 nt!MiFreePoolPages+0x42c 03 a0c4fc50 82aa6b37 nt!ExFreePoolWithTag+0x436 04 a0c4fc64 82c3745d nt!MmFreeAccessPfnBuffer+0x2f 05 a0c4fcc0 82c4b83a nt!PfpFlushBuffers+0x2ba 06 a0c4fd50 82c12f5e nt!PfTLoggingWorker+0xaa 07 a0c4fd90 82aba219 nt!PspSystemThreadStartup+0x9e 08 00000000 00000000 nt!KiThreadStartup+0x19
2: kd> !irql
Debugger saved IRQL for processor 0x2 -- 0 (LOW_LEVEL) <<<<<< Windows Internals 4th Edition notes that IRQL may not be saved; this explains the 0
2: kd> u @eip nt!KiIpiSendPacket+0xdf: 82a85e7b eb9e jmp nt!KiIpiSendPacket+0x7f (82a85e1b) <<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<< @eip 82a85e7d 8d4900 lea ecx,[ecx] ...
Summary so far
So at this point we can say (without any real knowledge of how IPI handling is achieved) that:
Processor #0 is waiting for Processor #1, #2 and/or #3 to respond to its IPI
Processor #0 is running nt!ExFreePoolWithTag
Processor #0 is reported to be at IRQL 28 (CLOCK2_LEVEL)
Processor #1 is waiting for a flag to bet set; probably an IPI flag
Processor #0 is running nt!ExAllocatePoolWithTag
Processor #1 is reported to be at IRQL 0 (LOW_LEVEL); probably incorrect
Processor #2 is waiting for Processor #0, #1 and/or #3 to respond to its IPI
Processor #0 is running nt!ExFreePoolWithTag
Processor #2 is reported to be at IRQL 0 (LOW_LEVEL); probably incorrect
Processor #3 has been tagged by bugcheck as being the cause
Processor #3 hasn’t been looked at yet
Analyzing Processor #3 was problematic. The Processor Context Record (PCR) was not completely captured in the dump. This happens when the bugcheck thread is unable to interrupt the target processor to gather the context – usually because it is running at the HIGH (31) IRQL.
When the context is missing, all of the registers are treated as being zero. The zero value instruction pointer causes the stack to be incorrect when calling knL and various other commands.
-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=- ==> Processor #3 -=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=- 1: kd> ~3 WARNING: Process directory table base 7B59C400 doesn't match CR3 00185000 WARNING: Process directory table base 7B59C400 doesn't match CR3 00185000
3: kd> r eax=00000000 ebx=00000000 ecx=00000000 edx=00000000 esi=00000000 edi=00000000 eip=00000000 esp=00000000 ebp=00000000 iopl=0 nv up di pl nz na po nc cs=0000 ss=0000 ds=0000 es=0000 fs=0000 gs=0000 efl=00000000 00000000 ?? ???
3: kd> knL # ChildEBP RetAddr WARNING: Frame IP not in any known module. Following frames may be wrong. 00 00000000 00000000 0x0
To determine the thread running on the processor, the !thread command was used. The value for the _ETHREAD (8aaa17c8) comes from the PCR that Windows maintains as part of scheduling. Once again, like the knL command, because the context was missing, the stack was not shown.
3: kd> !pcr KPCR for Processor 3 at 87337000: ... CurrentThread: 8aaa17c8 ...
3: kd> !thread THREAD 8aaa17c8 Cid 0454.08ec Teb: 00000000 Win32Thread: 00000000 RUNNING on processor 3 IRP List: 8f8c6f68: (0006,0094) Flags: 40000404 Mdl: 00000000 Not impersonating DeviceMap 89970d50 Owning Process 89f93ac0 Image: ZyxApp.exe Attached Process N/A Image: N/A Wait Start TickCount 117454 Ticks: 589 (0:00:00:09.188) Context Switch Count 37 IdealProcessor: 3 UserTime 00:00:00.000 KernelTime 00:00:00.000 Win32 Start Address 0x7753fd0f Stack Init 91aecfd0 Current 91aeca78 Base 91aed000 Limit 91aea000 Call 0 Priority 10 BasePriority 8 UnusualBoost 0 ForegroundBoost 0 IoPriority 2 PagePriority 5 ChildEBP RetAddr Args to Child 00000000 00000000 00000000 00000000 00000000 0x0
The important piece of information in the !thread output was the stack limits. These values allow me to do a search of the raw stack for known symbols. I used dps <limit> <base>. What I was looking for was the first symbol after the 0xffffffff entries which has a value above it that points to a location within this stack. In this case, is it the hal!KfLowerIrql symbol that has a value (base pointer) above it. This symbol is the deepest function that the stack has got to; it doesn’t necessarily mean that the stack is at this depth now.
3: kd> dps 91aea000 91aed000 91aea000 ffffffff 91aea004 ffffffff … 91aec85c ffffffff 91aec860 ffffffff 91aec864 ffffffff 91aec868 ffffffff 91aec86c ffffffff 91aec870 ffffffff 91aec874 0001187f 91aec878 00000010 91aec87c 00011a1f 91aec880 ffffffff 91aec884 0001187f 91aec888 82b7561f nt!MmSystemSpaceLock+0x1f 91aec88c 91aec89c 91aec890 82e20ba9 hal!KfLowerIrql+0x61 91aec894 00011a00 91aec898 82b7561f nt!MmSystemSpaceLock+0x1f 91aec89c 91aec95c 91aec8a0 82e1e92d hal!KeReleaseQueuedSpinLock+0x2d 91aec8a4 82a3bdc7 nt!MiReturnNonPagedPoolVa+0x1d4 91aec8a8 00000001
91aec8d0 82e21cee hal!HalpLegacyApicWriteIcr+0xa
91aec8d8 82e2aea0 hal!HalpRequestIpiSpecifyVector+0x40
91aec8e4 82b738cc nt!KiProcessorBlock+0xc
91aec8f4 82aa6e36 nt!MiInsertPageInFreeOrZeroedList+0x25b
91aec91c 82a40bcd nt!KeFlushMultipleRangeTb+0x26f
91aec928 82a9fa43 nt!KiFlushTargetMultipleRangeTb
91aec958 82b29431 nt!MiFreePoolPages+0x42c
Using the first location found, I used the stack variables to build up a k= command. With an x86 stack, if you select the stack location above the symbol as the base pointer and stack pointer, and the symbol’s address as the instruction pointer, you’ll get a reconstructed stack.
3: kd> k= 91aec89c 91aec88c 82e20ba9
91aec89c 82e1e92d hal!KfLowerIrql+0x61
91aec8a0 82a3bdc7 hal!KeReleaseQueuedSpinLock+0x2d
91aec9a8 82aab049 nt!MiReturnNonPagedPoolVa+0x1d4
91aec9c0 82ab1685 nt!FindNodeOrParent+0x2091aec9d8 82abd820 nt!RtlLookupElementGenericTableFullAvl+0x16
91aeca0c 82e20ba9 nt!RtlLookupElementGenericTableAvl+0x18
91aeca1c 82b199d2 hal!KfLowerIrql+0x61
91aeca24 82d41ad1 nt!VfAvlUnlockShared+0x15
91aeca40 82b2aef1 nt!VfRemLockDeleteMemoryRange+0x5c
91aecaa8 963a55c5 nt!ExFreePoolWithTag+0x436
91aecac8 963a61d4 Zyx+0x5c5 91aecb00 82d3a6c3 Zyx+0x11d4 91aecb24 82a4054a nt!IovCallDriver+0x258 91aecb38 82c3b975 nt!IofCallDriver+0x1b 91aecb7c 82c2c591 nt!IopDeleteFile+0x10c 91aecb94 82a81d60 nt!ObpRemoveObjectRoutine+0x59 91aecba8 82a81cd0 nt!ObfDereferenceObjectWithTag+0x88 91aecbb0 82c4f308 nt!ObfDereferenceObject+0xd 91aecbf0 82c7dba9 nt!ObpCloseHandleTableEntry+0x21d 91aecc20 82c65f86 nt!ExSweepHandleTable+0x5f 91aecc40 82c73666 nt!ObKillProcess+0x54 91aeccb4 82c87051 nt!PspExitThread+0x5db 91aecccc 82aba8c0 nt!PsExitSpecialApc+0x22 91aecd1c 82a472a4 nt!KiDeliverApc+0x28b 91aecd1c 775570b4 nt!KiServiceExit+0x64 0147ff88 00000000 0x775570b4
In this case, because nt!KiFlushTargetMultipleRangeTb was present on Processor #1 as well as being in the Processor #3 raw stack, I found the first symbol lower than this symbol on the (raw) stack that had a valid base pointer above it. The stack was thus built around the nt!MiFreePoolPages function instead.
3: kd> k= 91aeca40 91aec954 82b29431
91aeca40 82b2aef1 nt!MiFreePoolPages+0x42c
91aecac8 963a61d4 Zyx+0x5c5
91aecb00 82d3a6c3 Zyx+0x11d4
91aecb24 82a4054a nt!IovCallDriver+0x258
91aecb38 82c3b975 nt!IofCallDriver+0x1b
91aecb7c 82c2c591 nt!IopDeleteFile+0x10c
91aecb94 82a81d60 nt!ObpRemoveObjectRoutine+0x59
91aecba8 82a81cd0 nt!ObfDereferenceObjectWithTag+0x88
91aecbb0 82c4f308 nt!ObfDereferenceObject+0xd
91aecbf0 82c7dba9 nt!ObpCloseHandleTableEntry+0x21d
91aecc20 82c65f86 nt!ExSweepHandleTable+0x5f
91aecc40 82c73666 nt!ObKillProcess+0x54
91aeccb4 82c87051 nt!PspExitThread+0x5db
91aecccc 82aba8c0 nt!PsExitSpecialApc+0x22
91aecd1c 82a472a4 nt!KiDeliverApc+0x28b
91aecd1c 775570b4 nt!KiServiceExit+0x64
0147ff88 00000000 0x775570b4
It’s very hard to get much further this stack since the exact registers are not known. In particular, it is hard to determine what function above nt!MiFreePoolPages the thread is executing at the moment. As with the Processor #1 investigation, it is interesting to note that Processor #3 is also involved in a memory operation; specifically, it is doing a nt!ExFreePoolWithTag much like Processor #0 is.
The only time an IPI (interrupt) is not processed immediately is when the target processor is at IPI IRQL or higher. The most common example being when it is already processing an IPI (interrupts of the same level cannot interrupt the handler for the same IRQL). In this case, the interrupt has to be queued until the interrupt mask allows its arrival. Usually, this design allows only one IPI to be processed at any one time.
A deadlock (like) condition can arise though if an IPI is issued to a processor that is at a higher IRQL, and this processor (thread) attempts to send an IPI. The IPI logic blocks the send if there is an outstanding IPI to complete on the processor. The assumption being that interrupt queuing should be avoided as there is a probability of loss if the queue overflows.
Looking in the Windows Internals book, there is a single sentence that says “Each interrupt level has a specific purpose. For example, the kernel issues an interprocessor interrupt (IPI) to request that another processor perform an action, such as dispatching a particular thread for execution or updating its translation look-aside buffer cache.”. This is very interesting as the translation look-aside buffer is part of the memory manager, and the memory operations are being undertaken on all the processors.
3rd Party Driver
Instead of pulling my hair out combing through the threads on the system, the IPI code, the Memory Manager code or bugcheck code, I decided to look at the 3rd party driver in processor #3 to see if it was changing the IRQL.
The first step was to find the bound of the Zyx+0x5c5 function. The end address of the assembler is easy to determine, it is Zyx+0x5c5. The question is, how big is the function? To work that out, you look at the assembler of the caller. The caller’s assembler will point to the starting instruction in the Zyx+0x5c5 function.
3: kd> ub Zyx+0x11d4 Zyx+0x11b7: 963a61b7 ff7008 push dword ptr [eax+8] 963a61ba 52 push edx 963a61bb 6a01 push 1 963a61bd ff7018 push dword ptr [eax+18h] 963a61c0 e827fdffff call Zyx+0xeec (963a5eec) 963a61c5 e9f1000000 jmp Zyx+0x12bb (963a62bb) 963a61ca e8d3fcffff call Zyx+0xea2 (963a5ea2) 963a61cf e89ef3ffff call Zyx+0x572 (963a5572)
3: kd> u 963a5572 Zyx+0x5c5 Zyx+0x572: 963a5572 8bff mov edi,edi 963a5574 55 push ebp 963a5575 8bec mov ebp,esp 963a5577 51 push ecx 963a5578 53 push ebx 963a5579 56 push esi 963a557a 57 push edi 963a557b b11f mov cl,1Fh <<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<< 0x1F = 0n31 (HIGH) 963a557d ff15006b3a96 call dword ptr [Zyx+0x1b00 (963a6b00)] <<<<<<<< hal!KfRaiseIrql (via lookup) 963a5583 bb14813a96 mov ebx,offset Zyx+0x3114 (963a8114) 963a5588 8bcb mov ecx,ebx 963a558a 8845ff mov byte ptr [ebp-1],al 963a558d e8c2100000 call Zyx+0x1654 (963a6654) <<<<<<<<<<<<<<<< [Unresolved] 963a5592 a1746d3a96 mov eax,dword ptr [Zyx+0x1d74 (963a6d74)] 963a5597 85c0 test eax,eax 963a5599 8b35586b3a96 mov esi,dword ptr [Zyx+0x1b58 (963a6b58)] <<<< nt!ExFreePoolWithTag (via lookup) 963a559f 7413 je Zyx+0x5b4 (963a55b4) 963a55a1 8b7808 mov edi,dword ptr [eax+8] 963a55a4 6a00 push 0 963a55a6 50 push eax 963a55a7 ffd6 call esi <<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<< nt!ExFreePoolWithTag (via 963a5599) 963a55a9 85ff test edi,edi 963a55ab 8bc7 mov eax,edi 963a55ad a3746d3a96 mov dword ptr [Zyx+0x1d74 (963a6d74)],eax 963a55b2 75ed jne Zyx+0x5a1 (963a55a1) 963a55b4 a1a06d3a96 mov eax,dword ptr [Zyx+0x1da0 (963a6da0)] 963a55b9 85c0 test eax,eax 963a55bb 7413 je Zyx+0x5d0 (963a55d0) 963a55bd 8b7808 mov edi,dword ptr [eax+8] 963a55c0 6a00 push 0 963a55c2 50 push eax 963a55c3 ffd6 call esi <<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<< nt!ExFreePoolWithTag (via 963a5599)
3: kd> ln poi(963a6b00) (82e20844) hal!KfRaiseIrql | (82e208c6) hal!HalpDispatchSoftwareInterrupt Exact matches:
3: kd> ln poi(963a6b58) (82b2aaba) nt!ExFreePoolWithTag | (82b2b660) nt!ExDeferredFreePool Exact matches:
The next step was to work out what functions the Zyx+0x5c5 function called, and with what parameters. The assembler didn’t make this immediately obvious. The address of the function calls were lookups. This is common when the address is imported by the code, instead of being bound. To get the function name of the call, I dereferenced the pointer (poi(<addr>) and passed the address to ln so that it listed the nearest symbols.
The first function confirmed some the theory. A call to hal!KfRaiseIrql was being made with a IRQL of HIGH (31). And this was prior to a code path that called functions.
The value of the @esi register was determined to be nt!ExFreePoolWithTag. This matched the function name in the reconstructed stack (nt!ExFreePoolWithTag+0x436) and confirmed that the stack was reconstructed in this area correctly.
The Zyx function raised IRQL to HIGH_LEVEL before calling nt!ExFreePoolWithTag. This function caused a Translation Look-aside Buffer (TLB) flush via nt!KiFlushTargetMultipleRangeTb) to occur. This in turn caused an IPI notification to the other processors (via nt!KiIpiSendPacket) to indicate the release of the TLB memory.
The Processor #0 IPI was stuck as incomplete because Processor #3 could not be interrupted, and Processor #3 did not lower its IRQL as it was waiting to send its own IPI upon completion of the Processor #0 IPI – thus a deadlock was formed.
I contacted the driver’s author and organized a new version that used the SYNCH (27) IRQL (the level immediately under IPI). The new build was sent to the customer and the immediate bugcheck that they were observing in their test environment disappeared. Although this is not an ideal solution, it was sufficient for the time being. An ideal solution would be to follow the documentation for ExFreePoolWithTag and not make the call with an IRQL greater than DISPATCH_LEVEL.
I hope this case has provided a good foundation in how to recover and navigate through stacks without symbols or context. Once you understand the mechanics of calling conventions, the values needed to bound a disassembling operation and/or a stack reassemble are quite easy to determine.