Hi, this is Bob Golding; I wanted to write a blog about an interesting hardware issue I ran into. Hardware problems can be tricky to isolate. I recently came across one that I thought was interesting and gave an example of how to trace code execution. The machine executed the filler “int 3” instructions generated by the compiler. Execution should never reach these filler instructions, so we needed to determine how the instruction pointer got there.
What was the issue?
The issue was a bug check 8E (unhandled exception). The exception was a debug exception (80000003), because a filler INT 3 instruction was executed.
1: kd> .bugcheck
Bugcheck code 0000008E
Arguments 80000003 8082e3e0 f78aec38 00000000
Below is the trap frame (the trap frame is the third argument in the bugcheck code). Note that the actual trapping instruction is 8082e3e0, the instruction pointer is incremented before the INT 3 generates a trap. The correct EIP is reported in the bug check values.
1: kd> .trap f78aec38
ErrCode = 00000000
eax=0e364a01 ebx=f78aed50 ecx=f78aecb8 edx=000000e1 esi=f7727ec0 edi=f75d9ca2
eip=8082e3e1 esp=f78aecac ebp=f78aecc0 iopl=0 nv up ei pl nz na po nc
cs=0008 ss=0010 ds=0023 es=0023 fs=0030 gs=0000 efl=00000202
nt!KiCheckForKernelApcDelivery+0x37:
8082e3e1 cc int 3
The machine executed code at the end of a function that should never be executed. This should never happen.
1: kd> ub 8082e3e1
nt!KiCheckForKernelApcDelivery+0x1c:
8082e3c6 32c9 xor cl,cl
8082e3c8 ff2508118080 jmp dword ptr [nt!_imp_KfLowerIrql (80801108)]
8082e3ce 64a124010000 mov eax,dword ptr fs:[00000124h]
8082e3d4 c6403d01 mov byte ptr [eax+3Dh],1
8082e3d8 ff2598108080 jmp dword ptr [nt!_imp_HalRequestSoftwareInterrupt (80801098)]
8082e3de cc int 3 <<< This is after the function.
8082e3df cc int 3
8082e3e0 cc int 3
OK, So Now What?
Now, we need to find the execution path that caused the machine to execute this INT 3. There are places to look to find clues that will tell us. The first place to start looking is the stack. If a “call” instruction was made, the return will be pushed on the stack. This way we can try to determine if we arrived at this bad instruction pointer from a call or a ret instruction.
Using the value from esp in the above trap frame, let’s dump the stack.
1: kd> dps f78aecac
f78aecac 80a5d8fc hal!HalpIpiHandler+0xcc <<< Interesting?
f78aecb0 f78aecc0
f78aecb4 00000000
f78aecb8 00000002
f78aecbc 000000e1
f78aecc0 f78aed50
f78aecc4 f75d9ca2
f78aecc8 badb0d00
f78aeccc 000086a8
In looking at the stack dump, we see that there may have been a call from HalpIpiHandler. Let’s dump the code leading up to hal!HalpIpiHandler+0xcc to see what it did.
1: kd> ub 80a5d8fc
hal!HalpIpiHandler+0xb1:
80a5d8e1 e8d82b0000 call hal!HalBeginSystemInterrupt (80a604be)
80a5d8e6 0ac0 or al,al
80a5d8e8 7509 jne hal!HalpIpiHandler+0xc3 (80a5d8f3)
80a5d8ea 83c408 add esp,8
80a5d8ed ff2510b0a580 jmp dword ptr [hal!_imp_Kei386EoiHelper (80a5b010)]
80a5d8f3 6a00 push 0
80a5d8f5 55 push ebp
80a5d8f6 ff1520b0a580 call dword ptr [hal!_imp__KiIpiServiceRoutine (80a5b020)]
In the above assembly, we can see that there is a call made using a pointer in the import table. Now, let’s have a look at that pointer.
1: kd> dd 80a5b020 l 1
80a5b020 8082e3e4
The pointer is very close to the instruction we trapped on. Is this a coincidence?
It looks like due to an effective address calculation failure, the machine starting executing at 8082e3e0 instead of 8082e3e4. Somewhere in the data path the processor executing this instruction stream dropped bit three, turning a 4 into a 0.
1: kd> ?0y0100
Evaluate expression: 4 = 00000000`00000004
1: kd> ?0y0000
Evaluate expression: 0 = 00000000`00000000
What does all of this mean?
There is some circumstantial evidence here that the machine was in the IPI handler. The IPI Handler is used in multiprocessor systems so that one processor may interrupt another. So, how can we further prove this is where we were? Let’s try to match the trap frame registers to the assembly from HalpIpiHandler before it calls KiIpiServiceRoutine.
1: kd> ub 80a5d8fc l10
hal!HalpIpiHandler+0x94:
80a5d8c4 c74508000ddbba mov dword ptr [ebp+8],0BADB0D00h <<< EBP+ 8 does equal badb0d00
80a5d8cb 895d00 mov dword ptr [ebp],ebx <<< ebp does equal ebx
80a5d8ce 897d04 mov dword ptr [ebp+4],edi <<< ebp+4 does equal edi
80a5d8d1 68e1000000 push 0E1h
80a5d8d6 83ec04 sub esp,4
80a5d8d9 54 push esp
80a5d8da 68e1000000 push 0E1h
80a5d8df 6a1d push 1Dh
80a5d8e6 0ac0 or al,al <<< al is equal to 1
Below is the stack data that the instructions above produced (borrowed from the dps output above). It matches, so the machine did execute the above instructions.
f78aecac 80a5d8fc hal!HalpIpiHandler+0xcc <<< Return Pushed by call instruction
f78aecb0 f78aecc0 <<< Pushed EBP
f78aecb4 00000000 <<< Pushed a 0
Finally, the last bit of evidence:
Let's view the IPI state using the debugger command !ipi. From this output we can see that processor 1 is the receiver of a cross interrupt from Processor 0. This is consistent with the data we found on the stack.
1: kd> !ipi
IPI State for Processor 0
As a sender, awaiting packet completion from processor 1.
TargetSet 2 PacketBarrier e IpiFrozen 2 [Frozen]
IpiFrame 8089a570 SignalDone 00000000 RequestSummary 0
Packet State Active WorkerRoutine nt!KiFlushTargetMultipleTb
Parameter[0] 00000000 Parameter[1] 80899f10 Parameter[2] 80899f04
IPI State for Processor 1
As a receiver, the following unhandled requests are pending: [Packet] [DPC]
TargetSet 0 PacketBarrier 0 IpiFrozen 0 [Running]
IpiFrame b5ad7be8 SignalDone ffdff120 RequestSummary 2 [DPC]
Packet State Stale WorkerRoutine nt!KiFlushTargetMultipleTb
Parameter[0] 00000000 Parameter[1] b5ad7950 Parameter[2] b5ad7948
What went wrong?
Based on the evidence in this dump it appears that call instruction transferred execution to the wrong address. The machine ended up executing at address 8082e3e0 instead of 8082e3e4, a single bit difference. This same bit was flipped in several crashes from this machine, so all the evidence pointed to a faulty processor. After replacing the processor we were running on when we bugchecked, the issue did not occur again.
Hardware can sometimes cause pretty specific failures, such as the flipped bit we see here. To determine that this failure was a hardware issue, we had to reconstruct the execution path and trace how we ended up at the failing instruction. We were able to match the register contents to what they would have been before the call to KiIpiServiceRoutine. This demonstrated that the call should have been made to KiIpiServiceRoutine, but it unexpectedly went to the wrong address.
Hi, it's the Debug Ninja back again with another debugging adventure. Recently I have encountered several instances where processes fail to initialize, and a review of available resources showed that there was no obvious resource exhaustion. A more in depth review found that there were no available string atoms in the global atom table.
Global atoms are organized on a per-session basis. If atoms cannot be allocated in session 0, services may fail to start or processes launched by various services may fail to start. However, a user logged in to a different session will not experience any such failures.
String atoms are numbered from 0xC000 through 0xFFFF, providing a maximum of 0x4000 atoms per session. For more information on atoms, and atom tables, see http://technet.microsoft.com/en-us/query/ms649053.
When there are no more string atoms available, calls to APIs that allocate string atoms will fail. Because atoms are often allocated at process or dll init time, the most common symptom is that processes fail to initialize. The process may cleanly exit without an error. You are likely experiencing this problem if you debug your application and find that the failure originates from an API that allocates string atoms such as RegisterClass, RegisterClassEx, GlobalAddAtom, or AddAtom.
To determine if the global string atom table is full you will need to perform a kernel debug. This can be a live debug or a post-mortem debug using a dump.
First identify the session where the failures have occurred and set the process context to a process in this session. In my example, w3wp.exe was launching a process and this process failed to initialize.
2: kd> !process 0 0 w3wp.exe
PROCESS fffffa8005083060
SessionId: 0 Cid: 1668 Peb: fffdf000 ParentCid: 08ec
DirBase: 8a2df000 ObjectTable: fffff8a0128bbe40 HandleCount: 441.
Image: w3wp.exe
2: kd> .process /p /r fffffa8005083060
Implicit process is now fffffa80`05083060
Loading User Symbols
.....
Next we need to analyze the global atom table. The pointer to the table is stored in the UserAtomTableHandle global.
2: kd> dq win32k!UserAtomTableHandle l1
fffff960`003bf7a8 fffff8a0`05e5bc70
The UserAtomTableHandle has a pointer to a handle table at offset 0x10 in 64-bit, and offset 0x8 in 32-bit. Note that although the atom table is defined as a _RTL_ATOM_TABLE, the format shown by dt is for user mode and does not apply to the UserAtomTableHandle in kernel mode.
2: kd> dq fffff8a0`05e5bc70+10 l1
fffff8a0`05e5bc80 fffff8a0`05db7740
2: kd> dt nt!_HANDLE_TABLE fffff8a0`05db7740
+0x000 TableCode : 0xfffff8a0`109c8001
+0x008 QuotaProcess : (null)
+0x010 UniqueProcessId : 0x00000000`00000184 Void
+0x018 HandleLock : _EX_PUSH_LOCK
+0x020 HandleTableList : _LIST_ENTRY [ 0xfffff8a0`05db7760 - 0xfffff8a0`05db7760 ]
+0x030 HandleContentionEvent : _EX_PUSH_LOCK
+0x038 DebugInfo : (null)
+0x040 ExtraInfoPages : 0n0
+0x044 Flags : 0
+0x044 StrictFIFO : 0y0
+0x048 FirstFreeHandle : 0x10004
+0x050 LastFreeHandleEntry : 0xfffff8a0`10ca4ff0 _HANDLE_TABLE_ENTRY
+0x058 HandleCount : 0x3fc0
+0x05c NextHandleNeedingPool : 0x10400
+0x060 HandleCountHighWatermark : 0x3fc1
The FirstFreeHandle contains the handle number that will be given to the next handle allocated from this table. This value is encoded, to get the next handle number we need to right shift the FirstFreeHandle by 2 bits.
2: kd> ?00010004>>2
Evaluate expression: 16385 = 00000000`00004001
The result from above, 0x4001, is greater than the number of possible string atoms. As I mentioned earlier, there is a limit of 0x4000 string atoms. Now we know that the session is out of string atoms.
The next step is to dump the string atoms to identify whether there is an observable pattern in the leaked strings. The !atom command only works in user mode, so we need to dump the kernel mode strings manually. An atom table is comprised of multiple buckets. Each bucket is the head of a list of atoms. The buckets start at offset 0x20 in the atom table in 64-bit, and offset 0x10 in 32-bit.
2: kd> dq fffff8a0`05e5bc70+20
fffff8a0`05e5bc90 fffff8a0`05e5ba60 fffff8a0`05db7be0
fffff8a0`05e5bca0 fffff8a0`08cf1770 fffff8a0`05e5b3d0
fffff8a0`05e5bcb0 fffff8a0`05ea9020 fffff8a0`05e5b8e0
fffff8a0`05e5bcc0 fffff8a0`05ea9b10 fffff8a0`05ea9910
fffff8a0`05e5bcd0 fffff8a0`05ea9f00 fffff8a0`05e5b650
fffff8a0`05e5bce0 fffff8a0`05cda290 fffff8a0`05ea9e80
fffff8a0`05e5bcf0 fffff8a0`05e5b200 fffff8a0`05ea9e30
fffff8a0`05e5bd00 fffff8a0`05e5b7e0 fffff8a0`06c56210
2: kd> dq
fffff8a0`05e5bd10 fffff8a0`06d6b5a0 fffff8a0`05ea9d50
fffff8a0`05e5bd20 fffff8a0`05e5b790 fffff8a0`05e5b9d0
fffff8a0`05e5bd30 fffff8a0`06bd9bc0 fffff8a0`05ea9c90
fffff8a0`05e5bd40 fffff8a0`05e5b0c0 fffff8a0`06ae2020
fffff8a0`05e5bd50 fffff8a0`05e5b930 fffff8a0`04d2af40
fffff8a0`05e5bd60 fffff8a0`05e5b690 fffff8a0`05e5b980
fffff8a0`05e5bd70 fffff8a0`05e5b490 fffff8a0`05e5b410
fffff8a0`05e5bd80 fffff8a0`05e5ba20 fffff8a0`05e5b4f0
fffff8a0`05e5bd90 fffff8a0`05e5baa0 fffff8a0`05e5b390
fffff8a0`05e5bda0 fffff8a0`05e5b840 fffff8a0`05ea9c50
fffff8a0`05e5bdb0 fffff8a0`05e5b250 00000000`00000000
fffff8a0`05e5bdc0 00000000`00000000 00000000`00000000
fffff8a0`05e5bdd0 00000000`00000000 00000000`00000000
fffff8a0`05e5bde0 00000000`00000000 00000000`00000000
fffff8a0`05e5bdf0 00000000`00000000 00000000`00000000
fffff8a0`05e5be00 00000000`00000000 00000000`00000000
The quick and dirty way to dump the buckets is with !list. I am sure that some will say it is tedious to dump each bucket list by hand and that there are easier ways to accomplish this. To prevent this article from becoming a lesson on debugger scripting, I am leaving that as an exercise to the reader.
2: kd> !list "-t nt!_RTL_ATOM_TABLE_ENTRY.HashLink -e -x \"du @$extret+10\" fffff8a0`05e5ba60"
du @$extret+10
fffff8a0`05e5ba70 "Native"
<snip strings that don't match a pattern>
fffff8a0`0838a120 "ControlOfs0210000000000700"
fffff8a0`0f7ff430 "ControlOfs021A000000000C30"
fffff8a0`162168c0 "ControlOfs020E000000001774"
fffff8a0`08c33870 "ControlOfs01F70000000007F4"
fffff8a0`07c46910 "ControlOfs0202000000000BF8"
fffff8a0`062aab50 "ControlOfs01F5000000001274"
fffff8a0`0777b150 "ControlOfs0202000000000C80"
fffff8a0`07dd3410 "ControlOfs0207000000000F00"
fffff8a0`0f01d190 "ControlOfs0214000000000DAC"
Dumping the atoms I found that there is a continuous pattern of the string ControlOfs followed by 16 hexadecimal numbers. Some time spent with your favorite search engine should find other reports of atom leaks involving the string ControlOfs, and that these leaks have been identified as a problem in some specific software. In this instance the programmer using that software needs to change their application to avoid the problem.
Hello all, Scott Olson here again to share another interesting issue I recently debugged with pool corruption and found that using special pool does not work with large pool allocations (pool allocations greater than a PAGE_SIZE).
Here is an example of a valid large page allocation. Notice the size is 0x1fb0 and a PAGE_SIZE is 0x1000 or 4kb.
0: kd> !pool fffffa80`0dba6fa0
Pool page fffffa800dba6fa0 region is Nonpaged pool
*fffffa800dba5000 : large page allocation, Tag is Io , size is 0x1fb0 bytes
Pooltag Io : general IO allocations, Binary : nt!io
In Windows 7, at the end of the large pool allocation it will have an allocation tag of “Frag” then a “Free” tag with the rest of the page size and is stored on the free pool list for allocation less than a page in size.
0: kd> dc fffffa800dba5000 fffffa800dba5000+0x1fb0-4
fffffa80`0dba5000 00558001 32373242 00000000 00000000 ..U.B272........
fffffa80`0dba5010 55555555 55555555 98764321 01b75f55 UUUUUUUU!Cv.U_..
fffffa80`0dba5020 00000001 00000001 704e6ff0 fffff981 .........oNp....
…<cut>
fffffa80`0dba6f80 55555555 55555555 55555555 55555555 UUUUUUUUUUUUUUUU
fffffa80`0dba6f90 55555555 55555555 55555555 55555555 UUUUUUUUUUUUUUUU
fffffa80`0dba6fa0 55555555 55555555 00001fb0 00000000 UUUUUUUU........
0: kd> dc
fffffa80`0dba6fb0 02010100 67617246 55555555 55555555 ....FragUUUUUUUU
fffffa80`0dba6fc0 00040101 65657246 55555555 55555555 ....FreeUUUUUUUU
fffffa80`0dba6fd0 00802170 fffff880 0e49cf70 fffffa80 p!......p.I.....
fffffa80`0dba6fe0 15cc8fe8 fffff981 3b9c50a7 00000005 .........P.;....
Displayed with the !pool command:
0: kd> !pool fffffa80`0dba6fb0
Pool page fffffa800dba6fb0 region is Nonpaged pool
*fffffa800dba6fb0 size: 10 previous size: 0 (Allocated) *Frag
Owning component : Unknown (update pooltag.txt)
fffffa800dba6fc0 size: 40 previous size: 10 (Free) Free
The example above demonstrates how this normally works. The downside to this architecture is that if a driver were to overrun its pool allocation then special pool would not be useful because the large pool allocation has to be page-aligned. Special pool detects pool overruns by putting the data at the end of the page, which would not be feasible with a large pool allocation.
In Windows 7 there is a check while freeing the pool memory that will determine if this allocation had written past the end of its allocation, and if so will bug check the machine with a Stop 0x19 BAD_POOL_HEADER with the first parameter being a 0x21. Here is the definition along with what each parameter means:
BAD_POOL_HEADER (19)
The pool is already corrupt at the time of the current request.
This may or may not be due to the caller.
The internal pool links must be walked to figure out a possible cause of
the problem, and then special pool applied to the suspect tags or the driver
verifier to a suspect driver.
Arguments:
Arg1: 0000000000000021, the data following the pool block being freed is corrupt. Typically this means the consumer (call stack ) has overrun the block.
Arg2: fffffa800dc57000, The pool pointer being freed.
Arg3: 0000000000002180, The number of bytes allocated for the pool block.
Arg4: 006b0072006f0077, The corrupted value found following the pool block.
Here is an example of what this corruption looks like compared to the above valid large pool allocation:
0: kd> !pool fffffa800dc57000
Pool page fffffa800dc57000 region is Nonpaged pool
fffffa800dc57000 is not a valid large pool allocation, checking large session pool...
fffffa800dc57000 is freed (or corrupt) pool
Bad allocation size @fffffa800dc57000, zero is invalid
***
*** An error (or corruption) in the pool was detected;
*** Attempting to diagnose the problem.
*** Use !poolval fffffa800dc57000 for more details.
Pool page [ fffffa800dc57000 ] is __inVALID.
Analyzing linked list...
[ fffffa800dc57000 ]: invalid previous size [ 0x38 ] should be [ 0x0 ]
Scanning for single bit errors...
None found
Next, I dump the allocation from the start to the end. Notice the size of the allocation is stored in the bugcheck code as argument 3.
0: kd> dc fffffa800dc57000 fffffa800dc57000+2180-4
fffffa80`0dc57000 00000038 0000000e 00000000 00000000 8...............
fffffa80`0dc57010 a24da497 01ccc5d6 c827993c 41946d1f ..M.....<.'..m.A
fffffa80`0dc57020 c0d75c9b b7cff1a5 00000000 00000020 .\.......... ...
fffffa80`0dc57030 000021e0 00000006 0000006c 00000110 .!......l.......
fffffa80`0dc57040 00000208 000003b8 00000208 00000660 ............`...
fffffa80`0dc57050 00000208 00000910 00000208 00000bb0 ................
<cut>
fffffa80`0dc59150 002d0033 00300031 0063002e 006d006f 3.-.1.0...c.o.m.
fffffa80`0dc59160 006c002e 00660065 00680074 006e0061 ..l.e.f.t.h.a.n.
fffffa80`0dc59170 006e0064 00740065 006f0077 006b0072 d.n.e.t.w.o.r.k.
This should be the end of the allocation. The next thing we see should be the “Frag” and “Free” tags.
fffffa80`0dc59180 003a0073 0061006d 0061006e 00650067 s.:.m.a.n.a.g.e.
fffffa80`0dc59190 0065006d 0074006e 0038003a 00390036 m.e.n.t.:.8.6.9.
fffffa80`0dc591a0 0062003a 00670069 0075006c 00790063 :.b.i.g.l.u.c.y.
fffffa80`0dc591b0 0064002d 00740061 002d0061 006e0069 -.d.a.t.a.-.i.n.
fffffa80`0dc591c0 00650064 00650078 002d0073 00740063 d.e.x.e.s.-.c.t.
fffffa80`0dc591d0 006c0072 0031005f 00000031 00000000 r.l._.1.1.......
We clearly see that the Frag and Free tag have been overwritten with some string value which is causing the corruption. At this point, you would need to look at the current stack to determine which driver had allocated the memory, and review the code to investigate when this corruption could have occurred.
Yesterday's blog prompted some questions about how to set up a debugger for a Windows OS running in a Hyper-V VM. I was surprised that I wasn't able to find good, publicly available, Microsoft issued documentation for this configuration.
The first step is to configure the Windows OS in the VM to enable a kernel debugger on COM1. One would use these same steps if you were preparing the OS to be debugged using a null modem cable. Hyper-V will allow us to redirect the COM port so that we don't need such a cable.
Next, configure Hyper-V to redirect the COM1 port to a named pipe. We will use this pipe in place of a traditional null modem cable.
After the OS and the VM are configured for debugging, we need to connect a debugger.
To test the debugger connection in windbg, from the ‘Debug’ menu choose ‘Break’. This should cause the server to break into the debugger and display a kd> prompt. Please note that breaking into the debugger will cause the OS running in the VM to halt until you tell the debugger to go, the OS will appear to be hung during this time. The command 'g' followed by Enter will tell the debugger to ‘go’ causing the VM to resume operation.
Hello ntdebugging readers, the Debug Ninja is back again with a quick blog this holiday season. I recently encountered a situation where the kernel debugger could not connect to a Windows Server 2008 R2 system running in a Hyper-V virtual machine. The configuration appeared correct; however, the debugger would not connect to the VM.
In windbg you can use Ctrl+Alt+D to view the debugger’s internal information flow. In KD use Ctrl+D followed by ENTER to toggle the output. Enabling this output I could see that the debugger was unable to read from the debug port, and that it was getting timeouts. The error "SYNCTARGET: Timeout." is a clear indication that the debug host cannot communicate with the debug target, especially when this error appears after a “Send Break in” message.
Because I was using a named pipe on a Hyper-V VM I knew that I didn't have a bad cable, although this is a common cause of kernel debug failures. I also knew that the configuration of the VM was correct, and I could use the debugger for other VMs on this server. The problem was most likely with the OS running in the VM.
By checking Device Manager I was able to confirm that there was a problem with the configuration of the OS running in the VM. The bcdedit settings were configured to use COM1, and this should make COM1 unavailable in the OS, however, COM1 was present in device manager. For some reason the debugger was not capturing COM1 on boot as it was configured to.
Examining the bcd configuration of this server I found that the bcd configuration was not correct. In the bcd store of normal Windows 7 or Windows Server 2008 R2 OS, the Windows Boot Loader sections of bcdedit have an inherit setting. You can view this information on your system from an elevated command prompt using the command ‘bcdedit /enum all’. Ordinarily the Windows Boot Loader inherits the {bootloadersettings}, the {bootloadersettings} inherit the {globalsettings}, and the {globalsettings} inherit the {dbgsettings}. Without the inherit settings, the debugger configuration will not be read by the boot loader.
Below are the bcd settings from the broken VM. You can see that all of the normal inherited settings are missing.
C:\Windows\system32>bcdedit /enum all
Windows Boot Manager
--------------------
identifier {bootmgr}
device partition=C:
path \bootmgr
description Windows Boot Manager
locale en-US
default {current}
displayorder {current}
timeout 30
Windows Boot Loader
-------------------
identifier {current}
path \Windows\system32\winload.exe
description Windows Server 2008 R2 Standard (recovered)
osdevice partition=C:
systemroot \Windows
resumeobject {2ec5363f-2a92-11e1-bbe4-806e6f6e6963}
usefirmwarepcisettings No
debug Yes
Resume from Hibernate
---------------------
identifier {2ec5363f-2a92-11e1-bbe4-806e6f6e6963}
path \Windows\system32\winresume.exe
inherit {resumeloadersettings}
filedevice partition=C:
filepath \hiberfil.sys
debugoptionenabled Yes
Windows Memory Tester
identifier {memdiag}
path \boot\memtest.exe
description Windows Memory Diagnostic
Debugger Settings
-----------------
identifier {dbgsettings}
debugtype Serial
debugport 1
baudrate 115200
Because my only interest in this VM was to get the debugger working, I did not add all of the missing settings to the bcd store. I was able to force the debugger configuration to be read on boot using this command:
bcdedit /set inherit {dbgsettings}
I hope this helps the next time you are trying to configure a debugger and it does not work. Remember that we don't just need the debugger to be turned on and be configured; we need the settings to be inherited as well.
Hi debuggers, Andrew Richards here with a NTDebugging post that is a little different to what is usually posted. Instead of talking about debugging, I’m going to talk about an issue I just faced while writing a debugger.
This debugger work is an extension of an upcoming article that I’ve written for MSDN Magazine (scheduled for the December 2011 issue). The MSDN Magazine article goes over how to write a native debugger using the DbgHelp API. It also explains how you can use this code to then make a plugin for Sysinternals ProcDump.
When debugging a managed application, you can take debugging one step further by being both a managed and unmanaged (native) debugger. To do this, you use the CLR Debugger API instead of the DbgHelp API.
What prompted this post was an issue that I hit while implementing the ICorDebugUnmanagedCallback::DebugEvent function of my unmanaged interface implementation. I was finding that the target process was hung after I processed in-band debug events but not out-of-band debug events. This was despite calling ICorDebugController::Continue, with or without calling ICorDebugProcess::ClearCurrentException first.
ICorDebug Interface:
Firstly, let’s take a step back and look at what it takes to get to the point of my issue. The goal in the initialization code is to get an instance of an ICorDebug based object.
Below is an abridged version of the code to do this using .NET 4.0; I have omitted the error handling and some of the cleanup (IUnknown::Release) to keep the code brief.
// Start COM
CoInitialize(NULL);
// Get a ICLRMetaHost instance (from .NET 4.0)
ICLRMetaHost* pCLRMetaHost = NULL;
CLRCreateInstance(CLSID_CLRMetaHost, IID_ICLRMetaHost, (LPVOID*)&pCLRMetaHost);
// Get an enumeration of the loaded runtimes in the target process (opened prior with OpenProcess)
IEnumUnknown* pEnumUnknown = NULL;
pCLRMetaHost->EnumerateLoadedRuntimes(hProcess, &pEnumUnknown);
// Use the first runtime found (Note, you can only debug one runtime at once)
IUnknown* pUnknown = NULL;
ULONG ulFetched = 0;
pEnumUnknown->Next(1, &pUnknown, &ulFetched);
// QueryInterface for the ICLRRuntimeInfo interface
ICLRRuntimeInfo* pCLRRuntimeInfo = NULL;
pUnknown->QueryInterface(__uuidof(ICLRRuntimeInfo), (void **)&pCLRRuntimeInfo);
// Get the ICorDebug interface (this allows you to debug .NET 2.0 targets with the .NET 4.0 API)
pCLRRuntimeInfo->GetInterface(CLSID_CLRDebuggingLegacy, IID_ICorDebug, (void **)&pCorDebug);
// Initialize the .NET 2.0 debugging interface
pCorDebug->Initialize();
// Allocate our ICorDebugManagedCallback2 implementation and apply it to ICorDebug
CCorDebugManagedCallback2* pCorDebugManagedCallback2 = new CCorDebugManagedCallback2();
pCorDebug->SetManagedHandler((ICorDebugManagedCallback*)pCorDebugManagedCallback2);
// Allocate our ICorDebugUnmanagedCallback implementation and apply it to ICorDebug
CCorDebugUnmanagedCallback* pCorDebugUnmanagedCallback = new CCorDebugUnmanagedCallback();
pCorDebug->SetUnmanagedHandler((ICorDebugUnmanagedCallback*)pCorDebugUnmanagedCallback);
// Start debugging the process; returns the ICorDebugProcess we’ll need in the callbacks
pCorDebug->DebugActiveProcess(nProcessId, TRUE, &pCorDebugProcess);
This code is pretty linear; if any call fails you are out of luck. By the end, you have associated your own managed and unmanaged callback classes with the ICorDebug object and are attached as a debugger. The code supports a target process using any of the.NET versions (v1.0, v1.1, v2.0, v4.0). Note that .NET v3.0 and v3.5 applications are actually v2.0 applications from a debugger point-of-view as these .NET releases just contain additional class libraries.
My managed callback implementation supports the IUnknown, ICorDebugManagedCallback and ICorDebugManagedCallback2 interfaces. (I’m not going to discuss this code here).
My unmanaged callback implementation supports the IUnknown and ICorDebugUnmanagedCallback interfaces. It is in this class that I had the issue.
ICorDebugUnmanagedCallback Interface:
The ICorDebugUnmanagedCallback interface has just one function:
HRESULT DebugEvent (
[in] LPDEBUG_EVENT pDebugEvent,
[in] BOOL fOutOfBand
);
The function provides a DEBUG_EVENT structure in the same way that WaitForDebugEvent does. This is not surprising as under the covers, that is what the .NET 4.0 API is using – it is just passing it to us. As such, the rules for handling a DEBUG_EVENT structure apply here too. Namely, close the handle passed with the CREATE_PROCESS_DEBUG_EVENT and LOAD_DLL_DEBUG_EVENT events.
Following the DebugEvent documentation, I ended up with (roughly) the code below – which hangs the target process.
STDMETHODIMP CCorDebugUnmanagedCallback::DebugEvent(LPDEBUG_EVENT pDebugEvent, BOOL fOutOfBand)
{
BOOL bClear = TRUE;
switch (pDebugEvent->dwDebugEventCode)
case EXCEPTION_DEBUG_EVENT:
if (pDebugEvent->u.Exception.dwFirstChance != 0)
bClear = FALSE;
break;
case CREATE_PROCESS_DEBUG_EVENT:
if (pDebugEvent->u.CreateProcessInfo.hFile)
CloseHandle(pDebugEvent->u.CreateProcessInfo.hFile);
case LOAD_DLL_DEBUG_EVENT:
if (pDebugEvent->u.LoadDll.hFile)
CloseHandle(pDebugEvent->u.LoadDll.hFile);
}
if (bClear)
pCorDebugProcess->ClearCurrentException(pDebugEvent->dwThreadId);
pCorDebugProcess->Continue(fOutOfBand);
return S_OK;
If you know what to look for, the answer to the ‘hang’ issue is on the MSDN page:
You can call ICorDebugController::Continue only on a Win32 thread and only when continuing past an out-of-band event.
So what does this really mean?
What is means is that you must call ICorDebugController::Continue from any other thread than the one servicing the callback if the debug event is in-band (fOutOfBand == FALSE). The reason for this is to stop a race condition. In-band debug events can be interrupted by out-of-band debug events – that is, the DebugEvent function can be firing multiple times concurrently. By forcing the continuation on an alternate thread, the race condition is averted.
I’m being brief here (on purpose) as I don’t want to incorrectly dissect for you the extremely complex internals of the CLR. You just need to know that you must use another thread for the hang to be averted.
So what does the code look like now? It’s something like this:
if (fOutOfBand)
pCorDebugProcess->Continue(TRUE);
else
SetEvent(hEventContinueBegin);
WaitForSingleEvent(hEventContinueDone, INFINITE);
DWORD WINAPI CCorDebugUnmanagedCallbackThreadProc(LPVOID lpParameter)
while (!bQuit)
switch (WaitForSingleObject(hEventContinueBegin, 1000))
case WAIT_OBJECT_0:
pCorDebugProcess->Continue(FALSE);
SetEvent(hEventContinueDone);
return 0;
For out-of-band debug events, nothing has changed; the ICorDebugProcess::Continue call is made locally.
For in-band debug events, an event is set to trigger the ICorDebugProcess::Continue on a dedicated thread. The dedicated thread sets an event to tell the callback thread that the Continue has been done.
Note that the above code is a massive simplification of what is actually required – there is a ton of code missing that passes all the interface pointers & handles around and to create & shutdown the thread at the correct time.
In-band vs. Out-of-band:
So what is the difference between In-band vs. Out-of-band debug events?
An out-of-band debug event causes all threads in the target process to suspend (it’s exactly the same as native debugger induced suspend). As such, it is not possible to use the managed debugging interfaces to gather information from the target – as the managed debugging thread is suspended.
An in-band debug event only causes the managed threads in the target process to suspend – the managed debugging thread is still running. As such, it is possible to use the managed debugging interfaces to gather information from the target.
The act of using the managed debugging thread from within an in-band debug event can cause an out-of-band debug event (the common examples being first chance exceptions).
Cleanup/Detach:
Just to be complete, below is the code to cleanup and (optionally) detach from the ICorDebug session. In .NET 4.0, the ICorDebugController::Detach will terminate the process if interop debugging (passing TRUE to ICorDebug::DebugActiveProcess) is used. Interop debugging is not supported in .NET 2.0 on x64 - so this is less of an issue.
// If the target process is still running, we need to detach.
if (bDetachNeeded)
ICorDebugController* pCorDebugController = NULL;
pCorDebugProcess->QueryInterface(__uuidof(ICorDebugController), (void**)&pCorDebugController);
pCorDebugController->Stop(INFINITE /* Note: Value is ignored – always INFINITE */);
pCorDebugController->Detach();
pCorDebugController->Release();
pCorDebug->SetUnmanagedHandler(NULL);
pCorDebugUnmanagedCallback->Release();
pCorDebug->SetManagedHandler(NULL);
pCorDebugManagedCallback2->Release();
pCorDebug->Terminate();
pCorDebug->Release();
CoUninitialize();
There is still quite a big bit of code required to implement the debugger completely.
You’ll need an ICorDebugManagedCallback implementation that handles process exiting, attaching to an application domain (ICorDebugAppDomain::Attach), handling name changes, and continuation.
Plus, if you want to support .NET 2.0 debugging without .NET 4.0 installed, you’ll need to use LoadLibrary/GetProcAddress to call .NET 4.0 (optionally), and fall back to the .NET 2.0 GetVersionFromProcess and CreateDebuggingInterfaceFromVersion functions.
Conclusion:
The CLR Debugging API is not for the faint at heart. There are numerous pitfalls when using the ICorDebug interface against different versions of the CLR, different versions of Windows, different architectures, and with or without interop debugging.
If you have any questions about the API, post a comment here and I’ll do my best to answer them for you.
Hello, Mr. Ninja back again. I recently discovered that although my team often tracks I/O from the file system through to the disk controller, we have never publicly documented the steps required to do this. This seems like a great opportunity for a blog because most of the structures are known, and they are even included in the public symbols.
When debugging a system that is hung, slow, or otherwise unresponsive you will likely encounter an IRP that has been sent from ntfs to the disk. Running !irp against such a request will show that the request has gone to disk.sys, but that is not really where the story ends.
Below is one such example of ntfs waiting with an IRP that appears to be stuck in disk.sys. You can determine what driver last handled the IRP by looking for the > character, this points to the current io stack location.
2: kd> !thread fffffa804f151040 e
THREAD fffffa804f151040 Cid 0004.33f8 Teb: 0000000000000000 Win32Thread: 0000000000000000 WAIT: (Executive) KernelMode Non-Alertable
fffff8803836e730 NotificationEvent
IRP List:
fffffa804f379440: (0006,0310) Flags: 00060043 Mdl: fffffa803c854090
Not impersonating
DeviceMap fffff8a000008720
Owning Process fffffa8030cdeb30 Image: System
Attached Process N/A Image: N/A
Wait Start TickCount 34797397 Ticks: 1118 (0:00:00:17.440)
Context Switch Count 5893
UserTime 00:00:00.000
KernelTime 00:00:00.296
Win32 Start Address nt!ExpWorkerThread (0xfffff80002ae2ef0)
Stack Init fffff88038370db0 Current fffff8803836e0d0
Base fffff88038371000 Limit fffff8803836b000 Call 0
Priority 16 BasePriority 13 UnusualBoost 0 ForegroundBoost 0 IoPriority 2 PagePriority 5
Child-SP RetAddr Call Site
fffff880`3836e110 fffff800`02addf32 nt!KiSwapContext+0x7a
fffff880`3836e250 fffff800`02ae074f nt!KiCommitThreadWait+0x1d2
fffff880`3836e2e0 fffff880`0164b3ff nt!KeWaitForSingleObject+0x19f
fffff880`3836e380 fffff880`01654224 Ntfs!NtfsNonCachedIo+0x23f
fffff880`3836e550 fffff880`0164f507 Ntfs!NtfsNonCachedUsaWrite+0x64
fffff880`3836e5e0 fffff880`016501a3 Ntfs!NtfsCommonWrite+0x2ca4
fffff880`3836e790 fffff800`02abebff Ntfs!NtfsFsdWrite+0x1c3
fffff880`3836ea10 fffff800`02b1cc00 nt!IoSynchronousPageWrite+0x24f
fffff880`3836ea90 fffff800`02b1b2d8 nt!MiFlushSectionInternal+0xb30
fffff880`3836ecc0 fffff800`02b1a83c nt!MmFlushSection+0x1f4
fffff880`3836ed80 fffff880`01653bb7 nt!CcFlushCache+0x7bc
fffff880`3836ee80 fffff880`01700037 Ntfs!LfsFlushLfcb+0x647
fffff880`3836f000 fffff880`017025b0 Ntfs!LfsFlushToLsnPriv+0x143
fffff880`3836f090 fffff880`0172445f Ntfs!LfsWriteLfsRestart+0xf0
fffff880`3836f0d0 fffff880`017242d0 Ntfs!LfsCloseLogFile+0x17f
fffff880`3836f190 fffff880`01715810 Ntfs!NtfsStopLogFile+0x70
fffff880`3836f1d0 fffff880`0172bfdb Ntfs!NtfsPerformDismountOnVcb+0x184
2: kd> !irp fffffa804f379440
Irp is active with 8 stacks 5 is current (= 0xfffffa804f379630)
Mdl=fffffa803c854090: No System Buffer: Thread fffffa804f151040: Irp stack trace.
cmd flg cl Device File Completion-Context
[ 0, 0] 0 0 00000000 00000000 00000000-00000000
Args: 00000000 00000000 00000000 00000000
>[ 4,34] 1c e0 fffffa8032052060 00000000 fffff880011bb010-fffffa803a604c90 Success Error Cancel
\Driver\Disk volmgr!VmpReadWriteCompletionRoutine
Args: 00001000 00000000 b5f8a000 00000000
[ 4, 0] c e0 fffffa803a604b40 00000000 fffff88001cb5150-fffffa803a1ec180 Success Error Cancel
\Driver\volmgr volsnap!VspRefCountCompletionRoutine
Args: 00001000 00000000 b5e8a000 00000000
[ 4, 0] c e1 fffffa803a1ec030 00000000 fffff8800164c344-fffff8803836e728 Success Error Cancel pending
\Driver\volsnap Ntfs!NtfsMasterIrpSyncCompletionRoutine
[ 4, 0] 0 0 fffffa803d1bf030 fffffa803b268540 00000000-00000000
\FileSystem\Ntfs
Args: 00001000 00000000 01d0c000 00000000
To learn more about what disk.sys is doing with this request we start by looking at the device extension. Disk.sys is a miniclass driver, it depends on classpnp.sys to do most of the work. The device extension will be a FUNCTIONAL_DEVICE_EXTENSION structure from classpnp.
2: kd> !devobj fffffa8032052060
Device object (fffffa8032052060) is for:
DR36 \Driver\Disk DriverObject fffffa80319fa990
Current Irp 00000000 RefCount 0 Type 00000007 Flags 01002050
Vpb fffffa803204aba0 Dacl fffff9a100463450 DevExt fffffa80320521b0 DevObjExt fffffa8032052858 Dope fffffa803204ab30
ExtensionFlags (0x00000800)
Unknown flags 0x00000800
AttachedDevice (Upper) fffffa8032052b90 \Driver\partmgr
AttachedTo (Lower) fffffa8031dcc060 \Driver\mpio
Device queue is not busy.
2: kd> dt classpnp!_FUNCTIONAL_DEVICE_EXTENSION fffffa80320521b0
+0x000 Version : 3
+0x008 DeviceObject : 0xfffffa80`32052060 _DEVICE_OBJECT
+0x000 CommonExtension : _COMMON_DEVICE_EXTENSION
+0x200 LowerPdo : 0xfffffa80`31dcc060 _DEVICE_OBJECT
+0x208 DeviceDescriptor : 0xfffffa80`320afeb0 _STORAGE_DEVICE_DESCRIPTOR
+0x210 AdapterDescriptor : 0xfffffa80`32043910 _STORAGE_ADAPTER_DESCRIPTOR
+0x218 DevicePowerState : 1 ( PowerDeviceD0 )
+0x21c DMByteSkew : 0
+0x220 DMSkew : 0
+0x224 DMActive : 0 ''
+0x228 DiskGeometry : _DISK_GEOMETRY
+0x240 SenseData : 0xfffffa80`320a65c0 _SENSE_DATA
+0x248 TimeOutValue : 0x3c
+0x24c DeviceNumber : 0x24
+0x250 SrbFlags : 0x200102
+0x254 ErrorCount : 0
+0x258 LockCount : 0n1
+0x25c ProtectedLockCount : 0n0
+0x260 InternalLockCount : 0n0
+0x268 EjectSynchronizationEvent : _KEVENT
+0x280 DeviceFlags : 4
+0x282 SectorShift : 0x9 ''
+0x283 CdbForceUnitAccess : 0 ''
+0x288 MediaChangeDetectionInfo : (null)
+0x290 Unused1 : (null)
+0x298 Unused2 : (null)
+0x2a0 KernelModeMcnContext : _FILE_OBJECT_EXTENSION
+0x2b8 MediaChangeCount : 6
+0x2c0 DeviceDirectory : 0xffffffff`800003cc Void
+0x2c8 ReleaseQueueSpinLock : 0
+0x2d0 ReleaseQueueIrp : (null)
+0x2d8 ReleaseQueueSrb : _SCSI_REQUEST_BLOCK
+0x330 ReleaseQueueNeeded : 0 ''
+0x331 ReleaseQueueInProgress : 0 ''
+0x332 ReleaseQueueIrpFromPool : 0 ''
+0x333 FailurePredicted : 0 ''
+0x334 FailureReason : 0
+0x338 FailurePredictionInfo : (null)
+0x340 PowerDownInProgress : 0 ''
+0x344 EnumerationInterlock : 0
+0x348 ChildLock : _KEVENT
+0x360 ChildLockOwner : (null)
+0x368 ChildLockAcquisitionCount : 0
+0x36c ScanForSpecialFlags : 0
+0x370 PowerRetryDpc : _KDPC
+0x3b0 PowerRetryTimer : _KTIMER
+0x3f0 PowerContext : _CLASS_POWER_CONTEXT
+0x478 PrivateFdoData : 0xfffffa80`320bc010 _CLASS_PRIVATE_FDO_DATA
+0x480 Reserved2 : 0
+0x488 Reserved3 : 0
+0x490 Reserved4 : 0
The information about requests is stored in the PrivateFdoData .
2: kd> dt 0xfffffa80`320bc010 _CLASS_PRIVATE_FDO_DATA
CLASSPNP!_CLASS_PRIVATE_FDO_DATA
+0x000 SqmData : 0x62a05
+0x008 TrackingFlags : 0
+0x00c UpdateDiskPropertiesWorkItemActive : 0
+0x010 LocalMinWorkingSetTransferPackets : 0x200
+0x014 LocalMaxWorkingSetTransferPackets : 0x800
+0x018 AllFdosListEntry : _LIST_ENTRY [ 0xfffffa80`320be028 - 0xfffffa80`320b8028 ]
+0x028 Perf : <unnamed-tag>
+0x038 HackFlags : 0
+0x040 HotplugInfo : _STORAGE_HOTPLUG_INFO
+0x048 Retry : <unnamed-tag>
+0x0f0 TimerInitialized : 0 ''
+0x0f1 LoggedTURFailureSinceLastIO : 0 ''
+0x0f2 LoggedSYNCFailure : 0 ''
+0x0f3 ReleaseQueueIrpAllocated : 0x1 ''
+0x0f8 ReleaseQueueIrp : 0xfffffa80`320bcc40 _IRP
+0x100 AllTransferPacketsList : _LIST_ENTRY [ 0xfffffa80`320bbe60 - 0xfffffa80`4ed53d10 ]
+0x110 FreeTransferPacketsList : _SLIST_HEADER
+0x120 NumFreeTransferPackets : 0xff
+0x124 NumTotalTransferPackets : 0x100
+0x128 DbgPeakNumTransferPackets : 0x100
+0x130 DeferredClientIrpList : _LIST_ENTRY [ 0xfffffa80`320bc140 - 0xfffffa80`320bc140 ]
+0x140 HwMaxXferLen : 0x80000
+0x148 SrbTemplate : _SCSI_REQUEST_BLOCK
+0x1a0 SpinLock : 0
+0x1a8 LastKnownDriveCapacityData : _READ_CAPACITY_DATA_EX
+0x1b4 IsCachedDriveCapDataValid : 0x1 ''
+0x1b8 ErrorLogNextIndex : 6
+0x1c0 ErrorLogs : [16] _CLASS_ERROR_LOG_DATA
+0x9c0 NumHighPriorityPagingIo : 0
+0x9c4 MaxInterleavedNormalIo : 0
+0x9c8 ThrottleStartTime : _LARGE_INTEGER 0x0
+0x9d0 ThrottleStopTime : _LARGE_INTEGER 0x0
+0x9d8 LongestThrottlePeriod : _LARGE_INTEGER 0x0
+0x9e0 IdlePrioritySupported : 0x1 ''
+0x9e8 IdleListLock : 0
+0x9f0 IdleIrpList : _LIST_ENTRY [ 0xfffffa80`320bca00 - 0xfffffa80`320bca00 ]
+0xa00 IdleTimer : _KTIMER
+0xa40 IdleDpc : _KDPC
+0xa80 IdleTimerInterval : 0x19
+0xa82 StarvationCount : 0x14
+0xa84 IdleTimerTicks : 0
+0xa88 IdleTicks : 0
+0xa8c IdleIoCount : 0
+0xa90 IdleTimerStarted : 0n0
+0xa98 LastIoTime : _LARGE_INTEGER 0x1cc8bde`4f571cca
+0xaa0 ActiveIoCount : 0n1
+0xaa4 ActiveIdleIoCount : 0n0
+0xaa8 InterpretSenseInfo : (null)
+0xab0 MaxPowerOperationRetryCount : 0
+0xab8 PowerProcessIrp : 0xfffffa80`320bd010 _IRP
+0xac0 PerfCounterFrequency : _LARGE_INTEGER 0x23c3c4
The outstanding requests are stored in the AllTransferPacketsList. Classpnp uses a transfer packet to send the request to the lower level drivers. This allows the request to be split into smaller packets if necessary, and for the request to be retried if there is a failure.
We can dump the AllTransferPacketsList with !list and then search for our irp, it will be in the OriginalIrp field of one of the transfer packets. Note that the output from dt will displayed with a `, while the output from !thread does not, so you will need to add a ` when searching through the !list output. Also, there may be multiple transfer packets with the same OriginalIrp.
2: kd> !list "-t classpnp!_TRANSFER_PACKET.AllPktsListEntry.Flink -e -x \"??@$extret; dt classpnp!_TRANSFER_PACKET @$extret\" 0xfffffa80`320bbe60"
…
??@$extret; dt classpnp!_TRANSFER_PACKET @$extret
unsigned int64 0xfffffa80`399ad5e0
+0x000 AllPktsListEntry : _LIST_ENTRY [ 0xfffffa80`3bae2b40 - 0xfffffa80`3bc7cb30 ]
+0x010 SlistEntry : _SLIST_ENTRY
+0x020 Irp : 0xfffffa80`3bb71570 _IRP
+0x028 Fdo : 0xfffffa80`32052060 _DEVICE_OBJECT
+0x030 OriginalIrp : 0xfffffa80`4f379440 _IRP
+0x038 CompleteOriginalIrpWhenLastPacketCompletes : 0x1 ''
+0x03c NumRetries : 8
+0x040 RetryTimer : _KTIMER
+0x080 RetryTimerDPC : _KDPC
+0x0c0 RetryIn100nsUnits : 0n0
+0x0c8 SyncEventPtr : (null)
+0x0d0 DriverUsesStartIO : 0 ''
+0x0d1 InLowMemRetry : 0 ''
+0x0d8 LowMemRetry_remainingBufPtr : (null)
+0x0e0 LowMemRetry_remainingBufLen : 0
+0x0e8 LowMemRetry_nextChunkTargetLocation : _LARGE_INTEGER 0x0
+0x0f0 BufPtrCopy : 0xfffffa80`40d79000 "RCRD("
+0x0f8 BufLenCopy : 0x1000
+0x100 TargetLocationCopy : _LARGE_INTEGER 0xb5f8a000
+0x108 SrbErrorSenseData : _SENSE_DATA
+0x120 Srb : _SCSI_REQUEST_BLOCK
+0x178 UsePartialMdl : 0 ''
+0x180 PartialMdl : 0xfffffa80`3bfda010 _MDL
+0x188 RetryHistory : (null)
+0x190 RequestStartTime : 0
Now we can view the irp that classpnp sent to the lower level drivers and determine what it is doing.
2: kd> !irp 0xfffffa80`3bb71570
Irp is active with 3 stacks 3 is current (= 0xfffffa80`3bb71688)
Mdl=fffffa803c854090: No System Buffer: Thread 00000000: Irp stack trace. Pending has been returned
[ 0, 0] 0 2 00000000 00000000 00000000-00000000
Args: 00000000 00000000 00000000 ffffffffc0000185
>[ f, 0] 1c 0 fffffa8031dcc060 00000000 fffff8800107d1a0-fffffa80413ec4c0
\Driver\elxstor mpio!MPIOPdoCompletion
Args: fffffa80399ad700 00000000 00000000 fffffa80413ec4c0
[ f, 0] 1c e1 fffffa8031dcc060 00000000 fffff88001d61a00-fffffa80399ad5e0 Success Error Cancel pending
\Driver\mpio CLASSPNP!TransferPktComplete
We can see that the request has been sent to the disk driver. More specifically the request has been sent to the storport miniport driver elxstor. From this data we can usually assume that the request has been sent to the disk drive and we are waiting for the disk to respond. There may be conditions where the request is stuck in storport, or in the miniport, however those conditions are beyond the scope of this article.
As you can see, there are several drivers between the disk.sys mini class driver and the actual physical disk drive. It is often necessary to determine how far down the storage driver stack a request has been before you can determine where it is stuck.
Hello, it's the Debug Ninja back again for another NtDebugging Blog article. For as long as I can remember user mode debuggers have had an easy way to get call stacks for heap allocations. On more recent versions of Windows this has been as simple as using gflags +ust and umdh or !heap -k. Kernel debuggers have not always had an easy way to determine who allocated a pool block. Sure, we have pool tags to help us out, but often a programmer will use the same tag in many places and devalue this as a troubleshooting technique.
Fortunately, starting in Windows Vista and Server 2008, kernel debuggers can get call stacks from pool allocations. We can even get call stacks from pool frees. This little known technique is not quite as useful as gflags +ust is for heap, but when it is needed it is very useful.
First, you need to turn on special pool using driver verifier. Verifier will obtain and track the call stack for the allocation and the free, so this technique will not work with traditional special pool as documented in KB188831 because those settings do not use driver verifier. Because special pool requires additional memory overhead to run, this technique is not valuable for large memory leaks. However, this technique is a good way to determine what code allocated or freed your pool block in other conditions. For example, this works well if you find that pool has been freed when you expected it to be allocated. This also works for smaller memory leaks, especially those for which you can easily reproduce the leak. Analyzing the allocations and stacks for a leak must be done by hand, as there is no umdh-like tool for kernel mode.
Step 1 - Turning on verifier
In this example I am using Sysinternals’ notmyfault tool to generate the pool allocations. Because I know the driver in question I set verifier to only monitor that driver. Note that a reboot is required to make this setting take effect.
Verifier /flags 1 /driver myfault.sys
Step 2 - Finding the pool allocation to analyze
For this example I am going to find the call stack of a leaked pool allocation. First find the tag that is using the most pool by using !poolused.
kd> !poolused 4
Sorting by Paged Pool Consumed
Pool Used:
NonPaged Paged
Tag Allocs Used Allocs Used
Leak 0 0 23 23552000 UNKNOWN pooltag 'Leak', please update pooltag.txt
CM31 0 0 20520 18514560 Internal Configuration manager allocations , Binary: nt!cm
CIcr 0 0 2977 8511504 Code Integrity allocations for image integrity checking , Binary: ci.dll
Next find the pool allocations for that tag with !poolfind. There is some guessing to be done with all pool leak debugging techniques; you can’t be sure that the allocation you’re looking at has really been leaked and is not just in a state where it has not yet been freed. You need to make an educated guess because there is no umdh-type functionality to analyze allocates and frees. If you have the benefit of a live debug you can go the debugger and check back later to see if the memory has been freed or not.
kd> !poolfind Leak
Scanning large pool allocation table for Tag: Leak (fffffa8002e00000 : fffffa8002f80000)
*fffff8a006a00000 :large page allocation, Tag is Leak, size is 0xfa000 bytes
*fffff8a0058fa000 :large page allocation, Tag is Leak, size is 0xfa000 bytes
*fffff8a006200000 :large page allocation, Tag is Leak, size is 0xfa000 bytes
*fffff8a0068fa000 :large page allocation, Tag is Leak, size is 0xfa000 bytes
*fffff8a0060fa000 :large page allocation, Tag is Leak, size is 0xfa000 bytes
*fffff8a005a00000 :large page allocation, Tag is Leak, size is 0xfa000 bytes
*fffff8a006c00000 :large page allocation, Tag is Leak, size is 0xfa000 bytes
*fffff8a006400000 :large page allocation, Tag is Leak, size is 0xfa000 bytes
*fffff8a0062fa000 :large page allocation, Tag is Leak, size is 0xfa000 bytes
*fffff8a005afa000 :large page allocation, Tag is Leak, size is 0xfa000 bytes
*fffff8a005c00000 :large page allocation, Tag is Leak, size is 0xfa000 bytes
*fffff8a006e00000 :large page allocation, Tag is Leak, size is 0xfa000 bytes
*fffff8a006600000 :large page allocation, Tag is Leak, size is 0xfa000 bytes
*fffff8a0064fa000 :large page allocation, Tag is Leak, size is 0xfa000 bytes
*fffff8a005cfa000 :large page allocation, Tag is Leak, size is 0xfa000 bytes
*fffff8a006afa000 :large page allocation, Tag is Leak, size is 0xfa000 bytes
*fffff8a005e00000 :large page allocation, Tag is Leak, size is 0xfa000 bytes
*fffff8a006800000 :large page allocation, Tag is Leak, size is 0xfa000 bytes
*fffff8a0066fa000 :large page allocation, Tag is Leak, size is 0xfa000 bytes
*fffff8a005efa000 :large page allocation, Tag is Leak, size is 0xfa000 bytes
*fffff8a006cfa000 :large page allocation, Tag is Leak, size is 0xfa000 bytes
*fffff8a006000000 :large page allocation, Tag is Leak, size is 0xfa000 bytes
*fffff8a005800000 :large page allocation, Tag is Leak, size is 0xfa000 bytes
Step 3 – Dump the call stack for the allocation
This step is the easy one. Once you have the address of the allocation use !verifier 0x80 Address. If you were interested in all of the call stacks in the log you can run !verifier 0x80 without the Address parameter.
kd> !verifier 0x80 fffff8a005800000
Log of recent kernel pool Allocate and Free operations:
There are up to 0x10000 entries in the log.
Parsing 0x0000000000010000 log entries, searching for address 0xfffff8a005800000.
======================================================================
Pool block fffff8a005800000, Size 00000000000fa000, Thread fffffa8002be4060
fffff80001923cc6 nt!VeAllocatePoolWithTagPriority+0x2b6
fffff80001923d3d nt!VerifierExAllocatePoolEx+0x1d
fffff880042881f6 myfault+0x11f6
fffff8800428842f myfault+0x142f
fffff8000192e750 nt!IovCallDriver+0xa0
fffff800017a3a97 nt!IopXxxControlFile+0x607
fffff800017a42f6 nt!NtDeviceIoControlFile+0x56
fffff80001487ed3 nt!KiSystemServiceCopyEnd+0x13
Pool block fffff8a005800000, Size 0000000000001000, Thread fffffa8002187060
fffff8000192393a nt!VfFreePoolNotification+0x4a
fffff800015b6a6f nt!ExDeferredFreePool+0x107b
fffff800017273eb nt!HvFreeDirtyData+0x7f
fffff800017269bb nt!HvOptimizedSyncHive+0x53
fffff80001726303 nt!CmFlushKey+0xaf
fffff80001726b22 nt!NtFlushKey+0x142
Parsed entry 0000000000010000/0000000000010000...
Finished parsing all pool tracking information.
Keep in mind that the log may contain allocate and free information that predates the current use of the pool block, and that the log is of a fixed size so eventually old data will fall off the end. The most recent use of the pool will be at the top of the output. Usually this is the stack at the top of the output is what you are interested in, I have highlighted the relevant call stack in red. In this instance we can see that the pool was most recently allocated by myfault.sys.
Sometimes it is useful to have historical information about previous uses of the pool block such as when dealing with pool that was improperly freed. In that scenario the most recent call stack may be from an allocate call when the pool block was reused by the memory manager and so you may need to go down several levels to find out where the pool was improperly freed.
For more information on using !verifier you can refer to the debugger help in MSDN, http://msdn.microsoft.com/en-us/library/ff565591.aspx.
Hi debuggers, Andrew Richards here for my first NT Debugging post. I thought I’d share a recent case that used a lot of discovery techniques to uncover the details of what was going on. Most bugchecks give you the information you need as arguments, but in the case of bugcheck 0x101, I had to go digging for a trap frame, the thread stack, look at the disassembly of the running threads, and lots of other goodies.
As usual, the first thing I did was run !analyze -v to get a more detailed explanation of what a bugcheck 0x101 “CLOCK_WATCHDOG_TIMEOUT” is.
0: kd> !analyze -v ******************************************************************************* * * * Bugcheck Analysis * * * *******************************************************************************
CLOCK_WATCHDOG_TIMEOUT (101) An expected clock interrupt was not received on a secondary processor in an MP system within the allocated interval. This indicates that the specified processor is hung and not processing interrupts. Arguments: Arg1: 00000031, Clock interrupt time out interval in nominal clock ticks. Arg2: 00000000, 0. Arg3: 87337120, The PRCB address of the hung processor. Arg4: 00000003, 0.
A bugcheck 0x101 occurs when the Clock interrupt (IRQL #28) has not been processed by each processor within a timeout. The Clock interrupt is quite high in the IRQL table for x86; only the Inter-Processor Interrupt (IPI), Power-Fail and High interrupts are higher.
31
High
30
Power Fail
29
Inter-processor Interrupt
28
Clock
27
Profile/Synch
26
Device n
5
CMCI
4
3
Device 1
2
DPC/Dispatch
1
APC
0
Passive
The immediate hypothesis was that one of the processors was stuck processing an IPI (a common interrupt), causing it to be above the Clock IRQ level.
The questions were:
1. At what IRQL was the problem processor? Was it at IPI?
2. Why was it stuck?
3. Who did it?
This was a 4 core box, so I ran !prcb four times to view the Processor Resource control Block (PRCB) of each processor. Processor #3 matched the PRCB specified in the bugcheck’s 3rd argument (Arg3 = 87337120). The first question was answered; Processor #3 was the culprit. Interestingly, the Arg4 value in the bugcheck has a value of 3. Per the documentation it should be 0, but it seems that the processor number is provided.
0: kd> !prcb 0 PRCB for Processor 0 at 82b34d20: Current IRQL -- 28 Threads-- Current 82b3e380 Next 00000000 Idle 82b3e380 Processor Index 0 Number (0, 0) GroupSetMember 1 Interrupt Count -- 000475e1 Times -- Dpc 000008d6 Interrupt 0000007b Kernel 0001920f User 00003b0b 0: kd> !prcb 1 PRCB for Processor 1 at 807c7120: Current IRQL -- 0 Threads-- Current 9bb65d48 Next 9bf52c10 Idle 807cc800 Processor Index 1 Number (0, 1) GroupSetMember 2 Interrupt Count -- 00030ae8 Times -- Dpc 000005ca Interrupt 00000098 Kernel 00017e4a User 00004eab 0: kd> !prcb 2 PRCB for Processor 2 at 87300120: Current IRQL -- 0 Threads-- Current 9b41b6e8 Next 00000000 Idle 87305800 Processor Index 2 Number (0, 2) GroupSetMember 4 Interrupt Count -- 0002ab35 Times -- Dpc 00000568 Interrupt 000000ac Kernel 0001a788 User 00002565 0: kd> !prcb 3 PRCB for Processor 3 at 87337120: Current IRQL -- 0 Threads-- Current 8aaa17c8 Next 00000000 Idle 8733c800 Processor Index 3 Number (0, 3) GroupSetMember 8 Interrupt Count -- 00026c0d Times -- Dpc 00000620 Interrupt 0000008b Kernel 0001ac65 User 00001e33
IRQL
The Processor Context Records reported that Processors #0, #1 and #2 were at IRQL 31 (1f - HIGH), and Processor #3 was at IRQL 27 (1b - SYNCH). Having a value of 31 is expected at bugcheck as that is how the bugcheck gains control of the processor to gather the current context. It was strange that Processor #3 was different and that is matched the processor mentioned in the bugcheck
0: kd> !pcr 0 KPCR for Processor 0 at 82b34c00: ... Irql: 0000001f ...
0: kd> !pcr 1 KPCR for Processor 1 at 807c7000: ... Irql: 0000001f ...
0: kd> !pcr 2 KPCR for Processor 2 at 87300000: ... Irql: 0000001f ...
0: kd> !pcr 3 KPCR for Processor 3 at 87337000: ... Irql: 0000001b ...
The next step was to look at the stacks of the processors to see what the threads were all involved in.
Processor #0
To determine the stack of the Processor #0’s thread before the bugcheck, the trap frame needed to be found. The trap frame is stored immediately above the interrupt handler. To find that, I looked at the Interrupt Descriptor Table to find out the name of the handler for clock interrupt and then I searched for that symbol on the stack.
-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=- ==> Processor #0 -=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=- 0: kd> knL # ChildEBP RetAddr 00 82b31674 82a84a6f nt!KeBugCheckEx+0x1e 01 82b316b0 82a840be nt!KeAccumulateTicks+0x242 02 82b316f0 82a83f6b nt!KeUpdateRunTime+0x145 03 82b3174c 82a88c17 nt!KeUpdateSystemTime+0x613 04 82b3174c 82a85e79 nt!KeUpdateSystemTimeAssist+0x13 05 82b317e0 82abfa17 nt!KiIpiSendPacket+0xdd 06 82b31820 82af0866 nt!KeFlushSingleTb+0x136 07 82b3190c 82b2ab90 nt!MmFreeSpecialPool+0x2b4 08 82b31970 82d4a06e nt!ExFreePoolWithTag+0xd6 09 82b31980 82d3fab2 nt!ViCtxFreeIsrContext+0xf 0a 82b31998 82d3a1c0 nt!VfIoFreeIrp+0xd3 0b 82b319a8 8ced5986 nt!IovFreeIrpPrivate+0x47 WARNING: Stack unwind information not available. Following frames may be wrong. 0c 82b319c4 82d3acd4 irsir+0x2986 0d 82b319f4 82a81933 nt!IovpLocalCompletionRoutine+0x14b 0e 82b31a3c 82d3ab64 nt!IopfCompleteRequest+0x128 0f 82b31aa4 8b7c6abd nt!IovCompleteRequest+0x133 10 82b31ab4 8b7c6461 serial!SerialGetNextIrpLocked+0x61 11 82b31ad8 8b7c7567 serial!SerialGetNextIrp+0x27 12 82b31b00 8b7b9eb7 serial!SerialTryToCompleteCurrent+0x7a 13 82b31b38 82a83039 serial!SerialReadTimeout+0x68 14 82b31b7c 82a82fdd nt!KiProcessTimerDpcTable+0x50 15 82b31c68 82a82e9a nt!KiProcessExpiredTimerList+0x101 16 82b31cdc 82a8100e nt!KiTimerExpiration+0x25c 17 82b31d20 82a80e38 nt!KiRetireDpcList+0xcb 18 82b31d24 00000000 nt!KiIdleLoop+0x38
0: kd> !irql Debugger saved IRQL for processor 0x0 -- 28 (CLOCK2_LEVEL)
0: kd> !idt
Dumping IDT:
37: 82e35104 hal!PicSpuriousService37
51: 8aba1558 serial!SerialCIsrSw (KINTERRUPT 8aba1500)
71: 89003cd8 i8042prt!I8042KeyboardInterruptService (KINTERRUPT 89003c80)
72: 8aba17d8 USBPORT!USBPORT_InterruptService (KINTERRUPT 8aba1780)
82: 89123058 ataport!IdePortInterrupt (KINTERRUPT 89123000)
ataport!IdePortInterrupt (KINTERRUPT 89123a00)
ataport!IdePortInterrupt (KINTERRUPT 89123780)
ataport!IdePortInterrupt (KINTERRUPT 89123500)
92: 8aba1058 Impcd+0x8540 (KINTERRUPT 8aba1000)
a0: 8aba1a58 ndis!ndisMiniportMessageIsr (KINTERRUPT 8aba1a00)
a2: 8aba1cd8 USBPORT!USBPORT_InterruptService (KINTERRUPT 8aba1c80)
b0: 891232d8 ndis!ndisMiniportMessageIsr (KINTERRUPT 89123280)
b1: 89123cd8 ACPI!ACPIInterruptServiceRoutine (KINTERRUPT 89123c80)
b2: 8aba12d8 serial!SerialCIsrSw (KINTERRUPT 8aba1280)
c1: 82e353f4 hal!HalpBroadcastCallService
d1: 82e1d634 hal!HalpHpetClockInterrupt
d2: 82e1d898 hal!HalpHpetRolloverInterrupt
df: 82e351dc hal!HalpApicRebootService
e1: 82e35958 hal!HalpIpiHandler
e3: 82e356f8 hal!HalpLocalApicErrorService
fd: 82e35f2c hal!HalpProfileInterrupt
fe: 82e361a8 hal!HalpPerfInterrupt
The search was between the nt!KiIpiSendPacket call (I just chose it as it didn’t seem to be bugcheck related) and the current stack pointer. Using the dereferenced pointer (poi) and some maths, the trap frame location is retrieved.
0: kd> dps @esp 82b317e0 ... 82b31744 82b31760 nt!KiDoubleFaultStack+0x2760 <<<<< This is not a real symbol; it actually is a part of the trap frame 82b31748 82e1d72a hal!HalpHpetClockInterrupt+0xf6 82b3174c 82b31760 nt!KiDoubleFaultStack+0x2760 82b31750 82a88c17 nt!KeUpdateSystemTimeAssist+0x13 ...
0: kd> .trap poi(82b31748-4) ErrCode = 00000000 eax=87300120 ebx=841882dc ecx=4cdfc4c4 edx=82b34d20 esi=807c7120 edi=82b738c4 eip=82a85e79 esp=82b317d4 ebp=82b31820 iopl=0 nv up ei pl nz na po nc cs=0008 ss=0010 ds=0000 es=dea0 fs=040f gs=0008 efl=00000202 nt!KiIpiSendPacket+0xdd: 82a85e79 f390 pause
So you might be asking yourself, can this be done an easier way? The answer is definitely Yes. You just need to use kv instead; it adds the trap frame information on the end of the line.
0: kd> kv
ChildEBP RetAddr Args to Child
82b31674 82a84a6f 00000101 00000031 00000000 nt!KeBugCheckEx+0x1e
82b316b0 82a840be 00026161 00000000 0001cd00 nt!KeAccumulateTicks+0x242
82b316f0 82a83f6b 82a85e79 807c7120 00000000 nt!KeUpdateRunTime+0x145
82b3174c 82a88c17 ffffff1b ffffff1b 000000d1 nt!KeUpdateSystemTime+0x613
82b3174c 82a85e79 ffffff1b ffffff1b 000000d1 nt!KeUpdateSystemTimeAssist+0x13 (FPO: [0,2] TrapFrame @ 82b31760)
82b317e0 82abfa17 00000001 00000000 82a3cbe1 nt!KiIpiSendPacket+0xdd (FPO: [6,2,0])
82b31820 82af0866 a33d6f20 00000001 a33d00cf nt!KeFlushSingleTb+0x136
82b3190c 82b2ab90 a33d6f20 8a2096d8 8a2096d8 nt!MmFreeSpecialPool+0x2b4
82b31970 82d4a06e a33d6f20 00000000 82b31998 nt!ExFreePoolWithTag+0xd6
82b31980 82d3fab2 a33d6f20 a33d6f20 a33d6f20 nt!ViIrpFree+0xf
82b31998 82d3a1c0 8ced5986 905cadb0 82b319c4 nt!VfIoFreeIrp+0xd3
82b319a8 8ced5986 a33d6f20 8ced58f0 82b31a6c nt!IovFreeIrpPrivate+0x47
WARNING: Stack unwind information not available. Following frames may be wrong.
82b319c4 82d3acd4 00000000 a33d6f20 905cadb0 irsir+0x2986
82b319f4 82a81933 00000000 a33d6f20 82b31a6c nt!IovpLocalCompletionRoutine+0x14b
82b31a3c 82d3ab64 a33d6f20 8ab151ac 8ab150f0 nt!IopfCompleteRequest+0x128
82b31aa4 8b7c6abd 00000000 8ab151ac 82b31ad8 nt!IovCompleteRequest+0x133
82b31ab4 8b7c6461 8ab151ac 8ab1518c 82b31b0c serial!SerialGetNextIrpLocked+0x61 (FPO: [Non-Fpo])
82b31ad8 8b7c7567 8ab151ac 8ab1518c 82b31b0c serial!SerialGetNextIrp+0x27 (FPO: [Non-Fpo])
82b31b00 8b7b9eb7 8ab150f0 00000000 8ab15002 serial!SerialTryToCompleteCurrent+0x7a (FPO: [Non-Fpo])
82b31b38 82a83039 8ab15314 8ab15002 3c171d26 serial!SerialReadTimeout+0x68 (FPO: [Non-Fpo])
82b31b7c 82a82fdd 82b34d20 82b31ca8 00000001 nt!KiProcessTimerDpcTable+0x50
82b31c68 82a82e9a 82b34d20 82b31ca8 00000000 nt!KiProcessExpiredTimerList+0x101
82b31cdc 82a8100e 0001cacf 9bb374c0 82b3e380 nt!KiTimerExpiration+0x25c
82b31d20 82a80e38 00000000 0000000e 00000000 nt!KiRetireDpcList+0xcb
82b31d24 00000000 0000000e 00000000 00000000 nt!KiIdleLoop+0x38 (FPO: [0,0,0])
By using the stack based maths, or the value provide by kv, the trap frame address is used to set the context (.trap <addr>) to the code running before the interrupt. These are the stored registers and the stack at the time of the interrupt.
0: kd> .trap 82b31760
eax=87300120 ebx=841882dc ecx=4cdfc4c4 edx=82b34d20 esi=807c7120 edi=82b738c4
eip=82a85e79 esp=82b317d4 ebp=82b31820 iopl=0 nv up ei pl nz na po nc
cs=0008 ss=0010 ds=0000 es=dea0 fs=040f gs=0008 efl=00000202
nt!KiIpiSendPacket+0xdd:
82a85e79 f390 pause
0: kd> knL *** Stack trace for last set context - .thread/.cxr resets it # ChildEBP RetAddr 00 82b317e0 82abfa17 nt!KiIpiSendPacket+0xdd 01 82b31820 82af0866 nt!KeFlushSingleTb+0x136 02 82b3190c 82b2ab90 nt!MmFreeSpecialPool+0x2b4 03 82b31970 82d4a06e nt!ExFreePoolWithTag+0xd6 04 82b31980 82d3fab2 nt!ViCtxFreeIsrContext+0xf 05 82b31998 82d3a1c0 nt!VfIoFreeIrp+0xd3 06 82b319a8 8ced5986 nt!IovFreeIrpPrivate+0x47 WARNING: Stack unwind information not available. Following frames may be wrong. 07 82b319c4 82d3acd4 irsir+0x2986 08 82b319f4 82a81933 nt!IovpLocalCompletionRoutine+0x14b 09 82b31a3c 82d3ab64 nt!IopfCompleteRequest+0x128 0a 82b31aa4 8b7c6abd nt!IovCompleteRequest+0x133 0b 82b31ab4 8b7c6461 serial!SerialGetNextIrpLocked+0x61 0c 82b31ad8 8b7c7567 serial!SerialGetNextIrp+0x27 0d 82b31b00 8b7b9eb7 serial!SerialTryToCompleteCurrent+0x7a 0e 82b31b38 82a83039 serial!SerialReadTimeout+0x68 0f 82b31b7c 82a82fdd nt!KiProcessTimerDpcTable+0x50 10 82b31c68 82a82e9a nt!KiProcessExpiredTimerList+0x101 11 82b31cdc 82a8100e nt!KiTimerExpiration+0x25c 12 82b31d20 82a80e38 nt!KiRetireDpcList+0xcb 13 82b31d24 00000000 nt!KiIdleLoop+0x38
Dissassembling the first few instructions reveals a jump (jmp) that is back up in the nt!KiIpiSendPacket function. Using the jmp location and the instruction after the jmp as the bound, we can disassemble the loop. At the time of the bugcheck, the thread was executing a pause (a CPU based delay), and seemingly doing this in a loop while waiting for to release it.
0: kd> u @eip nt!KiIpiSendPacket+0xdd: 82a85e79 f390 pause <<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<< @eip 82a85e7b eb9e jmp nt!KiIpiSendPacket+0x7f (82a85e1b) 82a85e7d 8d4900 lea ecx,[ecx]
...
0: kd> u 82a85e1b 82a85e7d nt!KiIpiSendPacket+0x7f: 82a85e1b 8b86a4180000 mov eax,dword ptr [esi+18A4h] 82a85e21 0bc0 or eax,eax <<<<<< Checking if value is nonzero 82a85e23 7538 jne nt!KiIpiSendPacket+0xc1 (82a85e5d) <<<<<< Take jmp to stay in loop 82a85e25 f00fb196a4180000 lock cmpxchg dword ptr [esi+18A4h],edx 82a85e2d 75ec jne nt!KiIpiSendPacket+0x7f (82a85e1b) 82a85e2f 59 pop ecx 82a85e30 d1e9 shr ecx,1 82a85e32 8d7f04 lea edi,[edi+4] 82a85e35 72df jb nt!KiIpiSendPacket+0x7a (82a85e16) 82a85e37 75f7 jne nt!KiIpiSendPacket+0x94 (82a85e30) 82a85e39 59 pop ecx 82a85e3a 64890d8c190000 mov dword ptr fs:[198Ch],ecx 82a85e41 8b4c240c mov ecx,dword ptr [esp+0Ch] 82a85e45 8b542410 mov edx,dword ptr [esp+10h] 82a85e49 64ff0590360000 inc dword ptr fs:[3690h] 82a85e50 52 push edx 82a85e51 51 push ecx 82a85e52 ff15a4a0a082 call dword ptr [nt!_imp__HalRequestIpi (82a0a0a4)] 82a85e58 5f pop edi 82a85e59 5e pop esi 82a85e5a c21800 ret 18h 82a85e5d 41 inc ecx 82a85e5e 850d7c3ab782 test dword ptr [nt!HvlLongSpinCountMask (82b73a7c)],ecx 82a85e64 7513 jne nt!KiIpiSendPacket+0xdd (82a85e79) 82a85e66 f605783ab78240 test byte ptr [nt!HvlEnlightenments (82b73a78)],40h <<<<<< Don’t spin if you’re an enlightened VM, just pause 82a85e6d 740a je nt!KiIpiSendPacket+0xdd (82a85e79) 82a85e6f 52 push edx 82a85e70 51 push ecx 82a85e71 51 push ecx 82a85e72 e8c8b60500 call nt!HvlNotifyLongSpinWait (82ae153f) 82a85e77 59 pop ecx 82a85e78 5a pop edx 82a85e79 f390 pause <<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<< @eip 82a85e7b eb9e jmp nt!KiIpiSendPacket+0x7f (82a85e1b)
The summary so far: Since processor #0 was the thread that created the bugcheck, it must have been interrupted by a Clock interrupt so as to trigger a CLOCK_WATCHDOG_TIMEOUT bugcheck. It is not surprising then that the value of !irql is CLOCK2.
The existence of the nt!KiIpiSendPacket function lends weight to the thought that this was the creator of an IPI and is not one of the processors that wasn’t processing IPIs.
Processor #1
Using the same technique as Processor #0, the disassembly for the loop on Processor #1 is determined. Processor #1 is in a tight loop within the nt!KeFlushMultipleRangeTb function, an ancestor function of nt!ExAllocatePoolWithTag, a memory related operation. This is an interesting function to see, considering that nt!ExFreePoolWithTag, the counterpart, is on Processor #0’s stack.
-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=- ==> Processor #1 -=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=- 1: kd> knL # ChildEBP RetAddr 00 95bfc7bc 82abb217 nt!KeFlushMultipleRangeTb+0x2d3 01 95bfc89c 82aa7d11 nt!MiFlushTbAsNeeded+0x12e 02 95bfc8dc 82b29487 nt!MiAllocatePagedPoolPages+0x567 03 95bfc940 82aa4674 nt!MiAllocatePoolPages+0x1f 04 95bfc998 82b2a132 nt!ExpAllocateBigPool+0xa6 05 95bfc9fc 82aab6b1 nt!ExAllocatePoolWithTag+0x12d 06 95bfca20 82c4cc62 nt!ExAllocatePoolWithTagPriority+0x196 07 95bfca78 82c7a662 nt!IopQueryNameInternal+0x60 08 95bfca98 82c57d88 nt!IopQueryName+0x1b 09 95bfcb1c 82c71a50 nt!ObpQueryNameString+0x7f 0a 95bfcb38 82c75e12 nt!ObQueryNameString+0x18 0b 95bfcc14 82c65788 nt!EtwTraceProcess+0xa2 0c 95bfcc38 82c73625 nt!PspExitProcess+0x37 0d 95bfccb4 82c87051 nt!PspExitThread+0x59a 0e 95bfcccc 82aba8c0 nt!PsExitSpecialApc+0x22 0f 95bfcd1c 82a472a4 nt!KiDeliverApc+0x28b 10 95bfcd1c 77556fc0 nt!KiServiceExit+0x64 WARNING: Frame IP not in any known module. Following frames may be wrong. 11 014ef918 00000000 0x77556fc0
1: kd> !irql
Debugger saved IRQL for processor 0x1 -- 0 (LOW_LEVEL) <<<<<< Windows Internals 4th Edition notes that IRQL may not be saved; this explains the 0
1: kd> u @eip nt!KeFlushMultipleRangeTb+0x2d3: 82a40c31 f390 pause <<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<< @eip 82a40c33 8b07 mov eax,dword ptr [edi] 82a40c35 85c0 test eax,eax 82a40c37 75de jne nt!KeFlushMultipleRangeTb+0x2b9 (82a40c17) 82a40c39 8a4d0f mov cl,byte ptr [ebp+0Fh] 82a40c3c ff1558a1a082 call dword ptr [nt!_imp_KfLowerIrql (82a0a158)] 82a40c42 5f pop edi 82a40c43 5e pop esi
1: kd> u 82a40c17 82a40c39 nt!KeFlushMultipleRangeTb+0x2b9: 82a40c17 46 inc esi 82a40c18 85357c3ab782 test dword ptr [nt!HvlLongSpinCountMask (82b73a7c)],esi 82a40c1e 7511 jne nt!KeFlushMultipleRangeTb+0x2d3 (82a40c31) 82a40c20 f605783ab78240 test byte ptr [nt!HvlEnlightenments (82b73a78)],40h 82a40c27 7408 je nt!KeFlushMultipleRangeTb+0x2d3 (82a40c31) 82a40c29 56 push esi 82a40c2a e810090a00 call nt!HvlNotifyLongSpinWait (82ae153f) 82a40c2f eb02 jmp nt!KeFlushMultipleRangeTb+0x2d5 (82a40c33) 82a40c31 f390 pause <<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<< @eip 82a40c33 8b07 mov eax,dword ptr [edi] 82a40c35 85c0 test eax,eax 82a40c37 75de jne nt!KeFlushMultipleRangeTb+0x2b9 (82a40c17)
This function tests a variable at edi in each loop. While the signal is not set, the thread goes around loop, eventually executes a pause instruction and then tries the test again. This thread seems to be waiting on someone to set a flag.
Processor #2
Processor #2 is also trying to send an IPI using nt!KiIpiSendPacket. It was caught in the same function that Processor #0 is in. In this case, it is executing the jmp instruction that is one instruction after the pause which Processor #0 is executing.
-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=- ==> Processor #2 -=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=- 2: kd> knL # ChildEBP RetAddr 00 a0c4fac0 82a40bcd nt!KiIpiSendPacket+0xdf 01 a0c4fafc 82b29431 nt!KeFlushMultipleRangeTb+0x26f 02 a0c4fbe8 82b2aef1 nt!MiFreePoolPages+0x42c 03 a0c4fc50 82aa6b37 nt!ExFreePoolWithTag+0x436 04 a0c4fc64 82c3745d nt!MmFreeAccessPfnBuffer+0x2f 05 a0c4fcc0 82c4b83a nt!PfpFlushBuffers+0x2ba 06 a0c4fd50 82c12f5e nt!PfTLoggingWorker+0xaa 07 a0c4fd90 82aba219 nt!PspSystemThreadStartup+0x9e 08 00000000 00000000 nt!KiThreadStartup+0x19
2: kd> !irql
Debugger saved IRQL for processor 0x2 -- 0 (LOW_LEVEL) <<<<<< Windows Internals 4th Edition notes that IRQL may not be saved; this explains the 0
2: kd> u @eip nt!KiIpiSendPacket+0xdf: 82a85e7b eb9e jmp nt!KiIpiSendPacket+0x7f (82a85e1b) <<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<< @eip 82a85e7d 8d4900 lea ecx,[ecx] ...
Summary so far
So at this point we can say (without any real knowledge of how IPI handling is achieved) that:
Processor #0 is waiting for Processor #1, #2 and/or #3 to respond to its IPI
Processor #0 is running nt!ExFreePoolWithTag
Processor #0 is reported to be at IRQL 28 (CLOCK2_LEVEL)
Processor #1 is waiting for a flag to bet set; probably an IPI flag
Processor #0 is running nt!ExAllocatePoolWithTag
Processor #1 is reported to be at IRQL 0 (LOW_LEVEL); probably incorrect
Processor #2 is waiting for Processor #0, #1 and/or #3 to respond to its IPI
Processor #2 is reported to be at IRQL 0 (LOW_LEVEL); probably incorrect
Processor #3 has been tagged by bugcheck as being the cause
Processor #3 hasn’t been looked at yet
Processor #3
Analyzing Processor #3 was problematic. The Processor Context Record (PCR) was not completely captured in the dump. This happens when the bugcheck thread is unable to interrupt the target processor to gather the context – usually because it is running at the HIGH (31) IRQL.
When the context is missing, all of the registers are treated as being zero. The zero value instruction pointer causes the stack to be incorrect when calling knL and various other commands.
-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=- ==> Processor #3 -=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=- 1: kd> ~3 WARNING: Process directory table base 7B59C400 doesn't match CR3 00185000 WARNING: Process directory table base 7B59C400 doesn't match CR3 00185000
3: kd> r eax=00000000 ebx=00000000 ecx=00000000 edx=00000000 esi=00000000 edi=00000000 eip=00000000 esp=00000000 ebp=00000000 iopl=0 nv up di pl nz na po nc cs=0000 ss=0000 ds=0000 es=0000 fs=0000 gs=0000 efl=00000000 00000000 ?? ???
3: kd> knL # ChildEBP RetAddr WARNING: Frame IP not in any known module. Following frames may be wrong. 00 00000000 00000000 0x0
To determine the thread running on the processor, the !thread command was used. The value for the _ETHREAD (8aaa17c8) comes from the PCR that Windows maintains as part of scheduling. Once again, like the knL command, because the context was missing, the stack was not shown.
3: kd> !pcr KPCR for Processor 3 at 87337000: ... CurrentThread: 8aaa17c8 ...
3: kd> !thread THREAD 8aaa17c8 Cid 0454.08ec Teb: 00000000 Win32Thread: 00000000 RUNNING on processor 3 IRP List: 8f8c6f68: (0006,0094) Flags: 40000404 Mdl: 00000000 Not impersonating DeviceMap 89970d50 Owning Process 89f93ac0 Image: ZyxApp.exe Attached Process N/A Image: N/A Wait Start TickCount 117454 Ticks: 589 (0:00:00:09.188) Context Switch Count 37 IdealProcessor: 3 UserTime 00:00:00.000 KernelTime 00:00:00.000 Win32 Start Address 0x7753fd0f Stack Init 91aecfd0 Current 91aeca78 Base 91aed000 Limit 91aea000 Call 0 Priority 10 BasePriority 8 UnusualBoost 0 ForegroundBoost 0 IoPriority 2 PagePriority 5 ChildEBP RetAddr Args to Child 00000000 00000000 00000000 00000000 00000000 0x0
The important piece of information in the !thread output was the stack limits. These values allow me to do a search of the raw stack for known symbols. I used dps <limit> <base>. What I was looking for was the first symbol after the 0xffffffff entries which has a value above it that points to a location within this stack. In this case, is it the hal!KfLowerIrql symbol that has a value (base pointer) above it. This symbol is the deepest function that the stack has got to; it doesn’t necessarily mean that the stack is at this depth now.
3: kd> dps 91aea000 91aed000 91aea000 ffffffff 91aea004 ffffffff … 91aec85c ffffffff 91aec860 ffffffff 91aec864 ffffffff 91aec868 ffffffff 91aec86c ffffffff 91aec870 ffffffff 91aec874 0001187f 91aec878 00000010 91aec87c 00011a1f 91aec880 ffffffff 91aec884 0001187f 91aec888 82b7561f nt!MmSystemSpaceLock+0x1f 91aec88c 91aec89c 91aec890 82e20ba9 hal!KfLowerIrql+0x61 91aec894 00011a00 91aec898 82b7561f nt!MmSystemSpaceLock+0x1f 91aec89c 91aec95c 91aec8a0 82e1e92d hal!KeReleaseQueuedSpinLock+0x2d 91aec8a4 82a3bdc7 nt!MiReturnNonPagedPoolVa+0x1d4 91aec8a8 00000001
91aec8ac 00000000
91aec8b0 83cac9e8
91aec8b4 ffffffff
91aec8b8 ffffffff
91aec8bc ffffffff
91aec8c0 ffffffff
91aec8c4 ffffffff
91aec8c8 ffffffff
91aec8cc ffffffff
91aec8d0 82e21cee hal!HalpLegacyApicWriteIcr+0xa
91aec8d4 91aec8f0
91aec8d8 82e2aea0 hal!HalpRequestIpiSpecifyVector+0x40
91aec8dc 00000000
91aec8e0 83cac9e8
91aec8e4 82b738cc nt!KiProcessorBlock+0xc
91aec8e8 87300120
91aec8ec 00000202
91aec8f0 91aec914
91aec8f4 82aa6e36 nt!MiInsertPageInFreeOrZeroedList+0x25b
91aec8f8 83cac9e8
91aec8fc 00000000
91aec900 0000001f
91aec904 00000001
91aec908 83cac9e8
91aec90c 00000003
91aec910 00000007
91aec914 00000fff
91aec918 00000003
91aec91c 82a40bcd nt!KeFlushMultipleRangeTb+0x26f
91aec920 00000001
91aec924 00000000
91aec928 82a9fa43 nt!KiFlushTargetMultipleRangeTb
91aec92c 00000000
91aec930 91aec9bc
91aec934 91aec9b8
91aec938 00000001
91aec93c 00000000
91aec940 83e9cc04
91aec944 00018736
91aec948 8733a480
91aec94c 00000003
91aec950 91aec9bc
91aec954 91aeca40
91aec958 82b29431 nt!MiFreePoolPages+0x42c
91aec95c 91aec9b8
91aec960 1f000000
91aec964 91aec9b8
91aec968 00000000
Using the first location found, I used the stack variables to build up a k= command. With an x86 stack, if you select the stack location above the symbol as the base pointer and stack pointer, and the symbol’s address as the instruction pointer, you’ll get a reconstructed stack.
3: kd> k= 91aec89c 91aec88c 82e20ba9
ChildEBP RetAddr
91aec89c 82e1e92d hal!KfLowerIrql+0x61
91aec8a0 82a3bdc7 hal!KeReleaseQueuedSpinLock+0x2d
91aec9a8 82aab049 nt!MiReturnNonPagedPoolVa+0x1d4
91aec9c0 82ab1685 nt!FindNodeOrParent+0x2091aec9d8 82abd820 nt!RtlLookupElementGenericTableFullAvl+0x16
91aeca0c 82e20ba9 nt!RtlLookupElementGenericTableAvl+0x18
91aeca1c 82b199d2 hal!KfLowerIrql+0x61
91aeca24 82d41ad1 nt!VfAvlUnlockShared+0x15
91aeca40 82b2aef1 nt!VfRemLockDeleteMemoryRange+0x5c
91aecaa8 963a55c5 nt!ExFreePoolWithTag+0x436
91aecac8 963a61d4 Zyx+0x5c5 91aecb00 82d3a6c3 Zyx+0x11d4 91aecb24 82a4054a nt!IovCallDriver+0x258 91aecb38 82c3b975 nt!IofCallDriver+0x1b 91aecb7c 82c2c591 nt!IopDeleteFile+0x10c 91aecb94 82a81d60 nt!ObpRemoveObjectRoutine+0x59 91aecba8 82a81cd0 nt!ObfDereferenceObjectWithTag+0x88 91aecbb0 82c4f308 nt!ObfDereferenceObject+0xd 91aecbf0 82c7dba9 nt!ObpCloseHandleTableEntry+0x21d 91aecc20 82c65f86 nt!ExSweepHandleTable+0x5f 91aecc40 82c73666 nt!ObKillProcess+0x54 91aeccb4 82c87051 nt!PspExitThread+0x5db 91aecccc 82aba8c0 nt!PsExitSpecialApc+0x22 91aecd1c 82a472a4 nt!KiDeliverApc+0x28b 91aecd1c 775570b4 nt!KiServiceExit+0x64 0147ff88 00000000 0x775570b4
In this case, because nt!KiFlushTargetMultipleRangeTb was present on Processor #1 as well as being in the Processor #3 raw stack, I found the first symbol lower than this symbol on the (raw) stack that had a valid base pointer above it. The stack was thus built around the nt!MiFreePoolPages function instead.
3: kd> k= 91aeca40 91aec954 82b29431
91aeca40 82b2aef1 nt!MiFreePoolPages+0x42c
91aecac8 963a61d4 Zyx+0x5c5
91aecb00 82d3a6c3 Zyx+0x11d4
91aecb24 82a4054a nt!IovCallDriver+0x258
91aecb38 82c3b975 nt!IofCallDriver+0x1b
91aecb7c 82c2c591 nt!IopDeleteFile+0x10c
91aecb94 82a81d60 nt!ObpRemoveObjectRoutine+0x59
91aecba8 82a81cd0 nt!ObfDereferenceObjectWithTag+0x88
91aecbb0 82c4f308 nt!ObfDereferenceObject+0xd
91aecbf0 82c7dba9 nt!ObpCloseHandleTableEntry+0x21d
91aecc20 82c65f86 nt!ExSweepHandleTable+0x5f
91aecc40 82c73666 nt!ObKillProcess+0x54
91aeccb4 82c87051 nt!PspExitThread+0x5db
91aecccc 82aba8c0 nt!PsExitSpecialApc+0x22
91aecd1c 82a472a4 nt!KiDeliverApc+0x28b
91aecd1c 775570b4 nt!KiServiceExit+0x64
0147ff88 00000000 0x775570b4
It’s very hard to get much further this stack since the exact registers are not known. In particular, it is hard to determine what function above nt!MiFreePoolPages the thread is executing at the moment. As with the Processor #1 investigation, it is interesting to note that Processor #3 is also involved in a memory operation; specifically, it is doing a nt!ExFreePoolWithTag much like Processor #0 is.
Inter-Processor Interrupts
The only time an IPI (interrupt) is not processed immediately is when the target processor is at IPI IRQL or higher. The most common example being when it is already processing an IPI (interrupts of the same level cannot interrupt the handler for the same IRQL). In this case, the interrupt has to be queued until the interrupt mask allows its arrival. Usually, this design allows only one IPI to be processed at any one time.
A deadlock (like) condition can arise though if an IPI is issued to a processor that is at a higher IRQL, and this processor (thread) attempts to send an IPI. The IPI logic blocks the send if there is an outstanding IPI to complete on the processor. The assumption being that interrupt queuing should be avoided as there is a probability of loss if the queue overflows.
Looking in the Windows Internals book, there is a single sentence that says “Each interrupt level has a specific purpose. For example, the kernel issues an interprocessor interrupt (IPI) to request that another processor perform an action, such as dispatching a particular thread for execution or updating its translation look-aside buffer cache.”. This is very interesting as the translation look-aside buffer is part of the memory manager, and the memory operations are being undertaken on all the processors.
3rd Party Driver
Instead of pulling my hair out combing through the threads on the system, the IPI code, the Memory Manager code or bugcheck code, I decided to look at the 3rd party driver in processor #3 to see if it was changing the IRQL.
The first step was to find the bound of the Zyx+0x5c5 function. The end address of the assembler is easy to determine, it is Zyx+0x5c5. The question is, how big is the function? To work that out, you look at the assembler of the caller. The caller’s assembler will point to the starting instruction in the Zyx+0x5c5 function.
3: kd> ub Zyx+0x11d4 Zyx+0x11b7: 963a61b7 ff7008 push dword ptr [eax+8] 963a61ba 52 push edx 963a61bb 6a01 push 1 963a61bd ff7018 push dword ptr [eax+18h] 963a61c0 e827fdffff call Zyx+0xeec (963a5eec) 963a61c5 e9f1000000 jmp Zyx+0x12bb (963a62bb) 963a61ca e8d3fcffff call Zyx+0xea2 (963a5ea2) 963a61cf e89ef3ffff call Zyx+0x572 (963a5572)
3: kd> u 963a5572 Zyx+0x5c5 Zyx+0x572: 963a5572 8bff mov edi,edi 963a5574 55 push ebp 963a5575 8bec mov ebp,esp 963a5577 51 push ecx 963a5578 53 push ebx 963a5579 56 push esi 963a557a 57 push edi 963a557b b11f mov cl,1Fh <<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<< 0x1F = 0n31 (HIGH) 963a557d ff15006b3a96 call dword ptr [Zyx+0x1b00 (963a6b00)] <<<<<<<< hal!KfRaiseIrql (via lookup) 963a5583 bb14813a96 mov ebx,offset Zyx+0x3114 (963a8114) 963a5588 8bcb mov ecx,ebx 963a558a 8845ff mov byte ptr [ebp-1],al 963a558d e8c2100000 call Zyx+0x1654 (963a6654) <<<<<<<<<<<<<<<< [Unresolved] 963a5592 a1746d3a96 mov eax,dword ptr [Zyx+0x1d74 (963a6d74)] 963a5597 85c0 test eax,eax 963a5599 8b35586b3a96 mov esi,dword ptr [Zyx+0x1b58 (963a6b58)] <<<< nt!ExFreePoolWithTag (via lookup) 963a559f 7413 je Zyx+0x5b4 (963a55b4) 963a55a1 8b7808 mov edi,dword ptr [eax+8] 963a55a4 6a00 push 0 963a55a6 50 push eax 963a55a7 ffd6 call esi <<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<< nt!ExFreePoolWithTag (via 963a5599) 963a55a9 85ff test edi,edi 963a55ab 8bc7 mov eax,edi 963a55ad a3746d3a96 mov dword ptr [Zyx+0x1d74 (963a6d74)],eax 963a55b2 75ed jne Zyx+0x5a1 (963a55a1) 963a55b4 a1a06d3a96 mov eax,dword ptr [Zyx+0x1da0 (963a6da0)] 963a55b9 85c0 test eax,eax 963a55bb 7413 je Zyx+0x5d0 (963a55d0) 963a55bd 8b7808 mov edi,dword ptr [eax+8] 963a55c0 6a00 push 0 963a55c2 50 push eax 963a55c3 ffd6 call esi <<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<< nt!ExFreePoolWithTag (via 963a5599)
3: kd> ln poi(963a6b00) (82e20844) hal!KfRaiseIrql | (82e208c6) hal!HalpDispatchSoftwareInterrupt Exact matches:
3: kd> ln poi(963a6b58) (82b2aaba) nt!ExFreePoolWithTag | (82b2b660) nt!ExDeferredFreePool Exact matches:
The next step was to work out what functions the Zyx+0x5c5 function called, and with what parameters. The assembler didn’t make this immediately obvious. The address of the function calls were lookups. This is common when the address is imported by the code, instead of being bound. To get the function name of the call, I dereferenced the pointer (poi(<addr>) and passed the address to ln so that it listed the nearest symbols.
The first function confirmed some the theory. A call to hal!KfRaiseIrql was being made with a IRQL of HIGH (31). And this was prior to a code path that called functions.
The value of the @esi register was determined to be nt!ExFreePoolWithTag. This matched the function name in the reconstructed stack (nt!ExFreePoolWithTag+0x436) and confirmed that the stack was reconstructed in this area correctly.
Conclusion
The Zyx function raised IRQL to HIGH_LEVEL before calling nt!ExFreePoolWithTag. This function caused a Translation Look-aside Buffer (TLB) flush via nt!KiFlushTargetMultipleRangeTb) to occur. This in turn caused an IPI notification to the other processors (via nt!KiIpiSendPacket) to indicate the release of the TLB memory.
The Processor #0 IPI was stuck as incomplete because Processor #3 could not be interrupted, and Processor #3 did not lower its IRQL as it was waiting to send its own IPI upon completion of the Processor #0 IPI – thus a deadlock was formed.
I contacted the driver’s author and organized a new version that used the SYNCH (27) IRQL (the level immediately under IPI). The new build was sent to the customer and the immediate bugcheck that they were observing in their test environment disappeared. Although this is not an ideal solution, it was sufficient for the time being. An ideal solution would be to follow the documentation for ExFreePoolWithTag and not make the call with an IRQL greater than DISPATCH_LEVEL.
I hope this case has provided a good foundation in how to recover and navigate through stacks without symbols or context. Once you understand the mechanics of calling conventions, the values needed to bound a disassembling operation and/or a stack reassemble are quite easy to determine.
Hello everyone, my name is Sean Walker, and I am on the Platforms OEM team in Washington. This article is for those people who have had a hard time switching from the old boot.ini configuration to the new BCD store (myself included). Doing the simple tasks such as enabling kernel debugging over com1 are easy to do with bcdedit.exe or the msconfig GUI, you just enable them and reboot the computer. However, if you need to do something more advanced such as break into the early boot process during resume from hibernation, things get a lot more complicated.
This article has some samples for enabling and disabling debug settings that you may not be familiar with, and a list of bcdedit debug settings for Windows Vista/Server 2008 and Windows 7/Server 2008 R2. This information has been helpful to me for quickly and accurately getting to the debug at hand rather than fumbling around with bcdedit. Much of the following information has been taken from various sources, including the windbg help files, the OEM team blog, the MSDN bcdedit reference, and the WHDC debugger site.
NOTE: For the examples below, you will need to run bcdedit.exe from an administrator (UAC-elevated) command prompt. To output a summary view of the current state of the BCD store, just run "bcdedit.exe" from the command prompt. To get detailed information about all of the store(s) that Windows knows about, use the following command:
bcdedit /enum all
What is a BCD store?
A BCD store is a binary file that contains boot configuration data for Windows, basically it is a small registry file. Boot applications use the system BCD store, located on the system partition, during the boot process. You can also create additional BCD stores in separate files but only one store at a time can be designated as the system store.
NOTE: The "/store" switch can be used to specify a particular BCD store for bcdedit commands (instead of the default store). To enumerate all the settings in another BCD store, in this case e:\bcd_store\BCD, use the following command:
bcdedit /store e:\bcd_store\BCD /enum all
This will show you which options are currently set, and what their values are. When /store switch is omitted, the system store is used.
To enable debugging for early boot problems, you may need to enable the bootdebug switch. This is easy to do with bcdedit:
bcdedit /set bootdebug on
However, this only sets bootdebug for the current "boot application", which is generally winload.exe, so it does not break into the very early boot process. There are multiple applications used for booting, hibernating, and resuming (bootmgr.exe, winload.exe and winresume.exe are examples of these). Each application (called BCD Objects) has its own settings (called BCD Elements) in the BCD store and each can be modified globally and/or individually.
So, to deal with different (or multiple) debug scenarios, you just enable boot debugging based on the boot application you are concerned with. For early debugging, you can enable bootdebug for bootmgr:
bcdedit /set {bootmgr} bootdebug on
To set bootdebug for winload.exe (which will most often be your current, and default, boot object) all three of the following will give you the same result:
bcdedit /set {current} bootdebug on
bcdedit /set {default} bootdebug on
If you are modifying the settings in another store, or are booted into another OS on the same computer (such as WinPE), you need to specify the location of the BCD store:
bcdedit /store d:\Boot\BCD /set {default} bootdebug on
Not all of the boot objects have "friendly" names, so you may need to specify the full GUID (Globally Unique ID) to modify it. As an example, if you wanted to enable bootdebug on resume from hibernation, you would include the identifier (see figure 1) for the "Resume from Hibernate" object:
bcdedit /set {89a932d0-d5bc-11e0-a0af-00215add5ebc} bootdebug on
Figure 1: Color coded bcdedit output
Why won't my USB or 1394 debug work?
When there are multiple debug ports of a certain type in a computer Windows may not default to the correct one for your situation. This happens most commonly when there are either multiple 1394 host controllers or USB EHCI controllers. When this occurs it can range from a slight inconvenience (different port is used so the cable needs to be plugged into another port), to complete failure (internal port is used, which is not accessible). In the case of USB debugging the Intel USB 2.0 specification only provides one debug port, so debugging is not possible if the wrong host controller is used.
There are several caveats with USB debugging, not the least of which is that you need to buy a separate, expensive, debug cable. Some of the difficulties and implementation details necessary to get USB debugging to work are encompassed in the WHDC USB FAQ and in Setting Up Kernel Debugging with USB 2.0.
NOTE: A correction to the WHDC USB documentation for Windows 7/Windows 2008 R2 is that the busparams switch now takes decimal rather than hexadecimal values, and the "loadoptions" parameter is no longer required. So, to enable the busparams element (for USB or 1394 debugging) in Vista/2008, you would use something like this:
bcdedit /set {current} loadoptions busparams=0.1D.7
And the Win7/2008 R2 example would be:
bcdedit /set {current} busparams 0.29.7
In the case of loadoptions or busparams, deleting the setting is not as easy as changing a flag from yes to no. You must specifically delete the value to get rid of it, and one of the examples below can be used:
For Vista/2008:
bcdedit /deletevalue {current} loadoptions
And Windows 7/2008 R2:
bcdedit /deletevalue {current} busparams
Bcdedit settings and examples
This is just scratching the surface of using bcdedit for your troubleshooting and/or debugging needs, so there are more articles to follow. Part 2 will include some more detailed debugging scenarios, such as Hyper-V guest and host debugging. Below is a consolidated table with many of the debugging switches/settings as well as a number of different usage examples.
Table of debug-related bcdedit settings
Option
Description
bootdebug
Enables or disables the boot debugger for a specified boot entry. Although this command works for any boot entry, it is effective only for boot applications.
Enable value(s): on, 1
Disable value(s): off, 0
Bcdedit /set bootdebug on
debug
Enables or disables the kernel debugger for a specified boot entry.
/dbgsettings
Used to modify the global settings for the debug connection (does not include hypervisor). Values:
Can change all settings at once instead of using the /set command to change them individually. Usage example:
bcdedit /dbgsettings 1394 channel:30
debugport
Used to specify the debugger type.
Values:
Serial port – com1, com2, comx
1394 port – 1394
USB port - USB
channel
Specifies 1394 channel used.
Decimal integer between 0 and 62, inclusive.
baudrate
Used to specify the baud rate of a serial debug port.
Values: 9600, 19200, 38400, 57600, 115200
targetname
Specifies a string to use as the identification for the USB 2.0 connection. This string can be any value.
Usage example:
bcdedit /dbgsettings usb targetname:usbdebug
/hypervisorsettings
Used the same way as /dbgsettings to configure all settings at once.
bcdedit /hypervisorsettings 1394 channel:10
hypervisordebug
Enables or disables hypervisor debug mode. This is for debugging a Hyper-V host system.
bcdedit /set {current} hypervisordebug on
/noumex
Specifies that the kernel debugger ignores user-mode exceptions. By default, the kernel debugger breaks for certain user-mode exceptions, such as STATUS_BREAKPOINT and STATUS_SINGLE_STEP. The /noumex parameter is effective only when there is no user-mode debugger attached to the process.
/start
This option specifies the debugger start policy. If a start policy is not specified, ACTIVE is the default.
Values: active, disable, autoenable
loadoptions
Used to describe settings that are not covered by other types. One setting that is relevant here is busparams.
Values: Any value followed by the setting.
Usage example (Vista/2008):
bcdedit /set {current} loadoptions busparams=0.1d.0
busparams
A boot setting (specified with loadoptions key word) used to point to the PCI address of the debugger in use. The PCI bus, device, and function are used, in the format bb.dd.ff. This is generally used to identify the location of a 1394 or USB debug port. In Vista/2008, hexadecimal values are used, whereas decimal values are used for Win7.
Values: Decimal values between 0 and 255.
In Win7 - bcdedit /set busparams 0.29.0
In Vista - bcdedit /set loadoptions busparams=0.1d.0
kernel
The loadoptions parameter used to point to a different kernel binary. This can be used to test with a checked or instrumented version of the kernel without replacing the existing one. The updated binary MUST be placed in the %windir%\system32 folder to be used
Values: The 8.3 filename of the replacement kernel include the exe extension.
Usage examples:
In Win7 – bcdedit /set kernel kernchk.exe
In Vista - bcdedit /set loadoptions kernel=kernchk.exe
hal
The loadoptions parameter used to point to a different hal binary. This can be used to test with a checked or instrumented version of the kernel without replacing the existing one. The updated binary MUST be placed in the %windir%\system32 folder to be used
Values: the 8.3 filename of the replacement kernel include the .dll extension.
In Win7 – bcdedit /set hal halchk.dll
In Vista - bcdedit /set loadoptions hal=halchk.dll
testsigning
Controls whether Windows 7, Windows Server 2008, or Windows Vista will load any type of test-signed kernel-mode code. This option is not set by default, which means test-signed kernel-mode drivers on 64-bit versions of Windows 7, Windows Server 2008, and Windows Vista will not load without setting the testsigning switch
Bcdedit /set testsigning on