• Ntdebugging Blog

    Debugging a CLOCK_WATCHDOG_TIMEOUT Bugcheck


    Hi debuggers, Andrew Richards here for my first NT Debugging post. I thought I’d share a recent case that used a lot of discovery techniques to uncover the details of what was going on. Most bugchecks give you the information you need as arguments, but in the case of bugcheck 0x101, I had to go digging for a trap frame, the thread stack, look at the disassembly of the running threads, and lots of other goodies.


    As usual, the first thing I did was run !analyze -v to get a more detailed explanation of what a bugcheck 0x101 “CLOCK_WATCHDOG_TIMEOUT” is.


    0: kd> !analyze -v
    *                                                                             *
    *                        Bugcheck Analysis                                    *
    *                                                                             *

    An expected clock interrupt was not received on a secondary processor in an
    MP system within the allocated interval. This indicates that the specified
    processor is hung and not processing interrupts.
    Arg1: 00000031, Clock interrupt time out interval in nominal clock ticks.
    Arg2: 00000000, 0.
    Arg3: 87337120, The PRCB address of the hung processor.
    Arg4: 00000003, 0.


    A bugcheck 0x101 occurs when the Clock interrupt (IRQL #28) has not been processed by each processor within a timeout.  The Clock interrupt is quite high in the IRQL table for x86; only the Inter-Processor Interrupt (IPI), Power-Fail and High interrupts are higher.





    Power Fail


    Inter-processor Interrupt






    Device n






    Device 1








    The immediate hypothesis was that one of the processors was stuck processing an IPI (a common interrupt), causing it to be above the Clock IRQ level.

    The questions were:

    1.  At what IRQL was the problem processor?  Was it at IPI?

    2.  Why was it stuck?

    3.  Who did it?


    This was a 4 core box, so I ran !prcb four times to view the Processor Resource control Block (PRCB) of each processor. Processor #3 matched the PRCB specified in the bugcheck’s 3rd argument (Arg3 = 87337120). The first question was answered; Processor #3 was the culprit. Interestingly, the Arg4 value in the bugcheck has a value of 3.  Per the documentation it should be 0, but it seems that the processor number is provided.


    0: kd> !prcb 0
    PRCB for Processor 0 at 82b34d20:
    Current IRQL -- 28
    Threads--  Current 82b3e380 Next 00000000 Idle 82b3e380
    Processor Index 0 Number (0, 0) GroupSetMember 1
    Interrupt Count -- 000475e1
    Times -- Dpc    000008d6 Interrupt 0000007b
             Kernel 0001920f User      00003b0b

    0: kd> !prcb 1
    PRCB for Processor 1 at 807c7120:
    Current IRQL -- 0
    Threads--  Current 9bb65d48 Next 9bf52c10 Idle 807cc800
    Processor Index 1 Number (0, 1) GroupSetMember 2
    Interrupt Count -- 00030ae8
    Times -- Dpc    000005ca Interrupt 00000098
             Kernel 00017e4a User      00004eab

    0: kd> !prcb 2
    PRCB for Processor 2 at 87300120:
    Current IRQL -- 0
    Threads--  Current 9b41b6e8 Next 00000000 Idle 87305800
    Processor Index 2 Number (0, 2) GroupSetMember 4
    Interrupt Count -- 0002ab35
    Times -- Dpc    00000568 Interrupt 000000ac
             Kernel 0001a788 User      00002565

    0: kd> !prcb 3
    PRCB for Processor 3 at 87337120:
    Current IRQL -- 0
    Threads--  Current 8aaa17c8 Next 00000000 Idle 8733c800
    Processor Index 3 Number (0, 3) GroupSetMember 8
    Interrupt Count -- 00026c0d
    Times -- Dpc    00000620 Interrupt 0000008b
             Kernel 0001ac65 User      00001e33



    The Processor Context Records reported that Processors #0, #1 and #2 were at IRQL 31 (1f - HIGH), and Processor #3 was at IRQL 27 (1b - SYNCH). Having a value of 31 is expected at bugcheck as that is how the bugcheck gains control of the processor to gather the current context. It was strange that Processor #3 was different and that is matched the processor mentioned in the bugcheck

    0: kd> !pcr 0
    KPCR for Processor 0 at 82b34c00:
                    Irql: 0000001f

    0: kd> !pcr 1
    KPCR for Processor 1 at 807c7000:
                    Irql: 0000001f

    0: kd> !pcr 2
    KPCR for Processor 2 at 87300000:
                    Irql: 0000001f

    0: kd> !pcr 3
    KPCR for Processor 3 at 87337000:
                    Irql: 0000001b


    The next step was to look at the stacks of the processors to see what the threads were all involved in.


    Processor #0

    To determine the stack of the Processor #0’s thread before the bugcheck, the trap frame needed to be found. The trap frame is stored immediately above the interrupt handler. To find that, I looked at the Interrupt Descriptor Table to find out the name of the handler for clock interrupt and then I searched for that symbol on the stack.


    ==> Processor #0
    0: kd> knL
    # ChildEBP RetAddr 
    00 82b31674 82a84a6f nt!KeBugCheckEx+0x1e
    01 82b316b0 82a840be nt!KeAccumulateTicks+0x242
    02 82b316f0 82a83f6b nt!KeUpdateRunTime+0x145
    03 82b3174c 82a88c17 nt!KeUpdateSystemTime+0x613
    04 82b3174c 82a85e79 nt!KeUpdateSystemTimeAssist+0x13
    05 82b317e0 82abfa17 nt!KiIpiSendPacket+0xdd
    06 82b31820 82af0866 nt!KeFlushSingleTb+0x136
    07 82b3190c 82b2ab90 nt!MmFreeSpecialPool+0x2b4
    08 82b31970 82d4a06e nt!ExFreePoolWithTag+0xd6
    09 82b31980 82d3fab2 nt!ViCtxFreeIsrContext+0xf
    0a 82b31998 82d3a1c0 nt!VfIoFreeIrp+0xd3
    0b 82b319a8 8ced5986 nt!IovFreeIrpPrivate+0x47
    WARNING: Stack unwind information not available. Following frames may be wrong.
    0c 82b319c4 82d3acd4 irsir+0x2986
    0d 82b319f4 82a81933 nt!IovpLocalCompletionRoutine+0x14b
    0e 82b31a3c 82d3ab64 nt!IopfCompleteRequest+0x128
    0f 82b31aa4 8b7c6abd nt!IovCompleteRequest+0x133
    10 82b31ab4 8b7c6461 serial!SerialGetNextIrpLocked+0x61
    11 82b31ad8 8b7c7567 serial!SerialGetNextIrp+0x27
    12 82b31b00 8b7b9eb7 serial!SerialTryToCompleteCurrent+0x7a
    13 82b31b38 82a83039 serial!SerialReadTimeout+0x68
    14 82b31b7c 82a82fdd nt!KiProcessTimerDpcTable+0x50
    15 82b31c68 82a82e9a nt!KiProcessExpiredTimerList+0x101
    16 82b31cdc 82a8100e nt!KiTimerExpiration+0x25c
    17 82b31d20 82a80e38 nt!KiRetireDpcList+0xcb
    18 82b31d24 00000000 nt!KiIdleLoop+0x38


    0: kd> !irql
    Debugger saved IRQL for processor 0x0 -- 28 (CLOCK2_LEVEL)


    0: kd> !idt

    Dumping IDT:

    37:   82e35104 hal!PicSpuriousService37

    51:   8aba1558 serial!SerialCIsrSw (KINTERRUPT 8aba1500)

    71:   89003cd8 i8042prt!I8042KeyboardInterruptService (KINTERRUPT 89003c80)

    72:   8aba17d8 USBPORT!USBPORT_InterruptService (KINTERRUPT 8aba1780)

    82:   89123058 ataport!IdePortInterrupt (KINTERRUPT 89123000)

                   ataport!IdePortInterrupt (KINTERRUPT 89123a00)

                   ataport!IdePortInterrupt (KINTERRUPT 89123780)

                   ataport!IdePortInterrupt (KINTERRUPT 89123500)

    92:   8aba1058 Impcd+0x8540 (KINTERRUPT 8aba1000)

    a0:   8aba1a58 ndis!ndisMiniportMessageIsr (KINTERRUPT 8aba1a00)

    a2:   8aba1cd8 USBPORT!USBPORT_InterruptService (KINTERRUPT 8aba1c80)

    b0:   891232d8 ndis!ndisMiniportMessageIsr (KINTERRUPT 89123280)

    b1:   89123cd8 ACPI!ACPIInterruptServiceRoutine (KINTERRUPT 89123c80)

    b2:   8aba12d8 serial!SerialCIsrSw (KINTERRUPT 8aba1280)

    c1:   82e353f4 hal!HalpBroadcastCallService

    d1:   82e1d634 hal!HalpHpetClockInterrupt

    d2:   82e1d898 hal!HalpHpetRolloverInterrupt

    df:   82e351dc hal!HalpApicRebootService

    e1:   82e35958 hal!HalpIpiHandler

    e3:   82e356f8 hal!HalpLocalApicErrorService

    fd:   82e35f2c hal!HalpProfileInterrupt

    fe:   82e361a8 hal!HalpPerfInterrupt


    The search was between the nt!KiIpiSendPacket call (I just chose it as it didn’t seem to be bugcheck related) and the current stack pointer.  Using the dereferenced pointer (poi) and some maths, the trap frame location is retrieved.


    0: kd> dps @esp 82b317e0
    82b31744  82b31760 nt!KiDoubleFaultStack+0x2760 <<<<< This is not a real symbol; it actually is a part of the trap frame
    82b31748  82e1d72a hal!HalpHpetClockInterrupt+0xf6
    82b3174c  82b31760 nt!KiDoubleFaultStack+0x2760
    82b31750  82a88c17 nt!KeUpdateSystemTimeAssist+0x13


    0: kd> .trap poi(82b31748-4)
    ErrCode = 00000000
    eax=87300120 ebx=841882dc ecx=4cdfc4c4 edx=82b34d20 esi=807c7120 edi=82b738c4
    eip=82a85e79 esp=82b317d4 ebp=82b31820 iopl=0         nv up ei pl nz na po nc
    cs=0008  ss=0010  ds=0000  es=dea0  fs=040f  gs=0008             efl=00000202
    82a85e79 f390            pause


    So you might be asking yourself, can this be done an easier way?  The answer is definitely Yes. You just need to use kv instead; it adds the trap frame information on the end of the line.


    0: kd> kv

    ChildEBP RetAddr  Args to Child             

    82b31674 82a84a6f 00000101 00000031 00000000 nt!KeBugCheckEx+0x1e

    82b316b0 82a840be 00026161 00000000 0001cd00 nt!KeAccumulateTicks+0x242

    82b316f0 82a83f6b 82a85e79 807c7120 00000000 nt!KeUpdateRunTime+0x145

    82b3174c 82a88c17 ffffff1b ffffff1b 000000d1 nt!KeUpdateSystemTime+0x613

    82b3174c 82a85e79 ffffff1b ffffff1b 000000d1 nt!KeUpdateSystemTimeAssist+0x13 (FPO: [0,2] TrapFrame @ 82b31760)

    82b317e0 82abfa17 00000001 00000000 82a3cbe1 nt!KiIpiSendPacket+0xdd (FPO: [6,2,0])

    82b31820 82af0866 a33d6f20 00000001 a33d00cf nt!KeFlushSingleTb+0x136

    82b3190c 82b2ab90 a33d6f20 8a2096d8 8a2096d8 nt!MmFreeSpecialPool+0x2b4

    82b31970 82d4a06e a33d6f20 00000000 82b31998 nt!ExFreePoolWithTag+0xd6

    82b31980 82d3fab2 a33d6f20 a33d6f20 a33d6f20 nt!ViIrpFree+0xf

    82b31998 82d3a1c0 8ced5986 905cadb0 82b319c4 nt!VfIoFreeIrp+0xd3

    82b319a8 8ced5986 a33d6f20 8ced58f0 82b31a6c nt!IovFreeIrpPrivate+0x47

    WARNING: Stack unwind information not available. Following frames may be wrong.

    82b319c4 82d3acd4 00000000 a33d6f20 905cadb0 irsir+0x2986

    82b319f4 82a81933 00000000 a33d6f20 82b31a6c nt!IovpLocalCompletionRoutine+0x14b

    82b31a3c 82d3ab64 a33d6f20 8ab151ac 8ab150f0 nt!IopfCompleteRequest+0x128

    82b31aa4 8b7c6abd 00000000 8ab151ac 82b31ad8 nt!IovCompleteRequest+0x133

    82b31ab4 8b7c6461 8ab151ac 8ab1518c 82b31b0c serial!SerialGetNextIrpLocked+0x61 (FPO: [Non-Fpo])

    82b31ad8 8b7c7567 8ab151ac 8ab1518c 82b31b0c serial!SerialGetNextIrp+0x27 (FPO: [Non-Fpo])

    82b31b00 8b7b9eb7 8ab150f0 00000000 8ab15002 serial!SerialTryToCompleteCurrent+0x7a (FPO: [Non-Fpo])

    82b31b38 82a83039 8ab15314 8ab15002 3c171d26 serial!SerialReadTimeout+0x68 (FPO: [Non-Fpo])

    82b31b7c 82a82fdd 82b34d20 82b31ca8 00000001 nt!KiProcessTimerDpcTable+0x50

    82b31c68 82a82e9a 82b34d20 82b31ca8 00000000 nt!KiProcessExpiredTimerList+0x101

    82b31cdc 82a8100e 0001cacf 9bb374c0 82b3e380 nt!KiTimerExpiration+0x25c

    82b31d20 82a80e38 00000000 0000000e 00000000 nt!KiRetireDpcList+0xcb

    82b31d24 00000000 0000000e 00000000 00000000 nt!KiIdleLoop+0x38 (FPO: [0,0,0])


    By using the stack based maths, or the value provide by kv, the trap frame address is used to set the context (.trap <addr>) to the code running before the interrupt.  These are the stored registers and the stack at the time of the interrupt.


    0: kd> .trap 82b31760

    ErrCode = 00000000

    eax=87300120 ebx=841882dc ecx=4cdfc4c4 edx=82b34d20 esi=807c7120 edi=82b738c4

    eip=82a85e79 esp=82b317d4 ebp=82b31820 iopl=0         nv up ei pl nz na po nc

    cs=0008  ss=0010  ds=0000  es=dea0  fs=040f  gs=0008             efl=00000202


    82a85e79 f390            pause


    0: kd> knL
      *** Stack trace for last set context - .thread/.cxr resets it
    # ChildEBP RetAddr 
    00 82b317e0 82abfa17 nt!KiIpiSendPacket+0xdd
    01 82b31820 82af0866 nt!KeFlushSingleTb+0x136
    02 82b3190c 82b2ab90 nt!MmFreeSpecialPool+0x2b4
    03 82b31970 82d4a06e nt!ExFreePoolWithTag+0xd6
    04 82b31980 82d3fab2 nt!ViCtxFreeIsrContext+0xf
    05 82b31998 82d3a1c0 nt!VfIoFreeIrp+0xd3
    06 82b319a8 8ced5986 nt!IovFreeIrpPrivate+0x47
    WARNING: Stack unwind information not available. Following frames may be wrong.
    07 82b319c4 82d3acd4 irsir+0x2986
    08 82b319f4 82a81933 nt!IovpLocalCompletionRoutine+0x14b
    09 82b31a3c 82d3ab64 nt!IopfCompleteRequest+0x128
    0a 82b31aa4 8b7c6abd nt!IovCompleteRequest+0x133
    0b 82b31ab4 8b7c6461 serial!SerialGetNextIrpLocked+0x61
    0c 82b31ad8 8b7c7567 serial!SerialGetNextIrp+0x27
    0d 82b31b00 8b7b9eb7 serial!SerialTryToCompleteCurrent+0x7a
    0e 82b31b38 82a83039 serial!SerialReadTimeout+0x68
    0f 82b31b7c 82a82fdd nt!KiProcessTimerDpcTable+0x50
    10 82b31c68 82a82e9a nt!KiProcessExpiredTimerList+0x101
    11 82b31cdc 82a8100e nt!KiTimerExpiration+0x25c
    12 82b31d20 82a80e38 nt!KiRetireDpcList+0xcb
    13 82b31d24 00000000 nt!KiIdleLoop+0x38


    Dissassembling the first few instructions reveals a jump (jmp) that is back up in the nt!KiIpiSendPacket function. Using the jmp location and the instruction after the jmp as the bound, we can disassemble the loop. At the time of the bugcheck, the thread was executing a pause (a CPU based delay), and seemingly doing this in a loop while waiting for to release it.


    0: kd> u @eip
    82a85e79 f390            pause  <<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<< @eip
    82a85e7b eb9e            jmp     nt!KiIpiSendPacket+0x7f (82a85e1b)
    82a85e7d 8d4900          lea     ecx,[ecx]


    0: kd> u 82a85e1b 82a85e7d
    82a85e1b 8b86a4180000    mov     eax,dword ptr [esi+18A4h]
    82a85e21 0bc0            or      eax,eax <<<<<< Checking if value is nonzero
    82a85e23 7538            jne     nt!KiIpiSendPacket+0xc1 (82a85e5d) <<<<<< Take jmp to stay in loop
    82a85e25 f00fb196a4180000 lock cmpxchg dword ptr [esi+18A4h],edx
    82a85e2d 75ec            jne     nt!KiIpiSendPacket+0x7f (82a85e1b)
    82a85e2f 59              pop     ecx
    82a85e30 d1e9            shr     ecx,1
    82a85e32 8d7f04          lea     edi,[edi+4]
    82a85e35 72df            jb      nt!KiIpiSendPacket+0x7a (82a85e16)
    82a85e37 75f7            jne     nt!KiIpiSendPacket+0x94 (82a85e30)
    82a85e39 59              pop     ecx
    82a85e3a 64890d8c190000  mov     dword ptr fs:[198Ch],ecx
    82a85e41 8b4c240c        mov     ecx,dword ptr [esp+0Ch]
    82a85e45 8b542410        mov     edx,dword ptr [esp+10h]
    82a85e49 64ff0590360000  inc     dword ptr fs:[3690h]
    82a85e50 52              push    edx
    82a85e51 51              push    ecx
    82a85e52 ff15a4a0a082    call    dword ptr [nt!_imp__HalRequestIpi (82a0a0a4)]
    82a85e58 5f              pop     edi
    82a85e59 5e              pop     esi
    82a85e5a c21800          ret     18h
    82a85e5d 41              inc     ecx
    82a85e5e 850d7c3ab782    test    dword ptr [nt!HvlLongSpinCountMask (82b73a7c)],ecx
    82a85e64 7513            jne     nt!KiIpiSendPacket+0xdd (82a85e79)
    82a85e66 f605783ab78240  test    byte ptr [nt!HvlEnlightenments (82b73a78)],40h <<<<<< Don’t spin if you’re an enlightened VM, just pause
    82a85e6d 740a            je      nt!KiIpiSendPacket+0xdd (82a85e79)
    82a85e6f 52              push    edx
    82a85e70 51              push    ecx
    82a85e71 51              push    ecx
    82a85e72 e8c8b60500      call    nt!HvlNotifyLongSpinWait (82ae153f)
    82a85e77 59              pop     ecx
    82a85e78 5a              pop     edx
    82a85e79 f390            pause  <<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<< @eip
    82a85e7b eb9e            jmp     nt!KiIpiSendPacket+0x7f (82a85e1b)


    The summary so far: Since processor #0 was the thread that created the bugcheck, it must have been interrupted by a Clock interrupt so as to trigger a CLOCK_WATCHDOG_TIMEOUT bugcheck. It is not surprising then that the value of !irql is CLOCK2.

    The existence of the nt!KiIpiSendPacket function lends weight to the thought that this was the creator of an IPI and is not one of the processors that wasn’t processing IPIs.


    Processor #1

    Using the same technique as Processor #0, the disassembly for the loop on Processor #1 is determined. Processor #1 is in a tight loop within the nt!KeFlushMultipleRangeTb function, an ancestor function of nt!ExAllocatePoolWithTag, a memory related operation.  This is an interesting function to see, considering that nt!ExFreePoolWithTag, the counterpart, is on Processor #0’s stack.


    ==> Processor #1
    1: kd> knL
    # ChildEBP RetAddr 
    00 95bfc7bc 82abb217 nt!KeFlushMultipleRangeTb+0x2d3
    01 95bfc89c 82aa7d11 nt!MiFlushTbAsNeeded+0x12e
    02 95bfc8dc 82b29487 nt!MiAllocatePagedPoolPages+0x567
    03 95bfc940 82aa4674 nt!MiAllocatePoolPages+0x1f
    04 95bfc998 82b2a132 nt!ExpAllocateBigPool+0xa6
    05 95bfc9fc 82aab6b1 nt!ExAllocatePoolWithTag+0x12d
    06 95bfca20 82c4cc62 nt!ExAllocatePoolWithTagPriority+0x196
    07 95bfca78 82c7a662 nt!IopQueryNameInternal+0x60
    08 95bfca98 82c57d88 nt!IopQueryName+0x1b
    09 95bfcb1c 82c71a50 nt!ObpQueryNameString+0x7f
    0a 95bfcb38 82c75e12 nt!ObQueryNameString+0x18
    0b 95bfcc14 82c65788 nt!EtwTraceProcess+0xa2
    0c 95bfcc38 82c73625 nt!PspExitProcess+0x37
    0d 95bfccb4 82c87051 nt!PspExitThread+0x59a
    0e 95bfcccc 82aba8c0 nt!PsExitSpecialApc+0x22
    0f 95bfcd1c 82a472a4 nt!KiDeliverApc+0x28b
    10 95bfcd1c 77556fc0 nt!KiServiceExit+0x64
    WARNING: Frame IP not in any known module. Following frames may be wrong.
    11 014ef918 00000000 0x77556fc0


    1: kd> !irql

    Debugger saved IRQL for processor 0x1 -- 0 (LOW_LEVEL)  <<<<<< Windows Internals 4th Edition notes that IRQL may not be saved; this explains the 0


    1: kd> u @eip
    82a40c31 f390            pause <<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<< @eip
    82a40c33 8b07            mov     eax,dword ptr [edi]
    82a40c35 85c0            test    eax,eax
    82a40c37 75de            jne     nt!KeFlushMultipleRangeTb+0x2b9 (82a40c17)
    82a40c39 8a4d0f          mov     cl,byte ptr [ebp+0Fh]
    82a40c3c ff1558a1a082    call    dword ptr [nt!_imp_KfLowerIrql (82a0a158)]
    82a40c42 5f              pop     edi
    82a40c43 5e              pop     esi


    1: kd> u 82a40c17 82a40c39
    82a40c17 46              inc     esi
    82a40c18 85357c3ab782    test    dword ptr [nt!HvlLongSpinCountMask (82b73a7c)],esi
    82a40c1e 7511            jne     nt!KeFlushMultipleRangeTb+0x2d3 (82a40c31)
    82a40c20 f605783ab78240  test    byte ptr [nt!HvlEnlightenments (82b73a78)],40h
    82a40c27 7408            je      nt!KeFlushMultipleRangeTb+0x2d3 (82a40c31)
    82a40c29 56              push    esi
    82a40c2a e810090a00      call    nt!HvlNotifyLongSpinWait (82ae153f)
    82a40c2f eb02            jmp     nt!KeFlushMultipleRangeTb+0x2d5 (82a40c33)
    82a40c31 f390            pause <<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<< @eip
    82a40c33 8b07            mov     eax,dword ptr [edi]
    82a40c35 85c0            test    eax,eax
    82a40c37 75de            jne     nt!KeFlushMultipleRangeTb+0x2b9 (82a40c17)


    This function tests a variable at edi in each loop. While the signal is not set, the thread goes around loop, eventually executes a pause instruction and then tries the test again. This thread seems to be waiting on someone to set a flag.


    Processor #2

    Processor #2 is also trying to send an IPI using nt!KiIpiSendPacket. It was caught in the same function that Processor #0 is in.  In this case, it is executing the jmp instruction that is one instruction after the pause which Processor #0 is executing.


    ==> Processor #2
    2: kd> knL
    # ChildEBP RetAddr 
    00 a0c4fac0 82a40bcd nt!KiIpiSendPacket+0xdf
    01 a0c4fafc 82b29431 nt!KeFlushMultipleRangeTb+0x26f
    02 a0c4fbe8 82b2aef1 nt!MiFreePoolPages+0x42c
    03 a0c4fc50 82aa6b37 nt!ExFreePoolWithTag+0x436
    04 a0c4fc64 82c3745d nt!MmFreeAccessPfnBuffer+0x2f
    05 a0c4fcc0 82c4b83a nt!PfpFlushBuffers+0x2ba
    06 a0c4fd50 82c12f5e nt!PfTLoggingWorker+0xaa
    07 a0c4fd90 82aba219 nt!PspSystemThreadStartup+0x9e
    08 00000000 00000000 nt!KiThreadStartup+0x19


    2: kd> !irql

    Debugger saved IRQL for processor 0x2 -- 0 (LOW_LEVEL)  <<<<<< Windows Internals 4th Edition notes that IRQL may not be saved; this explains the 0


    2: kd> u @eip
    82a85e7b eb9e            jmp     nt!KiIpiSendPacket+0x7f (82a85e1b) <<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<< @eip
    82a85e7d 8d4900          lea     ecx,[ecx]


    Summary so far

    So at this point we can say (without any real knowledge of how IPI handling is achieved) that:


    Processor #0 is waiting for Processor #1, #2 and/or #3 to respond to its IPI

    Processor #0 is running nt!ExFreePoolWithTag

    Processor #0 is reported to be at IRQL 28 (CLOCK2_LEVEL)


    Processor #1 is waiting for a flag to bet set; probably an IPI flag

    Processor #0 is running nt!ExAllocatePoolWithTag

    Processor #1 is reported to be at IRQL 0 (LOW_LEVEL); probably incorrect


    Processor #2 is waiting for Processor #0, #1 and/or #3 to respond to its IPI

    Processor #0 is running nt!ExFreePoolWithTag

    Processor #2 is reported to be at IRQL 0 (LOW_LEVEL); probably incorrect


    Processor #3 has been tagged by bugcheck as being the cause

    Processor #3 hasn’t been looked at yet


    Processor #3

    Analyzing Processor #3 was problematic. The Processor Context Record (PCR) was not completely captured in the dump. This happens when the bugcheck thread is unable to interrupt the target processor to gather the context – usually because it is running at the HIGH (31) IRQL.

    When the context is missing, all of the registers are treated as being zero. The zero value instruction pointer causes the stack to be incorrect when calling knL and various other commands.


    ==> Processor #3
    1: kd> ~3
    WARNING: Process directory table base 7B59C400 doesn't match CR3 00185000
    WARNING: Process directory table base 7B59C400 doesn't match CR3 00185000


    3: kd> r
    eax=00000000 ebx=00000000 ecx=00000000 edx=00000000 esi=00000000 edi=00000000
    eip=00000000 esp=00000000 ebp=00000000 iopl=0         nv up di pl nz na po nc
    cs=0000  ss=0000  ds=0000  es=0000  fs=0000  gs=0000             efl=00000000
    00000000 ??              ???


    3: kd> knL
    # ChildEBP RetAddr 
    WARNING: Frame IP not in any known module. Following frames may be wrong.
    00 00000000 00000000 0x0


    To determine the thread running on the processor, the !thread command was used. The value for the _ETHREAD (8aaa17c8) comes from the PCR that Windows maintains as part of scheduling. Once again, like the knL command, because the context was missing, the stack was not shown.


    3: kd> !pcr
    KPCR for Processor 3 at 87337000:


    3: kd> !thread
    8aaa17c8  Cid 0454.08ec  Teb: 00000000 Win32Thread: 00000000 RUNNING on processor 3
    IRP List:
        8f8c6f68: (0006,0094) Flags: 40000404  Mdl: 00000000
    Not impersonating
    DeviceMap                 89970d50
    Owning Process            89f93ac0       Image:         ZyxApp.exe
    Attached Process          N/A            Image:         N/A
    Wait Start TickCount      117454         Ticks: 589 (0:00:00:09.188)
    Context Switch Count      37             IdealProcessor: 3            
    UserTime                  00:00:00.000
    KernelTime                00:00:00.000
    Win32 Start Address 0x7753fd0f
    Stack Init 91aecfd0 Current 91aeca78 Base
    91aed000 Limit 91aea000 Call 0
    Priority 10 BasePriority 8 UnusualBoost 0 ForegroundBoost 0 IoPriority 2 PagePriority 5
    ChildEBP RetAddr  Args to Child             
    00000000 00000000 00000000 00000000 00000000 0x0


    The important piece of information in the !thread output was the stack limits. These values allow me to do a search of the raw stack for known symbols. I used dps <limit> <base>. What I was looking for was the first symbol after the 0xffffffff entries which has a value above it that points to a location within this stack. In this case, is it the hal!KfLowerIrql symbol that has a value (base pointer) above it. This symbol is the deepest function that the stack has got to; it doesn’t necessarily mean that the stack is at this depth now.


    3: kd> dps 91aea000 91aed000
    91aea000  ffffffff
    91aea004  ffffffff

    91aec85c  ffffffff
    91aec860  ffffffff
    91aec864  ffffffff
    91aec868  ffffffff
    91aec86c  ffffffff
    91aec870  ffffffff
    91aec874  0001187f
    91aec878  00000010
    91aec87c  00011a1f
    91aec880  ffffffff
    91aec884  0001187f
    91aec888  82b7561f nt!MmSystemSpaceLock+0x1f
    91aec88c  91aec89c
    91aec890  82e20ba9 hal!KfLowerIrql+0x61
    91aec894  00011a00
    91aec898  82b7561f nt!MmSystemSpaceLock+0x1f
    91aec89c  91aec95c
    91aec8a0  82e1e92d hal!KeReleaseQueuedSpinLock+0x2d
    91aec8a4  82a3bdc7 nt!MiReturnNonPagedPoolVa+0x1d4
    91aec8a8  00000001

    91aec8ac  00000000

    91aec8b0  83cac9e8

    91aec8b4  ffffffff

    91aec8b8  ffffffff

    91aec8bc  ffffffff

    91aec8c0  ffffffff

    91aec8c4  ffffffff

    91aec8c8  ffffffff

    91aec8cc  ffffffff

    91aec8d0  82e21cee hal!HalpLegacyApicWriteIcr+0xa

    91aec8d4  91aec8f0

    91aec8d8  82e2aea0 hal!HalpRequestIpiSpecifyVector+0x40

    91aec8dc  00000000

    91aec8e0  83cac9e8

    91aec8e4  82b738cc nt!KiProcessorBlock+0xc

    91aec8e8  87300120

    91aec8ec  00000202

    91aec8f0  91aec914

    91aec8f4  82aa6e36 nt!MiInsertPageInFreeOrZeroedList+0x25b

    91aec8f8  83cac9e8

    91aec8fc  00000000

    91aec900  0000001f

    91aec904  00000001

    91aec908  83cac9e8

    91aec90c  00000003

    91aec910  00000007

    91aec914  00000fff

    91aec918  00000003

    91aec91c  82a40bcd nt!KeFlushMultipleRangeTb+0x26f

    91aec920  00000001

    91aec924  00000000

    91aec928  82a9fa43 nt!KiFlushTargetMultipleRangeTb

    91aec92c  00000000

    91aec930  91aec9bc

    91aec934  91aec9b8

    91aec938  00000001

    91aec93c  00000000

    91aec940  83e9cc04

    91aec944  00018736

    91aec948  8733a480

    91aec94c  00000003

    91aec950  91aec9bc

    91aec954  91aeca40

    91aec958  82b29431 nt!MiFreePoolPages+0x42c

    91aec95c  91aec9b8

    91aec960  1f000000

    91aec964  91aec9b8

    91aec968  00000000



    Using the first location found, I used the stack variables to build up a k= command. With an x86 stack, if you select the stack location above the symbol as the base pointer and stack pointer, and the symbol’s address as the instruction pointer, you’ll get a reconstructed stack.


    3: kd> k= 91aec89c 91aec88c 82e20ba9

    ChildEBP RetAddr 

    91aec89c 82e1e92d hal!KfLowerIrql+0x61

    91aec8a0 82a3bdc7 hal!KeReleaseQueuedSpinLock+0x2d

    91aec9a8 82aab049 nt!MiReturnNonPagedPoolVa+0x1d4

    91aec9c0 82ab1685 nt!FindNodeOrParent+0x2091aec9d8 82abd820 nt!RtlLookupElementGenericTableFullAvl+0x16

    91aeca0c 82e20ba9 nt!RtlLookupElementGenericTableAvl+0x18

    91aeca1c 82b199d2 hal!KfLowerIrql+0x61

    91aeca24 82d41ad1 nt!VfAvlUnlockShared+0x15

    91aeca40 82b2aef1 nt!VfRemLockDeleteMemoryRange+0x5c

    91aecaa8 963a55c5 nt!ExFreePoolWithTag+0x436

    91aecac8 963a61d4 Zyx+0x5c5
    91aecb00 82d3a6c3 Zyx+0x11d4
    91aecb24 82a4054a nt!IovCallDriver+0x258
    91aecb38 82c3b975 nt!IofCallDriver+0x1b
    91aecb7c 82c2c591 nt!IopDeleteFile+0x10c
    91aecb94 82a81d60 nt!ObpRemoveObjectRoutine+0x59
    91aecba8 82a81cd0 nt!ObfDereferenceObjectWithTag+0x88
    91aecbb0 82c4f308 nt!ObfDereferenceObject+0xd
    91aecbf0 82c7dba9 nt!ObpCloseHandleTableEntry+0x21d
    91aecc20 82c65f86 nt!ExSweepHandleTable+0x5f
    91aecc40 82c73666 nt!ObKillProcess+0x54
    91aeccb4 82c87051 nt!PspExitThread+0x5db
    91aecccc 82aba8c0 nt!PsExitSpecialApc+0x22
    91aecd1c 82a472a4 nt!KiDeliverApc+0x28b
    91aecd1c 775570b4 nt!KiServiceExit+0x64
    0147ff88 00000000 0x775570b4


    In this case, because nt!KiFlushTargetMultipleRangeTb was present on Processor #1 as well as being in the Processor #3 raw stack, I found the first symbol lower than this symbol on the (raw) stack that had a valid base pointer above it.  The stack was thus built around the nt!MiFreePoolPages function instead.


    91aec94c  00000003

    91aec950  91aec9bc

    91aec954  91aeca40

    91aec958  82b29431 nt!MiFreePoolPages+0x42c

    91aec95c  91aec9b8

    91aec960  1f000000

    91aec964  91aec9b8

    91aec968  00000000


    3: kd> k= 91aeca40 91aec954 82b29431

    ChildEBP RetAddr 

    91aeca40 82b2aef1 nt!MiFreePoolPages+0x42c

    91aecaa8 963a55c5 nt!ExFreePoolWithTag+0x436

    WARNING: Stack unwind information not available. Following frames may be wrong.

    91aecac8 963a61d4 Zyx+0x5c5

    91aecb00 82d3a6c3 Zyx+0x11d4

    91aecb24 82a4054a nt!IovCallDriver+0x258

    91aecb38 82c3b975 nt!IofCallDriver+0x1b

    91aecb7c 82c2c591 nt!IopDeleteFile+0x10c

    91aecb94 82a81d60 nt!ObpRemoveObjectRoutine+0x59

    91aecba8 82a81cd0 nt!ObfDereferenceObjectWithTag+0x88

    91aecbb0 82c4f308 nt!ObfDereferenceObject+0xd

    91aecbf0 82c7dba9 nt!ObpCloseHandleTableEntry+0x21d

    91aecc20 82c65f86 nt!ExSweepHandleTable+0x5f

    91aecc40 82c73666 nt!ObKillProcess+0x54

    91aeccb4 82c87051 nt!PspExitThread+0x5db

    91aecccc 82aba8c0 nt!PsExitSpecialApc+0x22

    91aecd1c 82a472a4 nt!KiDeliverApc+0x28b

    91aecd1c 775570b4 nt!KiServiceExit+0x64

    0147ff88 00000000 0x775570b4


    It’s very hard to get much further this stack since the exact registers are not known. In particular, it is hard to determine what function above nt!MiFreePoolPages the thread is executing at the moment.  As with the Processor #1 investigation, it is interesting to note that Processor #3 is also involved in a memory operation; specifically, it is doing a nt!ExFreePoolWithTag much like Processor #0 is.


    Inter-Processor Interrupts

    The only time an IPI (interrupt) is not processed immediately is when the target processor is at IPI IRQL or higher. The most common example being when it is already processing an IPI (interrupts of the same level cannot interrupt the handler for the same IRQL). In this case, the interrupt has to be queued until the interrupt mask allows its arrival. Usually, this design allows only one IPI to be processed at any one time.


    A deadlock (like) condition can arise though if an IPI is issued to a processor that is at a higher IRQL, and this processor (thread) attempts to send an IPI. The IPI logic blocks the send if there is an outstanding IPI to complete on the processor. The assumption being that interrupt queuing should be avoided as there is a probability of loss if the queue overflows.


    Looking in the Windows Internals book, there is a single sentence that says “Each interrupt level has a specific purpose. For example, the kernel issues an interprocessor interrupt (IPI) to request that another processor perform an action, such as dispatching a particular thread for execution or updating its translation look-aside buffer cache.”.  This is very interesting as the translation look-aside buffer is part of the memory manager, and the memory operations are being undertaken on all the processors.


    3rd Party Driver

    Instead of pulling my hair out combing through the threads on the system, the IPI code, the Memory Manager code or bugcheck code, I decided to look at the 3rd party driver in processor #3 to see if it was changing the IRQL.


    The first step was to find the bound of the Zyx+0x5c5 function.  The end address of the assembler is easy to determine, it is Zyx+0x5c5.  The question is, how big is the function? To work that out, you look at the assembler of the caller.  The caller’s assembler will point to the starting instruction in the Zyx+0x5c5 function.


    3: kd> k= 91aeca40 91aec954 82b29431

    ChildEBP RetAddr 

    91aeca40 82b2aef1 nt!MiFreePoolPages+0x42c

    91aecaa8 963a55c5 nt!ExFreePoolWithTag+0x436

    WARNING: Stack unwind information not available. Following frames may be wrong.

    91aecac8 963a61d4 Zyx+0x5c5

    91aecb00 82d3a6c3 Zyx+0x11d4

    91aecb24 82a4054a nt!IovCallDriver+0x258

    91aecb38 82c3b975 nt!IofCallDriver+0x1b

    91aecb7c 82c2c591 nt!IopDeleteFile+0x10c



    3: kd> ub Zyx+0x11d4
    963a61b7 ff7008          push    dword ptr [eax+8]
    963a61ba 52              push    edx
    963a61bb 6a01            push    1
    963a61bd ff7018          push    dword ptr [eax+18h]
    963a61c0 e827fdffff      call    Zyx+0xeec (963a5eec)
    963a61c5 e9f1000000      jmp     Zyx+0x12bb (963a62bb)
    963a61ca e8d3fcffff      call    Zyx+0xea2 (963a5ea2)
    963a61cf e89ef3ffff      call    Zyx+0x572 (963a5572)


    3: kd> u 963a5572 Zyx+0x5c5
    963a5572 8bff            mov     edi,edi
    963a5574 55              push    ebp
    963a5575 8bec            mov     ebp,esp
    963a5577 51              push    ecx
    963a5578 53              push    ebx
    963a5579 56              push    esi
    963a557a 57              push    edi
    963a557b b11f            mov     cl,1Fh <<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<< 0x1F = 0n31 (HIGH)
    963a557d ff15006b3a96    call    dword ptr [Zyx+0x1b00 (963a6b00)] <<<<<<<< hal!KfRaiseIrql (via lookup)
    963a5583 bb14813a96      mov     ebx,offset Zyx+0x3114 (963a8114)
    963a5588 8bcb            mov     ecx,ebx
    963a558a 8845ff          mov     byte ptr [ebp-1],al
    963a558d e8c2100000      call    Zyx+0x1654 (963a6654) <<<<<<<<<<<<<<<< [Unresolved]
    963a5592 a1746d3a96      mov     eax,dword ptr [Zyx+0x1d74 (963a6d74)]
    963a5597 85c0            test    eax,eax
    963a5599 8b35586b3a96    mov     esi,dword ptr [Zyx+0x1b58 (963a6b58)] <<<< nt!ExFreePoolWithTag (via lookup)
    963a559f 7413            je      Zyx+0x5b4 (963a55b4)
    963a55a1 8b7808          mov     edi,dword ptr [eax+8]
    963a55a4 6a00            push    0
    963a55a6 50              push    eax
    963a55a7 ffd6            call    esi  <<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<< nt!ExFreePoolWithTag (via 963a5599)
    963a55a9 85ff            test    edi,edi
    963a55ab 8bc7            mov     eax,edi
    963a55ad a3746d3a96      mov     dword ptr [Zyx+0x1d74 (963a6d74)],eax
    963a55b2 75ed            jne     Zyx+0x5a1 (963a55a1)
    963a55b4 a1a06d3a96      mov     eax,dword ptr [Zyx+0x1da0 (963a6da0)]
    963a55b9 85c0            test    eax,eax
    963a55bb 7413            je      Zyx+0x5d0 (963a55d0)
    963a55bd 8b7808          mov     edi,dword ptr [eax+8]
    963a55c0 6a00            push    0
    963a55c2 50              push    eax
    963a55c3 ffd6            call    esi <<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<< nt!ExFreePoolWithTag (via 963a5599)


    3: kd> ln poi(963a6b00)
    (82e20844)   hal!KfRaiseIrql   |  (82e208c6)   hal!HalpDispatchSoftwareInterrupt
    Exact matches:


    3: kd> ln poi(963a6b58)
    (82b2aaba)   nt!ExFreePoolWithTag   |  (82b2b660)   nt!ExDeferredFreePool
    Exact matches:


    The next step was to work out what functions the Zyx+0x5c5 function called, and with what parameters. The assembler didn’t make this immediately obvious. The address of the function calls were lookups. This is common when the address is imported by the code, instead of being bound. To get the function name of the call, I dereferenced the pointer (poi(<addr>) and passed the address to ln so that it listed the nearest symbols.


    The first function confirmed some the theory. A call to hal!KfRaiseIrql was being made with a IRQL of HIGH (31). And this was prior to a code path that called functions.


    The value of the @esi register was determined to be nt!ExFreePoolWithTag. This matched the function name in the reconstructed stack (nt!ExFreePoolWithTag+0x436) and confirmed that the stack was reconstructed in this area correctly.



    The Zyx function raised IRQL to HIGH_LEVEL before calling nt!ExFreePoolWithTag. This function caused a Translation Look-aside Buffer (TLB) flush via nt!KiFlushTargetMultipleRangeTb) to occur. This in turn caused an IPI notification to the other processors (via nt!KiIpiSendPacket) to indicate the release of the TLB memory.


    The Processor #0 IPI was stuck as incomplete because Processor #3 could not be interrupted, and Processor #3 did not lower its IRQL as it was waiting to send its own IPI upon completion of the Processor #0 IPI – thus a deadlock was formed.


    I contacted the driver’s author and organized a new version that used the SYNCH (27) IRQL (the level immediately under IPI). The new build was sent to the customer and the immediate bugcheck that they were observing in their test environment disappeared.  Although this is not an ideal solution, it was sufficient for the time being.  An ideal solution would be to follow the documentation for ExFreePoolWithTag and not make the call with an IRQL greater than DISPATCH_LEVEL.


    I hope this case has provided a good foundation in how to recover and navigate through stacks without symbols or context. Once you understand the mechanics of calling conventions, the values needed to bound a disassembling operation and/or a stack reassemble are quite easy to determine.


  • Ntdebugging Blog

    Bcdedit Tips and Tricks For Debugging Part 1


    Hello everyone, my name is Sean Walker, and I am on the Platforms OEM team in Washington.  This article is for those people who have had a hard time switching from the old boot.ini configuration to the new BCD store (myself included). Doing the simple tasks such as enabling kernel debugging over com1 are easy to do with bcdedit.exe or the msconfig GUI, you just enable them and reboot the computer. However, if you need to do something more advanced such as break into the early boot process during resume from hibernation, things get a lot more complicated.


    This article has some samples for enabling and disabling debug settings that you may not be familiar with, and a list of bcdedit debug settings for Windows Vista/Server 2008 and Windows 7/Server 2008 R2.  This information has been helpful to me for quickly and accurately getting to the debug at hand rather than fumbling around with bcdedit.  Much of the following information has been taken from various sources, including the windbg help files, the OEM team blog, the MSDN bcdedit reference, and the WHDC debugger site.


    NOTE: For the examples below, you will need to run bcdedit.exe from an administrator (UAC-elevated) command prompt.  To output a summary view of the current state of the BCD store, just run "bcdedit.exe" from the command prompt.  To get detailed information about all of the store(s) that Windows knows about, use the following command:

    bcdedit /enum all


    What is a BCD store?

    A BCD store is a binary file that contains boot configuration data for Windows, basically it is a small registry file.  Boot applications use the system BCD store, located on the system partition, during the boot process.  You can also create additional BCD stores in separate files but only one store at a time can be designated as the system store.


    NOTE: The "/store" switch can be used to specify a particular BCD store for bcdedit commands (instead of the default store).  To enumerate all the settings in another BCD store, in this case e:\bcd_store\BCD, use the following command:

    bcdedit /store e:\bcd_store\BCD /enum all


    This will show you which options are currently set, and what their values are.  When /store switch is omitted, the system store is used.


    Using bootdebug

    To enable debugging for early boot problems, you may need to enable the bootdebug switch.  This is easy to do with bcdedit:

    bcdedit /set bootdebug on


    However, this only sets bootdebug for the current "boot application", which is generally winload.exe, so it does not break into the very early boot process.  There are multiple applications used for booting, hibernating, and resuming (bootmgr.exe, winload.exe and winresume.exe are examples of these).  Each application (called BCD Objects) has its own settings (called BCD Elements) in the BCD store and each can be modified globally and/or individually.


    So, to deal with different (or multiple) debug scenarios, you just enable boot debugging based on the boot application you are concerned with.  For early debugging, you can enable bootdebug for bootmgr:

    bcdedit /set {bootmgr} bootdebug on


    To set bootdebug for winload.exe (which will most often be your current, and default, boot object) all three of the following will give you the same result:

    bcdedit /set bootdebug on

    bcdedit /set {current} bootdebug on

    bcdedit /set {default} bootdebug on


    If you are modifying the settings in another store, or are booted into another OS on the same computer (such as WinPE), you need to specify the location of the BCD store:

    bcdedit /store d:\Boot\BCD /set {default} bootdebug on


    Not all of the boot objects have "friendly" names, so you may need to specify the full GUID (Globally Unique ID) to modify it.  As an example, if you wanted to enable bootdebug on resume from hibernation, you would include the identifier (see figure 1) for the "Resume from Hibernate" object:

    bcdedit /set {89a932d0-d5bc-11e0-a0af-00215add5ebc} bootdebug on



    Figure 1: Color coded bcdedit output


    Why won't my USB or 1394 debug work?

    When there are multiple debug ports of a certain type in a computer Windows may not default to the correct one for your situation.  This happens most commonly when there are either multiple 1394 host controllers or USB EHCI controllers.  When this occurs it can range from a slight inconvenience (different port is used so the cable needs to be plugged into another port), to complete failure (internal port is used, which is not accessible).  In the case of USB debugging the Intel USB 2.0 specification only provides one debug port, so debugging is not possible if the wrong host controller is used.


    There are several caveats with USB debugging, not the least of which is that you need to buy a separate, expensive, debug cable.  Some of the difficulties and implementation details necessary to get USB debugging to work are encompassed in the WHDC USB FAQ and in Setting Up Kernel Debugging with USB 2.0.


    NOTE: A correction to the WHDC USB documentation for Windows 7/Windows 2008 R2 is that the busparams switch now takes decimal rather than hexadecimal values, and the "loadoptions" parameter is no longer required.  So, to enable the busparams element (for USB or 1394 debugging) in Vista/2008, you would use something like this:

    bcdedit /set {current} loadoptions busparams=0.1D.7


    And the Win7/2008 R2 example would be:

    bcdedit /set {current} busparams 0.29.7


    In the case of loadoptions or busparams, deleting the setting is not as easy as changing a flag from yes to no. You must specifically delete the value to get rid of it, and one of the examples below can be used:


    For Vista/2008:

    bcdedit /deletevalue {current} loadoptions


    And Windows 7/2008 R2:

    bcdedit /deletevalue {current} busparams


    Bcdedit settings and examples

    This is just scratching the surface of using bcdedit for your troubleshooting and/or debugging needs, so there are more articles to follow. Part 2 will include some more detailed debugging scenarios, such as Hyper-V guest and host debugging.  Below is a consolidated table with many of the debugging switches/settings as well as a number of different usage examples.


    Table of debug-related bcdedit settings




    Enables or disables the boot debugger for a specified boot entry. Although this command works for any boot entry, it is effective only for boot applications.

    Enable value(s): on, 1

    Disable value(s): off, 0

    Bcdedit /set bootdebug on


    Enables or disables the kernel debugger for a specified boot entry.

    Enable value(s): on, 1

    Disable value(s): off, 0


    Used to modify the global settings for the debug connection (does not include hypervisor).  Values:

    Can change all settings at once instead of using the /set command to change them individually. Usage example:

    bcdedit /dbgsettings 1394 channel:30


    Used to specify the debugger type.


    Serial port – com1, com2, comx

    1394 port – 1394

    USB port - USB


    Specifies 1394 channel used.


    Decimal integer between 0 and 62, inclusive.


    Used to specify the baud rate of a serial debug port.

    Values: 9600, 19200, 38400, 57600, 115200


    Specifies a string to use as the identification for the USB 2.0 connection. This string can be any value.

    Usage example:

    bcdedit /dbgsettings usb targetname:usbdebug


    Used the same way as /dbgsettings to configure all settings at once.

    Usage example:

    bcdedit /hypervisorsettings 1394 channel:10


    Enables or disables hypervisor debug mode. This is for debugging a Hyper-V host system.

    Enable value(s): on, 1

    Disable value(s): off, 0

    Usage example:

    bcdedit /set {current} hypervisordebug on


    Specifies that the kernel debugger ignores user-mode exceptions. By default, the kernel debugger breaks for certain user-mode exceptions, such as STATUS_BREAKPOINT and STATUS_SINGLE_STEP. The /noumex parameter is effective only when there is no user-mode debugger attached to the process.


    This option specifies the debugger start policy. If a start policy is not specified, ACTIVE is the default.

    Values: active, disable, autoenable


    Used to describe settings that are not covered by other types. One setting that is relevant here is busparams.

    Values: Any value followed by the setting.

    Usage example (Vista/2008):

    bcdedit /set {current} loadoptions busparams=0.1d.0


    A boot setting (specified with loadoptions key word) used to point to the PCI address of the debugger in use. The PCI bus, device, and function are used, in the format bb.dd.ff. This is generally used to identify the location of a 1394 or USB debug port. In Vista/2008, hexadecimal values are used, whereas decimal values are used for Win7.

    Values: Decimal values between 0 and 255.

    Usage example:

    In Win7 - bcdedit /set busparams 0.29.0

    In Vista - bcdedit /set loadoptions busparams=0.1d.0


    The loadoptions parameter used to point to a different kernel binary. This can be used to test with a checked or instrumented version of the kernel without replacing the existing one. The updated binary MUST be placed in the %windir%\system32 folder to be used

    Values: The 8.3 filename of the replacement kernel include the exe extension.

    Usage examples:

    In Win7 – bcdedit /set kernel kernchk.exe

    In Vista - bcdedit /set loadoptions kernel=kernchk.exe


    The loadoptions parameter used to point to a different hal binary. This can be used to test with a checked or instrumented version of the kernel without replacing the existing one. The updated binary MUST be placed in the %windir%\system32 folder to be used

    Values: the 8.3 filename of the replacement kernel include the .dll extension.

    Usage examples:

    In Win7 – bcdedit /set hal halchk.dll

    In Vista - bcdedit /set loadoptions hal=halchk.dll


    Controls whether Windows 7, Windows Server 2008, or Windows Vista will load any type of test-signed kernel-mode code. This option is not set by default, which means test-signed kernel-mode drivers on 64-bit versions of Windows 7, Windows Server 2008, and Windows Vista will not load without setting the testsigning switch

    Enable value(s): on, 1

    Disable value(s): off, 0

    Usage example:

    Bcdedit /set testsigning on


  • Ntdebugging Blog

    Determining The Interrupt Line For A Particular PCI-E Slot


    Hi debuggers, this is Graham McIntyre again. These days I’m working more closely with hardware so I thought I’d share some hardware related debugging tips.  I recently debugged an issue where a PCI-E storage device failed to work after hot swapping it from one slot to another slot on the system without rebooting.  We determined the issue was due to the device not receiving interrupts once it was moved.   So in the process I learned how line based interrupts are routed to a particular PCI slot.    Interrupt routing is quite a hefty subject, but here’s one example of how to determine what the expected interrupt line is for a particular PCI-E slot using a live kernel debug.


    There are two ways the routing can be defined in the ACPI tables:

    1. Static routing (most common for APIC systems)
    2. Link Node routing (most common for PIC systems)


    Since APIC is much more common, I am focusing on method 1 for static routing. Though, it is legal to use Link Node routing with IOAPICs, it’s not common, so I am omitting how to parse that.  This is also specifically for devices that use physical line based interrupts (LBI), not Message Signaled Interrupts (MSI).


    Here is the general method for determining the static routing IRQ for a particular device:

    1. Locate the devstack for the device, and determine its parent devices in the PCI hierarchy. (!pcitree)
    2. Determine the interrupt pin which the device uses
    3. Walk the parent devices to find the closest PCI Routing Table (_PRT) which will describe the mapping of interrupt pin to IRQ.
      1. If the parent device does not have a _PRT, then swizzle the pin, since the pin number can change when moving to the upstream side of the PCI bridge (you may end up swizzling the pin several times).  We will discuss how to swizzle the pin number later in this article.
      2. If the parent device has a _PRT, then move to the next step
    4. Convert the IntPin number from PCI to ACPI numbering
    5. Parse the _PRT method to find the static routing table
    6. Find the routing entry which represents our IntPin


    Here’s the in-depth steps, along with an example:


    Step 1:  Locate the devstack for the device, and determine its parent devices in the PCI hierarchy.


    To determine this, use !pcitree to dump the PCI hierarchy. Then locate your device by ven/dev ID.  You could also use !devnode to dump the hierarchy.


    The way !pcitree shows the hierarchy may be a little confusing.  When it encounters a PCI bridge, it dumps the child buses under the bridge. The indenting tells you what bus a device is on. A device is always indented one level from the entry of the parent bus.  In my case, I know the device I'm interested in is VEN FEFE DEV 1550.

    kd> !pcitree

    Bus 0x0 (FDO Ext fffffa80053efe00)

      (d=0,  f=0) 80863406 devext 0xfffffa80054d51b0 devstack 0xfffffa80054d5060 0600 Bridge/HOST to PCI

      (d=1,  f=0) 80863408 devext 0xfffffa80054d9b70 devstack 0xfffffa80054d9a20 0604 Bridge/PCI to PCI

      Bus 0x1 (FDO Ext fffffa80054e8680)

        (d=0,  f=0) 14e41639 devext 0xfffffa80051b91b0 devstack 0xfffffa80051b9060 0200 Network Controller/Ethernet

        (d=0,  f=1) 14e41639 devext 0xfffffa80051ba1b0 devstack 0xfffffa80051ba060 0200 Network Controller/Ethernet

      (d=3,  f=0) 8086340a devext 0xfffffa80054dab70 devstack 0xfffffa80054daa20 0604 Bridge/PCI to PCI

      Bus 0x2 (FDO Ext fffffa80054e9460)

        (d=0,  f=0) 14e41639 devext 0xfffffa80051bcb70 devstack 0xfffffa80051bca20 0200 Network Controller/Ethernet

        (d=0,  f=1) 14e41639 devext 0xfffffa80051cab70 devstack 0xfffffa80051caa20 0200 Network Controller/Ethernet

      (d=4,  f=0) 8086340b devext 0xfffffa80054dbb70 devstack 0xfffffa80054dba20 0604 Bridge/PCI to PCI

      Bus 0x3 (FDO Ext fffffa80054ec190)

        (d=0,  f=0) 10000079 devext 0xfffffa80051cd1b0 devstack 0xfffffa80051cd060 0104 Mass Storage Controller/RAID

      (d=5,  f=0) 8086340c devext 0xfffffa80054dcb70 devstack 0xfffffa80054dca20 0604 Bridge/PCI to PCI

      Bus 0x4 (FDO Ext fffffa80054ede00)

        No devices have been enumerated on this bus.

      (d=6,  f=0) 8086340d devext 0xfffffa80054ddb70 devstack 0xfffffa80054dda20 0604 Bridge/PCI to PCI

      Bus 0x5 (FDO Ext fffffa80054ee9c0)

        No devices have been enumerated on this bus.

      (d=7,  f=0) 8086340e devext 0xfffffa80054deb70 devstack 0xfffffa80054dea20 0604 Bridge/PCI to PCI << Root Port

      Bus 0x6 (FDO Ext fffffa80054f1190)

        (d=0,  f=0) abcd8632 devext 0xfffffa80051d91b0 devstack 0xfffffa80051d9060 0604 Bridge/PCI to PCI << Upstream switch port

        Bus 0x7 (FDO Ext fffffa80051cd850)

          (d=4,  f=0) abcd8632 devext 0xfffffa80051d71b0 devstack 0xfffffa80051d7060 0604 Bridge/PCI to PCI

          Bus 0x8 (FDO Ext fffffa8006f44ac0)

            No devices have been enumerated on this bus.

          (d=5,  f=0) abcd8632 devext 0xfffffa80058d6a10 devstack 0xfffffa80058d68c0 0604 Bridge/PCI to PCI

          Bus 0x9 (FDO Ext fffffa80051ba850)

            No devices have been enumerated on this bus.

          (d=6,  f=0) abcd8632 devext 0xfffffa8007075b70 devstack 0xfffffa8007075a20 0604 Bridge/PCI to PCI << Parent PDO (Downstream Switch Port)

          Bus 0xa (FDO Ext fffffa8007312b60)

            (d=0,  f=0) fefe1550 devext 0xfffffa8006f67b70 devstack 0xfffffa8006f67a20 0180 Mass Storage Controller/'Other' << Device

          (d=7,  f=0) abcd8632 devext 0xfffffa80051e5b70 devstack 0xfffffa80051e5a20 0604 Bridge/PCI to PCI

          Bus 0xb (FDO Ext fffffa80052d2e00)

            No devices have been enumerated on this bus.

      (d=14, f=0) 8086342e devext 0xfffffa80054dfb70 devstack 0xfffffa80054dfa20 0800 Base System Device/Interrupt Controller

      (d=14, f=1) 80863422 devext 0xfffffa80054e0b70 devstack 0xfffffa80054e0a20 0800 Base System Device/Interrupt Controller

      (d=14, f=2) 80863423 devext 0xfffffa80054e1b70 devstack 0xfffffa80054e1a20 0800 Base System Device/Interrupt Controller

      (d=1a, f=0) 80862937 devext 0xfffffa80054e2b70 devstack 0xfffffa80054e2a20 0c03 Serial Bus Controller/USB

      (d=1a, f=1) 80862938 devext 0xfffffa80054e31b0 devstack 0xfffffa80054e3060 0c03 Serial Bus Controller/USB

      (d=1a, f=7) 8086293c devext 0xfffffa80054e3b70 devstack 0xfffffa80054e3a20 0c03 Serial Bus Controller/USB

      (d=1d, f=0) 80862934 devext 0xfffffa80054e41b0 devstack 0xfffffa80054e4060 0c03 Serial Bus Controller/USB

      (d=1d, f=1) 80862935 devext 0xfffffa80054e4b70 devstack 0xfffffa80054e4a20 0c03 Serial Bus Controller/USB

      (d=1d, f=7) 8086293a devext 0xfffffa80054e51b0 devstack 0xfffffa80054e5060 0c03 Serial Bus Controller/USB

      (d=1e, f=0) 8086244e devext 0xfffffa80054e5b70 devstack 0xfffffa80054e5a20 0604 Bridge/PCI to PCI

      Bus 0xc (FDO Ext fffffa80054f2e00)

        (d=3,  f=0) 102b0532 devext 0xfffffa80051d51b0 devstack 0xfffffa80051d5060 0300 Display Controller/VGA

      (d=1f, f=0) 80862918 devext 0xfffffa80054e61b0 devstack 0xfffffa80054e6060 0601 Bridge/PCI to ISA

      (d=1f, f=2) 80862921 devext 0xfffffa80054e6b70 devstack 0xfffffa80054e6a20 0101 Mass Storage Controller/IDE

    Total PCI Root busses processed = 1

    Total PCI Segments processed = 1


    To recap the devices in the tree (Bus,Device,Function):

    (0,7,0) : Root Port, PCI-PCI Bridge (devstack 0xfffffa80054dea20)

        (6,0,0) : Upstream Switch Port (devstack 0xfffffa80051d9060)

            (7,6,0) : Downstream Switch Port (the PDO for the slot) (devstack 0xfffffa8007075a20)

                (a,0,0) : Device  (devstack 0xfffffa8006f67a20)


    I scanned the output looking for my ven/dev ID, and found it at Bus A, Device 0, Function 0.


    Step 2:  Determine which interrupt pin the device uses.


    For this step, you can use !pci to dump the PCI config space for the device. The output will show you the interrupt pin the device uses, labeled as IntPin.

    !pci 1 a 0 0

    PCI Bus 10

    00:0  FEFE:1550.01  Cmd[0007:imb...]  Sts[0018:c....]  Device  SubID:1344:1008  Other mass storage controller

          cf8:800a0000  IntPin:1  IntLine:2e  Rom:0  cis:0  cap:40

          MEM[2]:df5fd000  MEM[3]:df5fc000  IO[4]:cff1       MEM[5]:df5fe000 


    So our IntPin is 1.


    Step 3: Walk the parent devices to find the closest PCI Routing Table (_PRT) which will describe the mapping of interrupt pin to IRQ.


    Now, we will traverse the parent PCI devnodes until we find a PCI bridge which has an associated ACPI object with a _PRT method. This may be the root port, or an integrated bridge.

    1. Start by running !devstack on the parent.  We can determine the parent device using the indentations of the !pcitree output.
    2. If the devstack shows an ACPI filter driver, then dump the filter using !acpikd.acpiext to find the associated AcpiObject
    3. Dump the ACPI object and its children to see if it has a _PRT method defined
      1. If it does not have a _PRT, then you need to swizzle the Interrupt Pin to find what the pin number will be on the upstream side of the bridge
        1. We have to use a method called “swizzling” because the pin may become a different pin on the upstream side of the bridge. The way to calculate the pin is:
          1. IntPin = ((((IntPin -1) + DeviceNumber) % 4) +1)
        2. Where IntPin is the current IntPin value, and DeviceNumber the device number of the device you’re swizzling.
        3. You will start with the IntPin value from !pci output of the device itself. If you need to swizzle multiple times, you take the result of the previous swizzle as the input to the next swizzle
        4. The device number for the first time will be the device number of the target device, and subsequent times will be the device number of the parent device you’re swizzling.
      2. If it does have a _PRT, then move onto Step 4.



    First, we’ll swizzle the pin of the device itself (a,0,0).  The IntPin is 1 so:

                    IntPin = ((((1-1)+0) % 4) +1)    << The Swizzled Pin is still IntPin 1


    Next, I dumped the parent device (7,6,0), !devstack  0xfffffa8007075a20. It didn’t have an ACPI filter driver on the stack. So I need to swizzle the pin.

          IntPin = ((((1-1)+6) % 4) +1)    << The Swizzled Pin is now 3


    I now dump the next parent up, (6,0,0), !devstack 0xfffffa80051d9060. It also didn’t have an ACPI filter driver on the stack so I need to swizzle the pin again.

          IntPin = ((((3-1)+0) % 4) +1)    << The Swizzled Pin is still 3


    I am now at the root port. The first devstack which has a _PRT method in my case is the root port.

    kd> !devstack 0xfffffa80054dea20

      !DevObj   !DrvObj            !DevExt   ObjectName

      fffffa80054f1040  \Driver\pci        fffffa80054f1190 

      fffffa80054e5800  \Driver\ACPI       fffffa80051c1510  << Has an ACPI filter driver in the devstack

    > fffffa80054dea20  \Driver\pci        fffffa80054deb70  NTPNP_PCI0006

    !DevNode fffffa80054e1750 :

      DeviceInst is "PCI\VEN_8086&DEV_340E&SUBSYS_02351028&REV_13\3&33fd14ca&0&38"

      ServiceName is "pci"

    kd> !acpikd.acpiext fffffa80051c1510

    ACPI!DEVICE_EXTENSION fffffa80051c1510 - 70000

      DevObject  fffffa80054e5800  PhysObject  fffffa80054dea20  NextObject   fffffa80054dea20

      AcpiObject fffffa80052e3890  ParentExt   fffffa80051c07d0

      PnpState   Started   OldPnpState Stopped

      Dispatch   fffff880011cbb50

      RefCounts  4-Device 1-Irp 0-Hiber 0-Wake

      State      D0       

      SxD Table  S0->D0 S4->D3 S5->D3

      Flags      0540100002000240

        Types    Filter Enumerated ValidPnP

        Caps     PCIBus

        Props    HasAddress Enabled AcpiPower

    Dump the namespace object. Use /s to display the subtree under this object and look for a _PRT method.

    kd> !amli dns /s fffffa80052e3890


    ACPI Name Space: \_SB.PCI0.PEX7 (fffffa80052e3890)


    | Integer(_ADR:Value=0x0000000000070000[458752])

    | Integer(_STA:Value=0x000000000000000f[15])

    | Method(_PRT:Flags=0x0,CodeBuff=fffffa80052e3aa9,Len=144) << A _PRT method exists for this object


    Now, we have a swizzled IntPin value of 3, and a pointer to the _PRT method.  We can move on to the next step.


    Step 4: Convert the pin number from PCI to ACPI numbering


    The !pci or !devext output, and subsequent swizzling will show pin numbering in PCI format where 1 = INTA. But the ACPI table uses 0 for INTA. So you need to subtract one from the PCI pin number to get the ACPI pin number.


    PCI pin number

    ACPI pin number














    Once you’ve converted to ACPI pin numbering, you have to dump the _PRT method to find the package which maps to that pin number.


    For my example since the PCI IntPin value is 3, which corresponds to INTC, the ACPI pin number is 2


    Step 5: Parse the _PRT method to find the static routing table


    Now that we located the correct _PRT entry, we need to use the AMLI debugger extension to parse the method and find the static routing table.  The command !amli u will unassemble an ACPI method

    kd> !amli u \_SB.PCI0.PEX7._PRT

    AMLI_DBGERR: Failed to get address of ACPI!gDebugger


    fffffa80052e3aa9 : If(LNot(PICF))

    fffffa80052e3ab1 : {

    fffffa80052e3ab1 : | Name(P10B, Package(0x4)

    fffffa80052e3ab9 : | {

    fffffa80052e3ab9 : | | Package(0x4)

    fffffa80052e3abc : | | {

    fffffa80052e3abc : | | | 0xffff,

    fffffa80052e3abf : | | | 0x0,

    fffffa80052e3ac1 : | | | LK00,

    fffffa80052e3ac5 : | | | 0x0

    fffffa80052e3ac7 : | | },

    fffffa80052e3ac7 : | | Package(0x4)

    fffffa80052e3aca : | | {

    fffffa80052e3aca : | | | 0xffff,

    fffffa80052e3acd : | | | 0x1,

    fffffa80052e3acf : | | | LK01,

    fffffa80052e3ad3 : | | | 0x0

    fffffa80052e3ad5 : | | },

    fffffa80052e3ad5 : | | Package(0x4)

    fffffa80052e3ad8 : | | {

    fffffa80052e3ad8 : | | | 0xffff,

    fffffa80052e3adb : | | | 0x2,

    fffffa80052e3add : | | | LK02,

    fffffa80052e3ae1 : | | | 0x0

    fffffa80052e3ae3 : | | },

    fffffa80052e3ae3 : | | Package(0x4)

    fffffa80052e3ae6 : | | {

    fffffa80052e3ae6 : | | | 0xffff,

    fffffa80052e3ae9 : | | | 0x3,

    fffffa80052e3aeb : | | | LK03,

    fffffa80052e3aef : | | | 0x0

    fffffa80052e3af1 : | | }

    fffffa80052e3af1 : | })

    fffffa80052e3af1 : | Store(P10B, Local0)

    fffffa80052e3af7 : }

    fffffa80052e3af7 : Else

    fffffa80052e3af9 : {

    fffffa80052e3af9 : | Name(A10B, Package(0x4)

    fffffa80052e3b01 : | {

    fffffa80052e3b01 : | | Package(0x4)

    fffffa80052e3b04 : | | {

    fffffa80052e3b04 : | | | 0xffff,

    fffffa80052e3b07 : | | | 0x0,

    fffffa80052e3b09 : | | | 0x0,

    fffffa80052e3b0b : | | | 0x26

    fffffa80052e3b0d : | | },

    fffffa80052e3b0d : | | Package(0x4)

    fffffa80052e3b10 : | | {

    fffffa80052e3b10 : | | | 0xffff,

    fffffa80052e3b13 : | | | 0x1,

    fffffa80052e3b15 : | | | 0x0,

    fffffa80052e3b17 : | | | 0x2d

    fffffa80052e3b19 : | | },

    fffffa80052e3b19 : | | Package(0x4)

    fffffa80052e3b1c : | | {

    fffffa80052e3b1c : | | | 0xffff,

    fffffa80052e3b1f : | | | 0x2,

    fffffa80052e3b21 : | | | 0x0,

    fffffa80052e3b23 : | | | 0x2f

    fffffa80052e3b25 : | | },

    fffffa80052e3b25 : | | Package(0x4)

    fffffa80052e3b28 : | | {

    fffffa80052e3b28 : | | | 0xffff,

    fffffa80052e3b2b : | | | 0x3,

    fffffa80052e3b2d : | | | 0x0,

    fffffa80052e3b2f : | | | 0x2e

    fffffa80052e3b31 : | | }

    fffffa80052e3b31 : | })

    fffffa80052e3b31 : | Store(A10B, Local0)

    fffffa80052e3b37 : }

    fffffa80052e3b37 : Return(Local0)

    fffffa80052e3b39 : Zero

    fffffa80052e3b3a : Zero

    fffffa80052e3b3b : Zero

    fffffa80052e3b3c : Zero

    fffffa80052e3b3d : Zero

    fffffa80052e3b3e : Zero

    fffffa80052e3b3f : Zero

    fffffa80052e3b40 : HNSO

    fffffa80052e3b44 : Not(Zero, )

    fffffa80052e3b47 : Zero

    fffffa80052e3b48 : Zero

    fffffa80052e3b49 : AMLI_DBGERR: UnAsmOpcode: invalid opcode class 0

    AMLI_DBGERR: Failed to unassemble scope at 382d4a0 (size=4096)


    There are 2 different _PRT tables here, each with 4 packages (think of it as 2 arrays, each containing 4 structures). The first is using link nodes, the second is using static interrupts.  The first list is used if we are in PIC mode, the second if we are in APIC mode.


    We can check the value of PICF to determine the mode. (I expect it to be APIC but let’s check)

    kd> !amli dns \PICF


    ACPI Name Space: \PICF (fffffa80052ded18)



    So we’re in APIC mode (PICF != 0), we use the static routing mode. So we will use the 2nd table.  What does each package represent?  Each is a PCI Routing Table. From ACPI spec section 6.2.12, which describes the _PRT:

    Table 6-14   Mapping Fields






    The address of the device (uses the same format as _ADR).



    The PCI pin number of the device (0–INTA, 1–INTB, 2–INTC, 3–INTD).





    Name of the device that allocates the interrupt to which the above pin is connected. The name can be a fully qualified path, a relative path, or a simple name segment that utilizes the namespace search rules. Note: This field is a NamePath and not a String literal, meaning that it should not be surrounded by quotes. If this field is the integer constant Zero (or a BYTE value of 0), then the interrupt is allocated from the global interrupt pool.

    Source Index


    Index that indicates which resource descriptor in the resource template of the device pointed to in the Source field this interrupt is allocated from. If the Source field is the BYTE value zero, then this field is the global system interrupt number to which the pin is connected.


    fffffa80052e3af9 : | Name(A10B, Package(0x4)

    fffffa80052e3b01 : | {

    fffffa80052e3b01 : | | Package(0x4)

    fffffa80052e3b04 : | | {

    fffffa80052e3b04 : | | | 0xffff,

    fffffa80052e3b07 : | | | 0x0,          << INTA

    fffffa80052e3b09 : | | | 0x0,

    fffffa80052e3b0b : | | | 0x26  << Interrupt line

    fffffa80052e3b0d : | | },

    fffffa80052e3b0d : | | Package(0x4)

    fffffa80052e3b10 : | | {

    fffffa80052e3b10 : | | | 0xffff,

    fffffa80052e3b13 : | | | 0x1,          << INTB

    fffffa80052e3b15 : | | | 0x0,

    fffffa80052e3b17 : | | | 0x2d

    fffffa80052e3b19 : | | },

    fffffa80052e3b19 : | | Package(0x4)

    fffffa80052e3b1c : | | {

    fffffa80052e3b1c : | | | 0xffff,

    fffffa80052e3b1f : | | | 0x2,          << INTC

    fffffa80052e3b21 : | | | 0x0,

    fffffa80052e3b23 : | | | 0x2f

    fffffa80052e3b25 : | | },

    fffffa80052e3b25 : | | Package(0x4)

    fffffa80052e3b28 : | | {

    fffffa80052e3b28 : | | | 0xffff,

    fffffa80052e3b2b : | | | 0x3,          << INTD

    fffffa80052e3b2d : | | | 0x0,

    fffffa80052e3b2f : | | | 0x2e

    fffffa80052e3b31 : | | }

    fffffa80052e3b31 : | })


    Step 6 - Find the routing entry which represents our IntPin


    Now, we just have to locate the entry in the routing table with a IntPin value of 2.


    fffffa80052e3b19 : | | Package(0x4)

    fffffa80052e3b1c : | | {

    fffffa80052e3b1c : | | | 0xffff,

    fffffa80052e3b1f : | | | 0x2, <<< IntPin 2 (INTC)

    fffffa80052e3b21 : | | | 0x0,

    fffffa80052e3b23 : | | | 0x2f << IRQ is 0x2f


    So the device should be assigned IRQ 0x2F.   However, you may have noticed from the !pci output above that in this case the device was actually assigned IntLine (IRQ) 0x2e!  Since the wrong interrupt line was assigned after the device changed slots in the system, the device did not receive interrupts and hence was not functional.


    I hope this was useful to help understand how interrupts are assigned to LBI devices.


    More reading / references:


    PCI IRQ Routing on a Multiprocessor ACPI System:



    ACPI 4.0 spec


  • Ntdebugging Blog

    For testing purposes only, and other desktop watermarks


    Hi all, Matt here again.  One of our team’s main functions is to work with our development teams to create hotfixes when customers run into issues that can only be resolved through a code change.  The developers will often prepare a private test fix that either tests the proposed change, or adds additional instrumentation to help pinpoint the issue. The private test fix is sent to the customer reporting the problem so they can confirm that it does indeed correct (or identify) the flaw.


    When testing a private test fix, customers frequently ask, why does my desktop now show a message on the lower right corner of the desktop, and what does it mean?  The message reads “For testing purposes only”, and looks like this:

    For testing purposes only


    Often, users are concerned that his message means that they aren’t allowed to use the server in production, or maybe that it is now “unsupported.”  These aren’t the case!  Since this message appears as a result of installing a fix during the course of a Microsoft Support case, the servers are, by definition, being supported. 


    The purpose of this message is simply to remind users that code that Microsoft Support has asked them to test has been installed on the system, and this code may not have yet undergone the full suite of quality assurance testing that fixes that are made public usually do.   


    For comparison, let’s look at some of the other watermarks you may find in the lower corner of the desktop – as these can often be confused for the above message, and may explain some of the customer concerns around these messages.


    First up is the old trusty text you see when a box is booted into ‘Safe Mode’. I’m sure every IT Pro has seen this at one time or another, so I won’t go into detail, but rest assured, the testing purposes text is completely unrelated to booting in safe mode or having a subset of services running.

    Safe Mode


    Next up is our ‘Evaluation copy’ watermark.  This message is shown on the desktops of copies of Windows that have a “time bomb” (ones that will cease to function after a certain date.)  This message is typically seen on beta versions of Windows which are designed to stop functioning sometime after the desired beta testing period ends. 

    Evaluation copy


    Third, we have our Windows is not genuine message.  This is shown if, for example, a copy of Windows is not activated during the grace period after the installation process, or if a number of hardware changes have been made and Windows needs to be re-activated.  This has nothing to do with the ‘testing purposes’ message.  See http://technet.microsoft.com/en-us/library/dd979803.aspx for more information about this message.

    This copy of Windows is not genuine


    Fourth, we have the general Windows build stamp.  This is enabled via the registry using the PaintDesktopVersion DWORD (http://technet.microsoft.com/en-us/library/cc782257(WS.10).aspx).  Some administrators like to enable this option so they always know what version of Windows they are using, sort of like a mini-bginfo.  Unlike the others, this message does not indicate anything else about a server’s state.

    Windows 7


    Finally, we have ‘Test Mode’.  This is actually somewhat related to the testing purposes message.  This ‘Test Mode’ text is shown when test signing is enabled on a PC.  This is done by running “bcdedit /set testsigning on” from an UAC-elevated command prompt.  Test signing is used to allow developers to load drivers they are still working on that have not yet been code signed with an official certificate.  This is actually one of the steps we need to do when loading our test fixes.  For more information on Test Signing, see http://msdn.microsoft.com/en-us/library/ff553484%28v=VS.85%29.aspx.

    Test Mode


    So now that you know what causes these various watermarks to appear, perhaps you’re wondering how to make the “For testing purposes only” message disappear.  This is a question we are frequently asked.  While you are running a private test fix, there is no way to disable this message.  Your only option is to remove the private test fix from your system.  This is something your engineer will ask you to do before you install the final, public version of a hotfix.  You can easily identify and uninstall private test fixes by going into Control Panel, Programs and Features, View Installed Updates, then look for hotfixes with the words FOR TESTING PURPOSES ONLY in their name, like the one shown in the image below.  You may also notice that the KB number listed for these fixes is often a place holder, and not a real KB article ID.

    Installed Updates


    If the ‘For testing purposes only’ message is still displayed even after uninstalling the private test fix, there is one more place to check.  If a system has the Microsoft Test Root Authority certificate installed into its Trusted Root Certification Authorities store, the text will be displayed.  We use this certificate to allow a PC to run test code that has been signed by our development team, but not yet fully tested and signed off with the official Microsoft certificate.  To remove this certificate from your system, go to Start -> Run, and enter certmgr.msc and hit enter.  In the Certificate Manager MMC, browse to Trusted Root Certification Authorities, then into Certificates.  You should see one labeled Microsoft Test Root Authority, as shown below.  This will need to be deleted and the system rebooted to remove the testing purposes message.  Do not do this if you still have a private test fix installed though, as it would prevent that binary from continuing to function and may mean you can no longer boot in to Normal or Safe mode.



    If you reboot and find that ‘Test Mode’ has replaced the ‘For testing purposes only’ text, you’ll need to launch a command prompt with administrative privileges, then run “bcdedit /set testsigning off” and reboot.  You can always check if test signing is enabled by running “bcdedit /enum” and looking for this line:

    bcdedit /enum


    That’s all for today.  Hopefully this post helped clear up any confusion about our different desktop watermarks.

  • Ntdebugging Blog

    LUN Discovery in a Nutshell


    The way the disk configuration is determined on a Windows system is by a process called “discovery”.  If the disk configuration is ISCSI or Fibre the process is very similar.  I will describe the process on each and you will see they are very similar.  To simplify things this blog will not describe multipath facilities, the availability of multiple paths to a LUN/Target can cause storport to create more device objects than number of LUNs available.


    First, let me explain some SCSI terminologies I will be using to describe this process:

    Target - A SCSI target is an addressable unit that contains logical units (see LUN). A target can be described as a container of LUNs.  The address can be an IP address as in the case of ISCSI or a 16 bit node address if fibre. Commands are sent to a target using its address.

    LUN - A LUN is a logical unit that is in a target. A LUN most often describes a disk or tape unit inside a target.  The LUN that is the target of a command is specified in the command packet.

    Check Condition - A check condition is a status response to a SCSI command.  There are a number of status codes a target can return, one of them is a check condition.  A check condition requires the host to send a command called “request sense”.  The target will respond with data containing sense codes defining the event that caused the check condition to occur.


    Now let’s talk about the discovery process on the different mediums and how it is accomplished:



    Whenever a scan is requested for a bus controlled by an iSCSI or Fibre Channel adapter, the Microsoft storport driver will attempt to send to all targets a “report LUNS” command. The response to a report LUNS command will contain information regarding all the LUNS on the target. Then storport will send “inquiry” commands to all LUNS reported to get specific LUN information.  Even though the number of targets can be as high as 128, the time this takes is usually very short.  The adapter keeps a table to translate OS Target ID to the Destination ID which is the actual address on the fabric (or SessionID in case of iscsi).  The Adapter miniport driver creates this table from data it receives from the name server. No I/O is done if the command fails due to a nonexistent target.


    The device object representing the LUN is built from the inquiry data received from the LUN.  One device object will be built in storport for each LUN discovered.



    The iSCSI discovery occurs the same way as far as sending the report LUN command to each target as done in Fibre.  However, the discovery will happen when a connection is made to a target.  When the iSCSI miniport driver successfully logs into a target it will notify the port driver to do a bus scan of the ISCSI adapter.  The enumeration is identical as far as the scan itself.


    Common question are:


    What happens when I add a LUN to an existing target? 

    When a LUN is added, the target does not immediately notify the host.  When an I/O is sent to an existing LUN, the I/O will get a “check condition” status.  When the request sense is issued the sense data will indicate that the “report LUNS data has changed”.  The port driver (usually storport) picks this up and the port driver will scan the bus and discover the new LUN.  This would be the same for iSCSI and fibre.  This means that if the target does not issue a check condition to notify the OS of a new LUN the LUN will not appear on the system and the user will need to manually scan for the new disk.


    How does the operating system know that a new target was added?

    If a new target is added on an iSCSI bus, the scan will be initiated after the miniport driver successfully logs in to the target.  So what about fiber?  When a new target is added to a fibre channel, the adapter will receive a RSCN (registered state change notification) packet from the fabric.  This indication will be passed on to the port driver by the miniport (depending on adapter driver) to scan the bus.  Then the new target will be picked up by the scan and the target will be sent a report LUNS to get the LUNS on the target.  Again, this means if the miniport or the fibre does not notify the OS of a new target its devices will not appear on the system and the user will need to manually scan for the new disks.


    Editor's Note: The external links provided in this article contained accurate information at the time of publication.  For official information regarding the SCSI specification, please see the latest revision of the SAM working draft at http://www.t10.org/drafts.htm#sam4.  The T10 webpage may ask for personal information and agreements before giving access to the drafts, this is outside of the control of The NtDebugging Blog.

  • Ntdebugging Blog

    Updated Archive of the Debug Ninja’s Twitter Debug Tips


    Every Wednesday (usually) I post a debug tip to our twitter page at https://twitter.com/#!/ntdebugging. This blog is an archive of these tips to allow our readers to find this information easily. I will update this blog periodically with the new tips; follow us on twitter if you want to see the tips as I post them.

    The goal of these tips is to share debug commands, and forms of commands (parameters, flags, etc) that my colleagues and I find useful. I hope you can add these commands to your toolkit and they will help you debug more efficiently.


    !thread/!process [address] e - on x64 will not show you the meaningless Args to Child information.

    .frame /c [FrameNumber] - sets context to specified stack frame. Provides more reliable information than .trap on x64.

    kn - Dumps call stack with frame numbers, easier than counting stacks for .frame.

    .frame /r [FrameNumber] - same as .frame /c, but shows registers without changing context.

    Note: With .frame /c or /r you can only trust the nonvolatile registers. See http://msdn.microsoft.com/en-us/library/9z1stfyw(VS.80).aspx for vol/nonvol regs.

    k=rbp rip FrameCount - Dumps call stack starting at rbp/rip on x64. Useful when the stack is corrupt. #debug ^DN

    .process/.thread /p /r [address] - sets new process context, sets .cache forcedecodeuser, and reloads user symbols. #debug ^DebugNinja

    !process [address] 17 - Sets the context for this command, avoids the need for .process to see user stacks. Try !process 0 17 #debug ^DN

    ~~[ThreadID]s - Changes threads in user mode. Use Thread ID number from output such as !locks. Ex: ~~[1bd4]s #debug ^DN

    runas /netonly /u:<account> windbg.exe - Launch windbg with domain account. Use when dbg computer isn't in domain and symbol server is. ^DN

    !heap -p -a <address> - Shows information about the heap block containing <address>, even if you aren't using pageheap. #debug ^DN

    ub - Unassembles starting at a location prior to your address. Accepts l<number> to specify how many instructions to go back. ub . l20 ^DN

    !stacks 2 [FilterString] - Finds kernel mode call stacks that contain the FilterString in a symbol. #debug ^DN

    !thread [address] 17 (or 1e on x64) - Sets context for this command, avoids the need for .thread/.process for user stacks. #debug ^DN

    .hh [Text] - Opens the debugger help. [Text] is the topic to lookup in the index. Example: .hh !pte   #debug ^DN

    ?? can dump structs using C++ style expressions. Ex: ??((nt!_KTHREAD*)(0xfffffa800ea43bb0))->ApcState #debug ^DN

    bp /t EThread - Sets a kernel mode breakpoint that only triggers when hit in the context of this thread. #debug ^DN

    bp /p EProcess - Sets a kernel mode breakpoint that only triggers when hit in the context of this process. #debug ^DN

    gc - If you run 'p' and hit a breakpoint, gc takes you where p would have gone if you had not hit the bp.  #debug ^DN

    gu - Go until the current function returns.  Effectively this unwinds one stack frame.  #debug #windbg ^DN

    pc - Steps through until the next 'call' instruction. Combine with other commands to find who returned your error> pc;p;r eax #debug ^DN

    pt - Steps through until the next 'ret' instruction. Similar to gu, but pt stops on the ret and gu stops after the ret. #debug ^DN

    .ignore_missing_pages 1 - supresses the error: "Page 2a49 not present in the dump file. Type ".hh dbgerr004" for details" #debug ^DN

    .exr -1 shows the most recent exception.  Useful in user dumps of crashes, especially for no execute crashes (NX/DEP). #debug ^DN

    wt - Trace calls until they return to the current address. More useful with -or to get return values. Use -l for depth. ^DN #debug

    .thread /w - Changes to the WOW64 32-bit context from 64-bit kernel mode. Wow64exts doesn't work in kernel mode. #debug ^DN

    ??sizeof(structure) - Gets the size of a structure, it's easier than counting. #debug ^DN

    sxe ld:module.dll - Enables an exception which will break into the debugger when module.dll is loaded. #debug ^DN

    vertarget - Shows OS version of the debug target. Also shows machine name, uptime, and session time (when the dump was taken). #debug ^DN

    !vm 1 - In a kernel debugger, shows basic information about memory usage. Available, committed, pagefile, pool, sysptes, etc. #debug ^DN

    .time - Shows session time (when dump was taken) and system uptime. In user mode shows process uptime, kernel/user time. #debug ^DN

    ba w size [address] - Break on write access only. Replace size with the num bytes you want to watch. Ex: ba w 4 005d5f10 #debug ^DN

    .bugcheck - Displays the bugcheck code of a blue screen crash. The format is more concise than !analyze.  #debug ^DN

    .process -i <address> - Make the process active and break into the debugger. Use in live kernel debugs to get into process context. ^DN

    .reload /f /o - Overwrites cached files in your downstream symbol store.  Useful when your store has corrupt pdbs. #debug ^DN

    ->* - Use with dt to dump pointers. Example: dt _EPROCESS [Address] ObjectTable->*

    !for_each_module s -a @#Base @#End "PTag" - Find the drivers using pool tag "PTag". #debug ^DN

    .unload [DllName] - Unloads the debug extension you didn't intend to load. Omit DllName to unload the last dll loaded. #debug ^DN

    !exqueue dumps the executive worker queues.  Use flags 7f to dump the worker threads and the queues. #debug ^DN

    lmvm <module> - Dumps information about the module.  Remember to use <module> and not <module.dll>. #debug ^DN

  • Ntdebugging Blog

    A Classic Case of Whodunit


    Sometimes we encounter problems that just don't make sense.  I don't mean a high powered lawyer talking about Chewbacca, I mean sometimes computers do things that defy logic.


    The below bugcheck is one example.  At first glance, some people will blame the first third party code they see and declare "It's the anti-virus!"  That is a classic example of people defying logic, but this article is about computers defying logic, so there must be something else going on here.

    3: kd> .bugcheck

    Bugcheck code 00000050

    Arguments c73fdb0b 00000001 809327b8 00000000

    3: kd> k

    ChildEBP RetAddr

    b7b4d5a8 8085e6cd nt!KeBugCheckEx+0x1b

    b7b4d620 8088bc18 nt!MmAccessFault+0xb25

    b7b4d620 809327b8 nt!_KiTrap0E+0xdc

    b7b4d6bc 808ef973 nt!ObReferenceObjectByHandle+0x16e

    b7b4d75c 80888c7c nt!NtQueryInformationFile+0xcd

    b7b4d75c 8082ea49 nt!KiFastCallEntry+0xfc

    b7b4d7e8 b88db606 nt!ZwQueryInformationFile+0x11

    b7b4d864 b88db6c3 NAVAP+0x2e606

    b7b4d88c b88b30f6 NAVAP+0x2e6c3

    b7b4d8e0 b88b3338 NAVAP+0x60f6

    b7b4d900 b88b6a37 NAVAP+0x6338

    b7b4d948 b8993348 NAVAP+0x9a37

    b7b4d96c b8995af8 SYMEVENT!SYMEvent_GetSubTask+0x1438

    b7b4d9e8 b898fe32 SYMEVENT!EventObjectDestroy+0x338

    b7b4d9f8 b89963e8 SYMEVENT+0x4e32

    b7b4da48 8081dcdf SYMEVENT!EventObjectCreate+0x3e8

    b7b4da5c 808f8275 nt!IofCallDriver+0x45

    b7b4db44 80936b13 nt!IopParseDevice+0xa35

    b7b4dbc4 80932e04 nt!ObpLookupObjectName+0x5a9

    b7b4dc18 808ea231 nt!ObOpenObjectByName+0xea

    b7b4dc94 808eb4cb nt!IopCreateFile+0x447

    b7b4dcf0 808edf4a nt!IoCreateFile+0xa3

    b7b4dd30 80888c7c nt!NtCreateFile+0x30

    b7b4dd30 7c82ed54 nt!KiFastCallEntry+0xfc

    059293e0 00000000 0x7c82ed54


    The basic premise of all troubleshooting is logic.  I often use a series of questions to shape the logic for the problem I am investigating.  I start all blue screen debugs with the same question, "Why did the system crash?".  The answer to this question is usually in the bugcheck code.

    3: kd> .bugcheck

    Bugcheck code 00000050

    Arguments c73fdb0b 00000001 809327b8 00000000


    The debugger.chm help file has a description of this error under the topic "Bug Check 0x50: PAGE_FAULT_IN_NONPAGED_AREA".  It explains that this error happens when invalid memory is accessed, and it shows what the four bugcheck parameters mean.

    Parameter Description
    1 Memory address referenced
    2 0: Read operation
    1: Write operation
    3 Address that referenced memory (if known)
    4 reserved


    Interpreting our bugcheck code, address c73fdb0b was written to by the instruction at address 809327b8


    If you refer to the documentation on page fault handling in x86 you will see that the processor stores the address being faulted on in cr2 prior to calling the page fault handler.  We can use this to reconfirm the data in the bugcheck code.

    3: kd> r cr2

    Last set context:



    We can confirm that virtual address c73fdb0b really is invalid by looking at the PTE.

    3: kd> !pte c73fdb0b

                        VA c73fdb0b

    PDE at C06031C8            PTE at C0639FE8

    contains 000000021BB36963  contains 0000000000000000

    pfn 21bb36    -G-DA--KWEV  not valid


    The next question is why did the instruction at address 809327b8 attempt to write to c73fdb0b?  The call stack and trap frame can answer this question.

    3: kd> kv

    ChildEBP RetAddr  Args to Child

    b7b4d5a8 8085e6cd 00000050 c73fdb0b 00000001 nt!KeBugCheckEx+0x1b (FPO: [Non-Fpo]) (CONV: stdcall)

    b7b4d620 8088bc18 00000001 c73fdb0b 00000000 nt!MmAccessFault+0xb25 (FPO: [Non-Fpo]) (CONV: stdcall)

    b7b4d620 809327b8 00000001 c73fdb0b 00000000 nt!_KiTrap0E+0xdc (FPO: [0,0] TrapFrame @ b7b4d638)

    b7b4d6bc 808ef973 000012e4 00000080 00000180 nt!ObReferenceObjectByHandle+0x16e (FPO: [Non-Fpo]) (CONV: stdcall)

    b7b4d75c 80888c7c 000012e4 b7b4d858 b7b4d810 nt!NtQueryInformationFile+0xcd (FPO: [Non-Fpo]) (CONV: stdcall)

    b7b4d75c 8082ea49 000012e4 b7b4d858 b7b4d810 nt!KiFastCallEntry+0xfc (FPO: [0,0] TrapFrame @ b7b4d778)

    b7b4d7e8 b88db606 000012e4 b7b4d858 b7b4d810 nt!ZwQueryInformationFile+0x11 (FPO: [5,0,0])

    WARNING: Stack unwind information not available. Following frames may be wrong.

    b7b4d864 b88db6c3 000012e4 e1e5ba38 00000000 NAVAP+0x2e606

    b7b4d88c b88b30f6 8b0bff88 b88b3040 b7b4d8fc NAVAP+0x2e6c3

    b7b4d8e0 b88b3338 8b0bff88 00003f80 b88b3040 NAVAP+0x60f6

    b7b4d900 b88b6a37 00000000 b7b4d9ac b88b6a42 NAVAP+0x6338

    b7b4d948 b8993348 e11998b8 b7b4d9a4 00000001 NAVAP+0x9a37

    b7b4d96c b8995af8 00000000 b7b4d9ac b8997526 SYMEVENT!SYMEvent_GetSubTask+0x1438

    b7b4d9e8 b898fe32 b7b4da2c e162be44 b7b4da2c SYMEVENT!EventObjectDestroy+0x338

    b7b4d9f8 b89963e8 b7b4da2c 8b60cc50 b7b4da2c SYMEVENT+0x4e32

    b7b4da48 8081dcdf 8b916f10 8b3557c8 8b3557c8 SYMEVENT!EventObjectCreate+0x3e8

    b7b4da5c 808f8275 b7b4dc04 8cb7dcb0 00000000 nt!IofCallDriver+0x45 (FPO: [Non-Fpo]) (CONV: fastcall)

    b7b4db44 80936b13 8cb7dcc8 00000000 8b452008 nt!IopParseDevice+0xa35 (FPO: [Non-Fpo]) (CONV: stdcall)

    b7b4dbc4 80932e04 00000000 b7b4dc04 00000040 nt!ObpLookupObjectName+0x5a9 (FPO: [Non-Fpo]) (CONV: stdcall)

    b7b4dc18 808ea231 00000000 00000000 2775a801 nt!ObOpenObjectByName+0xea (FPO: [Non-Fpo]) (CONV: stdcall)

    b7b4dc94 808eb4cb 059293e8 80100080 05929384 nt!IopCreateFile+0x447 (FPO: [Non-Fpo]) (CONV: stdcall)

    b7b4dcf0 808edf4a 059293e8 80100080 05929384 nt!IoCreateFile+0xa3 (FPO: [Non-Fpo]) (CONV: stdcall)

    b7b4dd30 80888c7c 059293e8 80100080 05929384 nt!NtCreateFile+0x30 (FPO: [Non-Fpo]) (CONV: stdcall)

    b7b4dd30 7c82ed54 059293e8 80100080 05929384 nt!KiFastCallEntry+0xfc (FPO: [0,0] TrapFrame @ b7b4dd64)

    059293e0 00000000 00000000 00000000 00000000 0x7c82ed54

    3: kd> .trap b7b4d638

    ErrCode = 00000002

    eax=00000000 ebx=8b40af78 ecx=00000180 edx=b7b4d6e8 esi=8b81bb40 edi=e5c265c8

    eip=809327b8 esp=b7b4d6ac ebp=b7b4d6bc iopl=0         nv up ei ng nz na pe nc

    cs=0008  ss=0010  ds=0023  es=0023  fs=0030  gs=0000             efl=00010286


    809327b8 894a04          mov     dword ptr [edx+4],ecx ds:0023:b7b4d6ec=00000000


    Reading the instruction that we failed on, at address 809327b8, we can see that it dereferenced edx+4 where edx is b7b4d6e8.  This would result in a write to the memory at address b7b4d6ec.

    3: kd> !pte b7b4d6ec

                        VA b7b4d6ec

    PDE at C0602DE8            PTE at C05BDA68

    contains 000000010BA77863  contains 00000001BB583963

    pfn 10ba77    ---DA--KWEV  pfn 1bb583    -G-DA--KWEV


    3: kd> dd b7b4d6ec

    b7b4d6ec  00000000 00000000 00000000 00000000

    b7b4d6fc  00000000 b7b4d804 808ef01e 00000098

    b7b4d70c  00000005 00000000 8b81bb40 808ef045

    b7b4d71c  b7b4d86c 00000000 b7b4d838 b7b4d800

    b7b4d72c  00000000 00000000 00000007 00000001

    b7b4d73c  00000000 00000000 b7b4d6dc 00000000

    b7b4d74c  ffffffff 80880c90 80802b70 ffffffff

    b7b4d75c  b7b4d778 80888c7c 000012e4 b7b4d858


    This is where the logic starts to break down.  The code wanted to write to b7b4d6ec, which is a valid address.  The bugcheck code and cr2 say we failed writing to address c73fdb0b.  This does not make sense.


    The analogy I often use for scenarios such as this one is: If I ask my intern to get me a Mountain Dew from the break room, and he comes back to say we are out of coffee, am I at fault or is my intern broken?  Applying the same logic to this crash, if ObReferenceObjectByHandle asks the hardware to write to address b7b4d6ec and the hardware came back saying it cannot write to address c73fdb0b, is the software at fault or is the hardware broken?  Clearly the hardware is broken if it does not do what the software asks of it.


    In this instance, the customer replaced the processor and afterwards the system was stable.

  • Ntdebugging Blog

    NTFS And 4K Disks


    Since the 1960’s, hard disks have always used a block size of 512 bytes for the default read/write block size.  Recently drive manufacturers have been moving toward a larger block size to improve performance and reliability.  Currently there are two types of disks available with a 4KB sector size: 512 byte emulated, and 4KB block sized disks.


    Disks with 4KB block size and 512 bytes per sector emulation

    For performance reasons, drive manufacturers have already produced disks with 4KB native block size, which use firmware to emulate 512 bytes per sector.  Because of the emulated 512 byte sector size, the file system and most disk utilities will be blissfully unaware that they are running on a 4KB disk.  As a result, the on-disk structures will be completely unaffected by the underlying 4KB block size.  This allows for improved performance without altering the bytes per sector presented to the file system.  These disks are referred to as 512e (pronounced “five-twelve-eee”) disks.


    Disks with 4KB block size without emulation

    When the logical bytes per sector value is extended to 4KB without emulation, the actual file system will have to adjust to this new environment.  Actually, NTFS is already capable of functioning in this environment provided that no attached FS filter drivers make false assumptions about sector size.  Below are the highlights of what you should expect to see on a disk with a 4KB logical sector size.


    1. It will not be possible to format with a cluster size that is smaller than the 4KB native block size.  This is because cluster size is defined as a multiple of sector size.  This multiple will always be expressed as 2n .

    2. File records will assume the size of the logical block size of 4KB, rather than the previous size of 1KB.  This actually improves scalability to some degree, but the down-side is that each NTFS file record will require 4KB or more in the MFT.

    3. Sparse and compressed files will continue to have 16 clusters per compression unit.

    4. Since file records are 4 times their normal size, it will be possible to encode more mapping pairs per file record.  As a result, larger files can be compressed with NTFS compression without running into file system limitations.

    5. Since the smallest allowable cluster size is 4KB, NTFS compression will only work on volumes with a 4KB cluster size.

    6. Bytes per index record will be unaffected by the 4K block size since all index records are 4KB in size.  The on-disk folder directory structures will be completely unaffected by the new block size, but a performance increase may be seen while accessing folder structure metadata.

    7. The BIOS Parameter Block (BPB) will continue to have the same format as before, but the only positive value for clusters per File Record Segment (FRS) will be 1.  In the case where clusters per FRS is 1, the FRS byte size is computed by the following equation:



    NTFS BIOS Parameter Block Information


      BytesPerSector      :        4096

      Sectors Per Cluster :           1

      ReservedSectors     :           0

      Fats                :           0

      RootEntries         :           0

      Small Sectors       :           0 ( 0 MB )

      Media Type          :         248 ( 0xf8 )

      SectorsPerFat       :           0

      SectorsPerTrack     :          63

      Heads               :         255

      Hidden Sectors      :          64

      Large Sectors       :           0 ( 0 MB )


      ClustersPerFRS      :           1

      Clust/IndxAllocBuf  :           1

      NumberSectors       :                50431 ( 196.996 MB )

      MftStartLcn         :                16810

      Mft2StartLcn        :                    2

      SerialNumber        :  8406742282501311868

      Checksum            :                    0 (0x0)


    If the cluster size is larger than the FRS size, then ClustersPerFrs will be a negative number as shown in the example below (0xf4 is -12 decimal).  In this case, the record size is computed with the equation:



    In short, NTFS will always force a 4096 byte cluster size on disk with a 4KB sector size regardless of the cluster size.


    NTFS BIOS Parameter Block Information


      BytesPerSector      :        4096

      Sectors Per Cluster :           4

      ReservedSectors     :           0

      Fats                :           0

      RootEntries         :           0

      Small Sectors       :           0 ( 0 MB )

      Media Type          :         248 ( 0xf8 )

      SectorsPerFat       :           0

      SectorsPerTrack     :          63

      Heads               :         255

      Hidden Sectors      :          64

      Large Sectors       :           0 ( 0 MB )


      ClustersPerFRS      :         f4

      Clust/IndxAllocBuf  :         f4

      NumberSectors       :                50431 ( 196.996 MB )

      MftStartLcn         :                 4202

      Mft2StartLcn        :                    1

      SerialNumber        :  7270585088516976380

      Checksum            :                    0 (0x0)


    8. Aside from the 4KB file record size, there are a few other things to know about 4KB drives.  The code for implementing update sequence arrays (USA’s) has always worked on a 512 byte assumed sector size and it will continue to do so.  Since file records are 4 times their normal size, the update sequence arrays for file records now contain 9 entries instead of 3.  One array entry is required for the sequence number (blue) and eight array entries for the trailing bytes (red).  The original purpose of USA is to allow NTFS to detect torn writes.  Since the file record size is now equal to the block size, the hardware is capable of writing the entire file record at once, rather than in two parts.


        _MULTI_SECTOR_HEADER MultiSectorHeader {

                ULONG      Signature             : 0x454c4946 "FILE"

                USHORT     SequenceArrayOffset   : 0x0030

                USHORT     SequenceArraySize     : 0x0009




    0x0000   46 49 4c 45 30 00 09 00-dd 24 10 00 00 00 00 00   FILE0...Ý$.....

    0x0010   01 00 01 00 48 00 01 00-b0 01 00 00 00 10 00 00   ....H...°......

    0x0020   00 00 00 00 00 00 00 00-06 00 00 00 00 00 00 00   ................

    0x0030   02 00 00 00 00 00 00 00-00 00 00 00 00 00 00 00   ................

    0x0040   00 00 00 00 00 00 00 00-10 00 00 00 60 00 00 00   ...........`...

    0x0050   00 00 18 00 00 00 00 00-48 00 00 00 18 00 00 00   .......H......

    0x0060   f8 f1 5b 89 36 d2 cb 01-f8 f1 5b 89 36 d2 cb 01   øñ[‰6ÒË.øñ[‰6ÒË.

    0x0070   f8 f1 5b 89 36 d2 cb 01-f8 f1 5b 89 36 d2 cb 01   øñ[‰6ÒË.øñ[‰6ÒË.

    0x0080   06 00 00 00 00 00 00 00-00 00 00 00 00 00 00 00   ................

    0x0090   00 00 00 00 00 01 00 00-00 00 00 00 00 00 00 00   ................

    0x00a0   00 00 00 00 00 00 00 00-30 00 00 00 68 00 00 00   ........0...h...

    0x00b0   00 00 18 00 00 00 03 00-4a 00 00 00 18 00 01 00   .......J......

    0x00c0   05 00 00 00 00 00 05 00-f8 f1 5b 89 36 d2 cb 01   ........øñ[‰6ÒË.

    0x00d0   f8 f1 5b 89 36 d2 cb 01-f8 f1 5b 89 36 d2 cb 01   øñ[‰6ÒË.øñ[‰6ÒË.

    0x00e0   f8 f1 5b 89 36 d2 cb 01-00 00 01 00 00 00 00 00   øñ[‰6ÒË.........

    0x00f0   00 00 01 00 00 00 00 00-06 00 00 00 00 00 00 00   ................

    0x0100   04 03 24 00 4d 00 46 00-54 00 00 00 00 00 00 00   ..$.M.F.T.......

    0x0110   80 00 00 00 48 00 00 00-01 00 40 00 00 00 01 00   €...H.....@.....

    0x0120   00 00 00 00 00 00 00 00-ff 00 00 00 00 00 00 00   ........ÿ.......

    0x0130   40 00 00 00 00 00 00 00-00 00 10 00 00 00 00 00   @..............

    0x0140   00 00 10 00 00 00 00 00-00 00 10 00 00 00 00 00   ..............

    0x0150   22 00 01 aa 41 00 ff ff-b0 00 00 00 50 00 00 00   "..ªA.ÿÿ°...P...

    0x0160   01 00 40 00 00 00 05 00-00 00 00 00 00 00 00 00   ..@.............

    0x0170   01 00 00 00 00 00 00 00-40 00 00 00 00 00 00 00   ........@.......

    0x0180   00 20 00 00 00 00 00 00-08 10 00 00 00 00 00 00   . .............

    0x0190   08 10 00 00 00 00 00 00-21 01 a9 41 21 01 fd fd   .......!.©A!.ýý

    0x01a0   00 69 b4 05 80 fa ff ff-ff ff ff ff 00 00 00 00   .i´.€úÿÿÿÿÿÿ....

    0x01b0   00 00 10 00 00 00 00 00-22 00 01 aa 41 00 ff ff   ......."..ªA.ÿÿ

    0x01c0   b0 00 00 00 50 00 00 00-01 00 40 00 00 00 05 00   °...P.....@.....

    0x01d0   00 00 00 00 00 00 00 00-01 00 00 00 00 00 00 00   ................

    0x01e0   40 00 00 00 00 00 00 00-00 20 00 00 00 00 00 00   @........ ......

    0x01f0   08 10 00 00 00 00 00 00-08 10 00 00 00 00 02 00   ..............




    0x03c0   00 00 00 00 00 00 00 00-00 00 00 00 00 00 00 00   ................

    0x03d0   00 00 00 00 00 00 00 00-00 00 00 00 00 00 00 00   ................

    0x03e0   00 00 00 00 00 00 00 00-00 00 00 00 00 00 00 00   ................

    0x03f0   00 00 00 00 00 00 00 00-00 00 00 00 00 00 02 00   ................




    0x05d0   00 00 00 00 00 00 00 00-00 00 00 00 00 00 00 00   ................

    0x05e0   00 00 00 00 00 00 00 00-00 00 00 00 00 00 00 00   ................

    0x05f0   00 00 00 00 00 00 00 00-00 00 00 00 00 00 02 00   ................




    0x07d0   00 00 00 00 00 00 00 00-00 00 00 00 00 00 00 00   ................

    0x07e0   00 00 00 00 00 00 00 00-00 00 00 00 00 00 00 00   ................

    0x07f0   00 00 00 00 00 00 00 00-00 00 00 00 00 00 02 00   ................




    0x09d0   00 00 00 00 00 00 00 00-00 00 00 00 00 00 00 00   ................

    0x09e0   00 00 00 00 00 00 00 00-00 00 00 00 00 00 00 00   ................

    0x09f0   00 00 00 00 00 00 00 00-00 00 00 00 00 00 02 00   ................




    0x0bd0   00 00 00 00 00 00 00 00-00 00 00 00 00 00 00 00   ................

    0x0be0   00 00 00 00 00 00 00 00-00 00 00 00 00 00 00 00   ................

    0x0bf0   00 00 00 00 00 00 00 00-00 00 00 00 00 00 02 00   ................




    0x0dd0   00 00 00 00 00 00 00 00-00 00 00 00 00 00 00 00   ................

    0x0de0   00 00 00 00 00 00 00 00-00 00 00 00 00 00 00 00   ................

    0x0df0   00 00 00 00 00 00 00 00-00 00 00 00 00 00 02 00   ................




    0x0fd0   00 00 00 00 00 00 00 00-00 00 00 00 00 00 00 00   ................

    0x0fe0   00 00 00 00 00 00 00 00-00 00 00 00 00 00 00 00   ................

    0x0ff0   00 00 00 00 00 00 00 00-00 00 00 00 00 00 02 00   ................


    I don’t actually own a 4KB disk, but I was able to give you this preview thanks to a nifty tool called VStorControl.  Vstor is a tool which allows you to create virtualized SCSI disks with arbitrary block sizes and is available for download with the Windows 7 SDK.


    That’s all for now,

    Dennis Middleton “The NTFS Doctor”

  • Ntdebugging Blog

    What Is In A RHS Dump File Created By Windows Error Reporting


    Hello all, East here.  I wanted to give you a hint on how to use a RHS dump to find what thread was part of the Windows Server 2008 R2 cluster RHS recovery deadlock.


    First let me start off with letting you know that Windows Server 2008 R2 will create two types of user-mode dumps:

    1 - A heap dump (.hdmp) - which is an extended mini-dump that contains additional data such as the process memory.

    2 - A mini dump (.mdmp) - which will only contain mainly stack information and not enough info to help you get further.


    When a Windows Server 2008 R2 cluster has a resource that does not respond to one of the main cluster queries in a timely fashion, RHS will try to recover that resource by restarting it in an isolated process as well as creating a WER report and a user-mode dump of the deadlock.


    NOTE: More info about RHS and WER reports can be seen in the following blog - http://blogs.msdn.com/b/clustering/archive/2009/06/27/9806160.aspx


    When a deadlock is detected the cluster log will show:

    "00000dd4.00000aac::2011/05/13-16:12:39.559 ERR   [RHS] WER report is submitted. Result : WerReportQueued."


    You should find the WER report file and dumps located at the default location:

    <SystemDrive>\Users\All Users\Microsoft\Windows\WER\ReportQueue


    Here is what a cluster created Windows Error Report looks like:


    EventType=WSFC Resource Deadlock






    Sig[0].Name=Resource Name

    Sig[0].Value=New flexdisk

    Sig[1].Name=Resource Type


    Sig[2].Name=Call Type


    DynamicSig[1].Name=OS Version


    DynamicSig[2].Name=Locale ID



    File[0].Path=WERC1F2.tmp.hdmp <----------------------------- Name of the dump file   









    FriendlyEventName=WSFC Resource Deadlock

    ConsentKey=WSFC Resource Host Monitor

    AppName=Failover Cluster Resource Host Subsystem <--------- The application that had an issue - RHS


    ReportDescription=WSFC Resource Deadlock


    Now you have the area to look for in the cluster log, and an idea of which resource caused RHS to deadlock, and what it was doing at the time:


    The Resource Name = New flexdisk

    The Resource Type = flexdisk

    The type of call the resource was engaged in = ISALIVE


    You have all this information in the cluster log as well:

    00000dd4.00000aac::2011/05/13-16:12:39.136 ERR   [RHS] RhsCall::DeadlockMonitor: Call ISALIVE timed out for resource 'New flexdisk'.  RHS deadlock during an Isalive check for the New flexdisk resource.

    00000dd4.00000aac::2011/05/13-16:12:39.136 INFO  [RHS] Enabling RHS termination watchdog with timeout 1200000 and recovery action 3.

    00000dd4.00000aac::2011/05/13-16:12:39.136 ERR   [RHS] Resource New flexdisk handling deadlock. Cleaning current operation and terminating RHS process.

    00000dd4.00000aac::2011/05/13-16:12:39.136 ERR   [RHS] About to send WER report.

    000004dc.00000fb4::2011/05/13-16:12:39.136 WARN  [RCM] HandleMonitorReply: FAILURENOTIFICATION for 'New flexdisk', gen(3) result 4.

    000004dc.00000fb4::2011/05/13-16:12:39.136 INFO  [RCM] rcm::RcmResource::HandleMonitorReply: Resource 'New flexdisk' consecutive failure count 1.

    00000dd4.00000aac::2011/05/13-16:12:39.559 ERR   [RHS] WER report is submitted. Result : WerReportQueued.  WER report and user-mode dump created

    000004dc.00000fb4::2011/05/13-16:12:39.559 ERR   [RCM] rcm::RcmMonitor::RecoverProcess: Recovering monitor process 3540 / 0xdd4

    000004dc.00000fb4::2011/05/13-16:12:39.575 INFO  [RCM] Created monitor process 3280 / 0xcd0

    00000cd0.00000200::2011/05/13-16:12:39.575 INFO  [RHS] Initializing.

    000004dc.00000fb4::2011/05/13-16:12:39.591 INFO  [RCM] rcm::RcmResource::ReattachToMonitorProcess: (New flexdisk, OfflineCallIssued)

    000004dc.00000fb4::2011/05/13-16:12:39.591 WARN  [RCM] Canceling pending control GET_RO_PRIVATE_PROPERTIES for resource 'New flexdisk' due to monitor crash.


    Now if you want to get adventurous and review the dump file to see the call stacks of RHS and what the resource was doing, you will look at the WER dump with the extension “.hdmp”.  Since we are looking at a process with multiple threads, we need to find the one that is showing the WER reporting calls and the one showing the RHS deadlock.


    Run the following command after opening the dump under your debugger “~*k”.  In the below example, thread 3 is the one we are interested in. You can tell this because the process and thread ID matches the cluster log above.  If you don’t have a cluster log, this is the thread with wer.dll on the stack.

    0:000> ~*k


    .  0  Id: dd4.9d4 Suspend: 0 Teb: 000007ff`fffdd000 Unfrozen

    Child-SP          RetAddr           Call Site

    00000000`001cf128 000007fe`fd611420 ntdll!ZwWaitForMultipleObjects+0xa

    00000000`001cf130 00000000`772c1220 KERNELBASE!WaitForMultipleObjectsEx+0xe8

    00000000`001cf230 00000000`ff763d53 kernel32!WaitForMultipleObjects+0xb0

    00000000`001cf2c0 00000000`ff74802c rhs!cxl::WaitHandle::WaitAny+0x3f

    00000000`001cf3b0 00000000`ff7458a1 rhs!Rhs::WaitForTermination+0x74

    00000000`001cf450 00000000`ff769099 rhs!wmain+0x4b1

    00000000`001cfbd0 00000000`772c652d rhs!__wmainCRTStartup+0x13d

    00000000`001cfc10 00000000`774fc521 kernel32!BaseThreadInitThunk+0xd

    00000000`001cfc40 00000000`00000000 ntdll!RtlUserThreadStart+0x1d


       1  Id: dd4.d0c Suspend: 0 Teb: 000007ff`fffdb000 Unfrozen

    Child-SP          RetAddr           Call Site

    00000000`00c3fa98 000007fe`fd611420 ntdll!ZwWaitForMultipleObjects+0xa

    00000000`00c3faa0 00000000`772c1220 KERNELBASE!WaitForMultipleObjectsEx+0xe8

    00000000`00c3fba0 00000000`ff763d53 kernel32!WaitForMultipleObjects+0xb0

    00000000`00c3fc30 00000000`ff75785c rhs!cxl::WaitHandle::WaitAny+0x3f

    00000000`00c3fd20 00000000`ff766988 rhs!PingAliveCallScheduler::Run+0x184

    00000000`00c3fda0 00000000`772c652d rhs!cxl::Thread::ThreadStartRoutine+0x18

    00000000`00c3fdf0 00000000`774fc521 kernel32!BaseThreadInitThunk+0xd

    00000000`00c3fe20 00000000`00000000 ntdll!RtlUserThreadStart+0x1d


       2  Id: dd4.e78 Suspend: 0 Teb: 000007ff`fffd9000 Unfrozen

    Child-SP          RetAddr           Call Site

    00000000`00baf998 00000000`774eb007 ntdll!ZwWaitForMultipleObjects+0xa

    00000000`00baf9a0 00000000`772c652d ntdll!TppWaiterpThread+0x14d

    00000000`00bafc40 00000000`774fc521 kernel32!BaseThreadInitThunk+0xd

    00000000`00bafc70 00000000`00000000 ntdll!RtlUserThreadStart+0x1d


       3  Id: dd4.aac Suspend: 0 Teb: 000007ff`fffd7000 Unfrozen

    Child-SP          RetAddr           Call Site

    00000000`00cbe5e8 000007fe`fd611420 ntdll!ZwWaitForMultipleObjects+0xa

    00000000`00cbe5f0 00000000`772d2d53 KERNELBASE!WaitForMultipleObjectsEx+0xe8

    00000000`00cbe6f0 00000000`773e8f7d kernel32!WaitForMultipleObjectsExImplementation+0xb3

    00000000`00cbe780 00000000`773e62b2 user32!RealMsgWaitForMultipleObjectsEx+0x12a

    00000000`00cbe820 00000000`773e62e0 user32!MsgWaitForMultipleObjectsEx+0x46

    00000000`00cbe860 000007fe`f6f3c35c user32!MsgWaitForMultipleObjects+0x20

    00000000`00cbe8a0 000007fe`f6f6dbd0 wer!UtilMsgWaitForMultipleObjects+0x54

    00000000`00cbe930 000007fe`f6f54158 wer!CReportManager::ReportProblemOutOfProcess+0x5c4

    00000000`00cbecc0 000007fe`f6f3b99b wer!CReportManager::ReportProblem+0x1eb

    00000000`00cbede0 000007fe`f6f61cbe wer!CReportHandleInstance::SubmitReport+0x386

    00000000`00cbee50 00000000`ff750dcf wer!WerReportSubmit+0x16a

    00000000`00cbeef0 00000000`ff75117b rhs!RhsResource::WERReportDeadlock+0x43f

    00000000`00cbfaf0 00000000`ff74b270 rhs!RhsResource::HandleDeadlock+0x187

    00000000`00cbfb60 00000000`772c652d rhs!RhsCall::DeadlockMonitor+0x324

    00000000`00cbfc60 00000000`774fc521 kernel32!BaseThreadInitThunk+0xd

    00000000`00cbfc90 00000000`00000000 ntdll!RtlUserThreadStart+0x1d


       4  Id: dd4.e48 Suspend: 0 Teb: 000007ff`fffae000 Unfrozen

    Child-SP          RetAddr           Call Site

    00000000`00e2f0f8 000007fe`fd611203 ntdll!ZwDelayExecution+0xa

    00000000`00e2f100 000007fe`f78f1ad9 KERNELBASE!SleepEx+0xab

    00000000`00e2f1a0 00000000`00326aa0 FlexRes+0x11ad9

    00000000`00e2f1a8 00000000`00000000 0x326aa0


    <snipped to save space>


    We will switch context to thread 3 by using the “ ~3s ” command, and follow that with “ kn “ to show the stack with frame numbers.  Debug Tip: Since we already know that its thread id is “aac” we can also switch directly to thread 3 by using its thread id in the following command syntax:  “ ~~[aac]s ”

    0:003> kn

    # Child-SP          RetAddr           Call Site

    00 00000000`00cbe5e8 000007fe`fd611420 ntdll!NtWaitForMultipleObjects+0xa

    01 00000000`00cbe5f0 00000000`772d2d53 KERNELBASE!WaitForMultipleObjectsEx+0xe8

    02 00000000`00cbe6f0 00000000`773e8f7d kernel32!WaitForMultipleObjectsExImplementation+0xb3

    03 00000000`00cbe780 00000000`773e62b2 user32!RealMsgWaitForMultipleObjectsEx+0x12a

    04 00000000`00cbe820 00000000`773e62e0 user32!MsgWaitForMultipleObjectsEx+0x46

    05 00000000`00cbe860 000007fe`f6f3c35c user32!MsgWaitForMultipleObjects+0x20

    06 00000000`00cbe8a0 000007fe`f6f6dbd0 wer!UtilMsgWaitForMultipleObjects+0x54

    07 00000000`00cbe930 000007fe`f6f54158 wer!CReportManager::ReportProblemOutOfProcess+0x5c4

    08 00000000`00cbecc0 000007fe`f6f3b99b wer!CReportManager::ReportProblem+0x1eb

    09 00000000`00cbede0 000007fe`f6f61cbe wer!CReportHandleInstance::SubmitReport+0x386

    0a 00000000`00cbee50 00000000`ff750dcf wer!WerReportSubmit+0x16a

    0b 00000000`00cbeef0 00000000`ff75117b rhs!RhsResource::WERReportDeadlock+0x43f

    0c 00000000`00cbfaf0 00000000`ff74b270 rhs!RhsResource::HandleDeadlock+0x187

    0d 00000000`00cbfb60 00000000`772c652d rhs!RhsCall::DeadlockMonitor+0x324

    0e 00000000`00cbfc60 00000000`774fc521 kernel32!BaseThreadInitThunk+0xd

    0f 00000000`00cbfc90 00000000`00000000 ntdll!RtlUserThreadStart+0x1d


    The frame of our concern will be frame d. You can either go the context ( /c ) or just display the registers  ( /r )  for that frame context:

    0:003> .frame /r d

    0d 00000000`00cbfb60 00000000`772c652d rhs!RhsCall::DeadlockMonitor+0x324

    rax=000007fffffd7000 rbx=0000000000346920 rcx=00000000007316c0

    rdx=0000000000000000 rsi=000000000035f8b0 rdi=000000000035f8b0

    rip=00000000ff74b270 rsp=0000000000cbfb60 rbp=0000000000000000

    r8=00000000ffffffff  r9=00000000000004ff r10=0000000000000000

    r11=0000000000cbe810 r12=0000000000000000 r13=0000000000000000

    r14=0000000000000000 r15=000000004dcd5877

    iopl=0         nv up ei pl zr na po nc

    cs=0033  ss=002b  ds=002b  es=002b  fs=0053  gs=002b             efl=00000246


    00000000`ff74b270 448b442430      mov     r8d,dword ptr [rsp+30h] ss:00000000`00cbfb90=ffffffff


    Here is where we get into tricky territory about where to look. I know from experience that the RDI register from above is going to end up containing my information for this resource. Another way to verify you are in the right area, dumping RDI will show the CallType.


    Dump that out to find the thread containing the stack with:

    0:003> dqs 000000000035f8b0

    00000000`0035f8b0  00000000`ff704eb8 rhs!IsAliveCall::`vftable' == CallType

    00000000`0035f8b8  00000000`00000004

    00000000`0035f8c0  00000000`001e7da0

    00000000`0035f8c8  00000000`00000e48 <-- Thread id of concerning resource

    00000000`0035f8d0  00000000`00346920

    00000000`0035f8d8  0000000c`00000001

    00000000`0035f8e0  00000000`00000003

    00000000`0035f8e8  00000000`4dcd5876

    00000000`0035f8f0  00000000`048d2dd3

    00000000`0035f8f8  00000000`00000100

    00000000`0035f900  00000001`00000100

    00000000`0035f908  00000000`00000000

    00000000`0035f910  00430000`00000031

    00000000`0035f918  8a0000a4`52e48988

    00000000`0035f920  004c0041`00530049

    00000000`0035f928  00450045`00560049


    If we scan our previous “~*k” output, we should find the thread id above to match our thread with the problem resource, FlexRes:

       4  Id: dd4.e48 Suspend: 0 Teb: 000007ff`fffae000 Unfrozen

    Child-SP          RetAddr           Call Site

    00000000`00e2f0f8 000007fe`fd611203 ntdll!ZwDelayExecution+0xa

    00000000`00e2f100 000007fe`f78f1ad9 KERNELBASE!SleepEx+0xab

    00000000`00e2f1a0 00000000`00326aa0 FlexRes+0x11ad9

    00000000`00e2f1a8 00000000`00000000 0x326aa0


    In this case I was using an in house test app that just went to sleep instead of responding to the LooksAlive() check, which caused IsAlive() to fail.


    Also if you check the cluster log 5 minutes prior to the RhsCall::DeadlockMonitor error you will see the resource and thread id as well, showing what was going on prior to the Deadlock:

    00000dd4.00000e48::2011/05/13-16:07:37.270 INFO  [RES] flexdisk <New flexdisk>: Set new entry point  = ISALIVEERROR:1722,ISALIVEFAILTYPE:1,ISALIVEHANG:1 Error = 0.

    000004dc.00000fd4::2011/05/13-16:07:37.286 WARN  [RCM] ResourceControl(SET_PRIVATE_PROPERTIES) to New flexdisk returned 5024.

    00000dd4.00000e48::2011/05/13-16:07:38.023 INFO  [RES] flexdisk <New flexdisk>: Set new entry point  = ISALIVEERROR:1722,ISALIVEFAILTYPE:1,ISALIVEHANG:0 Error = 0.

    00000dd4.00000e48::2011/05/13-16:07:38.023 INFO  [RES] flexdisk <New flexdisk>: Going to Hang in IsAlive.


    In most cases the cluster log can provide enough information to help you narrow down possible cluster resource issues.  If not, the user-mode dump may be able to help verify additional data, especially if you are debugging a third party resource dll that is deadlocking RHS.

  • Ntdebugging Blog

    Understanding Storage Timeouts and Event 129 Errors


    Greetings fellow debuggers, today I will be blogging about Event ID 129 messages.  These warning events are logged to the system event log with the storage adapter (HBA) driver’s name as the source.  Windows’ STORPORT.SYS driver logs this message when it detects that a request has timed out, the HBA driver’s name is used in the error because it is the miniport associated with storport.


    Below is an example 129 event:


    Event Type:       Warning

    Event Source:     <HBA_Name>

    Event Category:   None

    Event ID:         129

    Date:             4/9/2009

    Time:             1:15:45 AM

    User:             N/A

    Computer:         <Computer_Name>


    Reset to device, \Device\RaidPort1, was issued.


    So what does this mean?  Let’s discuss the Windows I/O stack architecture to answer this.


    Windows I/O uses a layered architecture where device drivers are on a “device stack.”  In a basic model, the top of the stack is the file system.  Next is the volume manager, followed by the disk driver.  At the bottom of the device stack are the port and miniport drivers.  When an I/O request reaches the file system, it takes the block number of the file and translates it to an offset in the volume. The volume manager then translates the volume offset to a block number on the disk and passes the request to the disk driver.  When the request reaches the disk driver it will create a Command Descriptor Block (CDB) that will be sent to the SCSI device.  The disk driver imbeds the CDB into a structure called the SCSI_REQUEST_BLOCK (SRB).  This SRB is sent to the port driver as part of the I/O request packet (IRP).


    The port driver does most of the request processing.  There are different port drivers depending on the architecture.  For example, ATAPORT.SYS is the port driver for ATA devices, and STORPORT.SYS is the port driver for SCSI devices.  Some of the responsibilities for a port driver are:

    • Providing timing services for requests.
    • Enforcing queue depth (making sure that a device does not get more requests that it can handle).
    • Building scatter gather arrays for data buffers.

    The port driver interfaces with a driver called the “miniport”.  The miniport driver is designed by the hardware manufacturer to work with a specific adapter and is responsible for taking requests from the port driver and sending them to the target LUN.  The port driver calls the HwStorStartIo() function to send requests to the miniport, and the miniport will send the requests to the HBA so they can be sent over the physical medium (fibre, ethernet, etc) to the LUN.  When the request is complete, the miniport will call StorPortNotification() with the NotificationType parameter value set to RequestComplete, along with a pointer to the completed SRB.


    When a request is sent to the miniport, STORPORT will put the request in a pending queue.  When the request is completed, it is removed from this queue.  While requests are in the pending queue they are timed. 


    The timing mechanism is simple.  There is one timer per logical unit and it is initialized to -1.  When the first request is sent to the miniport the timer is set to the timeout value in the SRB.  The disk timeout value is a tunable parameter in the registry at: HKLM\System\CurrentControlSet\Services\Disk\TimeOutValue.  Some vendors will tune this value to best match their hardware, we do not advise changing this value without guidance from your storage vendor.


    The timer is decremented once per second.  When a request completes, the timer is refreshed with the timeout value of the head request in the pending queue.  So, as long as requests complete the timer will never go to zero.  If the timer does go to zero, it means the device has stopped responding.  That is when the STORPORT driver logs the Event ID 129 error.  STORPORT then has to take corrective action by trying to reset the unit.  When the unit is reset, all outstanding requests are completed with an error and they are retried.  When the pending queue empties, the timer is set to -1 which is its initial value.


    Each SRB has a timer value set.  As requests are completed the queue timer is refreshed with the timeout value of the SRB at the head of the list.


    The most common causes of the Event ID 129 errors are unresponsive LUNs or a dropped request.  Dropped requests can be caused by faulty routers or other hardware problems on the SAN.


    I have never seen software cause an Event ID 129 error.  If you are seeing Event ID 129 errors in your event logs, then you should start investigating the storage and fibre network.

Page 7 of 24 (238 items) «56789»