• Ntdebugging Blog

    Leaving the Do Not Disturb Sign on the Door Will Cause the KERNEL_APC_PENDING_DURING_EXIT Bugcheck

    • 1 Comments

    This is Ron Stock from the Global Escalation Services team and I recently worked with a customer to determine which misbehaving driver was crashing their critical server. This particular crash was a STOP 0x00000020 which maps to KERNEL_APC_PENDING_DURING_EXIT.

     

    The KERNEL_APC_PENDING_DURING_EXIT bugcheck type indicates the APC disable count for a thread was not equal to zero when the thread exited. The APC disable count is a field in the _KTHREAD structure and it is decremented when drivers disable APCs by calling functions such as KeEnterCriticalRegion, FsRtlEnterFileSystem or by acquiring a mutex. Disabling APC delivery to a thread is the equivalent of hanging the “Do Not Disturb” sign on your door. When drivers need to perform a critical operation they ‘hang the sign on the door’ to prevent interruption from APCs. When the same driver fails to ‘take the sign off the door’ by calling KeLeaveCriticalRegion, FsRtlExitFileSystem or KeReleaseMutex, the APC disable count is never incremented back to its original value. This forgetful behavior causes a bugcheck because the APC disable count is checked when the thread is exiting. The OS expects this value to be zero on thread exit.

     

    In my case the value was 0xffff (negative 1) indicating a driver had forgot to remove the ‘Do Not Disturb’ sign.

     

    kd> !analyze –v

     

    KERNEL_APC_PENDING_DURING_EXIT (20)

    The key data item is the thread's APC disable count.

    If this is non-zero, then this is the source of the problem.

    Arguments:

    Arg1: 0000000000000000, The address of the APC found pending during exit.

    Arg2: 000000000000ffff, The thread's APC disable count

    Arg3: 0000000000000000, The current IRQL

    Arg4: 0000000000000001

     

    Because the value is decremented earlier in time the current call stack is not particularly useful. It merely shows the thread exiting under normal conditions.

     

    0: kd> !thread -1 e

    THREAD fffffa8049f04b50  Cid 0004.0998  Teb: 0000000000000000 Win32Thread: 0000000000000000 RUNNING on processor 0

    Not impersonating

    DeviceMap                 fffff8a000007ee0

    Owning Process            fffffa8048cad9e0       Image:         System

    Attached Process          N/A            Image:         N/A

    Wait Start TickCount      11503325       Ticks: 0

    Context Switch Count      185715         IdealProcessor: 0

    UserTime                  00:00:00.000

    KernelTime                00:00:06.078

    Win32 Start Address srv2!SrvProcWorkerThread(0xfffff88003c4b400)

    Stack Init fffff88005078db0 Current fffff880050789b0

    Base fffff88005079000 Limit fffff88005073000 Call 0

    Priority 15 BasePriority 15 UnusualBoost 0 ForegroundBoost 0 IoPriority 2 PagePriority 5

    Child-SP          RetAddr           Call Site

    fffff880`05078b08 fffff800`01984bd9 nt!KeBugCheckEx

    fffff880`05078b10 fffff800`019a1a3d nt!PspExitThread+0xffffffff`fffe3ae9

    fffff880`05078c10 fffff800`0195bc8a nt!PspTerminateThreadByPointer+0x4d

    fffff880`05078c60 fffff880`03c56769 nt!PsTerminateSystemThread+0x22

    fffff880`05078c90 fffff880`03c4b5b6 srv2!SrvProcTerminateWorkerThreadInternal+0x99

    fffff880`05078cc0 fffff800`01966e5a srv2!SrvProcWorkerThread+0x1b6

    fffff880`05078d40 fffff800`016c0d26 nt!PspSystemThreadStartup+0x5a

    fffff880`05078d80 00000000`00000000 nt!KxStartSystemThread+0x16

     

    Driver Verifier is the ideal tool for this type of bugcheck. It has a feature called Critical Region logging which tracks the call stack and KTRHEAD value for each call to either KeEnterCriticalRegion() and KeLeaveCriticalRegion(). I had the customer enable this logging by selecting the “Miscellaneous checks” option in Driver Verifier using these steps-

    • Run Verifier.exe
    • Select “Create custom settings (For code developers)”
    • Select individual settings from a full list
    • Select Miscellaneous checks
    • Select Driver Names from a list
    • Manually choose all of the third-party drivers.
    • Reboot after making the changes. 

     

    After running through the steps above, we gathered another STOP 0x00000020 dump. I confirmed the “Miscellaneous checks” option was enabled by using the !verifier command

     

    0: kd> !verifier

     

    Verify Level 800 ... enabled options are:

          Miscellaneous checks enabled

     

    The stack in this new dump was in the same SrvProcWorkerThread thread exit path so we had a consistent pattern. The thread with the negative APC Disable count was fffffa804b5be040.

     

    0: kd> !thread -1 e

    THREAD fffffa804b5be040 Cid 0004.082c  Teb: 0000000000000000 Win32Thread: 0000000000000000 RUNNING on processor 0

    Not impersonating

    DeviceMap                 fffff8a000007ee0

    Owning Process            fffffa8048cad9e0       Image:         System

    Attached Process          N/A            Image:         N/A

    Wait Start TickCount      4458237        Ticks: 0

    Context Switch Count      36067          IdealProcessor: 0            

    UserTime                  00:00:00.000

    KernelTime                00:00:01.218

    Win32 Start Address srv2!SrvProcWorkerThread(0xfffff88004827400)

    Stack Init fffff88005cc6db0 Current fffff88005cc69b0

    Base fffff88005cc7000 Limit fffff88005cc1000 Call 0

    Priority 15 BasePriority 15 UnusualBoost 0 ForegroundBoost 0 IoPriority 2 PagePriority 5

    Child-SP          RetAddr           Call Site

    fffff880`05cc6b08 fffff800`0198dbd9 nt!KeBugCheckEx

    fffff880`05cc6b10 fffff800`019aaa3d nt!PspExitThread+0xffffffff`fffe3ae9

    fffff880`05cc6c10 fffff800`01964c8a nt!PspTerminateThreadByPointer+0x4d

    fffff880`05cc6c60 fffff880`048326d9 nt!PsTerminateSystemThread+0x22

    fffff880`05cc6c90 fffff880`048275b6 srv2!SrvProcTerminateWorkerThreadInternal+0x99

    fffff880`05cc6cc0 fffff800`0196fe5a srv2!SrvProcWorkerThread+0x1b6

    fffff880`05cc6d40 fffff800`016c9d26 nt!PspSystemThreadStartup+0x5a

    fffff880`05cc6d80 00000000`00000000 nt!KxStartSystemThread+0x16

     

    I dumped the Critical Region log by using the !verifier 200 command. The Critical Region log has enough room for 128 stacks. After dumping the log, the first thing to do is to find the KTHREAD value of the thread with the non-zero APC disable count. Unfortunately in my case thread fffffa804b5be040 didn’t appear in the log. In fact all 128 stacks had a driver named Suspect.sys calling KeEnterCriticalRegion or KeLeaveCriticalRegion.  Note: To protect our vendor friends, I renamed the actual sys file in this article to suspect.sys.

     

    The customer disabled the suspect.sys driver hoping this was the driver forgetting to re-enable APCs. If nothing else, this would perhaps remove the noisy suspect.sys from the log in the next dump.

     

    0: kd> !verifier 200

     

    Enter/Leave Critical Region log:

    There are up to 0x80 entries in the log.

     

    Displaying all the log entries.

     

    ======================================================================

    Thread fffffa8048ce4b50

    fffff80001b74293 nt!VerifierKeLeaveCriticalRegion+0xc3

    fffff8800100aafa Suspect.sys+0xaafa

    fffff88001001e30 Suspect.sys+0x1e30

    fffff80001abc68c nt!IopLoadUnloadDriver+0x1c

    fffff800016e1641 nt!ExpWorkerThread+0x111

    fffff8000196ee5a nt!PspSystemThreadStartup+0x5a

    fffff800016c8d26 nt!KiStartSystemThread+0x16

    ======================================================================

    Thread fffffa8048ce4b50

    fffff80001b6b0a2 nt!VerifierKeEnterCriticalRegion+0x92

    fffff880010062a3 Suspect.sys+0x62a3

    fffff8800100a7e2 Suspect.sys+0xa7e2

    fffff88001001e30 Suspect.sys+0x1e30

    fffff80001abc68c nt!IopLoadUnloadDriver+0x1c

    fffff800016e1641 nt!ExpWorkerThread+0x111

    fffff8000196ee5a nt!PspSystemThreadStartup+0x5a

     

    Unfortunately, the system continued to crash and in the next dump the critical region log was completely empty. My guess is the complier was optimizing the KeEnterCriticalRegion and KeLeaveCriticalRegion calls in the driver, causing them to be inlined and skipping the call to VerifierKeLeaveCriticalRegion/VerifierKeEnterCriticalRegion. I needed another attack plan.

     

    There is another Verifier option called I/O Verification and it works in a similar way to the steps below.  Please note that this functionality is not documented and may be subject to change at any time.

    1. A call to IoCallDriver() is made to send an IO packet to a driver associated with a device.
    2. Verifier hooks the call.
    3. Verifier creates a structure to record state info.
    4. Verifier fills in the structure with data including the thread’s APC Disable Count.
    5. Next Verifier calls the normal IoCallDriver() routine to “continue” the call made in step 1.
    6. The driver does its work (disables and re-enables APCs as needed)
    7. The call to IoCallDriver() returns when the driver is finished.
    8. Verifier checks the real APC count in the thread. Next it compares the value to the recorded value in the structure from step 4. If the two values do not match, then Verifier crashes the machine so we can pull the bad driver out of the dump.

     

    I had the customer enable I/O Verification using these steps-

    • Run Verifier.exe
    • Select “Create custom settings (For code developers)”
    • Select individual settings from a full list
    • Select I/O Verification
    • Select Driver Names from a list
    • Manually choose all of the third-party drivers.
    • Reboot after making the changes. 

     

    As we expected, the machine crashed again because of the APC Disable issue.  Because we enabled I/O Verification, the bugcheck type changed to DRIVER_VERIFIER_DETECTED_VIOLATION and now we have a smoking gun.

     

    I used the !verifier command to review the Verifier settings. The output below shows “I\O subsystem checking enabled” which confirms I/O Verification was been enabled.

     

    0: kd> !verifier

     

    Verify Level 810 ... enabled options are:

          Io subsystem checking enabled

     

    The parameters to KeBugcheck reconfirmed the APC disable count was -1 (ffff). And this time we have an additional breadcrumb, the driver dispatch routine address.

     

    DRIVER_VERIFIER_DETECTED_VIOLATION (c4)

    A device driver attempting to corrupt the system has been caught.  This is

    because the driver was specified in the registry as being suspect (by the

    administrator) and the kernel has enabled substantial checking of this driver.

    If the driver attempts to corrupt the system, bugchecks 0xC4, 0xC1 and 0xA will

    be among the most commonly seen crashes.

    Arguments:

    Arg1: 00000000000000c5, Thread APC disable count changed by driver dispatch routine.

    Arg2: fffff88001345610, Driver dispatch routine address.

    Arg3: 000000000000ffff, Current thread APC disable count.

    Arg4: 0000000000000000, Thread APC disable count before calling driver dispatch routine.

          The APC disable count is decremented each time a driver calls

          KeEnterCriticalRegion, FsRtlEnterFileSystem, or acquires a mutex. The APC

          disable count is incremented each time a driver calls KeLeaveCriticalRegion,

          FsRtlExitFileSystem, or KeReleaseMutex. Since these calls should always be in

          pairs, this value should be zero when a thread exits. A negative value

          indicates that a driver has disabled APC calls without re-enabling them. A

          positive value indicates that the reverse is true.

     

    Notice the Verifier functions on the call stack which we I leveraged for the “saved state” information I discussed above in the I/O Verification architecture.

     

    0: kd> kn

    # Child-SP          RetAddr           Call Site

    00 fffff880`05db1b08 fffff800`0174a9c0 nt!KeBugCheckEx

    01 fffff880`05db1b10 fffff800`01b66b4ent!VfBugCheckNoStackUsage+0x30

    02 fffff880`05db1b50 fffff800`01b6cc2e nt!VfAfterCallDriver+0x33e

    03 fffff880`05db1ba0 fffff880`04054756 nt!IovCallDriver+0x57e

    04 fffff880`05db1c00 fffff880`0404b7b0 srv2!Smb2ExecuteRead+0x9a6

    05 fffff880`05db1c80 fffff880`0404b6fb srv2!SrvProcessPacket+0xa0

    06 fffff880`05db1cc0 fffff800`01960e5a srv2!SrvProcWorkerThread+0x2fb

    07 fffff880`05db1d40 fffff800`016bad26nt!PspSystemThreadStartup+0x5a

    08 fffff880`05db1d80 00000000`00000000nt!KiStartSystemThread+0x16

     

    Next I dumped the driver dispatch routine noted in the KeBugCheckoutput above using the ln command. This points to fltmgr!FltpDispatch which tells me we have a filter manager minifilter driver making calls to disable APCs but rudely forgetting to re-enable them. As I noted above we save the state info before the call to IoCallDriver().

     

    0: kd> ln fffff88001345610

    (fffff880`01345610)   fltmgr!FltpDispatch   |  (fffff880`01345710)   fltmgr!FltReleasePushLock

    Exact matches:

        fltmgr!FltpDispatch (<no parameter info>)

     

    Now the goal was to determine which minifilter is leaving the “Do Not Disturb” sign on the door and forgetting to remove it. We can find this using the fltmgr device object.  The “saved state” structure is passed to VfAfterCallDriver as the first parameter so I switched to the VfAfterCallDriverframe (second frame) to dig it out. I used the /r flag to show the original values of the registers for this frame.

     

    0: kd> .frame /r 2

    02 fffff880`05db1b50 fffff800`01b6cc2e nt!VfAfterCallDriver+0x33e

    rax=0000000000000000 rbx=fffffa804b729790 rcx=00000000000000c4

    rdx=00000000000000c5 rsi=fffffa804a3b0000 rdi=fffff8000183ce80

    rip=fffff80001b66b4e rsp=fffff88005db1b50 rbp=fffffa804de8c290

    r8=fffff88001345610   r9=000000000000ffff r10=fffff80001b7a640

    r11=0000000000000000 r12=000000004de8c290 r13=0000000000000000

    r14=0000000000000000 r15=0000000000000000

    iopl=0         nv up ei ng nz na pe nc

    cs=0010  ss=0018  ds=002b  es=002b  fs=0053  gs=002b             efl=00000282

    nt!VfAfterCallDriver+0x33e:

    fffff800`01b66b4e cc              int     3

     

    Parameter 1 on x64 is always passed via rcx. I dumped the assembly for VfAfterCallDriver and confirmed the value in rcx (the base of the saved state structure) is moved to rbx.

     

    0: kd> u nt!VfAfterCallDriver

    nt!VfAfterCallDriver:

    fffff800`01b66810 48895c2410      mov     qword ptr[rsp+10h],rbx

    fffff800`01b66815 48896c2418      mov     qword ptr[rsp+18h],rbp

    fffff800`01b6681a 4889742420      mov     qword ptr[rsp+20h],rsi

    fffff800`01b6681f 57              push    rdi

    fffff800`01b66820 4154            push    r12

    fffff800`01b66822 4155            push    r13

    fffff800`01b66824 4883ec30        sub     rsp,30h

    fffff800`01b66828 488bfa          mov     rdi,rdx

    fffff800`01b6682b 488bd9          mov     rbx,rcx

     

    The device object is stored in the save state information at offset 0xa0.

     

    0: kd> ? fffffa804b729790 + 0xa0

    Evaluate expression: -6046048151504 = fffffa80`4b729830

     

    0: kd> dq fffffa80`4b729830 l1

    fffffa80`4b729830  fffffa80`4a3b0060

     

    0: kd> !devobj fffffa80`4a3b0060

    Device object (fffffa804a3b0060) is for:

      \FileSystem\FltMgr DriverObject fffffa80491fb7c0

    Current Irp 00000000 RefCount 0 Type 00000008 Flags 00040000

    DevExt fffffa804a3b01b0 DevObjExt fffffa804a3b0208

    ExtensionFlags (0x80000800)  DOE_DEFAULT_SD_PRESENT, DOE_DESIGNATED_FDO

    Characteristics (0000000000) 

    AttachedTo (Lower) fffffa804a3b1030 \FileSystem\Ntfs

    Device queue is not busy.

     

    0: kd> !devstack fffffa80`4a3b0060

      !DevObj   !DrvObj            !DevExt   ObjectName

    > fffffa804a3b0060  \FileSystem\FltMgr fffffa804a3b01b0 

      fffffa804a3b1030  \FileSystem\Ntfs   fffffa804a3b1180 

     

    As http://msdn.microsoft.com/en-us/library/ff541610(v=vs.85).aspxexplains – “The filter manager is installed with Windows, but it becomes active only when a minifilter driver is loaded. The filter manager attaches to the file system stack for a target volume. A minifilter driver attaches to the file system stack indirectly, by registering with the filter manager for the I/O operations the minifilter driver chooses to filter.”

     

    Using the power of the fltkd extension, I dumped the volume information associated with this device object. From the output below, we can extract the name of the filter attached to the volume. The culprit is named BadDriver.sys. The customer removed the driver and the problem went away long enough for the vendor to create an update for BadDriver.sys. Happy Ending!

     

    0: kd> !fltkd.volume fffffa80`4a3b0060

     

    FLT_VOLUME: fffffa804a3b0800 "\Device\HarddiskVolume3"

       FLT_OBJECT: fffffa804a3b0800  [04000000] Volume

          RundownRef               : 0x000000000000020a (261)

          PointerCount             : 0x00000001

          PrimaryLink              : [fffffa804ae06810-fffffa804a2b16f0]

       Frame                    : fffffa8049fcd420 "Frame 0"

       Flags                    : [00000064] SetupNotifyCalledEnableNameCaching FilterAttached

       FileSystemType           : [00000002] FLT_FSTYPE_NTFS

       VolumeLink               : [fffffa804ae06810-fffffa804a2b16f0]

       DeviceObject             : fffffa804a3b0060

       DiskDeviceObject         : fffffa804a1c0350

       FrameZeroVolume          : fffffa804a3b0800

       VolumeInNextFrame        : 0000000000000000

       Guid                     : "\??\Volume{552791b0-455d-11de-b7b9-00145eed6acc}"

       CDODeviceName            : "\Ntfs"

       CDODriverName            : "\FileSystem\Ntfs"

       TargetedOpenCount        : 258

       Callbacks                : (fffffa804a3b0910)

       ContextLock              : (fffffa804a3b0cf8)

       VolumeContexts           : (fffffa804a3b0d00)  Count=0

       StreamListCtrls          : (fffffa804a3b0d08)  rCount=2871

       FileListCtrls            : (fffffa804a3b0d88)  rCount=0

       NameCacheCtrl            : (fffffa804a3b0e08)

       InstanceList             : (fffffa804a3b0890)

          FLT_INSTANCE: fffffa804b5b1010 "BadDriver.sys Instance" "189600"

  • Ntdebugging Blog

    Debugging a Debugger to Debug a Dump

    • 1 Comments

    Recently I came across an instance where my debugger did not do what I wanted.  Rarely do computers disobey me, but this one was unusually stubborn.  There was no other option; I had to bend the debugger to my will.

     

    There are many ways to make a computer program do what you want.  If you have the source code you can rewrite and recompile the program.  If you have a hex editor you can edit the code of the binary.  A shim can be used to modify a program at runtime.  In this instance I was in a hurry and I was ok with a temporary solution, so I used a debugger to change the execution of the debugger while it ran.

     

    Debug a debugger? Can you do such a thing?  Of course you can.

     

    In this example a memory dump was captured of a system and I was asked to determine if the system had run out of desktop heap.  Usually the !dskheap command can be used to determine how much heap has been used by each desktop. Unfortunately, this command failed me.

     

    23: kd> !dskheap

    Error Reading TotalFreeSize from nt!_HEAP @ fffffa8019c65c00

    Failed to GetHeapInfo for desktop @fffffa8019c65c00

    EnumDsktps failed on Winsta: 19c4f090FillWinstaArray failed

     

    The error indicates that the command couldn’t read from the _HEAP structure at fffffa8019c65c00 for desktop fffffa8019c65c00.  Further investigation found that reason I got this error is that the heap for the desktop in question is not valid memory.  Because the memory is described by a prototype PTE I assume that the heap has not been initialized (Note: See Windows Internals’ Memory Management chapter for more information about proto PTEs).

     

    23: kd> dt win32k!tagDESKTOP fffffa8019c65c00

       +0x000 dwSessionId      : 0

       +0x008 pDeskInfo        : 0xfffff900`c05e0a70 tagDESKTOPINFO

       +0x010 pDispInfo        : 0xfffff900`c0581e50 tagDISPLAYINFO

       +0x018 rpdeskNext       : 0xfffffa80`19c6ef20 tagDESKTOP

       +0x020 rpwinstaParent   : 0xfffffa80`19c4f090 tagWINDOWSTATION

       +0x028 dwDTFlags        : 0x110

       +0x030 dwDesktopId      : 0x19c65c00`00000003

       +0x038 spmenuSys        : (null)

       +0x040 spmenuDialogSys  : (null)

       +0x048 spmenuHScroll    : (null)

       +0x050 spmenuVScroll    : (null)

       +0x058 spwndForeground  : (null)

       +0x060 spwndTray        : (null)

       +0x068 spwndMessage     : 0xfffff900`c05e0d90 tagWND

       +0x070 spwndTooltip     : 0xfffff900`c05e0fa0 tagWND

       +0x078 hsectionDesktop  : 0xfffff8a0`00ef08e0 Void

       +0x080 pheapDesktop     : 0xfffff900`c05e0000 tagWIN32HEAP

       +0x088 ulHeapSize       : 0x18000

       +0x090 cciConsole       : _CONSOLE_CARET_INFO

       +0x0a8 PtiList          : _LIST_ENTRY [ 0xfffffa80`19c65ca8 - 0xfffffa80`19c65ca8 ]

       +0x0b8 spwndTrack       : (null)

       +0x0c0 htEx             : 0n0

       +0x0c4 rcMouseHover     : tagRECT

       +0x0d4 dwMouseHoverTime : 0

       +0x0d8 pMagInputTransform : (null)

    23: kd> dd 0xfffff900`c05e0000

    fffff900`c05e0000  ???????? ???????? ???????? ????????

    fffff900`c05e0010  ???????? ???????? ???????? ????????

    fffff900`c05e0020  ???????? ???????? ???????? ????????

    fffff900`c05e0030  ???????? ???????? ???????? ????????

    fffff900`c05e0040  ???????? ???????? ???????? ????????

    fffff900`c05e0050  ???????? ???????? ???????? ????????

    fffff900`c05e0060  ???????? ???????? ???????? ????????

    fffff900`c05e0070  ???????? ???????? ???????? ????????

    23: kd> !pte fffff900`c05e0000

                                               VA fffff900c05e0000

    PXE at FFFFF6FB7DBEDF90    PPE at FFFFF6FB7DBF2018    PDE at FFFFF6FB7E403010    PTE at FFFFF6FC80602F00

    contains 000000076245C863  contains 0000000762569863  contains 000000045FA17863  contains F8A000F4F9780400

    pfn 76245c    ---DA--KWEV  pfn 762569    ---DA--KWEV  pfn45fa17    ---DA--KWEV   not valid

                                                                                     Proto: FFFFF8A000F4F978

     

    There are many desktops in this session and I wanted to know about the usage of the other desktops, but the !dskheap command stopped after just one error.  I needed to force it to continue after this error, so I launched a debugger to debug my debugger.  There is a command to do this, just run .dbgdbg.

     

    23: kd> .dbgdbg

    Debugger spawned, connect with

        "-remotenpipe:icfenable,pipe=cdb_pipe,server=NINJA007"

     

    For clarity I will call the original debugger where I ran !dskheap debugger1, and the new debugger spawned by .dbgdbg debugger2 .

     

    Before switching to debugger2 I need to know what I am going to debug.  The error message gives a hint about where to set a breakpoint, I am looking for a failure from GetHeapInfo.

     

    23: kd> !dskheap

    Error Reading TotalFreeSize from nt!_HEAP @ fffffa8019c65c00

    Failed to GetHeapInfo for desktop @fffffa8019c65c00

    EnumDsktps failed on Winsta: 19c4f090FillWinstaArray failed

     

    I need to know which module GetHeapInfo is in, the .extmatch match command indicates which module contains the !dskheap command.

     

    23: kd> .extmatch dskheap

    !kdexts.dskheap

     

    Switching to debugger2 I set a breakpoint on kdexts!GetHeapInfo.  Use Ctrl+C to trigger a debug break in cdb (this is the same as a Ctrl+Break in windbg).

     

    0:004> bp kdexts!GetHeapInfo

    0:004> g

     

    Switch back to debugger1 and run the !dskheap command.

     

    23: kd> !dskheap

     

    In debugger2 I have hit the breakpoint.

     

    Breakpoint 0 hit

    kdexts!GetHeapInfo:

    000007f9`4237b9b0 4055            push    rbp

     

    The error says GetHeapInfo failed, so I am interested in what this function returns.  To see what GetHeapInfo returns I go up one level in the stack and set a new breakpoint on the code just after it returns.  This new breakpoint will also dump the return value of GetHeapInfo (return values are always in the rax register).

     

    0:000> gu

    kdexts!EnumDsktps+0x197:

    000007f9`4237b483 4885c0          test    rax,rax

    0:000> r rax

    rax=0000000000000000

    0:000> bc *

    0:000> bp 000007f9`4237b483 "r rax"

    0:000> g

     

    The next time the breakpoint hit the return value was 1, which in this instance means GetHeapInfo failed.  This is where I exerted my control over the computer: I forced the return value to 0.

     

    rax=0000000000000001

    kdexts!EnumDsktps+0x197:

    000007f9`4237b483 4885c0          test    rax,rax

    0:000> r rax=0

     

    I ran through the other breakpoints and changed rax as necessary.

     

    0:000> g

    rax=0000000000000000

    kdexts!EnumDsktps+0x197:

    000007f9`4237b483 4885c0          test    rax,rax

    0:000> g

    rax=0000000000000000

    kdexts!EnumDsktps+0x197:

    000007f9`4237b483 4885c0          test    rax,rax

    0:000> g

    rax=0000000000000000

    kdexts!EnumDsktps+0x197:

    000007f9`4237b483 4885c0          test    rax,rax

    0:000> g

    rax=0000000000000000

    kdexts!EnumDsktps+0x197:

    000007f9`4237b483 4885c0          test    rax,rax

    0:000> g

    rax=0000000000000001

    kdexts!EnumDsktps+0x197:

    000007f9`4237b483 4885c0          test    rax,rax

    0:000> r rax=0

    0:000> g

    rax=0000000000000000

    kdexts!EnumDsktps+0x197:

    000007f9`4237b483 4885c0          test    rax,rax

    0:000> g

     

    Everything was going well, until the computer defied me again.  The !dskheap output computes the percentage of heap usage by dividing the bytes used by the size of the heap.  Unfortunately, the size of the heap was left at 0 for the two heaps where I changed the return value.  It is well known that only Chuck Norris can divide by zero; to prevent a roundhouse kick to your computer the processor generates an exception.

     

    (2d0.928): Integer divide-by-zero - code c0000094 (first chance)

    First chance exceptions are reported before any exception handling.

    This exception may be expected and handled.

    kdexts!DisplayInfo+0x2ee:

    000007f9`4237b90e 49f7f3          div     rax,r11

    0:000> r r11

    r11=0000000000000000

    0:000> r rax

    rax=0000000000000000

    0:000> g

     

    Fortunately debugger1 handles the divide by zero exception and it is easy to run !dskheap again.

     

    23: kd> !dskheap

     

    Back in debugger2 I set a new breakpoint on the div instruction that outputs the divisor.  When the divisor (r11) is 0 I changed it to a non-zero value to avoid the wrath of Mr. Norris.

     

    rax=0000000000000000

    kdexts!EnumDsktps+0x197:

    000007f9`4237b483 4885c0          test    rax,rax

    0:000> bp 000007f9`4237b90e

    0:000> bp 000007f9`4237b90e "r r11"

    breakpoint 1 redefined

    0:000> g

    rax=0000000000000001

    kdexts!EnumDsktps+0x197:

    000007f9`4237b483 4885c0          test    rax,rax

    0:000> r rax=0

    0:000> g

    rax=0000000000000000

    kdexts!EnumDsktps+0x197:

    000007f9`4237b483 4885c0          test    rax,rax

    0:000> g

    rax=0000000000000000

    kdexts!EnumDsktps+0x197:

    000007f9`4237b483 4885c0          test    rax,rax

    0:000> g

    rax=0000000000000000

    kdexts!EnumDsktps+0x197:

    000007f9`4237b483 4885c0          test    rax,rax

    0:000> g

    rax=0000000000000000

    kdexts!EnumDsktps+0x197:

    000007f9`4237b483 4885c0          test    rax,rax

    0:000> g

    rax=0000000000000001

    kdexts!EnumDsktps+0x197:

    000007f9`4237b483 4885c0          test    rax,rax

    0:000> r rax=0

    0:000> g

    rax=0000000000000000

    kdexts!EnumDsktps+0x197:

    000007f9`4237b483 4885c0          test    rax,rax

    0:000> g

    r11=0000000000033333

    kdexts!DisplayInfo+0x2ee:

    000007f9`4237b90e 49f7f3          div     rax,r11

    0:000> g

    r11=0000000000000000

    kdexts!DisplayInfo+0x2ee:

    000007f9`4237b90e 49f7f3          div     rax,r11

    0:000> r r11=1

    0:000> g

    r11=00000000000007ae

    kdexts!DisplayInfo+0x2ee:

    000007f9`4237b90e 49f7f3          div     rax,r11

    0:000> g

    r11=0000000000013333

    kdexts!DisplayInfo+0x2ee:

    000007f9`4237b90e 49f7f3          div     rax,r11

    0:000> g

    r11=0000000000013333

    kdexts!DisplayInfo+0x2ee:

    000007f9`4237b90e 49f7f3          div     rax,r11

    0:000> g

    r11=0000000000013333

    kdexts!DisplayInfo+0x2ee:

    000007f9`4237b90e 49f7f3          div     rax,r11

    0:000> g

    r11=0000000000000000

    kdexts!DisplayInfo+0x2ee:

    000007f9`4237b90e 49f7f3          div     rax,r11

    0:000> r r11=1

    0:000> g

    r11=0000000000013333

    kdexts!DisplayInfo+0x2ee:

    000007f9`4237b90e 49f7f3          div     rax,r11

    0:000> g

     

    Finally, back in debugger1 I now have complete output for !dskheap.  After a few strategic modifications of the program’s execution I got it to output the data I wanted.  As it turns out we aren’t out of desktop heap after all.

     

    23: kd> !dskheap

    Error Reading TotalFreeSize from nt!_HEAP @ fffffa8019c65c00

    Error Reading TotalFreeSize from nt!_HEAP @ fffffa801a53ea30

      Winstation\Desktop            Heap Size(KB)   Used Rate(%)

    ------------------------------------------------------------

      WinSta0\Default                  20480                 0%

      WinSta0\Disconnect                   0                 0%

      WinSta0\Winlogon                   192                 2%

      Service-0x0-3e7$\Default          7680                 1%

      Service-0x0-3e4$\Default          7680                 0%

      Service-0x0-3e5$\Default          7680                 0%

      Service-0x0-26f46a$\Default          0                 0%

      Service-0x0-2706f2$\Default       7680                 0%

    ------------------------------------------------------

                    Total Desktop: (   51392 KB -   8 desktops)

                    Session ID:  0

    ============================================================

Page 1 of 1 (2 items)