• Ntdebugging Blog

    Leaving the Do Not Disturb Sign on the Door Will Cause the KERNEL_APC_PENDING_DURING_EXIT Bugcheck


    This is Ron Stock from the Global Escalation Services team and I recently worked with a customer to determine which misbehaving driver was crashing their critical server. This particular crash was a STOP 0x00000020 which maps to KERNEL_APC_PENDING_DURING_EXIT.


    The KERNEL_APC_PENDING_DURING_EXIT bugcheck type indicates the APC disable count for a thread was not equal to zero when the thread exited. The APC disable count is a field in the _KTHREAD structure and it is decremented when drivers disable APCs by calling functions such as KeEnterCriticalRegion, FsRtlEnterFileSystem or by acquiring a mutex. Disabling APC delivery to a thread is the equivalent of hanging the “Do Not Disturb” sign on your door. When drivers need to perform a critical operation they ‘hang the sign on the door’ to prevent interruption from APCs. When the same driver fails to ‘take the sign off the door’ by calling KeLeaveCriticalRegion, FsRtlExitFileSystem or KeReleaseMutex, the APC disable count is never incremented back to its original value. This forgetful behavior causes a bugcheck because the APC disable count is checked when the thread is exiting. The OS expects this value to be zero on thread exit.


    In my case the value was 0xffff (negative 1) indicating a driver had forgot to remove the ‘Do Not Disturb’ sign.


    kd> !analyze –v



    The key data item is the thread's APC disable count.

    If this is non-zero, then this is the source of the problem.


    Arg1: 0000000000000000, The address of the APC found pending during exit.

    Arg2: 000000000000ffff, The thread's APC disable count

    Arg3: 0000000000000000, The current IRQL

    Arg4: 0000000000000001


    Because the value is decremented earlier in time the current call stack is not particularly useful. It merely shows the thread exiting under normal conditions.


    0: kd> !thread -1 e

    THREAD fffffa8049f04b50  Cid 0004.0998  Teb: 0000000000000000 Win32Thread: 0000000000000000 RUNNING on processor 0

    Not impersonating

    DeviceMap                 fffff8a000007ee0

    Owning Process            fffffa8048cad9e0       Image:         System

    Attached Process          N/A            Image:         N/A

    Wait Start TickCount      11503325       Ticks: 0

    Context Switch Count      185715         IdealProcessor: 0

    UserTime                  00:00:00.000

    KernelTime                00:00:06.078

    Win32 Start Address srv2!SrvProcWorkerThread(0xfffff88003c4b400)

    Stack Init fffff88005078db0 Current fffff880050789b0

    Base fffff88005079000 Limit fffff88005073000 Call 0

    Priority 15 BasePriority 15 UnusualBoost 0 ForegroundBoost 0 IoPriority 2 PagePriority 5

    Child-SP          RetAddr           Call Site

    fffff880`05078b08 fffff800`01984bd9 nt!KeBugCheckEx

    fffff880`05078b10 fffff800`019a1a3d nt!PspExitThread+0xffffffff`fffe3ae9

    fffff880`05078c10 fffff800`0195bc8a nt!PspTerminateThreadByPointer+0x4d

    fffff880`05078c60 fffff880`03c56769 nt!PsTerminateSystemThread+0x22

    fffff880`05078c90 fffff880`03c4b5b6 srv2!SrvProcTerminateWorkerThreadInternal+0x99

    fffff880`05078cc0 fffff800`01966e5a srv2!SrvProcWorkerThread+0x1b6

    fffff880`05078d40 fffff800`016c0d26 nt!PspSystemThreadStartup+0x5a

    fffff880`05078d80 00000000`00000000 nt!KxStartSystemThread+0x16


    Driver Verifier is the ideal tool for this type of bugcheck. It has a feature called Critical Region logging which tracks the call stack and KTRHEAD value for each call to either KeEnterCriticalRegion() and KeLeaveCriticalRegion(). I had the customer enable this logging by selecting the “Miscellaneous checks” option in Driver Verifier using these steps-

    • Run Verifier.exe
    • Select “Create custom settings (For code developers)”
    • Select individual settings from a full list
    • Select Miscellaneous checks
    • Select Driver Names from a list
    • Manually choose all of the third-party drivers.
    • Reboot after making the changes. 


    After running through the steps above, we gathered another STOP 0x00000020 dump. I confirmed the “Miscellaneous checks” option was enabled by using the !verifier command


    0: kd> !verifier


    Verify Level 800 ... enabled options are:

          Miscellaneous checks enabled


    The stack in this new dump was in the same SrvProcWorkerThread thread exit path so we had a consistent pattern. The thread with the negative APC Disable count was fffffa804b5be040.


    0: kd> !thread -1 e

    THREAD fffffa804b5be040 Cid 0004.082c  Teb: 0000000000000000 Win32Thread: 0000000000000000 RUNNING on processor 0

    Not impersonating

    DeviceMap                 fffff8a000007ee0

    Owning Process            fffffa8048cad9e0       Image:         System

    Attached Process          N/A            Image:         N/A

    Wait Start TickCount      4458237        Ticks: 0

    Context Switch Count      36067          IdealProcessor: 0            

    UserTime                  00:00:00.000

    KernelTime                00:00:01.218

    Win32 Start Address srv2!SrvProcWorkerThread(0xfffff88004827400)

    Stack Init fffff88005cc6db0 Current fffff88005cc69b0

    Base fffff88005cc7000 Limit fffff88005cc1000 Call 0

    Priority 15 BasePriority 15 UnusualBoost 0 ForegroundBoost 0 IoPriority 2 PagePriority 5

    Child-SP          RetAddr           Call Site

    fffff880`05cc6b08 fffff800`0198dbd9 nt!KeBugCheckEx

    fffff880`05cc6b10 fffff800`019aaa3d nt!PspExitThread+0xffffffff`fffe3ae9

    fffff880`05cc6c10 fffff800`01964c8a nt!PspTerminateThreadByPointer+0x4d

    fffff880`05cc6c60 fffff880`048326d9 nt!PsTerminateSystemThread+0x22

    fffff880`05cc6c90 fffff880`048275b6 srv2!SrvProcTerminateWorkerThreadInternal+0x99

    fffff880`05cc6cc0 fffff800`0196fe5a srv2!SrvProcWorkerThread+0x1b6

    fffff880`05cc6d40 fffff800`016c9d26 nt!PspSystemThreadStartup+0x5a

    fffff880`05cc6d80 00000000`00000000 nt!KxStartSystemThread+0x16


    I dumped the Critical Region log by using the !verifier 200 command. The Critical Region log has enough room for 128 stacks. After dumping the log, the first thing to do is to find the KTHREAD value of the thread with the non-zero APC disable count. Unfortunately in my case thread fffffa804b5be040 didn’t appear in the log. In fact all 128 stacks had a driver named Suspect.sys calling KeEnterCriticalRegion or KeLeaveCriticalRegion.  Note: To protect our vendor friends, I renamed the actual sys file in this article to suspect.sys.


    The customer disabled the suspect.sys driver hoping this was the driver forgetting to re-enable APCs. If nothing else, this would perhaps remove the noisy suspect.sys from the log in the next dump.


    0: kd> !verifier 200


    Enter/Leave Critical Region log:

    There are up to 0x80 entries in the log.


    Displaying all the log entries.



    Thread fffffa8048ce4b50

    fffff80001b74293 nt!VerifierKeLeaveCriticalRegion+0xc3

    fffff8800100aafa Suspect.sys+0xaafa

    fffff88001001e30 Suspect.sys+0x1e30

    fffff80001abc68c nt!IopLoadUnloadDriver+0x1c

    fffff800016e1641 nt!ExpWorkerThread+0x111

    fffff8000196ee5a nt!PspSystemThreadStartup+0x5a

    fffff800016c8d26 nt!KiStartSystemThread+0x16


    Thread fffffa8048ce4b50

    fffff80001b6b0a2 nt!VerifierKeEnterCriticalRegion+0x92

    fffff880010062a3 Suspect.sys+0x62a3

    fffff8800100a7e2 Suspect.sys+0xa7e2

    fffff88001001e30 Suspect.sys+0x1e30

    fffff80001abc68c nt!IopLoadUnloadDriver+0x1c

    fffff800016e1641 nt!ExpWorkerThread+0x111

    fffff8000196ee5a nt!PspSystemThreadStartup+0x5a


    Unfortunately, the system continued to crash and in the next dump the critical region log was completely empty. My guess is the complier was optimizing the KeEnterCriticalRegion and KeLeaveCriticalRegion calls in the driver, causing them to be inlined and skipping the call to VerifierKeLeaveCriticalRegion/VerifierKeEnterCriticalRegion. I needed another attack plan.


    There is another Verifier option called I/O Verification and it works in a similar way to the steps below.  Please note that this functionality is not documented and may be subject to change at any time.

    1. A call to IoCallDriver() is made to send an IO packet to a driver associated with a device.
    2. Verifier hooks the call.
    3. Verifier creates a structure to record state info.
    4. Verifier fills in the structure with data including the thread’s APC Disable Count.
    5. Next Verifier calls the normal IoCallDriver() routine to “continue” the call made in step 1.
    6. The driver does its work (disables and re-enables APCs as needed)
    7. The call to IoCallDriver() returns when the driver is finished.
    8. Verifier checks the real APC count in the thread. Next it compares the value to the recorded value in the structure from step 4. If the two values do not match, then Verifier crashes the machine so we can pull the bad driver out of the dump.


    I had the customer enable I/O Verification using these steps-

    • Run Verifier.exe
    • Select “Create custom settings (For code developers)”
    • Select individual settings from a full list
    • Select I/O Verification
    • Select Driver Names from a list
    • Manually choose all of the third-party drivers.
    • Reboot after making the changes. 


    As we expected, the machine crashed again because of the APC Disable issue.  Because we enabled I/O Verification, the bugcheck type changed to DRIVER_VERIFIER_DETECTED_VIOLATION and now we have a smoking gun.


    I used the !verifier command to review the Verifier settings. The output below shows “I\O subsystem checking enabled” which confirms I/O Verification was been enabled.


    0: kd> !verifier


    Verify Level 810 ... enabled options are:

          Io subsystem checking enabled


    The parameters to KeBugcheck reconfirmed the APC disable count was -1 (ffff). And this time we have an additional breadcrumb, the driver dispatch routine address.



    A device driver attempting to corrupt the system has been caught.  This is

    because the driver was specified in the registry as being suspect (by the

    administrator) and the kernel has enabled substantial checking of this driver.

    If the driver attempts to corrupt the system, bugchecks 0xC4, 0xC1 and 0xA will

    be among the most commonly seen crashes.


    Arg1: 00000000000000c5, Thread APC disable count changed by driver dispatch routine.

    Arg2: fffff88001345610, Driver dispatch routine address.

    Arg3: 000000000000ffff, Current thread APC disable count.

    Arg4: 0000000000000000, Thread APC disable count before calling driver dispatch routine.

          The APC disable count is decremented each time a driver calls

          KeEnterCriticalRegion, FsRtlEnterFileSystem, or acquires a mutex. The APC

          disable count is incremented each time a driver calls KeLeaveCriticalRegion,

          FsRtlExitFileSystem, or KeReleaseMutex. Since these calls should always be in

          pairs, this value should be zero when a thread exits. A negative value

          indicates that a driver has disabled APC calls without re-enabling them. A

          positive value indicates that the reverse is true.


    Notice the Verifier functions on the call stack which we I leveraged for the “saved state” information I discussed above in the I/O Verification architecture.


    0: kd> kn

    # Child-SP          RetAddr           Call Site

    00 fffff880`05db1b08 fffff800`0174a9c0 nt!KeBugCheckEx

    01 fffff880`05db1b10 fffff800`01b66b4ent!VfBugCheckNoStackUsage+0x30

    02 fffff880`05db1b50 fffff800`01b6cc2e nt!VfAfterCallDriver+0x33e

    03 fffff880`05db1ba0 fffff880`04054756 nt!IovCallDriver+0x57e

    04 fffff880`05db1c00 fffff880`0404b7b0 srv2!Smb2ExecuteRead+0x9a6

    05 fffff880`05db1c80 fffff880`0404b6fb srv2!SrvProcessPacket+0xa0

    06 fffff880`05db1cc0 fffff800`01960e5a srv2!SrvProcWorkerThread+0x2fb

    07 fffff880`05db1d40 fffff800`016bad26nt!PspSystemThreadStartup+0x5a

    08 fffff880`05db1d80 00000000`00000000nt!KiStartSystemThread+0x16


    Next I dumped the driver dispatch routine noted in the KeBugCheckoutput above using the ln command. This points to fltmgr!FltpDispatch which tells me we have a filter manager minifilter driver making calls to disable APCs but rudely forgetting to re-enable them. As I noted above we save the state info before the call to IoCallDriver().


    0: kd> ln fffff88001345610

    (fffff880`01345610)   fltmgr!FltpDispatch   |  (fffff880`01345710)   fltmgr!FltReleasePushLock

    Exact matches:

        fltmgr!FltpDispatch (<no parameter info>)


    Now the goal was to determine which minifilter is leaving the “Do Not Disturb” sign on the door and forgetting to remove it. We can find this using the fltmgr device object.  The “saved state” structure is passed to VfAfterCallDriver as the first parameter so I switched to the VfAfterCallDriverframe (second frame) to dig it out. I used the /r flag to show the original values of the registers for this frame.


    0: kd> .frame /r 2

    02 fffff880`05db1b50 fffff800`01b6cc2e nt!VfAfterCallDriver+0x33e

    rax=0000000000000000 rbx=fffffa804b729790 rcx=00000000000000c4

    rdx=00000000000000c5 rsi=fffffa804a3b0000 rdi=fffff8000183ce80

    rip=fffff80001b66b4e rsp=fffff88005db1b50 rbp=fffffa804de8c290

    r8=fffff88001345610   r9=000000000000ffff r10=fffff80001b7a640

    r11=0000000000000000 r12=000000004de8c290 r13=0000000000000000

    r14=0000000000000000 r15=0000000000000000

    iopl=0         nv up ei ng nz na pe nc

    cs=0010  ss=0018  ds=002b  es=002b  fs=0053  gs=002b             efl=00000282


    fffff800`01b66b4e cc              int     3


    Parameter 1 on x64 is always passed via rcx. I dumped the assembly for VfAfterCallDriver and confirmed the value in rcx (the base of the saved state structure) is moved to rbx.


    0: kd> u nt!VfAfterCallDriver


    fffff800`01b66810 48895c2410      mov     qword ptr[rsp+10h],rbx

    fffff800`01b66815 48896c2418      mov     qword ptr[rsp+18h],rbp

    fffff800`01b6681a 4889742420      mov     qword ptr[rsp+20h],rsi

    fffff800`01b6681f 57              push    rdi

    fffff800`01b66820 4154            push    r12

    fffff800`01b66822 4155            push    r13

    fffff800`01b66824 4883ec30        sub     rsp,30h

    fffff800`01b66828 488bfa          mov     rdi,rdx

    fffff800`01b6682b 488bd9          mov     rbx,rcx


    The device object is stored in the save state information at offset 0xa0.


    0: kd> ? fffffa804b729790 + 0xa0

    Evaluate expression: -6046048151504 = fffffa80`4b729830


    0: kd> dq fffffa80`4b729830 l1

    fffffa80`4b729830  fffffa80`4a3b0060


    0: kd> !devobj fffffa80`4a3b0060

    Device object (fffffa804a3b0060) is for:

      \FileSystem\FltMgr DriverObject fffffa80491fb7c0

    Current Irp 00000000 RefCount 0 Type 00000008 Flags 00040000

    DevExt fffffa804a3b01b0 DevObjExt fffffa804a3b0208

    ExtensionFlags (0x80000800)  DOE_DEFAULT_SD_PRESENT, DOE_DESIGNATED_FDO

    Characteristics (0000000000) 

    AttachedTo (Lower) fffffa804a3b1030 \FileSystem\Ntfs

    Device queue is not busy.


    0: kd> !devstack fffffa80`4a3b0060

      !DevObj   !DrvObj            !DevExt   ObjectName

    > fffffa804a3b0060  \FileSystem\FltMgr fffffa804a3b01b0 

      fffffa804a3b1030  \FileSystem\Ntfs   fffffa804a3b1180 


    As http://msdn.microsoft.com/en-us/library/ff541610(v=vs.85).aspxexplains – “The filter manager is installed with Windows, but it becomes active only when a minifilter driver is loaded. The filter manager attaches to the file system stack for a target volume. A minifilter driver attaches to the file system stack indirectly, by registering with the filter manager for the I/O operations the minifilter driver chooses to filter.”


    Using the power of the fltkd extension, I dumped the volume information associated with this device object. From the output below, we can extract the name of the filter attached to the volume. The culprit is named BadDriver.sys. The customer removed the driver and the problem went away long enough for the vendor to create an update for BadDriver.sys. Happy Ending!


    0: kd> !fltkd.volume fffffa80`4a3b0060


    FLT_VOLUME: fffffa804a3b0800 "\Device\HarddiskVolume3"

       FLT_OBJECT: fffffa804a3b0800  [04000000] Volume

          RundownRef               : 0x000000000000020a (261)

          PointerCount             : 0x00000001

          PrimaryLink              : [fffffa804ae06810-fffffa804a2b16f0]

       Frame                    : fffffa8049fcd420 "Frame 0"

       Flags                    : [00000064] SetupNotifyCalledEnableNameCaching FilterAttached

       FileSystemType           : [00000002] FLT_FSTYPE_NTFS

       VolumeLink               : [fffffa804ae06810-fffffa804a2b16f0]

       DeviceObject             : fffffa804a3b0060

       DiskDeviceObject         : fffffa804a1c0350

       FrameZeroVolume          : fffffa804a3b0800

       VolumeInNextFrame        : 0000000000000000

       Guid                     : "\??\Volume{552791b0-455d-11de-b7b9-00145eed6acc}"

       CDODeviceName            : "\Ntfs"

       CDODriverName            : "\FileSystem\Ntfs"

       TargetedOpenCount        : 258

       Callbacks                : (fffffa804a3b0910)

       ContextLock              : (fffffa804a3b0cf8)

       VolumeContexts           : (fffffa804a3b0d00)  Count=0

       StreamListCtrls          : (fffffa804a3b0d08)  rCount=2871

       FileListCtrls            : (fffffa804a3b0d88)  rCount=0

       NameCacheCtrl            : (fffffa804a3b0e08)

       InstanceList             : (fffffa804a3b0890)

          FLT_INSTANCE: fffffa804b5b1010 "BadDriver.sys Instance" "189600"

  • Ntdebugging Blog

    Troubleshooting Pool Leaks Part 6 – Driver Verifier


    In part 5 we used poolhittag to get call stacks of pool being allocated and freed.  This information is often essential to identifying the cause of a memory leak; however it is not always feasible to configure a live kernel debug to obtain this information.  Fortunately there are alternative methods to get such call stacks.


    Driver verifier has an option to enable pool tracking for a specific driver, or for multiple drivers.  This functionality was first introduced in Windows Vista and Windows Server 2008.  This information is also captured when driver verifier is used to enable special pool, however for the purposes of this article we will focus on using pool tracking.


    The data stored by driver verifier requires a debugger to view.  Any method of debugging can be used for this.  You can use a live kernel debug as we described in part 4, you can get a memory dump (kernel or complete, a small dump is insufficient), or you can use livekd.


    If you have used the steps from Part 1, Part 2, or Part 3, you likely have an idea which drivers are likely involved in creating the pool leak.  In this example we are generating the leak using notmyfault, the same tool we have been using in prior examples.  As seen in Part 2, the relevant driver is myfault.sys.


    Although driver verifier has GUI, the easiest way to enable this functionality is with the below command from an elevated command prompt:


    Verifier /flags 8 /driver myfault.sys


    The above command will provide the following output, allowing you to confirm that the expected settings are enabled:


    New verifier settings:


    Special pool: Disabled

    Pool tracking: Enabled

    Force IRQL checking: Disabled

    I/O verification: Disabled

    Deadlock detection: Disabled

    DMA checking: Disabled

    Security checks: Disabled

    Force pending I/O requests: Disabled

    Low resources simulation: Disabled

    IRP Logging: Disabled

    Miscellaneous checks: Disabled


    Verified drivers:





    You must restart this computer for the changes to take effect.


    After rebooting the system, reproduce the memory leak and attach a debugger or generate a memory dump after the memory has been leaked.


    Break in with the debugger (Ctrl+Break or Ctrl+C) or load the dump in windbg (File – Open Crash Dump).


    Set the symbol path and reload symbols.


    1: kd> .symfix c:\symbols

    1: kd> .reload

    Loading Kernel Symbols



    The !verifier command has various options to view information about driver verifier.  To view the pool allocations which have been tracked by verifier for notmyfault.sys, use the following:


    0: kd> !verifier 3 myfault.sys


    Verify Level 8 ... enabled options are:

          All pool allocations checked on unload


    Summary of All Verifier Statistics


    RaiseIrqls                             0x0

    AcquireSpinLocks                       0x0

    Synch Executions                       0x0

    Trims                                  0x0


    Pool Allocations Attempted             0xb

    Pool Allocations Succeeded             0xb

    Pool Allocations Succeeded SpecialPool 0xa

    Pool Allocations With NO TAG           0x1

    Pool Allocations Failed                0x0

    Resource Allocations Failed Deliberately   0x0


    Current paged pool allocations         0x0 for 00000000 bytes

    Peak paged pool allocations            0x1 for 00000080 bytes

    Current nonpaged pool allocations      0xa for 009CE000 bytes

    Peak nonpaged pool allocations         0xa for 009CE000 bytes


    Driver Verification List


    Entry     State           NonPagedPool   PagedPool   Module


    fffffa80031b5830 Loaded           009ce000       00000000    myfault.sys


    Current Pool Allocations  0000000a    00000000

    Current Pool Bytes        009ce000    00000000

    Peak Pool Allocations     0000000a    00000001

    Peak Pool Bytes           009ce000    00000080


    PoolAddress  SizeInBytes    Tag       CallersAddress

    fffffa8005400000     0x000fb000     Leak      fffff8800447d634

    fffffa80052fb000     0x000fb000     Leak      fffff8800447d634

    fffffa8005200000     0x000fb000     Leak      fffff8800447d634

    fffffa80050fb000     0x000fb000     Leak      fffff8800447d634

    fffffa8005000000     0x000fb000     Leak      fffff8800447d634

    fffffa8004efb000     0x000fb000     Leak      fffff8800447d634

    fffffa8004e00000     0x000fb000     Leak      fffff8800447d634

    fffffa8004cfb000     0x000fb000     Leak      fffff8800447d634

    fffffa8004c00000     0x000fb000     Leak      fffff8800447d634

    fffffa8004a66000     0x000fb000     Leak      fffff8800447d634


    At the bottom of the above output is the list of allocations made by notmyfault.sys.  For our purposes we are going to assume that these allocations have been leaked, as opposed to just being normal allocations that were not yet freed when the debugger was attached.


    The !verifier command has an option to view call stacks for one of the tracked allocations.  Keep in mind that the size of the database is limited and only more recent allocations will be kept in the database.


    0: kd> !verifier 80 fffffa8005400000


    Log of recent kernel pool Allocate and Free operations:


    There are up to 0x10000 entries in the log.


    Parsing 0x0000000000010000 log entries, searching for address 0xfffffa8005400000.




    Pool block fffffa8005400000, Size 00000000000fa000, Thread fffffa80044ceb60

    fffff80001927cc6 nt!VeAllocatePoolWithTagPriority+0x2b6

    fffff80001927d3d nt!VerifierExAllocatePoolEx+0x1d

    fffff8800447d634 myfault!MyfaultDeviceControl+0x358

    fffff8800447d727 myfault!MyfaultDispatch+0xb7

    fffff80001932750 nt!IovCallDriver+0xa0

    fffff800017a7a97 nt!IopXxxControlFile+0x607

    fffff800017a82f6 nt!NtDeviceIoControlFile+0x56

    fffff8000148bed3 nt!KiSystemServiceCopyEnd+0x13

    Parsed entry 0000000000010000/0000000000010000...

    Finished parsing all pool tracking information.


    The above output shows the call stack leading to the pool allocation.  This is the same information we had seen in Part 5, however we are able to obtain this information using a dump or livekd, whereas the steps from Part 5 required an invasive debug and extended system downtime.


    When you have completed troubleshooting, disable driver verifier with the following command and reboot:

    verifier /reset

  • Ntdebugging Blog

    Another Who Done It


    Hi my name is Bob Golding, I am an EE in GES. I want to share an interesting problem I recently worked on.  The initial symptom was the system bugchecked with a Stop 0xA which means there was an invalid memory reference.  The cause of the crash was a driver making I/O requests while Asynchronous Procedure Calls (APCs) were disabled.  The bugcheck caused by an invalid memory reference was the result of the problem and not the cause.


    An APC is queued to a thread during I/O completion. This is to guarantee the last phase of the I/O completion occurs in the same context as the process that issued the request.


    The stack of the trap is presented below.  The call stack shows that APCs are being enabled allowing queued APCs to run.


    Child-SP          RetAddr           Call Site

    fffff880`07bf3598 fffff800`030b85a9 nt!KeBugCheckEx

    fffff880`07bf35a0 fffff800`030b7220 nt!KiBugCheckDispatch+0x69

    fffff880`07bf36e0 fffff800`030d8b56 nt!KiPageFault+0x260

    fffff880`07bf3870 fffff800`030959ff nt!IopCompleteRequest+0xc73

    fffff880`07bf3940 fffff800`0306c0d9 nt!KiDeliverApc+0x1d7

    fffff880`07bf39c0 fffff800`033f8a1a nt!KiCheckForKernelApcDelivery+0x25

    fffff880`07bf39f0 fffff800`033cce2f nt!MiMapViewOfSection+0x2bafa

    fffff880`07bf3ae0 fffff800`030b8293 nt!NtMapViewOfSection+0x2be

    fffff880`07bf3bb0 00000000`772df93a nt!KiSystemServiceCopyEnd+0x13

    00000000`0015dea8 00000000`00000000 0x772df93a


    The reason the trap occurred is because when issuing requests to lower drivers it is common practice in drivers to implement code similar to:


    KEVENT event;


    status = IoCallDriver( DeviceObject, irp );



    //  Wait for the event to be signaled if STATUS_PENDING is returned.


    if (status == STATUS_PENDING) {

       (VOID)KeWaitForSingleObject( &event, // event is a local which is declared on the stack




                                    NULL );



    As you can see in the above code, if the return from IoCallDriver does not return pending the code continues and exits. Part of the last phase of I/O processing that takes place in the APC is signaling the event. If the call to IoCallDriver returns success, because the event is on the stack it is critical that the APC execute immediately before the stack unwinds. Since APCs where disabled, the execution of the APC was delayed and during this time the event became invalid. The APCs were delayed because the memory manager was in a critical area and APCs could not run.


    I needed to determine which driver did this so I enabled IRP logging in Driver Verifier to trace I/O requests.  With this enabled the next dump should contain a transaction log that will help identify what driver is performing I/O while APCs are disabled.  The command line to enable this is:

    verifier /flags 0x410 /all


    The new dump with verifier enabled also crashed after delivering an APC to the thread and completing the IRP.  From the debug output below I can find the IRPs that were issued and the thread that issued them, this is what I need to look for them in the log.


    1: kd> !thread

    THREAD fffffa80064c9b50 Cid 0200.0204  Teb: 000007fffffde000 Win32Thread: 0000000000000000 RUNNING on processor 1

    IRP List:

        fffff9800a33ec60: (0006,03a0) Flags: 40060070  Mdl: 00000000

        fffff9800a250c60: (0006,03a0) Flags: 40060070  Mdl: 00000000

        fffff9800a3f4ee0: (0006,0118) Flags: 40060070  Mdl: 00000000

    Not impersonating

    DeviceMap                 fffff8a000007890

    Owning Process            fffffa80064bbb30       Image:         csrss.exe

    Attached Process          N/A            Image:         N/A

    Wait Start TickCount      1656           Ticks: 0

    Context Switch Count      25             IdealProcessor: 0

    UserTime                  00:00:00.000

    KernelTime                00:00:00.000

    Win32 Start Address 0x000000004a061540

    Stack Init fffff88003b21c70 Current fffff88003b20890

    Base fffff88003b22000 Limit fffff88003b1c000 Call 0

    Priority 14 BasePriority 13 UnusualBoost 0 ForegroundBoost 0 IoPriority 2 PagePriority 5

    Child-SP          RetAddr           Call Site

    fffff880`03b21428 fffff800`0307a54c nt!KeBugCheckEx

    fffff880`03b21430 fffff800`030d02ee nt!MmAccessFault+0xffffffff`fff9c15c

    fffff880`03b21590 fffff800`030c8db9 nt!KiPageFault+0x16e

    fffff880`03b21728 fffff800`030e6ab3 nt!memcpy+0x229

    fffff880`03b21730 fffff800`030c4bd7 nt!IopCompleteRequest+0x5a3

    fffff880`03b21800 fffff800`0307ba85 nt!KiDeliverApc+0x1c7

    fffff880`03b21880 fffff800`0331d96a nt!KiCheckForKernelApcDelivery+0x25

    fffff880`03b218b0 fffff800`033e742e nt!MiMapViewOfSection+0xffffffff`fff36baa

    fffff880`03b219a0 fffff800`030d1453 nt!NtMapViewOfSection+0x2bd

    fffff880`03b21a70 00000000`7761159a nt!KiSystemServiceCopyEnd+0x13

    00000000`0025f078 00000000`00000000 0x7761159a


    The command “!verifier 100” will dump the transaction log.  Below is the relevant portion of the log containing the IRPs for our thread.


    IRP fffff9800a3f4ee0, Thread fffffa80064c9b50, IRQL = 0, KernelApcDisable = -4, SpecialApcDisable = -1

    fffff80003573a68 nt!IovAllocateIrp+0x28

    fffff800033b20e2 nt!IoBuildDeviceIoControlRequest+0x32

    fffff8000356d72e nt!IovBuildDeviceIoControlRequest+0x4e

    fffff880010f8bcc fltmgr!FltGetVolumeGuidName+0x18c

    fffff88004e4fbe1 baddriver+0x12be1

    fffff88004e73523 baddriver +0x36523

    fffff88004e7300c baddriver +0x3600c

    fffff88004e72cce baddriver +0x35cce

    fffff88004e5f715 baddriver +0x22715

    fffff88004e4c6c7 baddriver +0xf6c7

    fffff88004e48342 baddriver +0xb342

    fffff88004e5e44e baddriver +0x2144e

    fffff88004e5e638 baddriver +0x21638


    IRP fffff9800a250c60, Thread fffffa80064c9b50, IRQL = 0, KernelApcDisable = -5, SpecialApcDisable = -1

    fffff80003573a68 nt!IovAllocateIrp+0x28

    fffff800033b20e2 nt!IoBuildDeviceIoControlRequest+0x32

    fffff8000356d72e nt!IovBuildDeviceIoControlRequest+0x4e

    fffff8800101eec7 mountmgr!MountMgrSendDeviceControl+0x73

    fffff88001010a6b mountmgr!QueryDeviceInformation+0x207

    fffff8800101986b mountmgr!QueryPointsFromMemory+0x57

    fffff88001019f86 mountmgr!MountMgrQueryPoints+0x36a

    fffff8800101ea71 mountmgr!MountMgrDeviceControl+0xe9

    fffff80003574c16 nt!IovCallDriver+0x566

    fffff880010f8bec fltmgr!FltGetVolumeGuidName+0x1ac

    fffff88004e4fbe1 baddriver +0x12be1

    fffff88004e73523 baddriver +0x36523

    fffff88004e7300c baddriver +0x3600c


    IRP fffff9800a33ec60, Thread fffffa80064c9b50, IRQL = 0, KernelApcDisable = -5, SpecialApcDisable = -1

    fffff80003573a68 nt!IovAllocateIrp+0x28

    fffff800033b20e2 nt!IoBuildDeviceIoControlRequest+0x32

    fffff8000356d72e nt!IovBuildDeviceIoControlRequest+0x4e

    fffff8800101eec7 mountmgr!MountMgrSendDeviceControl+0x73

    fffff88001010afd mountmgr!QueryDeviceInformation+0x299

    fffff8800101986b mountmgr!QueryPointsFromMemory+0x57

    fffff88001019f86 mountmgr!MountMgrQueryPoints+0x36a

    fffff8800101ea71 mountmgr!MountMgrDeviceControl+0xe9

    fffff80003574c16 nt!IovCallDriver+0x566

    fffff880010f8bec fltmgr!FltGetVolumeGuidName+0x1ac

    fffff88004e4fbe1 baddriver +0x12be1

    fffff88004e73523 baddriver +0x36523

    fffff88004e7300c baddriver +0x3600c


    From the IRP log in verifier I can see that baddriver.sys is calling FltGetVolumeGuidName while APCs are disabled. Further investigation found that baddriver.sys had registered a function for image load notification, and the memory manager has APCs disabled when it calls the image notification routine. The image notification routine in baddriver.sys called FltGetVolumeGuidName which issued the I/O.  From the log output I see KernelApcDisable and SpecialApcDisable, the issue is SpecialApcDisable being –1.  The I/O completion APCs are considered special APCs, so kernel APC disable would not affect them.


    The solution was for the driver to check for APCs disabled before issuing the FltGetVolumeGuidName and not make this call if APCs are disabled.

  • Ntdebugging Blog

    Our Bangalore Team is Hiring - Windows Server Escalation Engineer


    Would you like to join the world’s best and most elite debuggers to enable the success of Microsoft solutions?


    As a trusted advisor to our top customers you will be working with to the most experienced IT professionals and developers in the industry. You will influence our product teams in sustained engineering efforts to drive improvements in our products.


    This role involves deep analysis of product source code and debugging to solve problems in multi-million dollar configurations and will give you an opportunity to stretch your critical thinking skills. During the course of debugging, you will uncover opportunities to improve the customer experience while influencing the current and future design of our products.


    In addition to providing support to customers while being the primary interface to our sustained engineering teams, you will also have the opportunity to work with new technologies and unreleased software. Through our continuous investment in depth training and hands-on experience with tough customer challenges you will become the world’s best in this area. Expect to partner with many various roles at Microsoft launching a very successful career!


    This position is located is at the Microsoft Global Technical Support Center in Bangalore, India.


    Learn more about what an Escalation Engineer does at:

    Profile: Ron Stock, CTS Escalation Engineer - Microsoft Customer Service & Support - What is CSS?

    Microsoft JobsBlog JobCast with Escalation Engineer Jeff Dailey

    Microsoft JobsBlog JobCast with Escalation Engineer Scott Oseychik


    Apply here:


  • Ntdebugging Blog

    Use Caution When Implementing IPC for Performance Counters



    Recently I was working with a developer who had created performance counters that work in Performance Monitor but are never collected in a user defined data collector set.  The customer explained that their counters update named shared memory inside the application which should be read by perfmon or the data collector set.


    Putting counter data in shared memory is a common technique for performance counter developers.  A programmer can update performance data in a block of shared memory in their application and then use a performance extension dll (aka an “extensible counter”) to read from the shared memory.


    Shared memory is created by calling CreateFileMapping and MapViewOfFile.  This memory is then accessed by another application by calling OpenFileMapping.  All applications which use this shared memory must pass the same lpName to CreateFileMapping or OpenFileMapping.  An example of using these APIs to implement shared memory is available on MSDN.


    Based on the customer’s explanation that they are populating shared memory in their application, and their counters work in Performance Monitor but do not work in a user defined data collector set, I suspected that OpenFileMapping was failing for the data collector set.


    User defined data collector sets run in a rundll32.exe process.  If you have multiple rundll32.exe processes you may need to identify which one is related to your data collector set.  The relevant process has a command line similar to “rundll32.exe C:\Windows\system32\pla.dll,PlaHost”.  There are several tools that can be used to identify the command line of the process such as tlist.exe, which is included with the Debugging Tools for Windows.


    After attaching a debugger to rundll32.exe, I wanted to break on the ret instruction at the end of the OpenFileMappingW function.  This would allow me to determine if the function succeeds or fails.  According to MSDN “If the function fails, the return value is NULL. To get extended error information, call GetLastError.”


    The uf command is an easy way to unassemble a function and find the ret instruction to break on.


    0:001> uf kernelbase!OpenFileMappingW


    75b88e0d 8bff            mov     edi,edi



    75b88e79 c9              leave

    75b88e7a c20c00          ret     0Ch

    0:001> bp 75b88e7a

    0:001> g

    Breakpoint 0 hit

    eax=00000000 ebx=00008022 ecx=7ffd8000 edx=00000002 esi=05abf03c edi=00000000

    eip=75b88e7a esp=05abeb20 ebp=05abeb3c iopl=0         nv up ei pl zr na pe nc

    cs=001b  ss=0023  ds=0023  es=0023  fs=003b  gs=0000             efl=00000246


    75b88e7a c20c00          ret     0Ch


    In the above output we can see that the eax register is NULL, indicating that the call to OpenFileMapping failed.  The !gle command will show the last error and last status.


    0:008> !gle

    LastErrorValue: (Win32) 0x2 (2) - The system cannot find the file specified.

    LastStatusValue: (NTSTATUS) 0xc0000034 - Object Name not found.


    The failure is that OpenFileMapping cannot find the file.  The file name is the third parameter to OpenFileMapping.  We can get the first three parameters from the kb command.


    0:008> kb

    ChildEBP RetAddr  Args to Child             

    05abf0d0 6abae355 0002001f 00000000 05abeb7c kernelbase!OpenFileMappingW+0x90

    WARNING: Stack unwind information not available. Following frames may be wrong.

    05abf0f8 7784fe67 02a7ae90 05abf224 05abf254 ninjaprf+0x10edb

    05abf110 7784fc97 00472158 02a7ae90 05abf224 advapi32!CallExtObj+0x17

    05abf270 7784efaf 05abf2bc 60fcfa02 05abf778 advapi32!QueryExtensibleData+0x735

    05abf654 75ff0468 80000004 05abf778 00000000 advapi32!PerfRegQueryValue+0x5da

    05abf748 75ffd505 80000004 05abf778 05abf790 kernel32!LocalBaseRegQueryValue+0x366

    05abf7b4 61247dc5 80000004 02a7ae90 00000000 kernel32!RegQueryValueExW+0xb7

    05abf830 61250595 80000004 02a7ae58 02a7ae90 pdh!GetSystemPerfData+0x92

    05abf89c 6124c753 02a407d0 05abf8e8 61241928 pdh!GetQueryPerfData+0xa4

    05abf8b8 61254463 02a407d0 05abf8e8 60fcf32f pdh!PdhiCollectQueryData+0x32

    05abf90c 611c6d04 02a58f08 00000000 75ffc3e0 pdh!PdhUpdateLogW+0xa2

    05abf9bc 611be128 0045c968 00000000 00000000 pla!HPerformanceCounterDataCollector::Cycle+0x48

    05abf9bc 00000000 0045c968 00000000 00000000 pla!PlaiCollectorControl+0x3b7

    0:008> da 05abeb7c

    05abeb7c  "Local\NINJAPERF_S-1-5-18"


    The user defined data collector set is failing to open the file "Local\NINJAPERF_S-1-5-18".  This is the name that the performance extension dll ninjaprf.dll has given to its shared memory.


    Based on the customer’s description this operation works in Performance Monitor.  Next, I attached a debugger to perfmon and set the same breakpoint.


    Breakpoint 0 hit

    eax=000009f8 ebx=00008022 ecx=a7330000 edx=080ee678 esi=06798070 edi=00000000

    eip=760be9bb esp=0a84e564 ebp=0a84e580 iopl=0         nv up ei pl zr na pe nc

    cs=0023  ss=002b  ds=002b  es=002b  fs=0053  gs=002b             efl=00000246


    760be9bb c20c00          ret     0Ch


    In the above output we can see that eax is a handle number, indicating that the function succeeded.


    Looking at the file being opened we can see why this works in one scenario and not in another.


    0:016> kb 1

    ChildEBP RetAddr  Args to Child             

    0a84e580 698e4ab9 0002001f 00000000 0a84e5c0 kernelbase!OpenFileMappingW+0x90

    0:016> da 0a84e5c0

    0a84e5c0  "Local\NINJAPERF_S-1-5-21-123578"

    0a84e5e0  "095-571698237-1598563147-18961"


    The file name used is unique for each user.  The ninjaprf dll has chosen a file name which includes the SID of the current user.  This works for Performance Monitor because the user who starts the application is the same as the user who runs Performance Monitor.  However, a user defined data collector set runs in rundll32.exe as the Local System account.


    It is possible to run the user defined data collector set as a different user, however the file name being used will not work in that scenario either.  By using the “Local\” prefix, the file is created in the local session namespace.  The application runs in the user’s session, while rundll32.exe is started by the Task Scheduler service and runs in session 0.  This prevents the user defined data collector set from seeing the file created by the application.


    If your performance counter uses shared memory to communicate, be aware that your performance extension dll may be run in the context of a different user account and a different user session.  Your inter process communication techniques must account for this, or your customers will ask you why your counters do not work outside of Performance Monitor.

  • Ntdebugging Blog

    Debugging a Network Connectivity Issue - TrackNblOwner to the Rescue


    Hello Debug community this is Karim Elsaid again.  Today I’m going to discuss a recent interesting case where intermittently the server is losing access to the network.  No communication (even pings) can be done from / to the server when the issue hits.


    We went through the normal exercise and asked the customer to obtain a Kernel memory dump from the machine while it was in the problematic state, hoping that we will find some data to help us to demystify the issue.


    One of the very first commands we run upon receiving a hang dump is the very famous “!locks” command.  This yielded the following:


    8: kd> !locks


    KD: Scanning for held locks..


    Resource @ nt!IopDeviceTreeLock (0xfffff80001a81c80)    Shared 1 owning threads

         Threads: fffffa800cd8a040-01<*>

    KD: Scanning for held locks.


    Resource @ nt!PiEngineLock (0xfffff80001a81b80)    Exclusively owned

        Contention Count = 6

         Threads: fffffa800cd8a040-01<*>

    KD: Scanning for held locks

    84372 total locks, 2 locks currently held


    What I’m looking for is Locks with exclusive owners and waiters.  From the above output we can see that thread fffffa800cd8a040 exclusively owns a Plug and Play (Pi prefix) lock and shared owns an I/O Manager (Io prefix) device tree lock.


    There are no waiters for the exclusive lock, however PnP locks always worth investigating.  While debugging I always treat everything a possible suspect unless proven otherwise, so let’s dump this thread:


    8: kd> !thread fffffa800cd8a040 e

    THREAD fffffa800cd8a040  Cid 0004.005c  Teb: 0000000000000000 Win32Thread: 0000000000000000 WAIT: (Executive) KernelMode Non-Alertable

        fffff88002b0f118  SynchronizationEvent

    IRP List:

        fffffa8016527510: (0006,0310) Flags: 00000000  Mdl: 00000000

    Not impersonating

    DeviceMap                 fffff8a000006100

    Owning Process            fffffa800cd56040       Image:        System

    Attached Process          N/A            Image:         N/A

    Wait Start TickCount      14791337       Ticks: 15577 (0:00:04:03.002)

    Context Switch Count      835317         IdealProcessor: 2            

    UserTime                  00:00:00.000

    KernelTime                00:00:26.863

    Win32 Start Address nt!ExpWorkerThread (0xfffff8000188f530)

    Stack Init fffff88002b0fc70 Current fffff88002b0ee30

    Base fffff88002b10000 Limit fffff88002b0a000 Call 0

    Priority 12 BasePriority 12 UnusualBoost 0 ForegroundBoost 0 IoPriority 2 PagePriority 5

    *** ERROR: Module load completed but symbols could not be loaded for myfault.sys

    Child-SP          RetAddr            Call Site

    fffff880`02b0ee70 fffff800`0187ba32 nt!KiSwapContext+0x7a

    fffff880`02b0efb0 fffff800`0188cd8f nt!KiCommitThreadWait+0x1d2

    fffff880`02b0f040 fffff800`018e1816 nt!KeWaitForSingleObject+0x19f

    fffff880`02b0f0e0 fffff880`01618fcd nt! ??::FNODOBFM::`string'+0x12ff6

    fffff880`02b0f150 fffff880`0173f54e tcpip!FlPnpEvent+0x17d

    fffff880`02b0f1c0 fffff880`00f87b2f tcpip!Fl48PnpEvent+0xe

    fffff880`02b0f1f0 fffff880`00f884b7 NDIS!ndisPnPNotifyBinding+0xbf

    fffff880`02b0f280 fffff880`00fa1911 NDIS!ndisPnPNotifyAllTransports+0x377

    fffff880`02b0f3f0 fffff880`00fa2c5b NDIS!ndisCloseMiniportBindings+0x111

    fffff880`02b0f500 fffff880`00f3bbc2 NDIS!ndisPnPRemoveDevice+0x25b

    fffff880`02b0f6a0 fffff880`00fa5b69 NDIS!ndisPnPRemoveDeviceEx+0xa2

    fffff880`02b0f6e0 fffff800`01aec8d9 NDIS!ndisPnPDispatch+0x609

    fffff880`02b0f780 fffff800`01c6c1e1 nt!IopSynchronousCall+0xc5

    fffff880`02b0f7f0 fffff800`0197f733 nt!IopRemoveDevice+0x101

    fffff880`02b0f8b0 fffff800`01c6bd34 nt!PnpRemoveLockedDeviceNode+0x1a3

    fffff880`02b0f900 fffff800`01c6be40 nt!PnpDeleteLockedDeviceNode+0x44

    fffff880`02b0f930 fffff800`01cfcd04 nt!PnpDeleteLockedDeviceNodes+0xa0

    fffff880`02b0f9a0 fffff800`01cfd35c nt!PnpProcessQueryRemoveAndEject+0xc34

    fffff880`02b0fae0 fffff800`01be65ce nt!PnpProcessTargetDeviceEvent+0x4c

    fffff880`02b0fb10 fffff800`0188f641 nt! ?? ::NNGAKEGL::`string'+0x5ab9b

    fffff880`02b0fb70 fffff800`01b1ce5a nt!ExpWorkerThread+0x111

    fffff880`02b0fc00 fffff800`01876d26 nt!PspSystemThreadStartup+0x5a

    fffff880`02b0fc40 00000000`00000000 nt!KiStartSystemThread+0x16


    Interesting, by looking at the stack above we can see that thread is doing some NDIS PnP stuff.  This thread has been waiting for more than 4 minutes, but hold on,  what is “ nt! ?? ::FNODOBFM::`string”?  This doesn’t seem to be a useful function name, no its not!  This is a side effect of Basic Block Tools optimization (BBT).  Using public symbols the debugger will find it hard to get to the right symbol, there is a nice a trick you can use in order to get to the right function.


    P.S for a nice x64 Deep Dive please refer to our archive.


    Let’s display the function data for the return address fffff800`018e1816:


    8: kd> .fnent fffff800`018e1816

    Debugger function entry 000000e8`f28f14f8 for:

    (fffff800`018c4790)   nt! ?? ::FNODOBFM::`string'+0x12ff6   |  (fffff800`018c47c8)   nt!vDbgPrintExWithPrefixInternal


    BeginAddress      = 00000000`000da7d0

    EndAddress        = 00000000`000da81c

    UnwindInfoAddress = 00000000`001c8a54


    Unwind info at fffff800`019cfa54, 10 bytes

      version 1, flags 4, prolog 0, codes 0


    Chained info:

    BeginAddress      = 00000000`000182f0

    EndAddress        = 00000000`00018358

    UnwindInfoAddress = 00000000`001bf910


    Unwind info at fffff800`019c6910, 6 bytes

      version 1, flags 0, prolog 4, codes 1

      00: offs 4, unwind op 2, op info c      UWOP_ALLOC_SMALL. 


    For optimized binaries, you will find a section “Chained Info”.  Add the BeginAddress to the start address of the module and you should hit the correct function so:


    8: kd> ln nt+000182f0

    (fffff800`0181f2f0)   nt!ExWaitForRundownProtectionReleaseCacheAware  |  (fffff800`0181f358)   nt!KeGetRecommendedSharedDataAlignment

    Exact matches:

        nt!ExWaitForRundownProtectionReleaseCacheAware (<no parameter info>)


    Bingo!  You got the function.  So tcpip!FlPnpEvent was calling ExWaitForRundownProtectionReleaseCacheAware.  This function will basically wait for the rundown protection to drop down to 0.


    A thread can call ExAcquireRundownProtectionEx against a shared object for safe access.  Rundown Protection provides a way to protect an object from being deleted unless all outstanding access has been finished (Run Down).  The “ExWaitForRundownProtectionReleaseCacheAware” will do exactly the same; it will wait for all rundown protection calls to be completed.


    The question is which structure are we waiting for its rundown to drain, that will depend on what we are dealing with.  Because of code optimization the debugger is not showing you the full picture.  Through code review I found that in this particular dump there is an inline call to function “FlpUninitializePacketProviderInterface”.


    So the stack in reality should look like this:


    Child-SP          RetAddr           Call Site

    fffff880`02b0ee70 fffff800`0187ba32 nt!KiSwapContext+0x7a

    fffff880`02b0efb0 fffff800`0188cd8f nt!KiCommitThreadWait+0x1d2

    fffff880`02b0f040 fffff800`018e1816 nt!KeWaitForSingleObject+0x19f

    fffff880`02b0f0e0 fffff880`01618fcd nt!ExWaitForRundownProtectionReleaseCacheAware

    ----inline function----             tcpip!FlpUninitializePacketProviderInterface

    fffff880`02b0f150 fffff880`0173f54e tcpip!FlPnpEvent+0x17d

    fffff880`02b0f1c0 fffff880`00f87b2f tcpip!Fl48PnpEvent+0xe


    So we need to un-initialize a network interface but before doing that we need to make sure that there are no outstanding references to packets and that there are no outstanding packets pending.  When we say packets, starting in NDIS 6 we basically mean “NET_BUFFER” and “Net_Buffer_List” structures.  So we need to check for any outstanding Net_Buffer_Lists (NBLs) that are pending, one reference will correspond to one pending NBL.


    To the rescue, the “NDISKD” debugger extension has a very nice and handy command to display all pending NBLS and their owners, it is “!pendingnbls”.  For the command to work it you must first enable “TrackNblOwner” through the registry.  By default, this registry key is not enabled on server SKUs as it may cause a performance hit.  On client SKUs this is enabled by default.


    When you run !pendingnbls on a clean Windows 2008 R2 install you get:


    8: kd> !ndiskd.pendingnbls

        This command requires NBL tracking to be enabled on the debugee target

        machine.  (By default, client operating systems have level 1, and servers

        have level 0).  To enable, set this REG_DWORD value to a nonzero value on

        the target machine and reboot the target machine:


        HKLM\SYSTEM\CurrentControlSet\Services\NDIS\Parameters ! TrackNblOwner

        Possible Values (features are cumulative)

        * 0:  Disable all tracking.

        * 1:  Track the most recent owner of each NBL (enables !ndiskd.pendingnbls)


        Show me all allocated NBLs so I can manually find the one I want


    You can find all allocated NBLs with the command “!ndiskd.nblpool -force -find ((@$extin.Flags)&0x108)==0x100)”, but still you don’t get any owner.


    So I asked the customer to turn on “TrackNblOwner” and reboot, wait for the next occurrence of the issue and get a new memory dump.


    Two days later we received the memory dump file.  I verified that they are having the same issue I found in the last dump and that TrackNblOwner is configured correctly:


    23: kd> dp NDIS!ndisTrackNblOwner L1

    fffff880`00ef1a30  00000000`00000001


    Then I immediately checked all pending NBLs to claim the prize, and it was not surprising to see why the NIC card was not un-initializing:


    23: kd> !ndiskd.pendingnbls


    PHASE 1/3: Found 20 NBL pool(s).                

    PHASE 2/3: Found 550 freed NBL(s).                                   


        Pending Nbl        Currently held by                                       

        fffffa801dc559f0   fffffa80142d31a0 - My Ethernet 1Gb 4-port Adapter  [Miniport]                   

        fffffa801dc81680   fffffa80142d31a0 - My Ethernet 1Gb 4-port Adapter  [Miniport]                   

        fffffa80131d2aa0   fffffa80142d31a0 - My Ethernet 1Gb 4-port Adapter  [Miniport]


    Ret of the repeated output omitted


    PHASE 3/3: Found 1854 pending NBL(s) of 3005 total NBL(s).                     

    Search complete.


    So we currently have 1854 NBLs pending on the NIC miniport driver “fffffa80142d31a0”.  This is the Miniport that currently holding all NBLs:


    23: kd> !ndiskd.miniport fffffa80142d31a0





        My Ethernet 1Gb 4-port Adapter 


        Ndis handle        fffffa80142d31a0

        Ndis API version   v6.20

        Adapter context    fffffa80138cc000

        Miniport driver    fffffa800d4f7530 - MyMiniPortDriver  v1.0

        Network interface  fffffa800d25e870


        Media type         802.3

        Device instance    PCI\VEN_1111&DEV_1111&SUBSYS_169D103C&REV_01\4&2263a140&0&0010

        Device object      fffffa80142d3050    More information

        MAC address        xx-xx-xx-xx-xx-xx





        Miniport           Running

        Device PnP         QUERY_REMOVED

        Datapath           Normal

        Operational status DORMANT

        Operational flags  DORMANT_PAUSED

        Admin status       ADMIN_UP

        Media              Connected

        Power              D0

        References         9

        Total resets       0

        Pending OID        None


                           SUPPORTS_MEDIA_SENSE, DOES_NOT_DO_LOOPBACK,





    What you notice from the above that the device received a “Query_Removed” PNP and is currently in a Dormant_Paused state.


    From: http://msdn.microsoft.com/en-us/library/ff566737.aspx:


    The operational status is set to NET_IF_OPER_STATUS_DORMANT because the miniport adapter is in the paused or pausing state.


    NDIS 6.0 and up allow miniport adapters to be paused and the documentation here shows what the miniport driver should do when it receives a pause request.


    Because the adapter was in a pause state, basic network commads like “ping” ceased to work as described earlier in the symptoms.  The next action is definitely to involve the miniport adapter vendor to trace this further and find out why all these pending NBLs were not completed.


    Until a next adventure!

    Best Regards,


  • Ntdebugging Blog

    Updated Archive of the NtDebugging Twitter Debug Tips


    Every Wednesday (usually) we post a debug tip to our twitter page at https://twitter.com/#!/ntdebugging. This blog is an archive of these tips to allow our readers to find this information easily. Periodically we post an updated blog with the current archive. Follow us on twitter if you want to see the new tips as we post them.

    The goal of these tips is to share debug commands, and forms of commands (parameters, flags, etc) that we in Platforms Global Escalation Services find useful. I hope you can add these commands to your toolkit and they will help you debug more efficiently.


    !thread/!process [address] e - on x64 will not show you the meaningless Args to Child information.

    .frame /c [FrameNumber] - sets context to specificied stack frame. On x64 provides more reliable register information than .trap.

    kn - Dumps call stack with frame numbers, easier than counting stacks for .frame.

    .frame /r [FrameNumber] - same as .frame /c, but shows registers without changing context.

    Note: With .frame /c or /r you can only trust the nonvolatile registers. See http://bit.ly/dik4OR for vol/nonvol regs.

    k=rbp rip FrameCount - Dumps call stack starting at rbp/rip on x64. Useful when the stack is corrupt.

    .process/.thread /p /r [address] - sets new process context, sets .cache forcedecodeuser, and reloads user symbols.

    !process [address] 17 - Sets the context for this command, avoids the need for .process to see user stacks. Try !process 0 17

    ~~[ThreadID]s - Changes threads in user mode. Use Thread ID number from output such as !locks. Ex: ~~[1bd4]s

    runas /netonly /u:<account> windbg.exe - Launch windbg with domain account. Use when dbg computer isn't in domain and symbol server is.

    !heap -p -a <address> - Shows information about the heap block containing <address>, even if you aren't using pageheap.

    ub - Unassembles starting at a location prior to your address. Accepts l<number> to specify how many instructions to go back. ub . l20

    !stacks 2 [FilterString] - Finds kernel mode call stacks that contain the FilterString in a symbol.

    !thread [address] 17 (or 1e on x64) - Sets context for this command, avoids the need for .thread/.process for user stacks.

    .hh [Text] - Opens the debugger help. [Text] is the topic to lookup in the index. Example: .hh !pte

    ?? can dump structs using C++ style expressions. Ex: ??((nt!_KTHREAD*)(0xfffffa800ea43bb0))->ApcState

    bp /t EThread - Sets a kernel mode breakpoint that only triggers when hit in the context of this thread.

    bp /p EProcess - Sets a kernel mode breakpoint that only triggers when hit in the context of this process.

    gc - If you run 'p' and hit a breakpoint, gc takes you where p would have gone if you had not hit the bp.

    gu - Go until the current function returns. Effectively this unwinds one stack frame. #windbg

    pc - Steps through until the next 'call' instruction. Combine with other commands to find who returned your error> pc;p;r eax

    pt - Steps through until the next 'ret' instruction. Similar to gu, but pt stops on the ret and gu stops after the ret.

    .ignore_missing_pages 1 - supresses the error: "Page 2a49 not present in the dump file. Type ".hh dbgerr004" for details"

    .exr -1 shows the most recent exception. Useful in user dumps of crashes, especially for no execute crashes (NX/DEP).

    wt - Trace calls until they return to the current address. More useful with -or to get return values. Use -l for depth.

    .thread /w - Changes to the WOW64 32-bit context from 64-bit kernel mode. Wow64exts doesn't work in kernel mode.

    ??sizeof(structure) - Gets the size of a structure, it's easier than counting.

    sxe ld:module.dll - Enables an exception which will break into the debugger when module.dll is loaded.

    vertarget - Shows OS version of the debug target. Also shows machine name, uptime, and session time (when the dump was taken).

    !vm 1 - In a kernel debugger, shows basic information about memory usage. Available, committed, pagefile, pool, sysptes, etc.

    .time - Shows session time (when dump was taken) and system uptime. In user mode shows process uptime, kernel/user time.

    ba w size [address] - Break on write access only. Replace size with the num bytes you want to watch. Ex: ba w 4 005d5f10

    .process -i <address> - Make the process active and break into the debugger. Use in live kernel debugs to get into process context.

    .reload /f /o - Overwrites cached files in your downstream symbol store. Useful when your store has corrupt pdbs.

    ->* - Use with dt to dump pointers. Example: dt _EPROCESS [Address] ObjectTable->*

    !for_each_module s -a @#Base @#End "PTag" - Find the drivers using pool tag "PTag".

    .unload [DllName] - Unloads the debug extension you didn't intend to load. Omit DllName to unload the last dll loaded.

    !exqueue dumps the executive worker queues. Use flags 7f to dump the worker threads and the queues.

    lmvm <module> - Dumps information about the module. Remember to use <module> and not <module.dll>.

    !thread -t TID - Dump a thread using thread ID rather than thread address. Useful when working with a critical section.

    !list - Walks a linked list and displays informatino for each element in a list. See blog later today for an example.

    .time -h # - Shows the debug session time using the timezone offset of #. Ex: .time -h 0 shows when a dump was taken in UTC.

    !session - Lists all of the user session IDs. A quick way to list the active sessions from a dump of a terminal server.

    !session -s SessionID - Changes the current session context to SessionID. Useful when looking at GDI, or other per session data.

    | ProcNum s - Switches to process number ProcNum. Use when debugging multiple dumps, or processes, in one windbg.

    !! - Launches a shell process and redirects its output to the debugger. The same as .shell, but "bang bang" sounds cooler.

    uf Function - Dumps the assembly for Function (name or address). Useful for optimized code that is not contiguous in memory.

    uf /c Function - Shows all of the calls made by Function (can be function name or address).

    !wow64exts.sw - switches between x64 and x86 contexts. Often used to reverse .thread /w context switch.

  • Ntdebugging Blog

    Troubleshooting Pool Leaks Part 4 – Debugging Multiple Users for a Tag


    In our previous articles we discussed various techniques for identifying a pool memory leak and narrowing the scope of the leak to an individual pool tag.  Knowing the leaking pool tag is often sufficient to identify the cause of the problem and find a solution.  However, there may be a scenario where multiple drivers use the same pool tag (such as DDK) or when one driver uses the same tag in multiple places.  In this scenario you will need more information to identify the source of the leak.  In our next several articles we will present techniques to get this information.


    This article will present a basic technique where we modify each pool tag to identify what code in which driver is allocating the memory that gets leaked.


    This technique requires a live debug of the problematic system.  There are many resources with steps for how to configure a system for a live debug.  The debugging tools have instructions in the debugger.chm help file, under Debugging Tools for Windows\Debuggers\Installation and Setup\Kernel-Mode Setup.




    Using the same technique as in Part 3, identify where the tag in question is used.


    0: kd> !for_each_module s -a @#Base @#End "Leak"

    fffff880`0496e3aa  4c 65 61 6b 3b c1 0f 42-c1 41 8d 49 fd 8b d0 ff  Leak;..B.A.I....

    fffff880`0496e621  4c 65 61 6b 3b c1 0f 42-c1 33 c9 8b d0 ff 15 cc  Leak;..B.3......


    Next, edit each instance so that they are unique.  The ASCII code for numeral 1 is 0x31, and the codes for each numeral increase sequentially.  Using this information edit each tag to be Lea1, Lea2, etc.


    0: kd> eb fffff880`0496e3aa+3 31

    0: kd> eb fffff880`0496e621+3 32


    Confirm your edits resulted in the expected tags using the dc command.


    0: kd> dc fffff880`0496e3aa l1

    fffff880`0496e3aa  3161654c                             Lea1

    0: kd> dc fffff880`0496e621 l1

    fffff880`0496e621  3261654c                             Lea2


    Now wait for the leak to happen and repeat the steps from Part 3 to identify which of the tags is leaked.  This tells you what code allocates the memory that gets leaked.  Below we can see that Lea2 is the tag being leaked.


    0: kd> !poolused /t5 2


    Sorting by NonPaged Pool Consumed


                   NonPaged                  Paged

    Tag     Allocs         Used     Allocs         Used


    Lea2       257    263168000          0            0  UNKNOWN pooltag 'Lea2', please update pooltag.txt

    nVsC       664      1531552          0            0  UNKNOWN pooltag 'nVsC', please update pooltag.txt

    netv      4369      1172224          1          144  UNKNOWN pooltag 'netv', please update pooltag.txt

    Leak         1      1024000          0            0  UNKNOWN pooltag 'Leak', please update pooltag.txt

    EtwB        94       945136          4       163840  Etw Buffer, Binary: nt!etw


    TOTAL     41296    281814544      44077     68102368


    Knowing what code allocates the leaked pool may be very valuable to a driver developer who needs to narrow the scope of the problem.  Often this information is sufficient for a developer to code review the use of this memory and identify why it would be leaked.


    There are times when more information is needed to determine the cause of the leak.  A developer may need the call stacks of memory being allocated and freed.  We will capture this information in Part 5.

  • Ntdebugging Blog

    Troubleshooting Pool Leaks Part 5 – PoolHitTag


    In Part 4 we narrowed the source of the leaked pool memory to the specific driver which is allocating it, and we identified where in the driver this allocation was taking place.  However, we did not capture contextual information such as the call stack leading up to this code.  Also, we didn’t capture information about when this allocated pool is freed.  In this article we will use the PoolHitTag feature to break into the debugger when a specific tag is used.


    As in Part 4, a live debug must be configured to use this feature.  The debugging tools have instructions in the debugger.chm help file, under Debugging Tools for Windows\Debuggers\Installation and Setup\Kernel-Mode Setup (see screenshot in part 4).


    These steps are typically only effective if you are able to perform them while the leak is happening.  There may be a scenario in which a developer wants to know what “normal” looks like, but most often the steps in this article are used to investigate “broken”.


    The PoolHitTag is a global in the kernel binary.  When this global is set to a pool tag, the system will break into the debugger whenever pool with this tag is allocated or freed.  By default the PoolHitTag is set to ffffff0f.


    1: kd> dc nt!PoolHitTag l1

    fffff800`016530fc  ffffff0f                             ....


    To turn on this feature, edit the PoolHitTag to the tag that is known to leak.  The value 3261654c is little endian ASCII for the string ‘Lea2’.  I found this value in the “Confirm your edits” step in Part 4. 


    1: kd> ed nt!PoolHitTag 3261654c

    1: kd> dc nt!PoolHitTag l1

    fffff800`016530fc  3261654c                             Lea2


    With PoolHitTag now set to the leaking tag, issue the ‘g’ command to release debugger and it will automatically break in whenever the Lea2 tag is used.


    1: kd> g

    Break instruction exception - code 80000003 (first chance)

    nt! ?? ::FNODOBFM::`string'+0x24a2a:

    fffff800`014798f6 cc              int     3


    In the above example the debugger broke in because the ‘int 3’ instruction triggered a breakpoint.  The symbols seem to indicate that we are in a function named “?? ::FNODOBFM::`string'”, but this is simply a lack of symbolic information for this optimized code.  Unassembling the surrounding code shows that this code is a piece of ExpAllocateBigPool, one of the functions used in the kernel to allocate pool allocations larger than 4096 bytes.


    1: kd> u fffff800`014798f6

    nt! ?? ::FNODOBFM::`string'+0x24a2a:

    fffff800`014798f6 cc              int     3

    fffff800`014798f7 e9ee8e0800      jmp     nt!ExpAllocateBigPool+0x13a (fffff800`015027ea)


    At this point we can dump the call stack and see the full context of what is happening when this memory is allocated.


    1: kd> k

    Child-SP          RetAddr           Call Site

    fffff880`04ec1680 fffff800`0161090e nt! ??::FNODOBFM::`string'+0x24a2a

    fffff880`04ec1770 fffff880`0496e634 nt!ExAllocatePoolWithTag+0x82e

    fffff880`04ec1860 fffff880`0496e727 myfault+0x1634

    fffff880`04ec19b0 fffff800`017fca97 myfault+0x1727

    fffff880`04ec1a10 fffff800`017fd2f6 nt!IopXxxControlFile+0x607

    fffff880`04ec1b40 fffff800`014e0ed3 nt!NtDeviceIoControlFile+0x56

    fffff880`04ec1bb0 00000000`7756138a nt!KiSystemServiceCopyEnd+0x13

    00000000`000df4c8 000007fe`fd5fa249 ntdll!ZwDeviceIoControlFile+0xa

    00000000`000df4d0 00000000`7740683f KERNELBASE!DeviceIoControl+0x75

    00000000`000df540 00000000`ff222384 kernel32!DeviceIoControlImplementation+0x7f

    00000000`000df590 00000000`00000000 NotMyfault+0x2384


    Repeating the ‘g’ and ‘k’ commands multiple times will begin to give you an understanding of the various ways this code may be used.  This can be automated by modifying the ‘int 3’ instruction and using a breakpoint.  Note that system performance may suffer because output to the debug port is serialized.


    The commands shown below use addresses specific to big pool allocations (larger than 4KB). The ‘int 3’ instruction may be located elsewhere depending on the scenario you are debugging.


    To modify the operation from a debug break to a breakpoint, change the ‘int 3’ to ‘nop’.  In x86 and x64 the opcode for ‘nop’ is 90.  Coincidentally these instructions are the same length.


    1: kd> eb fffff800`014798f6 90


    Confirm that the instruction was reset properly.


    1: kd> u fffff800`014798f6 l1

    nt! ?? ::FNODOBFM::`string'+0x24a2a:

    fffff800`014798f6 90              nop


    Set a breakpoint on the ‘nop’ instruction and configure the breakpoint to automatically dump the stack and go the debugger.


    1: kd> bp fffff800`014798f6 "k;g"

    1: kd> g


    If you find that the pool is sometimes allocated and occasionally freed, you may need to edit the ‘int 3’ used when ExFreePool is called, and set a similar breakpoint on that address.


    Break instruction exception - code 80000003 (first chance)


    fffff800`0160f5b7 cc              int     3

    1: kd> eb fffff800`0160f5b7 90

    1: kd> bp fffff800`0160f5b7 "k;g"

    1: kd> g


    Once you have sufficient data to understand the scenario where the memory is allocated and freed use Ctrl+Break to break into the debugger, clear the breakpoints and reset the PoolHitTag.  Then go the debugger to allow the system to continue normal operation.


    1: kd> bc *

    1: kd> ed nt!PoolHitTag ffffff0f

    1: kd> g


    The data collected with these steps should provide an indication to a developer of what memory is being leaked and when.


    PoolHitTag isn’t the only option for collecting call stack information.  Our final articles will cover alternative techniques for obtaining this information.

  • Ntdebugging Blog

    Understanding !PTE, Part2: Flags and Large Pages


    Hello, it's Ryan Mangipano with part two of my PTE series. Today I'll discuss PDE/PTE flags, the TLB, and show you a manual conversion of x86 PAE Large Page Virtual Addresses to Physical. If you haven’t read the first part of this series please find it here. It's a good primer before proceeding.

    PDE and PTE flags 

    I'll start with a discussion about the PDE/PTE flags. If you recall from part one not all of the bits of the Page Directory Entry (PDE) are related to the index (used to form the pointer to the base of the next level).  This is true of the table entries in all the levels. For example, on a PAE x86 systems only 9 bits of the PTE (page table entry) are used for the index. During our previous conversion,  we only used some of the bits for the index into the next table. The rest of the data, we simply dropped off and replaced with zeros as needed. So what are the other bits used for? They are used for a series of flags. You will observe the state of these flags output by !PTE in the following manner: (-G-DA—KWEV).

    These flags are documented in the Intel Manuals. Intel and AMD reserved some of the flags for use by the Operating System. All of these are also documented in chapter 9 (Memory Management) of “Windows Internals, 5th edition”. Let’s dump the PDE from the virtual address we dissected last time. This will allow you to see some of the flags that are present in the other bits

    Obtaining the Virtual Address of the PDE

    1: kd> !pte 0xf9a10054

    VA f9a10054

    PDE at 00000000C0603E68    PTE at 00000000C07CD080

    contains 000000000102D963  contains 0000000002010121

    pfn 102d       -G-DA--KWEV    pfn 2010       -G--A—KREV


    Here is the data-type of our PDE

    1: kd> dt nt!_MMPTE u.Hard

       +0x000 u      :

          +0x000 Hard   : _MMPTE_HARDWARE


    Dumping the PDE and flags

    1: kd> dt _MMPTE_HARDWARE 00000000C0603E68


       +0x000 Valid            : 0y1

       +0x000 Writable         : 0y1

       +0x000 Owner            : 0y0

       +0x000 WriteThrough     : 0y0

       +0x000 CacheDisable     : 0y0

       +0x000 Accessed         : 0y1

       +0x000 Dirty            : 0y1

       +0x000 LargePage        : 0y0

       +0x000 Global           : 0y1

       +0x000 CopyOnWrite      : 0y0

       +0x000 Prototype        : 0y0

       +0x000 Write            : 0y1

       +0x000 PageFrameNumber  : 0y00000000000001000000101101 (0x102d)

       +0x000 reserved1        : 0y00000000000000000000000000 (0)


    Take note of the Letters in the PDE and PTE section of the !pte output, such as -G-DA--KWEV    . These letters represent various flags. The presence or absence of the letter in the !PTE output tells you the state of the flag. These flags can also be seen in the hardware pte output above. 

    Valid                   (V)         - Indicates that the data is located in physical memory. If this flag is not set, then the software can use ALL of the rest of the bits for whatever it wants(like storing the pagefile number and offset where the page is stored.
    Write                 (W/R)     - Indicates if the data is writeable or read-only. Multiprocessor or Vista or later. Hardware bit is documented in the processor manuals. Reserved Bit 11’s use is documented in Windows Internals, Chap. 9.
    Owner               (K/U)      - Indicates if the page is owned kernel mode or usermode. Kernel if cleared. User if set.
    WriteThrough  (T)           - When set indicates Writethrough caching policy. When not set indicates write-back caching policy
    CacheDisable   (N)           - If set, the page translation table or physical page it points to cannot be cached.   
    Accessed           (A)           - Set when the page itself, or the table referencing it has been read from or written to
    Dirty                   (D)          - Indicates if any data on this page has been updated
    LargePage         (L)            - This field is only used on PDEs, not PTEs. It indicates whether or not the PDE is the last table level (meaning that this entry references an actual page in memory) or if it is instead referencing a Page Table. If this bit is set in the PDE, this PDE points directly to a 2-MB page when PAE is in use. If PAE is not being used, the large page size that we are referencing is 4-MB. So basically, this is the page size bit. If this bit is cleared, the final destination page is 4k and can be found in the page table that this PDE points to. If this bit is set, then the final destination page is equal to the size of a large page on your system (2MB when PAE is in use) and can be located using the index value of this particular PDE since it becomes the last level. Keep in mind that a larger offset will be needed to reference all the positions in this large page since it is larger. To use this feature, the PSE bit (bit 4 which is the 5th bit over) must be set in CR4. The setting in CR4 is a global setting, enable the use of large pages on the system. The flag in the PDE only applies to the individual PDE.
    Global                 (G)        - If not set Translation Caching flushes affect this bit. If set, other processes use this translation also, so don’t flush it from the Translation Lookaside Buffer cache upon process context switches.
    CopyOnWrite    (C)       - Intel states this is a software field. Windows uses this for processes to share the same copy of a page. The system will give the process a private copy of this page if there is any attempt to write to the page by the process (by copying it). Any attempt to execute code in this page occurs on a No execute system will cause an access violation.
    Prototype           (P)        - Intel states this is a software field. Windows uses this to indicate that this is a prototype PTE.
    Reserved0                       - These Bits are Reserved
    E                            (E)       - Executable page. E is always displayed on platformst that Do not support hardware No-Execute.

    Inspecting the state of the flags is important when attempting to manually convert addresses from Virtual Addresses to Physical. For example, since the valid bit is not set in the following invalid PTE, all of the fields are available for Windows to use. This means the information in the processor manuals doesn’t apply. Instead it is an nt!_MMPTE_SOFTWARE which references data located in the page file.

    3: kd> !pte b8ae900c

                   VA b8ae900c

    PDE at 00000000C0602E28    PTE at 00000000C05C5748

    contains 000000000B880863  contains 000B8AF500000000

    pfn b880       ---DA--KWEV    not valid

                           PageFile:  0

                           Offset: b8af5

                           Protect: 0


    For more information on the different types of invalid PTEs, refer to page 775 of “Windows Internals, 5th edition”.

    Manually Converting x86 PAE Large Page Virtual Address to Physical

    In part one of this blog, we manually translated a PAE  4-KByte Page Virtual Address (VA). Now we are going to manually translate a VA that represents a Large Page from our PAE system. As discussed in the previous section on PTE flags, a large page allocation means that the page size is larger and the PDE points directly to the page itself. The PDE will not point to the base of a page table.  This means that there will be one less level of tables used in the translation.  This also means that more bits will be needed to represent the offsets in the large page.  I found the following address on my system that references a Large Page, 8054099e. Once again, all the required information was obtained from the processor manuals, debugger help file, and Windows Internals Book.

    1: kd> !pte 8054099e

                   VA 8054099e

    PDE at 00000000C0602010    PTE at 00000000C0402A00

    contains 00000000004009E3  contains 0000000000000000

    pfn 400        -GLDA--KWEV    LARGE PAGE pfn 540  


    Below is the Virtual Address in binary.

    1: kd> .formats 8054099e

      Binary:  10000000 01010100 00001001 10011110


    I have split this VA into it's three parts.

    10                     Page Directory Pointer Table Offset

    000000 010      Page Directory Table Offset

    10100 00001001 10011110     This is the Offset into the large page


    Let’s get the base of the Page Directory Pointer Table and indentify which of the four entries we will need to follow.

    1: kd> !dq (@cr3 & 0xffffffe0) + ( 0y10 * 8) L1

    # 23406f0 00000000`06c46801


    Now take our address from above, add our zeros and we have the base of Page Directory Table. Then add the offset from our Virtual Address and we'll dump out the PDE.

    1: kd> !dq (6c46801 & 0xFFFFFF000) + ( 0y000000010 * 8) L1

    # 6c46010 00000000`004009e3


    Let’s convert the PDE to binary format to analyze the lower 12 bits. This will allow us to analyze the flags. The last Twelve bits (0-11) are used for the PFN. They are used for the flags that we discussed earlier.

    1: kd> .formats (00000000`004009e3 & 0x0000000FFF)

      Binary:  00000000 00000000 00001001 11100011


    Let’s analyze the flags from this VA using the information we learned earlier....

    ·         Bit Zero is set indicating that the page is Valid, located in physical memory, and all other bits

    ·         Bit One is set indicating that this page is Writeable (Hardware Field)

    ·         Bit Two is cleared indicating that this is a Kernel Mode Page

    ·         Bit Three is cleared indicating a Write-Back Caching policy (caching of writes to the page is enabled)

    ·         Bit Four is cleared indicating that caching is not disabled for the page.

    ·         Bit Five is set indicating this page has been Accessed

    ·         Bit Six is set indicating that this page is Dirty

    ·         Bit Seven is set indicating that this is a Large Page. This PDE points directly to a page, not a Page Table.

    ·         Bit Eight is set indicating other process share this Global PDE. No Delete upon TLB Cache Flush for process context switches.

    ·         Bit Nine is cleared indicating this page is not Copy-On-Write

    ·         Bit Ten is cleared indicating this is NOT a Prototype PTE

    ·         Bit Eleven is set also indicating this page is Writeable (Reserved Field, See Windows Internals, Chap. 9.)

    ...and compare our findings to the Flags output from !PTE, -GLDA—KWEV. My system doesn’t support No-Execute, so the E is also displayed.  For more information, .hh !PTE in windbg.

    We know this is a Large Page and is Valid, so we can obtain the directory of our 2-MB Large Page (on this PAE system) from this PDE. The Intel Manual states that in our PDE the last 21 bits aren’t part of the address base.

    1: kd> .formats (004009e3 & 0y11111111111000000000000000000000)

      Binary:  00000000 01000000 00000000 00000000



    So let’s combine the data from the PDE (Highlighted)  with the offset from the VA (Virtual Address).

    00000000 010  10100 00001001 10011110  

    Now I'll remove the spaces, precede this binary value with 0y, and send it to .formats.

    1: kd> .formats 0y00000000010101000000100110011110

      Hex:     0054099e

    We could have obtained the same data in this manner

    1: kd> ? (004009e3 & 0y11111111111000000000000000000000) + (8054099e & 0y00000000000111111111111111111111)

    Evaluate expression: 5507486 = 0054099e

    Now let’s dump the data in memory at this physical address

    1: kd> !db 0054099e

    #  54099e 33 db 8b 75 18 8b 7d 1c-0f 23 fb 0f 23 c6 8b 5d 3..u..}..#..#..]

    #  5409ae 20 0f 23 cf 0f 23 d3 8b-75 24 8b 7d 28 8b 5d 2c  .#..#..u$.}(.],

    #  5409be 0f 23 de 0f 23 f7 0f 23-fb e9 43 ff ff ff 8b 44 .#..#..#..C....D

    Now let’s dump the same data using the virtual address

    1: kd> db 8054099e

    8054099e  33 db 8b 75 18 8b 7d 1c-0f 23 fb 0f 23 c6 8b 5d  3..u..}..#..#..]

    805409ae  20 0f 23 cf 0f 23 d3 8b-75 24 8b 7d 28 8b 5d 2c   .#..#..u$.}(.],

    805409be  0f 23 de 0f 23 f7 0f 23-fb e9 43 ff ff ff 8b 44  .#..#..#..C....D

    So now you can see how I used the debugger to translate virtual addresses to physical adrresess. This concludes part two of this blog and in part three we will cover translation of x86 Non-PAE Virtual Address Translation, x64 Address Translation, and the TLB.

Page 22 of 24 (240 items) «2021222324