• Ntdebugging Blog

    Troubleshooting Pool Leaks Part 6 – Driver Verifier


    In part 5 we used poolhittag to get call stacks of pool being allocated and freed.  This information is often essential to identifying the cause of a memory leak; however it is not always feasible to configure a live kernel debug to obtain this information.  Fortunately there are alternative methods to get such call stacks.


    Driver verifier has an option to enable pool tracking for a specific driver, or for multiple drivers.  This functionality was first introduced in Windows Vista and Windows Server 2008.  This information is also captured when driver verifier is used to enable special pool, however for the purposes of this article we will focus on using pool tracking.


    The data stored by driver verifier requires a debugger to view.  Any method of debugging can be used for this.  You can use a live kernel debug as we described in part 4, you can get a memory dump (kernel or complete, a small dump is insufficient), or you can use livekd.


    If you have used the steps from Part 1, Part 2, or Part 3, you likely have an idea which drivers are likely involved in creating the pool leak.  In this example we are generating the leak using notmyfault, the same tool we have been using in prior examples.  As seen in Part 2, the relevant driver is myfault.sys.


    Although driver verifier has GUI, the easiest way to enable this functionality is with the below command from an elevated command prompt:


    Verifier /flags 8 /driver myfault.sys


    The above command will provide the following output, allowing you to confirm that the expected settings are enabled:


    New verifier settings:


    Special pool: Disabled

    Pool tracking: Enabled

    Force IRQL checking: Disabled

    I/O verification: Disabled

    Deadlock detection: Disabled

    DMA checking: Disabled

    Security checks: Disabled

    Force pending I/O requests: Disabled

    Low resources simulation: Disabled

    IRP Logging: Disabled

    Miscellaneous checks: Disabled


    Verified drivers:





    You must restart this computer for the changes to take effect.


    After rebooting the system, reproduce the memory leak and attach a debugger or generate a memory dump after the memory has been leaked.


    Break in with the debugger (Ctrl+Break or Ctrl+C) or load the dump in windbg (File – Open Crash Dump).


    Set the symbol path and reload symbols.


    1: kd> .symfix c:\symbols

    1: kd> .reload

    Loading Kernel Symbols



    The !verifier command has various options to view information about driver verifier.  To view the pool allocations which have been tracked by verifier for notmyfault.sys, use the following:


    0: kd> !verifier 3 myfault.sys


    Verify Level 8 ... enabled options are:

          All pool allocations checked on unload


    Summary of All Verifier Statistics


    RaiseIrqls                             0x0

    AcquireSpinLocks                       0x0

    Synch Executions                       0x0

    Trims                                  0x0


    Pool Allocations Attempted             0xb

    Pool Allocations Succeeded             0xb

    Pool Allocations Succeeded SpecialPool 0xa

    Pool Allocations With NO TAG           0x1

    Pool Allocations Failed                0x0

    Resource Allocations Failed Deliberately   0x0


    Current paged pool allocations         0x0 for 00000000 bytes

    Peak paged pool allocations            0x1 for 00000080 bytes

    Current nonpaged pool allocations      0xa for 009CE000 bytes

    Peak nonpaged pool allocations         0xa for 009CE000 bytes


    Driver Verification List


    Entry     State           NonPagedPool   PagedPool   Module


    fffffa80031b5830 Loaded           009ce000       00000000    myfault.sys


    Current Pool Allocations  0000000a    00000000

    Current Pool Bytes        009ce000    00000000

    Peak Pool Allocations     0000000a    00000001

    Peak Pool Bytes           009ce000    00000080


    PoolAddress  SizeInBytes    Tag       CallersAddress

    fffffa8005400000     0x000fb000     Leak      fffff8800447d634

    fffffa80052fb000     0x000fb000     Leak      fffff8800447d634

    fffffa8005200000     0x000fb000     Leak      fffff8800447d634

    fffffa80050fb000     0x000fb000     Leak      fffff8800447d634

    fffffa8005000000     0x000fb000     Leak      fffff8800447d634

    fffffa8004efb000     0x000fb000     Leak      fffff8800447d634

    fffffa8004e00000     0x000fb000     Leak      fffff8800447d634

    fffffa8004cfb000     0x000fb000     Leak      fffff8800447d634

    fffffa8004c00000     0x000fb000     Leak      fffff8800447d634

    fffffa8004a66000     0x000fb000     Leak      fffff8800447d634


    At the bottom of the above output is the list of allocations made by notmyfault.sys.  For our purposes we are going to assume that these allocations have been leaked, as opposed to just being normal allocations that were not yet freed when the debugger was attached.


    The !verifier command has an option to view call stacks for one of the tracked allocations.  Keep in mind that the size of the database is limited and only more recent allocations will be kept in the database.


    0: kd> !verifier 80 fffffa8005400000


    Log of recent kernel pool Allocate and Free operations:


    There are up to 0x10000 entries in the log.


    Parsing 0x0000000000010000 log entries, searching for address 0xfffffa8005400000.




    Pool block fffffa8005400000, Size 00000000000fa000, Thread fffffa80044ceb60

    fffff80001927cc6 nt!VeAllocatePoolWithTagPriority+0x2b6

    fffff80001927d3d nt!VerifierExAllocatePoolEx+0x1d

    fffff8800447d634 myfault!MyfaultDeviceControl+0x358

    fffff8800447d727 myfault!MyfaultDispatch+0xb7

    fffff80001932750 nt!IovCallDriver+0xa0

    fffff800017a7a97 nt!IopXxxControlFile+0x607

    fffff800017a82f6 nt!NtDeviceIoControlFile+0x56

    fffff8000148bed3 nt!KiSystemServiceCopyEnd+0x13

    Parsed entry 0000000000010000/0000000000010000...

    Finished parsing all pool tracking information.


    The above output shows the call stack leading to the pool allocation.  This is the same information we had seen in Part 5, however we are able to obtain this information using a dump or livekd, whereas the steps from Part 5 required an invasive debug and extended system downtime.


    When you have completed troubleshooting, disable driver verifier with the following command and reboot:

    verifier /reset

  • Ntdebugging Blog

    Breaking down the "Cl" in !irp


    Hey there NTDEBUGGERS my name is Randy Monteleone and today we are going to talk about IRPs. In the past we have talked about the IRP structure in passing and showed a field here and there that can be pulled out and used to find answers to stalled IO. I was recently working on a debugger extension and found something interesting in the IRP I was looking at. I had been looking at a !irp output much like the one pictured below. I found that I was asking myself what do the "Success Error Cancel" fields mean?


    After doing some digging and working with a few of my co-workers we found the mystery to the meaning behind these words and why we see them in our output. Lets break this IRP down starting with the ">" marker that indicates the current stack frame. In the output below you see this marker is indicating that we are working on something in partmgr.


    1: kd> !irp fffffa809a1f3af0

    Irp is active with 9 stacks 5 is current (= 0xfffffa809a1f3ce0)

    Mdl=fffffa814e9c4f40: No System Buffer: Thread fffffa80d05b67e0:  Irp stack trace. 

         cmd  flg cl Device   File     Completion-Context

    [  0, 0]   0  0 00000000 00000000 00000000-00000000   


                   Args: 00000000 00000000 00000000 00000000

    [  0, 0]   0  0 00000000 00000000 00000000-00000000   


                   Args: 00000000 00000000 00000000 00000000

    [  0, 0]   0  0 00000000 00000000 00000000-00000000   


                   Args: 00000000 00000000 00000000 00000000

    [  0, 0]   0  0 00000000 00000000 00000000-00000000   


                   Args: 00000000 00000000 00000000 00000000

    >[  4,34]  1c e0 fffffa80920c0060 00000000 fffff880010301b0-00000000 Success Error Cancel <-- What are you trying to tell me?

           \Driver\Disk           partmgr!PmReadWriteCompletion

                  Args: 00001000 00000000 14625000 00000000

    [  4, 0]  1c e0 fffffa80920c0b90 00000000 fffff88001063010-fffffa80b01eec00 Success Error Cancel

           \Driver\partmgr        volmgr!VmpReadWriteCompletionRoutine

                  Args: 3993d69568d 00000000 14625000 00000000

    [  4, 0]   c e0 fffffa80b01eeab0 00000000 fffff88001d59150-fffffa80d05b2180 Success Error Cancel

           \Driver\volmgr         volsnap!VspRefCountCompletionRoutine

                  Args: 00001000 00000000 3993d69568a 00000000

    [  4, 0]   c e1 fffffa80d05b2030 00000000 fffff88001845344-fffff8800d52bb38 Success Error Cancel pending

           \Driver\volsnap        Ntfs!NtfsMasterIrpSyncCompletionRoutine

                   Args: 00001000 00000000 14615000 00000000

    [  4, 0]   0  0 fffffa80c2829030 fffffa80d04f18a0 00000000-00000000   


                   Args: 00001000 00000000 00002000 00000000


    In the example below we see the “Success Error Cancel” fields and in this case we can also see that the pending flag has been set. The pending field indicates that STATUS_PENDING was returned to the caller. This is used so that I/O completion can determine whether or not to fully complete the I/O operation requested by the packet. Drivers can do this by calling IoMarkIrpPending.


    Now look at the "cl" column as it holds the key to unlocking what "Success Error Cancel pending" really means.


    1: kd> !irp 0xfffffa80`920c2340

    Irp is active with 3 stacks 3 is current (= 0xfffffa80920c24a0)   <--- _IO_STACK_LOCATION

    Mdl=fffffa814e9c4f40: No System Buffer: Thread 00000000: Irp stack trace. 

         cmd  flg cl Device   File     Completion-Context

    [  0, 0]   0  0 00000000 00000000 00000000-00000000   


                   Args: 00000000 00000000 00000000 00000000

    [  f, 0]  1c  0 fffffa809209a060 00000000 fffff880010061a0-fffffa80920bfcc0   

           \Driver\mpio        mpio!MPIOPdoCompletion

                   Args: fffffa80920c22b0 00000000 00000000 fffffa80920bfcc0

    >[  f, 0] 1c  e1 fffffa809209a060 00000000 fffff88001a01a00-fffffa80920c2190 Success Error Cancel pending

           \Driver\mpio        CLASSPNP!TransferPktComplete

    Args: fffffa80920c22b0 00000000 00000000 fffffa80920bfcc0


    Focusing on the cl column we see that our active stack frame is working in MPIO, but what is it trying to tell us about its status and intent? Well to figure that out we need to break down the value "e1" we see listed in our cl or Control column. This field is being used as a flag where the high and low parts represent two different values. We get this value in !irp from an _IO_STACK_LOCATION Control member. This can be found by dumping the stack location displayed by !irp.


    1: kd> dt_IO_STACK_LOCATION 0xfffffa80920c24a0


      +0x000 MajorFunction    : 0xf ''

      +0x001 MinorFunction    : 0 ''

      +0x002 Flags            : 0x1c ''

      +0x003 Control          : 0xe1 ''  < -- Control flag

      +0x008 Parameters       : <unnamed-tag>

      +0x028 DeviceObject     : 0xfffffa80`9209a060 _DEVICE_OBJECT

      +0x030 FileObject       : (null)

      +0x038 CompletionRoutine : 0xfffff880`01a01a00     long  CLASSPNP!TransferPktComplete+0

      +0x040 Context          : 0xfffffa80`920c2190 Void


    So now that we know where we get this value we still have to decode what it means and how this value results in what we see in our !IRP output. Let's split our number into two, our high and low parts. In our example above we had the value e1. Taking our value apart gives us an E and a 1. The first bit of our value will indicates the last returned value as our IRP was processed. In our case this is a 1, meaning that our IRP is Pending. Thus we see the word pending at the end of current stack frame in !irp. There are three possible values for this bit, 0, 1 or 2. 0 being nothing, 1 being pending and 2 being Error Returned.


    Moving to our high bits we are left with the "e". Lets take this number and convert it to binary. We end up with 1110.  This upper number indicates which invoke types were requested for the completion routine for the driver listed on that stack frame. If we look at MSDN we see that IoSetCompletionRoutine takes in three bool values to set this flag. These options specify whether the completion routine is called if the IRP is completed with that corresponding status.


    VOID IoSetCompletionRoutine(

      _In_      PIRP Irp,

      _In_opt_  PIO_COMPLETION_ROUTINE CompletionRoutine,

      _In_opt_  PVOID Context,

      _In_      BOOLEAN InvokeOnSuccess,

      _In_      BOOLEAN InvokeOnError,

      _In_      BOOLEAN InvokeOnCancel



    Doing some source review I was able to pin together how these values translate to the !irp output. If we look back at our binary value of "e" we see that we have a set of bits that get set based on the what the driver wanted to do when an IRP was completed with one of our defined status values.


    Cancel   = 2

    Success  = 4

    Error    = 8


    Add each of these values up and the sum is 14 or e in hex. Going back to our binary "1110" we see that our first bit is set to zero or off. This would be the pending and Error Returned values that I mentioned above. The next three bits represent the corresponding Cancel, Success and Error bool values passed to the driver at the time IoSetCompletionRoutine was called for this stack location.


    Example : IoSetCompletionRoutine(pirp, pCompletionRoutine,pContex, True,False,True); would yield a value of 6.


    Remember the important thing here is not that the !irp output is trying to tell us that one of these things happened. It's telling us that this driver would like to be notified if one of those things does happen. This area also provides us with information if pending or or an error is returned.


    Well that's all I have for now, for more documentation for the _IRP and _IO_STACK_LOCATION structures please see the following links to MSDN.


    More on IRP: http://msdn.microsoft.com/en-us/library/windows/hardware/ff550694(v=vs.85).aspx

    More on _IO_STACK_LOCATION: http://msdn.microsoft.com/en-us/library/windows/hardware/ff550659(v=vs.85).aspx

  • Ntdebugging Blog

    Troubleshooting Pool Leaks Part 5 – PoolHitTag


    In Part 4 we narrowed the source of the leaked pool memory to the specific driver which is allocating it, and we identified where in the driver this allocation was taking place.  However, we did not capture contextual information such as the call stack leading up to this code.  Also, we didn’t capture information about when this allocated pool is freed.  In this article we will use the PoolHitTag feature to break into the debugger when a specific tag is used.


    As in Part 4, a live debug must be configured to use this feature.  The debugging tools have instructions in the debugger.chm help file, under Debugging Tools for Windows\Debuggers\Installation and Setup\Kernel-Mode Setup (see screenshot in part 4).


    These steps are typically only effective if you are able to perform them while the leak is happening.  There may be a scenario in which a developer wants to know what “normal” looks like, but most often the steps in this article are used to investigate “broken”.


    The PoolHitTag is a global in the kernel binary.  When this global is set to a pool tag, the system will break into the debugger whenever pool with this tag is allocated or freed.  By default the PoolHitTag is set to ffffff0f.


    1: kd> dc nt!PoolHitTag l1

    fffff800`016530fc  ffffff0f                             ....


    To turn on this feature, edit the PoolHitTag to the tag that is known to leak.  The value 3261654c is little endian ASCII for the string ‘Lea2’.  I found this value in the “Confirm your edits” step in Part 4. 


    1: kd> ed nt!PoolHitTag 3261654c

    1: kd> dc nt!PoolHitTag l1

    fffff800`016530fc  3261654c                             Lea2


    With PoolHitTag now set to the leaking tag, issue the ‘g’ command to release debugger and it will automatically break in whenever the Lea2 tag is used.


    1: kd> g

    Break instruction exception - code 80000003 (first chance)

    nt! ?? ::FNODOBFM::`string'+0x24a2a:

    fffff800`014798f6 cc              int     3


    In the above example the debugger broke in because the ‘int 3’ instruction triggered a breakpoint.  The symbols seem to indicate that we are in a function named “?? ::FNODOBFM::`string'”, but this is simply a lack of symbolic information for this optimized code.  Unassembling the surrounding code shows that this code is a piece of ExpAllocateBigPool, one of the functions used in the kernel to allocate pool allocations larger than 4096 bytes.


    1: kd> u fffff800`014798f6

    nt! ?? ::FNODOBFM::`string'+0x24a2a:

    fffff800`014798f6 cc              int     3

    fffff800`014798f7 e9ee8e0800      jmp     nt!ExpAllocateBigPool+0x13a (fffff800`015027ea)


    At this point we can dump the call stack and see the full context of what is happening when this memory is allocated.


    1: kd> k

    Child-SP          RetAddr           Call Site

    fffff880`04ec1680 fffff800`0161090e nt! ??::FNODOBFM::`string'+0x24a2a

    fffff880`04ec1770 fffff880`0496e634 nt!ExAllocatePoolWithTag+0x82e

    fffff880`04ec1860 fffff880`0496e727 myfault+0x1634

    fffff880`04ec19b0 fffff800`017fca97 myfault+0x1727

    fffff880`04ec1a10 fffff800`017fd2f6 nt!IopXxxControlFile+0x607

    fffff880`04ec1b40 fffff800`014e0ed3 nt!NtDeviceIoControlFile+0x56

    fffff880`04ec1bb0 00000000`7756138a nt!KiSystemServiceCopyEnd+0x13

    00000000`000df4c8 000007fe`fd5fa249 ntdll!ZwDeviceIoControlFile+0xa

    00000000`000df4d0 00000000`7740683f KERNELBASE!DeviceIoControl+0x75

    00000000`000df540 00000000`ff222384 kernel32!DeviceIoControlImplementation+0x7f

    00000000`000df590 00000000`00000000 NotMyfault+0x2384


    Repeating the ‘g’ and ‘k’ commands multiple times will begin to give you an understanding of the various ways this code may be used.  This can be automated by modifying the ‘int 3’ instruction and using a breakpoint.  Note that system performance may suffer because output to the debug port is serialized.


    The commands shown below use addresses specific to big pool allocations (larger than 4KB). The ‘int 3’ instruction may be located elsewhere depending on the scenario you are debugging.


    To modify the operation from a debug break to a breakpoint, change the ‘int 3’ to ‘nop’.  In x86 and x64 the opcode for ‘nop’ is 90.  Coincidentally these instructions are the same length.


    1: kd> eb fffff800`014798f6 90


    Confirm that the instruction was reset properly.


    1: kd> u fffff800`014798f6 l1

    nt! ?? ::FNODOBFM::`string'+0x24a2a:

    fffff800`014798f6 90              nop


    Set a breakpoint on the ‘nop’ instruction and configure the breakpoint to automatically dump the stack and go the debugger.


    1: kd> bp fffff800`014798f6 "k;g"

    1: kd> g


    If you find that the pool is sometimes allocated and occasionally freed, you may need to edit the ‘int 3’ used when ExFreePool is called, and set a similar breakpoint on that address.


    Break instruction exception - code 80000003 (first chance)


    fffff800`0160f5b7 cc              int     3

    1: kd> eb fffff800`0160f5b7 90

    1: kd> bp fffff800`0160f5b7 "k;g"

    1: kd> g


    Once you have sufficient data to understand the scenario where the memory is allocated and freed use Ctrl+Break to break into the debugger, clear the breakpoints and reset the PoolHitTag.  Then go the debugger to allow the system to continue normal operation.


    1: kd> bc *

    1: kd> ed nt!PoolHitTag ffffff0f

    1: kd> g


    The data collected with these steps should provide an indication to a developer of what memory is being leaked and when.


    PoolHitTag isn’t the only option for collecting call stack information.  Our final articles will cover alternative techniques for obtaining this information.

  • Ntdebugging Blog

    Troubleshooting Pool Leaks Part 4 – Debugging Multiple Users for a Tag


    In our previous articles we discussed various techniques for identifying a pool memory leak and narrowing the scope of the leak to an individual pool tag.  Knowing the leaking pool tag is often sufficient to identify the cause of the problem and find a solution.  However, there may be a scenario where multiple drivers use the same pool tag (such as DDK) or when one driver uses the same tag in multiple places.  In this scenario you will need more information to identify the source of the leak.  In our next several articles we will present techniques to get this information.


    This article will present a basic technique where we modify each pool tag to identify what code in which driver is allocating the memory that gets leaked.


    This technique requires a live debug of the problematic system.  There are many resources with steps for how to configure a system for a live debug.  The debugging tools have instructions in the debugger.chm help file, under Debugging Tools for Windows\Debuggers\Installation and Setup\Kernel-Mode Setup.




    Using the same technique as in Part 3, identify where the tag in question is used.


    0: kd> !for_each_module s -a @#Base @#End "Leak"

    fffff880`0496e3aa  4c 65 61 6b 3b c1 0f 42-c1 41 8d 49 fd 8b d0 ff  Leak;..B.A.I....

    fffff880`0496e621  4c 65 61 6b 3b c1 0f 42-c1 33 c9 8b d0 ff 15 cc  Leak;..B.3......


    Next, edit each instance so that they are unique.  The ASCII code for numeral 1 is 0x31, and the codes for each numeral increase sequentially.  Using this information edit each tag to be Lea1, Lea2, etc.


    0: kd> eb fffff880`0496e3aa+3 31

    0: kd> eb fffff880`0496e621+3 32


    Confirm your edits resulted in the expected tags using the dc command.


    0: kd> dc fffff880`0496e3aa l1

    fffff880`0496e3aa  3161654c                             Lea1

    0: kd> dc fffff880`0496e621 l1

    fffff880`0496e621  3261654c                             Lea2


    Now wait for the leak to happen and repeat the steps from Part 3 to identify which of the tags is leaked.  This tells you what code allocates the memory that gets leaked.  Below we can see that Lea2 is the tag being leaked.


    0: kd> !poolused /t5 2


    Sorting by NonPaged Pool Consumed


                   NonPaged                  Paged

    Tag     Allocs         Used     Allocs         Used


    Lea2       257    263168000          0            0  UNKNOWN pooltag 'Lea2', please update pooltag.txt

    nVsC       664      1531552          0            0  UNKNOWN pooltag 'nVsC', please update pooltag.txt

    netv      4369      1172224          1          144  UNKNOWN pooltag 'netv', please update pooltag.txt

    Leak         1      1024000          0            0  UNKNOWN pooltag 'Leak', please update pooltag.txt

    EtwB        94       945136          4       163840  Etw Buffer, Binary: nt!etw


    TOTAL     41296    281814544      44077     68102368


    Knowing what code allocates the leaked pool may be very valuable to a driver developer who needs to narrow the scope of the problem.  Often this information is sufficient for a developer to code review the use of this memory and identify why it would be leaked.


    There are times when more information is needed to determine the cause of the leak.  A developer may need the call stacks of memory being allocated and freed.  We will capture this information in Part 5.

  • Ntdebugging Blog

    Troubleshooting Pool Leaks Part 3 – Debugging


    In our previous articles we discussed identifying a pool leak with perfmon, and narrowing the source of the leak with poolmon.  These tools are often preferred because they are easy to use, provide verbose information, and can be run on a system without forcing downtime.  However, it is not always possible to get perfmon and poolmon data.  If a system is experiencing poor performance you may have a business need to get the system up and running as quickly as possible without allowing time to troubleshoot.  It is also possible to completely exhaust memory through a pool leak, leaving the system in a state where tools such as perfmon and poolmon will not work.  In these scenarios it may be possible to troubleshoot the poor performance by forcing a bugcheck, gathering a memory dump, and performing a post mortem analysis.


    Although a dump is not the ideal data to troubleshoot a leak, it can be done.  I say less than ideal because a dump is a snapshot of the system memory, and does not provide the historical data which perfmon would provide.  The lack of historical data makes it difficult to differentiate between high memory usage due to load and high memory usage due to a leak.  It is up to you, as the troubleshooter, to determine if the dump is sufficient evidence of a leak.  Sometimes identifying the tag and the driver will help you identify a known issue that causes a leak, or your knowledge of the driver architecture may allow you to determine if the memory usage is normal or not.  In some scenarios you may decide to start monitoring with perfmon and collect additional data for a future occurrence.


    The first step to debug a pool leak using a dump is to load the dump in windbg, set the symbol path, and reload symbols.


    1: kd> .symfix c:\symbols

    1: kd> .reload

    Loading Kernel Symbols





    The !vm command will show memory utilization, the 1 flag will limit the verbosity of this command.  For the scenario of a pool leak, the significant values are “NonPagedPool Usage:” and “PagedPool Usage:”.  If the debugger identifies a value that is out of the normal range it will flag it, and we can see here that the debugger has flagged excessive nonpaged pool usage.  This is similar to the information we obtained in Part 1 using perfmon, but unlike perfmon we do not have trend data to indicate if this is temporary high pool usage due to load or if this is a leak.


    1: kd> !vm 1


    *** Virtual Memory Usage ***

            Physical Memory:      403854 (   1615416 Kb)

            Page File: \??\C:\pagefile.sys

              Current:   1048576 Kb  Free Space:   1015644 Kb

              Minimum:   1048576 Kb  Maximum:      4194304 Kb

            Available Pages:      106778 (    427112 Kb)

            ResAvail Pages:       225678 (    902712 Kb)

            Locked IO Pages:           0 (         0 Kb)

            Free System PTEs:   33533355 ( 134133420 Kb)

            Modified Pages:         4844 (     19376 Kb)

            Modified PF Pages:      4838 (     19352 Kb)

            NonPagedPool Usage:   155371 (    621484 Kb)

            NonPagedPool Max:     191078 (    764312 Kb)

            ********** Excessive NonPaged Pool Usage *****

            PagedPool 0 Usage:     27618 (    110472 Kb)

            PagedPool 1 Usage:      3848 (     15392 Kb)

            PagedPool 2 Usage:       299 (      1196 Kb)

            PagedPool 3 Usage:       283 (      1132 Kb)

            PagedPool 4 Usage:       344 (      1376 Kb)

            PagedPool Usage:       32392 (    129568 Kb)

            PagedPool Maximum:  33554432 ( 134217728 Kb)

            Session Commit:         7764 (     31056 Kb)

            Shared Commit:          6371 (     25484 Kb)

            Special Pool:              0 (         0 Kb)

            Shared Process:         5471 (     21884 Kb)

            PagedPool Commit:      32394 (    129576 Kb)

            Driver Commit:          2458 (      9832 Kb)

            Committed pages:      326464 (   1305856 Kb)

            Commit limit:         665998 (   2663992 Kb)


    The debugger can parse the pool tagging database and present similar information as poolmon.  The !poolused command will do this, the /t5 option will limit output to the top 5 consumers, the 2 flag will sort by nonpaged pool usage (use the 4 flag if your leak is in paged pool).


    1: kd> !poolused /t5 2


    Sorting by NonPaged Pool Consumed


                   NonPaged                  Paged

    Tag     Allocs         Used     Allocs         Used


    Leak       601    615424000          0            0    UNKNOWN pooltag 'Leak', please update pooltag.txt

    Pool         6      1717840          0            0    Pool tables, etc.

    nVsC       664      1531552          0            0    UNKNOWN pooltag 'nVsC', please update pooltag.txt

    netv      4369      1172224          1          144    UNKNOWN pooltag 'netv', please update pooltag.txt

    Thre       607       774048          0            0    Thread objects , Binary: nt!ps


    TOTAL     43424    634209952      63565    126487760


    The above output shows that the tag “Leak” is associated with almost all of the nonpaged pool usage.  This is the same information we obtained in Part 2 using poolmon.


    Now we must identify what drivers use the pool tag “Leak”. Because we have a snapshot of the system memory we can search the dump for this tag.  We can match each address to a module using the command lm a.


    1: kd> !for_each_module s -a @#Base @#End "Leak"

    fffff880`044b63aa  4c 65 61 6b 3b c1 0f 42-c1 41 8d 49 fd 8b d0 ff  Leak;..B.A.I....

    fffff880`044b6621  4c 65 61 6b 3b c1 0f 42-c1 33 c9 8b d0 ff 15 cc  Leak;..B.3......

    1: kd> lm a fffff880`044b63aa

    start             end                 module name

    fffff880`044b5000 fffff880`044bc000   myfault    (no symbols)

    1: kd> lm a fffff880`044b6621

    start             end                 module name

    fffff880`044b5000 fffff880`044bc000   myfault    (no symbols)


    The tag and driver name can be used to search the internet for known problems.  If a known issue is found a driver update may be available, and installing this update may prevent a future memory leak.


    If there are no updates available for the driver, or if this is your driver and you need to identify the cause of the leak, don’t panic.  In future articles we will show techniques for getting call stacks of pool allocations, these call stacks be used to identify under what conditions the driver leaks memory.

  • Ntdebugging Blog

    Troubleshooting Pool Leaks Part 2 – Poolmon


    In our previous article we discussed how to identify a pool leak using perfmon.  Although it may be interesting to know that you have a pool leak, most customers are interested in identifying the cause of the leak so that it can be corrected.  In this article we will begin the process of identifying what kernel mode driver is leaking pool, and possibly identify why.


    Often when we are collecting data for a poor performance scenario there are two pieces of data that we collect.  Perfmon log data is one, as we discussed in our previous article.  The other piece of data is poolmon logs.  The memory manager tracks pool usage according to the tag associated with the pool allocations, using a technique called pool tagging.  Poolmon gathers this data and displays it in an easy to use format.  Poolmon can also be configured to dump data to a log, and in some scenarios it is beneficial to schedule poolmon to periodically collect such logs.  There are several available techniques to schedule poolmon, however that is beyond the scope of this article.


    Poolmon has shipped with many different packages over the years; it is currently available with the Windows Driver Kit.  If you install the WDK to the default folder, poolmon will be in “C:\Program Files (x86)\Windows Kits\8.0\Tools\x64”.  Poolmon does not have dependencies on other modules in this folder; you can copy it to your other computers when you need to investigate pool usage.


    How does pool tagging work?  When a driver allocates pool it calls the ExAllocatePoolWithTag API.  This API accepts a tag - a four-letter string - that will be used to label the allocation.  It is up to a driver developer to choose this tag.  Ideally each developer will choose a tag that is unique to their driver and use a different tag for each code path which calls ExAllocatePoolWithTag.  Because each tag should be unique to each driver, if we can identify the tag whose usage corresponds with the leak we can then begin to identify the driver which is leaking the memory.  The tag may also give the driver developer clues as to why the memory is being leaked, if they use a unique tag for each code path.


    To view the pool usage associated with each tag run “poolmon -b” from a command prompt.  This will sort by the number of bytes associated with each tag.  If you are tracking pool usage over a period of time, you can log the data to a file with “poolmon -b -n poolmonlog1.txt”, replacing 1 with increasing numbers to obtain a series of logs.  Once you have a series of logs you may be able to view usage increasing for a specific tag, in a corresponding fashion to what you see in perfmon.


    When analyzing poolmon the important data is at the top.  Typically the tag with the largest usage in bytes is the cause of the leak.


    Poolmon -b


    In the above data we can see that the tag with the most pool usage is “Leak”.  Now that we know what tag is leaking we need to identify what driver is using this tag.  Techniques for associating a leak with a tag vary, but findstr is often effective.  Most drivers are located in c:\windows\system32\drivers, so that is a good starting point when looking for the driver.  If you don’t find a result in that folder, go up a folder and try again, repeating until you get to the root of the drive.


    C:\>findstr /s Leak *.sys



    ·∟   §£ Θ─  A╗☻   E"Θ╡☺ Hï♣╔♂  $Θª  Hï:Hc ê\@ë

      └δ_Aï ╞♣@  ë♣▓  Aï@ë¼←  δCAâ∙uAïAìIALeak;┴☼B3ï §


    In the above output we can see that “Leak” is used in myfault.sys.  If we hadn’t forced this leak with notmyfault, the next step in troubleshooting would be an internet search for the tag and the driver.  Often such a search will allow you to identify a specific fault within the driver and a solution.


    Don’t panic if findstr doesn’t find your tag, or if you find the tag but it is not unique to one driver.  In future articles we will cover additional techniques for associating drivers with tags, and for associating allocations with specific code within a driver.

  • Ntdebugging Blog

    Troubleshooting Pool Leaks Part 1 – Perfmon


    Over the years the NTDebugging Blog has published several articles about pool memory and pool leaks.  However, we haven’t taken a comprehensive approach to understanding and troubleshooting pool memory usage.  This upcoming series of articles is going to tackle pool leaks from the basics to advanced troubleshooting techniques.  Most of the examples will use the Windows Sysinternals tool NotMyFault to generate a leak so our readers will be able to reproduce the described behavior and repeat the troubleshooting steps.


    We need to start by understanding what pool is and how it is used.  Pool is virtual memory that is used by drivers in much the same way user mode applications use heap.  A driver developer calls ExAllocatePoolWithTag to get a block of memory that can be used in much the same way a user mode programmer would use memory returned by HeapAlloc or malloc.  The memory manager, which is responsible for managing pool, is able to efficiently handle small allocations by taking a page of virtual memory (typically 4KB) and breaking it up into smaller blocks.  The memory manager is also able to allocate pool in blocks larger than a page.  There are two types of pool a developer can request from ExAllocatePoolWithTag, paged pool and nonpaged pool.  As the names suggest one type of pool memory can be paged out, and the other cannot be paged.  Paged pool is used for most allocations, nonpagedpool is used for memory that will be written or read at an IRQL of DISPATCH_LEVEL or above.


    Pool leaks happen when a driver calls ExAllocatePoolWithTag but never calls the corresponding ExFreePool or ExFreePoolWithTag routine.  A leak is different than just high memory utilization, which may happen in normal conditions as load increases.  For example, the srv.sys driver creates work items for incoming requests, and when there is a large amount of SMB traffic to a server the pool usage from srv.sys may increase to handle this traffic.  Typically the differentiation between a leak and high memory usage due to load is that a leak never decreases.  Memory usage that is load related should decrease when the load is reduced.  Monitoring is required to differentiate between these two scenarios.  Performance Monitor (aka perfmon) is typically the most effective tool to begin such an investigation.


    The symptom of a pool leak is often poor system performance when the system runs out of pool, or on 64-bit systems the pool may begin to consume a large amount of the available memory.  This symptom makes perfmon an ideal tool to begin troubleshooting as it can be used to identify a wide variety of potential causes of poor performance.  Perfmon is most useful when it is started before a system enters a state of poor performance so that trend data can be analyzed leading up to the problem.


    You can use the below commands from an elevated command prompt to collect perfmon data from such a scenario.


    First create the data collector.  This command collects data from a variety of counters at a 5 minute interval and is designed to be run for several hours prior to and during a the time a system experiences poor performance (shorter intervals can be used for leaks that happen faster than several hours). We often recommend collecting these counters to perform general performance troubleshooting because we usually don’t know that there is a memory leak until after this data is collected and analyzed.

    Logman.exe create counter PerfLog-Long -o "c:\perflogs\\%computername%_PerfLog-Long.blg" -f bincirc -v mmddhhmm -max 300 -c "\LogicalDisk(*)\*" "\Memory\*" "\Cache\*" "\Network Interface(*)\*" "\Paging File(*)\*" "\PhysicalDisk(*)\*" "\Processor(*)\*" "\Processor Information(*)\*" "\Process(*)\*" "\Redirector\*" "\Server\*" "\System\*" "\Server Work Queues(*)\*" "\Terminal Services\*" –si 00:05:00


    Then start collecting data:

    Logman.exe start PerfLog-Long


    When the performance problem is being experienced, stop collecting data:

    Logman.exe stop PerfLog-Long


    After you have collected the data, open the .blg file in the Performance Monitor MMC snap-in.  Browse to the Memory object, and add the counters “Pool Nonpaged Bytes” and “Pool Paged Bytes”.  This should leave you with a view similar to the below screenshot.




    The steadily increasing line in the above screenshot, without a substantial decrease in the line, is an indicator that nonpaged pool memory is being leaked.  If we look at the maximum count we see that nonpaged pool has consumed 540MB.  The significance of this value depends on the amount of RAM in the system.  In this instance the system has 1GB of RAM so nonpaged pool is consuming 54% of the available memory.  We can now conclude that the cause of the performance problem is a nonpaged pool memory leak, which is consuming a large amount of RAM and preventing other components from using this RAM.


    Next we need to start investigating which driver has allocated the most pool.  We will begin that in part 2.

  • Ntdebugging Blog

    How To Deadlock Yourself (Don’t Do This)


    Some APIs should come with a warning in big red letters saying “DANGER!”, or perhaps more subtly “PROCEED WITH CAUTION”.  One such API is ExSetResourceOwnerPointer. Although the documentation contains an explanation of what limited activity you can do with the resource after making this call, its warning is not very strongly worded.


    You may see evidence of a call to ExSetResourceOwnerPointer in a debug.  A lock in !locks will have an unusual owner field, such as the one shown below:

    2: kd> !locks
    KD: Scanning for held locks...

    Resource @ 0xfffffa8011efede8 Exclusively owned
    Contention Count = 20
    NumberOfSharedWaiters = 16
    Threads: fffff88007fab7f3-02<*> *** Unknown owner, possibly FileSystem
    fffffa80169538a0-01 fffffa801ea69b60-01 fffffa8017dfd430-01 fffffa800cd76b60-01
    fffffa801512a410-01 fffffa801279b340-01 fffffa8016d079a0-01 fffffa8015452aa0-01
    fffffa801607bb60-01 fffffa8012f79b60-01 fffffa8013b4e040-01 fffffa801b03e300-01
    fffffa800cd77040-01 fffffa8013a8e040-01 fffffa800cd76040-01 fffffa80172d7490-01


    The error “*** Unknown owner, possibly FileSystem” is an indicator that the owner field of this eresource has likely been modified by ExSetResourceOwnerPointer.  Fortunately for us debuggers, programmers often point the owner field to a location on the thread stack.  You can pass an address to the !thread command and it will interpret the address as a stack value.

    2: kd> !thread fffff88007fab7f3 e
    fffff88007fab7f3 is not a thread object, interpreting as stack value...
    THREAD fffffa80169538a0 Cid 0004.0638 Teb: 0000000000000000 Win32Thread: 0000000000000000 WAIT: (WrResource) KernelMode Non-Alertable
    fffffa80139ea3e0 Semaphore Limit 0x7fffffff
    IRP List:
    fffffa8016fd1010: (0006,0310) Flags: 00000884 Mdl: 00000000
    Not impersonating
    DeviceMap fffff8a000008aa0
    Owning Process fffffa800cd6a5f0 Image: System
    Attached Process N/A Image: N/A
    Wait Start TickCount 27606952 Ticks: 141 (0:00:00:02.199)
    Context Switch Count 90787
    UserTime 00:00:00.000
    KernelTime 00:00:02.496
    Win32 Start Address nt!ExpWorkerThread (0xfffff80002293a50)
    Stack Init fffff88007fabdb0 Current fffff88007fab3a0
    Base fffff88007fac000 Limit fffff88007fa6000 Call 0
    Priority 14 BasePriority 13 UnusualBoost 1 ForegroundBoost 0 IoPriority 2 PagePriority 5
    Child-SP RetAddr Call Site
    fffff880`07fab3e0 fffff800`0228da52 nt!KiSwapContext+0x7a
    fffff880`07fab520 fffff800`0228fbaf nt!KiCommitThreadWait+0x1d2
    fffff880`07fab5b0 fffff800`0224ec9e nt!KeWaitForSingleObject+0x19f
    fffff880`07fab650 fffff800`022ad98c nt!ExpWaitForResource+0xae
    fffff880`07fab6c0 fffff880`0140fc10 nt!ExAcquireSharedStarveExclusive+0x1bc
    fffff880`07fab720 fffff880`0140f8e2 sis!SipDereferenceCSFile+0x40
    fffff880`07fab750 fffff880`0140f608 sis!SipDereferencePerLink+0x62
    fffff880`07fab780 fffff880`014102e7 sis!SipDereferenceScb+0x184
    fffff880`07fab7c0 fffff800`025796e6 sis!SiFilterContextFreedCallback+0xaf
    fffff880`07fab7f0 fffff880`016b9bcc nt!FsRtlTeardownPerStreamContexts+0xe2
    fffff880`07fab840 fffff880`016b98d5 Ntfs!NtfsDeleteScb+0x108
    fffff880`07fab880 fffff880`0162ccb4 Ntfs!NtfsRemoveScb+0x61
    fffff880`07fab8c0 fffff880`016b72dc Ntfs!NtfsPrepareFcbForRemoval+0x50
    fffff880`07fab8f0 fffff880`01635882 Ntfs!NtfsTeardownStructures+0xdc
    fffff880`07fab970 fffff880`016ce813 Ntfs!NtfsDecrementCloseCounts+0xa2
    fffff880`07fab9b0 fffff880`016a838f Ntfs!NtfsCommonClose+0x353
    fffff880`07faba80 fffff880`016cd7ef Ntfs!NtfsFspClose+0x15f
    fffff880`07fabb50 fffff880`01635c0d Ntfs!NtfsCommonCreate+0x193f
    fffff880`07fabd30 fffff800`0227e787 Ntfs!NtfsCommonCreateCallout+0x1d
    fffff880`07fabd60 fffff800`0227e741 nt!KySwitchKernelStackCallout+0x27 (TrapFrame @ fffff880`07fabc20)
    fffff880`085fffe0 fffff800`0229620a nt!KiSwitchKernelStackContinue
    fffff880`08600000 fffff880`01635b2f nt!KeExpandKernelStackAndCalloutEx+0x29a
    fffff880`086000e0 fffff880`016d29c0 Ntfs!NtfsCommonCreateOnNewStack+0x4f
    fffff880`08600140 fffff880`013330b6 Ntfs!NtfsFsdCreate+0x1b0
    fffff880`086002f0 fffff800`0258d717 fltmgr!FltpCreate+0xa6
    fffff880`086003a0 fffff800`0258379f nt!IopParseDevice+0x5a7
    fffff880`08600530 fffff800`02588b16 nt!ObpLookupObjectName+0x32f
    fffff880`08600630 fffff800`0258f827 nt!ObOpenObjectByName+0x306
    fffff880`08600700 fffff800`02599438 nt!IopCreateFile+0x2b7
    fffff880`086007a0 fffff880`01405bcf nt!NtCreateFile+0x78
    fffff880`08600830 fffff880`01405fbf sis!SipOpenBackpointerStream+0x10b
    fffff880`086008f0 fffff880`0140657d sis!SipOpenCSFileWork+0x3bf
    fffff880`08600c70 fffff800`02293b61 sis!SipOpenCSFile+0x21
    fffff880`08600cb0 fffff800`0252ea26 nt!ExpWorkerThread+0x111
    fffff880`08600d40 fffff800`02264866 nt!PspSystemThreadStartup+0x5a
    fffff880`08600d80 00000000`00000000 nt!KxStartSystemThread+0x16


    Looking at the call stack for the above thread we can see that sis.sys is trying to acquire the eresource shared.  Ordinarily, if a thread already owns an eresource exclusive, it can obtain it shared without first releasing the exclusive ownership.  In this scenario the kernel will compare the eresource’s owner field to the current thread and if they match the thread will be allowed to take shared ownership of the eresource.  This is where the danger of ExSetResourceOwnerPointer comes into play.  If you change the owner field with ExSetResourceOwnerPointer then this check fails because the owner field doesn’t match the current thread.


    The result of this scenario is that the thread waits for the exclusive owner to release the lock so this thread can get shared access.  Unfortunately this thread is the exclusive owner, and it is the shared waiter.  The thread has deadlocked on itself.


    Even if you are careful in your handling of the resource after calling ExSetResourceOwnerPointer, there is often a risk that your driver may be re-entered in the same thread and you may end up in a scenario you didn’t initially anticipate.  This is why using this API is dangerous, and should be avoided when not absolutely necessary.


    This issue demonstrated in this article was addressed in KB2608658 (issue 3), which is available for download from the Microsoft Download Center.

  • Ntdebugging Blog

    What Did Storport Do With My I/O?


    In a previous article I showed how to track an I/O request from the filesystem, through the class driver, and to the storage driver.  In that article I concluded with "From this data we can usually assume that the request has been sent to the disk drive and we are waiting for the disk to respond" and "There may be conditions where the request is stuck in storport, or in the miniport".  In this article I want to show what happens to the I/O request when gets to storport.sys.


    You will find that the structures needed to investigate a request inside of storport can be dumped with the dt command using publicly available storport.sys symbols.  However, until now those structures have not been useful to most debuggers because we have not described how an IRP goes through storport, or how to find an IRP once it has been sent to storport.


    In the previous article I showed how to find the IRP that was sent to the storport miniport driver.  Below is the example IRP we will be using today.


    48: kd> !irp fffffa81`4135a010

    Irp is active with 5 stacks 3 is current (= 0xfffffa814135a170)

    Mdl=fffffa8232ad1100: No System Buffer: Thread 00000000:  Irpstack trace.

         cmd  flg cl Device   File     Completion-Context

    [  0, 0]   0  0 00000000 00000000 00000000-00000000


                            Args: 00000000 00000000 00000000 00000000

    [  0, 0]   0  0 00000000 00000000 00000000-00000000


                            Args: 00000000 00000000 00000000 00000000

    >[  f, 0]   0 e1 fffffa813ba0c060 00000000 fffff880011651a0-fffffa8141609cb0 Success Error Cancel pending

                   \Driver\stormini mpio!MPIOPdoCompletion

                            Args: fffffa8231a15c10 00000000 00000000 fffffa813ba0c1b0

    [  f, 0]   0 e1 fffffa80c329f400 00000000 fffff88001487f8c-fffffa81414bc9c0 Success Error Cancel pending

                   \Driver\mpio     EmcpBase!PowerPlatformIoEnqueue

                            Args: fffffa8231a15c10 00000000 00000000 fffffa8141609cb0

    [  f, 0]   0 e1 fffffa80c34ab550 00000000 fffff88001dbaa00-fffffa8231a15af0 Success Error Cancel pending

                   \Driver\EmcpBase CLASSPNP!TransferPktComplete

                            Args: fffffa8231a15c10 00000000 00000000 00000000


    Although the device object in the IRP is named \Driver\stormini, the dispatch routines for the driver all point to storport.  Storport is the port driver that handles queuing the request, stormini is the miniport.


    48: kd> !drvobj \Driver\stormini 2

    Driver object (fffffa80c19d5bf0) is for:


    DriverEntry:   fffff88001729008 stormini!StateHandler

    DriverStartIo: 00000000

    DriverUnload:  fffff8800168f28c stormini

    AddDevice:     fffff88001416dc0 storport!DriverAddDevice


    Dispatch routines:

    [00] IRP_MJ_CREATE                      fffff880014582a0        storport!RaDriverCreateIrp

    [01] IRP_MJ_CREATE_NAMED_PIPE           fffff80001a721d4        nt!IopInvalidDeviceRequest

    [02] IRP_MJ_CLOSE                       fffff880014581c0        storport!RaDriverCloseIrp

    [03] IRP_MJ_READ                        fffff80001a721d4        nt!IopInvalidDeviceRequest

    [04] IRP_MJ_WRITE                       fffff80001a721d4        nt!IopInvalidDeviceRequest

    [05] IRP_MJ_QUERY_INFORMATION           fffff80001a721d4        nt!IopInvalidDeviceRequest

    [06] IRP_MJ_SET_INFORMATION             fffff80001a721d4        nt!IopInvalidDeviceRequest

    [07] IRP_MJ_QUERY_EA                    fffff80001a721d4        nt!IopInvalidDeviceRequest

    [08] IRP_MJ_SET_EA                      fffff80001a721d4        nt!IopInvalidDeviceRequest

    [09] IRP_MJ_FLUSH_BUFFERS               fffff80001a721d4        nt!IopInvalidDeviceRequest

    [0a] IRP_MJ_QUERY_VOLUME_INFORMATION    fffff80001a721d4        nt!IopInvalidDeviceRequest

    [0b] IRP_MJ_SET_VOLUME_INFORMATION      fffff80001a721d4        nt!IopInvalidDeviceRequest

    [0c] IRP_MJ_DIRECTORY_CONTROL           fffff80001a721d4        nt!IopInvalidDeviceRequest

    [0d] IRP_MJ_FILE_SYSTEM_CONTROL         fffff80001a721d4        nt!IopInvalidDeviceRequest

    [0e] IRP_MJ_DEVICE_CONTROL              fffff88001453010        storport!RaDriverDeviceControlIrp

    [0f] IRP_MJ_INTERNAL_DEVICE_CONTROL     fffff880014016c0        storport!RaDriverScsiIrp

    [10] IRP_MJ_SHUTDOWN                    fffff80001a721d4        nt!IopInvalidDeviceRequest

    [11] IRP_MJ_LOCK_CONTROL                fffff80001a721d4        nt!IopInvalidDeviceRequest

    [12] IRP_MJ_CLEANUP                     fffff80001a721d4        nt!IopInvalidDeviceRequest

    [13] IRP_MJ_CREATE_MAILSLOT             fffff80001a721d4        nt!IopInvalidDeviceRequest

    [14] IRP_MJ_QUERY_SECURITY              fffff80001a721d4        nt!IopInvalidDeviceRequest

    [15] IRP_MJ_SET_SECURITY                fffff80001a721d4        nt!IopInvalidDeviceRequest

    [16] IRP_MJ_POWER                       fffff8800141a9d0        storport!RaDriverPowerIrp

    [17] IRP_MJ_SYSTEM_CONTROL              fffff88001458f40        storport!RaDriverSystemControlIrp

    [18] IRP_MJ_DEVICE_CHANGE               fffff80001a721d4        nt!IopInvalidDeviceRequest

    [19] IRP_MJ_QUERY_QUOTA                 fffff80001a721d4        nt!IopInvalidDeviceRequest

    [1a] IRP_MJ_SET_QUOTA                   fffff80001a721d4        nt!IopInvalidDeviceRequest

    [1b] IRP_MJ_PNP                         fffff88001459b00        storport!RaDriverPnpIrp


    To find what storport has done with the IRP we must look at the unit.  The device extension contains the information regarding the state of the unit.


    48: kd> !devobj fffffa813ba0c060

    Device object (fffffa813ba0c060) is for:

    000000aa \Driver\stormini DriverObject fffffa80c19d5bf0

    Current Irp 00000000 RefCount 0 Type 00000007 Flags 00001050

    Dacl fffff9a10049abb0 DevExt fffffa813ba0c1b0 DevObjExt fffffa813ba0c630 DevNode fffffa80c19dad90

    ExtensionFlags (0x00000800)

                                 Unknown flags 0x00000800

    AttachedDevice (Upper) fffffa80c31ef9d0 \Driver\Disk

    Device queue is not busy.

    48: kd> dt storport!_RAID_UNIT_EXTENSION fffffa813ba0c1b0

       +0x000 ObjectType       : 1 ( RaidUnitObject )

       +0x008 DeviceObject     : 0xfffffa81`3ba0c060 _DEVICE_OBJECT

       +0x010 Adapter          : 0xfffffa80`c25a61b0 _RAID_ADAPTER_EXTENSION

       +0x018 SlowLock         : 0

       +0x020 DeviceState      : 1 ( DeviceStateWorking )

       +0x028 NextUnit         : _LIST_ENTRY [ 0xfffffa80`c25941d8 - 0xfffffa80`c25a6228 ]

       +0x038 UnitTableLink    : _LIST_ENTRY [ 0xfffffa80`c25a3320 - 0xfffffa80`c25e7b48 ]

       +0x048 Address          : _STOR_SCSI_ADDRESS

       +0x050 Identity         : _STOR_SCSI_IDENTITY

       +0x070 AlignmentInfo    : _STOR_ALIGNMENT_INFO

       +0x080 Flags            : <unnamed-tag>

       +0x088 ZeroOutstandingEvent : _KEVENT

       +0x0a0 OutstandingCount : 0n30

       +0x0a4 PagingPathCount  : 0

       +0x0a8 CrashDumpPathCount : 0

       +0x0ac HiberPathCount   : 0

       +0x0b0 TagList          : _QUEUE_TAG_LIST

       +0x0d0 IoQueue          : _IO_QUEUE

       +0x140 MaxQueueDepth    : 0xff

       +0x144 Power            : _RAID_POWER_STATE

       +0x150 PendingQueue     : _STOR_EVENT_QUEUE

       +0x170 PendingTimer     : _KTIMER

       +0x1b0 PendingDpc       : _KDPC

       +0x1f0 PauseTimer       : _KTIMER

       +0x230 PauseTimerDpc    : _KDPC

       +0x270 RestartDpc       : _KDPC

       +0x2b0 CommonBufferVAs  : (null)

       +0x2b8 CommonBufferSize : 0

       +0x2bc CommonBufferBlocks : 0

       +0x2c0 UnitExtension    : 0xfffffa80`c185dda0 Void

       +0x2c8 DefaultTimeout   : 0xa

       +0x2d0 DeferredList     : <unnamed-tag>

       +0x410 ResetCount       : 0n1

       +0x418 ResetResources   : _RAID_IO_RESOURCES

       +0x438 ResetResourcesAcquired : 0n1

       +0x43c SenseInfoSize    : 0x14 ''

       +0x440 IdleCounter      : (null)

       +0x448 PowerUpRequired  : 0n0

       +0x450 HierarchicalResetWorkItem : (null)

       +0x458 HierarchicalResetWorkItemTimeoutCountdown : 0n0

       +0x45c HierarchicalResetAbortCount : 1

       +0x460 HierarchicalResetWorkItemSpinLock : 0

       +0x468 RegistryParameters : _RAID_UNIT_REGISTRY_PARAMETERS


    Two parts of the device extension are relevant to finding the I/O request, the PendingQueue and the IoQueue.  Under most conditions, when a request is sent to storport it will be sent to the miniport driver and put on the PendingQueue.  The PendingQueue has a list of outstanding requests and a timer that is decremented every second.  The timer is refreshed when the request on the head of the queue is completed.  A request is completed when the miniport calls StorPortNotification with RequestComplete.  If the timer is decremented to 0, the request has timed out and storport will do a hierarchical reset to attempt to correct the problem.


    In the below PendingQueue we can see that there is one entry in the queue, and the Timeout is set to -2.  Under most conditions the Timeout will be a value between 0 and 0x3c (0n60), indicating the number of seconds left in the timer.  A Timeout value of -2 indicates that there was a timeout and it is being handled.  A Timeout value of -1 indicates that the queue is empty.


    48: kd> dt storport!_RAID_UNIT_EXTENSION fffffa813ba0c1b0 PendingQueue.

       +0x150 PendingQueue  :

          +0x000 List          : _LIST_ENTRY [ 0xfffff880`05906040 - 0xfffff880`05906040]

          +0x010 Lock          : 0

          +0x018 Timeout       : 0xfffffffe


    Items in the queue will be in the form of an XRB, or Extended Request Block.  There is one request in the pending queue (we know there is one request because the Flink and Blink point to the same address).  The Irp for this XRB is not our IRP, so we need to go look somewhere else for our request.


    48: kd> dt storport!_EXTENDED_REQUEST_BLOCK 0xfffff880`05906040-30

       +0x000 Signature        : 0x1f2e3d4c

       +0x008 Pool             : (null)

       +0x010 OwnedMdl         : 0y0

       +0x010 RemoveFromEventQueue : 0y1

       +0x010 State            : 0y011

       +0x010 RemappedSenseInfo : 0y1

       +0x012 InitiatingProcessor : _PROCESSOR_NUMBER

       +0x018 InitiatingToken  : (null)

       +0x020 CompletedLink    : _SLIST_ENTRY

       +0x030 PendingLink      : _STOR_EVENT_QUEUE_ENTRY

       +0x048 Mdl              : (null)

       +0x050 SgList           : (null)

       +0x058 RemappedSgListMdl : (null)

       +0x060 RemappedSgList   : (null)

       +0x068 Irp              : 0xfffffa82`35142ee0 _IRP

       +0x070 Srb              : 0xfffffa81`41cf54a0 _SCSI_REQUEST_BLOCK

       +0x078 SrbData          : <unnamed-tag>

       +0x098 Adapter          : 0xfffffa80`c25a61b0 _RAID_ADAPTER_EXTENSION

       +0x0a0 Unit             : 0xfffffa81`3ba0c1b0 _RAID_UNIT_EXTENSION

       +0x0a8 ScatterGatherBuffer : [424]  ""

       +0x250 CompletionRoutine : 0xfffff880`014199d0     void  storport!RaidUnitCompleteResetRequest+0

       +0x258 u                : <unnamed-tag>

       +0x270 RequestStartTimeStamp : 0


    If the unit is in a state where the request cannot be sent to the miniport, the requests will be queued on the DeviceQueue in the IoQueue. Storport may not be able to send the request to the miniport because the queue is full, the queue is paused, or because an untagged request is outstanding and tagged requests cannot be sent until the untagged request is completed.


    In the below DeviceQueue we can see that the unit is paused (the PauseCount is 1).  This would explain why our request is not in the PendingQueue.  It should be in the DeviceQueue.


    48: kd> dt storport!_RAID_UNIT_EXTENSION fffffa813ba0c1b0 IoQueue.DeviceQueue.

       +0x0d0 IoQueue              :

          +0x010 DeviceQueue          :

             +0x000 Type                 : 0n258

             +0x002 Size                 : 0n88

             +0x008 Gateway              : 0xfffffa80`c25a6470_STOR_IO_GATEWAY

             +0x010 DeviceList           : _LIST_ENTRY [ 0xfffffa81`4232d3a8 - 0xfffffa81`4172c768 ]

             +0x020 ByPassList           : _LIST_ENTRY [ 0xfffffa81`3ba0c2b0 - 0xfffffa81`3ba0c2b0 ]

             +0x030 Lock                 : 0

             +0x038 Depth                : 0n32

             +0x03c OutstandingCount     : 0n0

             +0x040 DeviceCount          : 0n27

             +0x044 ByPassCount          : 0n0

             +0x048 ByPassPowerCount     : 0n0

             +0x04c PauseCount           : 0n1

             +0x050 BusyCount            : 0n0

             +0x054 Frozen               : 0 ''

             +0x055 Locked               : 0 ''

             +0x056 Untagged             : 0 ''

             +0x057 PowerLocked          : 0 ''


    Requests from the DeviceQueue are in the DeviceList.  We can dump the DeviceList with !list and look for our IRP.


    48: kd> !list "-t nt!_IRP.Tail.Overlay.DeviceQueueEntry.DeviceListEntry.Flink -x\"??@$extret\" 0xfffffa814232d3a8-78"

    unsigned int64 0xfffffa81`4232d330


    unsigned int64 0xfffffa81`bea59630


    unsigned int64 0xfffffa81`41dad420


    unsigned int64 0xfffffa82`32e90770


    unsigned int64 0xfffffa81`4135a010

    <snip the end of the list>


    Now that we have found our request in the DeviceQueue's DeviceList we may want to look into why the request has not been sent to the disk and put on the PendingQueue.  Earlier I mentioned that the Timeout value in the PendingQueue is -2, indicating that a request in the PendingQueue has timed out.  If we take another look at the XRB from the PendingQueue, we see that the CompletionRoutine is RaidUnitCompleteResetRequest.  This indicates that a LUN reset request was sent to the disk, and it timed out. When a LUN reset times out storport calls the miniport’s bus reset function. When the bus reset function is called the miniport is required to complete all requests issued to it.  In this instance the reset was issued, but for some reason the reset has not yet completed and so all of the outstanding requests are stuck.  For more information on storport timeouts, see our article on Event 129 Errors.

  • Ntdebugging Blog

    Hotfix to Enable Mini-Filter Performance Diagnostics With XPerf for Windows Server 2008R2


    Greetings ntdebugging community, Bob here again and today I would like to let everyone know about a new feature implemented in Windows Server 2008 R2’s kernel and filter manager binaries released in knowledge base article 2666390.


    Beginning with this update, a minifilter that is adversely affecting system performance can be identified in Windows 2008 R2. These measurements are taken within the filter manager and measure the execution time of the minifilter’s pre and post operations.  On every file access file system minifilters are given the chance to perform pre and post modifications to IO operations within the driver stack.


    To start recording the operations and to view the results, the latest version of the Windows Performance Toolkit from the Windows 8 beta ADK needs to be installed on the system. How to install the package can be found in the KB article.


    The command to start recording and to record stack traces is (note that these commands may wrap depending on your screen resolution, they are intended to be typed in one line):

    xperf -on  PROC_THREAD+LOADER+FLT_IO_INIT+FLT_IO+FLT_FASTIO+FLT_IO_FAILURE+FILENAME -stackwalk MiniFilterPreOpInit+MiniFilterPostOpInit


    The command above starts a trace with most of the buffering done in memory. The command below specifies a file used as a backing store to use less memory:

    xperf -start -f <traceFile> -on  PROC_THREAD+LOADER+FLT_IO_INIT+FLT_IO+FLT_FASTIO+FLT_IO_FAILURE+FILENAME -stackwalk MiniFilterPreOpInit+MiniFilterPostOpInit


    Note: <traceFile> is the path of the file to store the backing file (i.e. trace.etl).  It is recommended that the backing file be placed on a drive other than the one experiencing the problem, to avoid xperf operations interfering with or contributing to the scenario you are measuring.  You can add the +FILENAME switch if you want to log names of the files that the minifilter is operating on.


    You may not know if a minifilter is causing the problem until after you begin data analysis, so you can add the filter manager tracing to the Driver Delays command presented in my previous blog, Using Xperf to investigate slow I/O issues.

    xperf -on PROC_THREAD+LOADER+CSWITCH+FILENAME+FILE_IO+FILE_IO_INIT+DRIVERS+FLT_IO_INIT+FLT_IO+FLT_FASTIO+FLT_IO_FAILURE -f kernel.etl -stackwalk CSwitch+DiskReadInit+DiskWriteInit+DiskFlushInit+FileCreate+FileCleanup+FileClose+FileRead+FileWrite+MiniFilterPreOpInit+MiniFilterPostOpInit -BufferSize 1024 -MaxBuffers 1024 -MaxFile 1024 -FileMode Circular


    To stop the trace and flush data to the trace file, type the following command:

    xperf –stop –d mymergedtrace.etl


    The file specified must be a different file name than the one provided with the –f switch if that option was used when starting the trace.


    Before we begin looking at the trace, let’s talk a little about what we are tracing. We are tracing fltmgr calling into the minifilter.  The fltmgr manages minifilters.  When a file operation is done, each minifilter has a chance to process and optionally modify an operation on a file before and after the call goes into the file system.  For example, in this case we are going to see results of the time it took for the minifilter to do a “pre create” operation. That is the function that is called before the “create file” function is sent to the file system.  This tracing can also track the time it took for the minifilter to do a “post create” operation.  That is the minifilter function that gets called after the create file is sent to the file system.  So the minifilter is called before and after every file system operation. The minifilter may not have functions to process before and after, however it has the option.


    Once the problem is traced, the data can be viewed by clicking on the merged etl file and you will see a similar presentation as below.



    The Mini-Filter Delays are in the Storage display. So click on the triangle on the left side of Storage and you will see:



    To get more detail, right click on the “Mini-Filter Delays” section and select “Add To New Analysis View”and that will bring up minifilter delays detail as below.



    The Duration is the time in micro seconds of the total time delay in the minifilter. So in the example above the scanner minifilter did 4,068 requests and it took 30 seconds total. 


    We can expand the display by clicking on the triangle next to scanner.sys to get a more detailed view, I chose to investigate scanner.sys because it has the longest Duration.  I dragged the “Major Function Name (Count)” and “File Name” columns to the left of the yellow bar to get detail and used the triangles to expand the ProcessID with the longest Duration, and then I expanded the File Name with the longest Duration.   To get more usable space on your monitor, you can click the X to close the Graph Explorer (to get it back, from the Window menu choose Graph Explorer).



    On the display above it shows the scanner.sys filter was operating on the file “C:\users\Administrator\Desktop\scanner\test2.txt”   The green 6 is the total operations. So there were three cleanups and three creates.  So the display is telling us that the filter did 3 “create” operations that took close to 15 seconds and 3 cleanup operations that took 5 milliseconds on the file.


    If you prefer operating from a command line rather than a GUI, this same data can be extracted using the below command.  Note that -min specifies the time in ns, so this command will show delays longer than 100 ms.

    xperf -i mymergedtrace.etl -a minifilterdelay -min 100000


    This will give you output similar to the below information:

    [1/2]    100.0%

    [2/2]    100.0%

    There are 3 MiniFilter Delay completion events in given time range.

       Process Name,     Module Name,       Call (us),     Return (us),  Time (us), Major Func Name,                       Filename

       explorer.exe,     scanner.sys,        18649145,        23645502,    4996357,          CREATE, C:\Users\Administrator\Desktop\scanner\test2.txt

       explorer.exe,     scanner.sys,        23646355,        28644638,    4998282,          CREATE, C:\Users\Administrator\Desktop\scanner\test2.txt

       explorer.exe,     scanner.sys,        28645194,        33645158,    4999964,          CREATE, C:\Users\Administrator\Desktop\scanner\test2.txt


    It seems there is a problem in the create file handling of the scanner.sys minifilter.  For some reason the create file function in the minifiltertook 5 seconds for each operation.  The vendor of this filter would need to do additional investigation to identify why the filter has this delay.

Page 5 of 25 (241 items) «34567»