• Ntdebugging Blog

    Troubleshooting Pool Leaks Part 5 – PoolHitTag

    • 0 Comments

    In Part 4 we narrowed the source of the leaked pool memory to the specific driver which is allocating it, and we identified where in the driver this allocation was taking place.  However, we did not capture contextual information such as the call stack leading up to this code.  Also, we didn’t capture information about when this allocated pool is freed.  In this article we will use the PoolHitTag feature to break into the debugger when a specific tag is used.

     

    As in Part 4, a live debug must be configured to use this feature.  The debugging tools have instructions in the debugger.chm help file, under Debugging Tools for Windows\Debuggers\Installation and Setup\Kernel-Mode Setup (see screenshot in part 4).

     

    These steps are typically only effective if you are able to perform them while the leak is happening.  There may be a scenario in which a developer wants to know what “normal” looks like, but most often the steps in this article are used to investigate “broken”.

     

    The PoolHitTag is a global in the kernel binary.  When this global is set to a pool tag, the system will break into the debugger whenever pool with this tag is allocated or freed.  By default the PoolHitTag is set to ffffff0f.

     

    1: kd> dc nt!PoolHitTag l1

    fffff800`016530fc  ffffff0f                             ....

     

    To turn on this feature, edit the PoolHitTag to the tag that is known to leak.  The value 3261654c is little endian ASCII for the string ‘Lea2’.  I found this value in the “Confirm your edits” step in Part 4. 

     

    1: kd> ed nt!PoolHitTag 3261654c

    1: kd> dc nt!PoolHitTag l1

    fffff800`016530fc  3261654c                             Lea2

     

    With PoolHitTag now set to the leaking tag, issue the ‘g’ command to release debugger and it will automatically break in whenever the Lea2 tag is used.

     

    1: kd> g

    Break instruction exception - code 80000003 (first chance)

    nt! ?? ::FNODOBFM::`string'+0x24a2a:

    fffff800`014798f6 cc              int     3

     

    In the above example the debugger broke in because the ‘int 3’ instruction triggered a breakpoint.  The symbols seem to indicate that we are in a function named “?? ::FNODOBFM::`string'”, but this is simply a lack of symbolic information for this optimized code.  Unassembling the surrounding code shows that this code is a piece of ExpAllocateBigPool, one of the functions used in the kernel to allocate pool allocations larger than 4096 bytes.

     

    1: kd> u fffff800`014798f6

    nt! ?? ::FNODOBFM::`string'+0x24a2a:

    fffff800`014798f6 cc              int     3

    fffff800`014798f7 e9ee8e0800      jmp     nt!ExpAllocateBigPool+0x13a (fffff800`015027ea)

     

    At this point we can dump the call stack and see the full context of what is happening when this memory is allocated.

     

    1: kd> k

    Child-SP          RetAddr           Call Site

    fffff880`04ec1680 fffff800`0161090e nt! ??::FNODOBFM::`string'+0x24a2a

    fffff880`04ec1770 fffff880`0496e634 nt!ExAllocatePoolWithTag+0x82e

    fffff880`04ec1860 fffff880`0496e727 myfault+0x1634

    fffff880`04ec19b0 fffff800`017fca97 myfault+0x1727

    fffff880`04ec1a10 fffff800`017fd2f6 nt!IopXxxControlFile+0x607

    fffff880`04ec1b40 fffff800`014e0ed3 nt!NtDeviceIoControlFile+0x56

    fffff880`04ec1bb0 00000000`7756138a nt!KiSystemServiceCopyEnd+0x13

    00000000`000df4c8 000007fe`fd5fa249 ntdll!ZwDeviceIoControlFile+0xa

    00000000`000df4d0 00000000`7740683f KERNELBASE!DeviceIoControl+0x75

    00000000`000df540 00000000`ff222384 kernel32!DeviceIoControlImplementation+0x7f

    00000000`000df590 00000000`00000000 NotMyfault+0x2384

     

    Repeating the ‘g’ and ‘k’ commands multiple times will begin to give you an understanding of the various ways this code may be used.  This can be automated by modifying the ‘int 3’ instruction and using a breakpoint.  Note that system performance may suffer because output to the debug port is serialized.

     

    The commands shown below use addresses specific to big pool allocations (larger than 4KB). The ‘int 3’ instruction may be located elsewhere depending on the scenario you are debugging.

     

    To modify the operation from a debug break to a breakpoint, change the ‘int 3’ to ‘nop’.  In x86 and x64 the opcode for ‘nop’ is 90.  Coincidentally these instructions are the same length.

     

    1: kd> eb fffff800`014798f6 90

     

    Confirm that the instruction was reset properly.

     

    1: kd> u fffff800`014798f6 l1

    nt! ?? ::FNODOBFM::`string'+0x24a2a:

    fffff800`014798f6 90              nop

     

    Set a breakpoint on the ‘nop’ instruction and configure the breakpoint to automatically dump the stack and go the debugger.

     

    1: kd> bp fffff800`014798f6 "k;g"

    1: kd> g

     

    If you find that the pool is sometimes allocated and occasionally freed, you may need to edit the ‘int 3’ used when ExFreePool is called, and set a similar breakpoint on that address.

     

    Break instruction exception - code 80000003 (first chance)

    nt!ExDeferredFreePool+0xb57:

    fffff800`0160f5b7 cc              int     3

    1: kd> eb fffff800`0160f5b7 90

    1: kd> bp fffff800`0160f5b7 "k;g"

    1: kd> g

     

    Once you have sufficient data to understand the scenario where the memory is allocated and freed use Ctrl+Break to break into the debugger, clear the breakpoints and reset the PoolHitTag.  Then go the debugger to allow the system to continue normal operation.

     

    1: kd> bc *

    1: kd> ed nt!PoolHitTag ffffff0f

    1: kd> g

     

    The data collected with these steps should provide an indication to a developer of what memory is being leaked and when.

     

    PoolHitTag isn’t the only option for collecting call stack information.  Our final articles will cover alternative techniques for obtaining this information.

  • Ntdebugging Blog

    Troubleshooting Pool Leaks Part 4 – Debugging Multiple Users for a Tag

    • 0 Comments

    In our previous articles we discussed various techniques for identifying a pool memory leak and narrowing the scope of the leak to an individual pool tag.  Knowing the leaking pool tag is often sufficient to identify the cause of the problem and find a solution.  However, there may be a scenario where multiple drivers use the same pool tag (such as DDK) or when one driver uses the same tag in multiple places.  In this scenario you will need more information to identify the source of the leak.  In our next several articles we will present techniques to get this information.

     

    This article will present a basic technique where we modify each pool tag to identify what code in which driver is allocating the memory that gets leaked.

     

    This technique requires a live debug of the problematic system.  There are many resources with steps for how to configure a system for a live debug.  The debugging tools have instructions in the debugger.chm help file, under Debugging Tools for Windows\Debuggers\Installation and Setup\Kernel-Mode Setup.

     

    DebuggerHelp

     

    Using the same technique as in Part 3, identify where the tag in question is used.

     

    0: kd> !for_each_module s -a @#Base @#End "Leak"

    fffff880`0496e3aa  4c 65 61 6b 3b c1 0f 42-c1 41 8d 49 fd 8b d0 ff  Leak;..B.A.I....

    fffff880`0496e621  4c 65 61 6b 3b c1 0f 42-c1 33 c9 8b d0 ff 15 cc  Leak;..B.3......

     

    Next, edit each instance so that they are unique.  The ASCII code for numeral 1 is 0x31, and the codes for each numeral increase sequentially.  Using this information edit each tag to be Lea1, Lea2, etc.

     

    0: kd> eb fffff880`0496e3aa+3 31

    0: kd> eb fffff880`0496e621+3 32

     

    Confirm your edits resulted in the expected tags using the dc command.

     

    0: kd> dc fffff880`0496e3aa l1

    fffff880`0496e3aa  3161654c                             Lea1

    0: kd> dc fffff880`0496e621 l1

    fffff880`0496e621  3261654c                             Lea2

     

    Now wait for the leak to happen and repeat the steps from Part 3 to identify which of the tags is leaked.  This tells you what code allocates the memory that gets leaked.  Below we can see that Lea2 is the tag being leaked.

     

    0: kd> !poolused /t5 2

    ..

    Sorting by NonPaged Pool Consumed

     

                   NonPaged                  Paged

    Tag     Allocs         Used     Allocs         Used

     

    Lea2       257    263168000          0            0  UNKNOWN pooltag 'Lea2', please update pooltag.txt

    nVsC       664      1531552          0            0  UNKNOWN pooltag 'nVsC', please update pooltag.txt

    netv      4369      1172224          1          144  UNKNOWN pooltag 'netv', please update pooltag.txt

    Leak         1      1024000          0            0  UNKNOWN pooltag 'Leak', please update pooltag.txt

    EtwB        94       945136          4       163840  Etw Buffer, Binary: nt!etw

     

    TOTAL     41296    281814544      44077     68102368

     

    Knowing what code allocates the leaked pool may be very valuable to a driver developer who needs to narrow the scope of the problem.  Often this information is sufficient for a developer to code review the use of this memory and identify why it would be leaked.

     

    There are times when more information is needed to determine the cause of the leak.  A developer may need the call stacks of memory being allocated and freed.  We will capture this information in Part 5.

Page 1 of 1 (2 items)