In our previous articles we discussed various techniques for identifying a pool memory leak and narrowing the scope of the leak to an individual pool tag. Knowing the leaking pool tag is often sufficient to identify the cause of the problem and find a solution. However, there may be a scenario where multiple drivers use the same pool tag (such as DDK) or when one driver uses the same tag in multiple places. In this scenario you will need more information to identify the source of the leak. In our next several articles we will present techniques to get this information.
This article will present a basic technique where we modify each pool tag to identify what code in which driver is allocating the memory that gets leaked.
This technique requires a live debug of the problematic system. There are many resources with steps for how to configure a system for a live debug. The debugging tools have instructions in the debugger.chm help file, under Debugging Tools for Windows\Debuggers\Installation and Setup\Kernel-Mode Setup.
Using the same technique as in Part 3, identify where the tag in question is used.
0: kd> !for_each_module s -a @#Base @#End "Leak"
fffff880`0496e3aa 4c 65 61 6b 3b c1 0f 42-c1 41 8d 49 fd 8b d0 ff Leak;..B.A.I....
fffff880`0496e621 4c 65 61 6b 3b c1 0f 42-c1 33 c9 8b d0 ff 15 cc Leak;..B.3......
Next, edit each instance so that they are unique. The ASCII code for numeral 1 is 0x31, and the codes for each numeral increase sequentially. Using this information edit each tag to be Lea1, Lea2, etc.
0: kd> eb fffff880`0496e3aa+3 31
0: kd> eb fffff880`0496e621+3 32
Confirm your edits resulted in the expected tags using the dc command.
0: kd> dc fffff880`0496e3aa l1
fffff880`0496e3aa 3161654c Lea1
0: kd> dc fffff880`0496e621 l1
fffff880`0496e621 3261654c Lea2
Now wait for the leak to happen and repeat the steps from Part 3 to identify which of the tags is leaked. This tells you what code allocates the memory that gets leaked. Below we can see that Lea2 is the tag being leaked.
0: kd> !poolused /t5 2
Sorting by NonPaged Pool Consumed
Tag Allocs Used Allocs Used
Lea2 257 263168000 0 0 UNKNOWN pooltag 'Lea2', please update pooltag.txt
nVsC 664 1531552 0 0 UNKNOWN pooltag 'nVsC', please update pooltag.txt
netv 4369 1172224 1 144 UNKNOWN pooltag 'netv', please update pooltag.txt
Leak 1 1024000 0 0 UNKNOWN pooltag 'Leak', please update pooltag.txt
EtwB 94 945136 4 163840 Etw Buffer, Binary: nt!etw
TOTAL 41296 281814544 44077 68102368
Knowing what code allocates the leaked pool may be very valuable to a driver developer who needs to narrow the scope of the problem. Often this information is sufficient for a developer to code review the use of this memory and identify why it would be leaked.
There are times when more information is needed to determine the cause of the leak. A developer may need the call stacks of memory being allocated and freed. We will capture this information in Part 5.
In Part 4 we narrowed the source of the leaked pool memory to the specific driver which is allocating it, and we identified where in the driver this allocation was taking place. However, we did not capture contextual information such as the call stack leading up to this code. Also, we didn’t capture information about when this allocated pool is freed. In this article we will use the PoolHitTag feature to break into the debugger when a specific tag is used.
As in Part 4, a live debug must be configured to use this feature. The debugging tools have instructions in the debugger.chm help file, under Debugging Tools for Windows\Debuggers\Installation and Setup\Kernel-Mode Setup (see screenshot in part 4).
These steps are typically only effective if you are able to perform them while the leak is happening. There may be a scenario in which a developer wants to know what “normal” looks like, but most often the steps in this article are used to investigate “broken”.
The PoolHitTag is a global in the kernel binary. When this global is set to a pool tag, the system will break into the debugger whenever pool with this tag is allocated or freed. By default the PoolHitTag is set to ffffff0f.
1: kd> dc nt!PoolHitTag l1
fffff800`016530fc ffffff0f ....
To turn on this feature, edit the PoolHitTag to the tag that is known to leak. The value 3261654c is little endian ASCII for the string ‘Lea2’. I found this value in the “Confirm your edits” step in Part 4.
1: kd> ed nt!PoolHitTag 3261654c
fffff800`016530fc 3261654c Lea2
With PoolHitTag now set to the leaking tag, issue the ‘g’ command to release debugger and it will automatically break in whenever the Lea2 tag is used.
1: kd> g
Break instruction exception - code 80000003 (first chance)
nt! ?? ::FNODOBFM::`string'+0x24a2a:
fffff800`014798f6 cc int 3
In the above example the debugger broke in because the ‘int 3’ instruction triggered a breakpoint. The symbols seem to indicate that we are in a function named “?? ::FNODOBFM::`string'”, but this is simply a lack of symbolic information for this optimized code. Unassembling the surrounding code shows that this code is a piece of ExpAllocateBigPool, one of the functions used in the kernel to allocate pool allocations larger than 4096 bytes.
1: kd> u fffff800`014798f6
fffff800`014798f7 e9ee8e0800 jmp nt!ExpAllocateBigPool+0x13a (fffff800`015027ea)
At this point we can dump the call stack and see the full context of what is happening when this memory is allocated.
1: kd> k
Child-SP RetAddr Call Site
fffff880`04ec1680 fffff800`0161090e nt! ??::FNODOBFM::`string'+0x24a2a
fffff880`04ec1770 fffff880`0496e634 nt!ExAllocatePoolWithTag+0x82e
fffff880`04ec1860 fffff880`0496e727 myfault+0x1634
fffff880`04ec19b0 fffff800`017fca97 myfault+0x1727
fffff880`04ec1a10 fffff800`017fd2f6 nt!IopXxxControlFile+0x607
fffff880`04ec1b40 fffff800`014e0ed3 nt!NtDeviceIoControlFile+0x56
fffff880`04ec1bb0 00000000`7756138a nt!KiSystemServiceCopyEnd+0x13
00000000`000df4c8 000007fe`fd5fa249 ntdll!ZwDeviceIoControlFile+0xa
00000000`000df4d0 00000000`7740683f KERNELBASE!DeviceIoControl+0x75
00000000`000df540 00000000`ff222384 kernel32!DeviceIoControlImplementation+0x7f
00000000`000df590 00000000`00000000 NotMyfault+0x2384
Repeating the ‘g’ and ‘k’ commands multiple times will begin to give you an understanding of the various ways this code may be used. This can be automated by modifying the ‘int 3’ instruction and using a breakpoint. Note that system performance may suffer because output to the debug port is serialized.
The commands shown below use addresses specific to big pool allocations (larger than 4KB). The ‘int 3’ instruction may be located elsewhere depending on the scenario you are debugging.
To modify the operation from a debug break to a breakpoint, change the ‘int 3’ to ‘nop’. In x86 and x64 the opcode for ‘nop’ is 90. Coincidentally these instructions are the same length.
1: kd> eb fffff800`014798f6 90
Confirm that the instruction was reset properly.
1: kd> u fffff800`014798f6 l1
fffff800`014798f6 90 nop
Set a breakpoint on the ‘nop’ instruction and configure the breakpoint to automatically dump the stack and go the debugger.
1: kd> bp fffff800`014798f6 "k;g"
If you find that the pool is sometimes allocated and occasionally freed, you may need to edit the ‘int 3’ used when ExFreePool is called, and set a similar breakpoint on that address.
fffff800`0160f5b7 cc int 3
1: kd> eb fffff800`0160f5b7 90
1: kd> bp fffff800`0160f5b7 "k;g"
Once you have sufficient data to understand the scenario where the memory is allocated and freed use Ctrl+Break to break into the debugger, clear the breakpoints and reset the PoolHitTag. Then go the debugger to allow the system to continue normal operation.
1: kd> bc *
1: kd> ed nt!PoolHitTag ffffff0f
The data collected with these steps should provide an indication to a developer of what memory is being leaked and when.
PoolHitTag isn’t the only option for collecting call stack information. Our final articles will cover alternative techniques for obtaining this information.