Hi, this is Chad from the OEM team. You may remember me from such posts as “Debugging a bluescreen at home.”
Some time ago I debugged a bluescreen on a friend’s home computer, and I thought the results were interesting enough to share. My friend had an older Windows XP system that had been experiencing random crashes for a while. I had looked at a memory dump or two, and while there wasn’t enough information to pinpoint a specific cause, I noticed what appeared to be pool corruption, so I recommended he enable Driver Verifier against all third-party drivers on the system in an effort to track down the offending driver.
(You can learn more about Driver Verifier in the following Knowledge Base article: “Using Driver Verifier to identify issues with Windows drivers for advanced users”).
With Verifier enabled the machine crashed with a bugcheck, and I debugged the memory dump. As always, I start with the “!analyze -v” command:
1: kd> !analyze -v
* Bugcheck Analysis *
Memory was referenced after it was freed.
This cannot be protected by try-except.
When possible, the guilty driver's name (Unicode string) is printed on
the bugcheck screen and saved in KiBugCheckDriver.
Arg1: 88328eac, memory referenced
Arg2: 00000000, value 0 = read operation, 1 = write operation
Arg3: 86c6929b, if non-zero, the address which referenced memory.
Arg4: 00000000, (reserved)
READ_ADDRESS: 88328eac Special pool
86c6929b 8b423c mov eax,dword ptr [edx+3Ch]
TRAP_FRAME: f7516bf0 -- (.trap 0xfffffffff7516bf0)
ErrCode = 00000000
eax=00000000 ebx=88328e70 ecx=00000003 edx=88328e70 esi=806e6410 edi=86c31af8
eip=86c6929b esp=f7516c64 ebp=f7516ca4 iopl=0 nv up ei pl zr na pe nc
cs=0008 ss=0010 ds=0023 es=0023 fs=0030 gs=0000 efl=00010246
86c6929b 8b423c mov eax,dword ptr [edx+3Ch] ds:0023:88328eac=????????
Resetting default scope
LAST_CONTROL_TRANSFER: from 8052037a to 804f9f43
f7516b70 8052037a 00000050 88328eac 00000000 nt!KeBugCheckEx+0x1b
f7516bd8 80544588 00000000 88328eac 00000000 nt!MmAccessFault+0x9a8
f7516bd8 86c6929b 00000000 88328eac 00000000 nt!KiTrap0E+0xd0
WARNING: Frame IP not in any known module. Following frames may be wrong.
f7516ca4 86c695ec 86cc71bd 86d3c318 88328e70 0x86c6929b
f7516ccc ba631459 f7516cf8 86c69605 86d3cbc8 0x86c695ec
f7516cd4 86c69605 86d3cbc8 88328e70 88328e70 sr!SrPassThrough+0x31
f7516cf8 8057f982 f7516d64 0007f964 80579e64 0x86c69605
f7516d0c 80579ec1 86d3cbc8 88328e70 86c31af8 nt!IopSynchronousServiceTail+0x70
f7516d30 8054162c 00000578 00000000 00000000 nt!NtQueryDirectoryFile+0x5d
f7516d30 7c90e514 00000578 00000000 00000000 nt!KiFastCallEntry+0xfc
0007f9ac 00000000 00000000 00000000 00000000 0x7c90e514
8057f982 807d1c00 cmp byte ptr [ebp+1Ch],0
A good way to get more information about a particular bugcheck code is to search for it in the Windows Debugger help file. Under the entry “Bug Check 0xD5: DRIVER_PAGE_FAULT_IN_FREED_SPECIAL_POOL” we learn that this particular bugcheck occurs when “The Driver Verifier Special Pool option has caught the driver accessing memory which was earlier freed.”
So, as a result of having previously enabled Driver Verifier, we have some memory allocations coming out of Special Pool. (Incidentally, you can use the “!verifier” command in the debugger to get a list of which drivers are being verified and various information about them.) Accesses to Special Pool memory undergo additional verification checking, and in this case, the verifier has thrown a bugcheck because the memory in question is free.
Using !pte against the address in question (88338eac) shows that, in fact, it’s not a valid virtual address at all:
1: kd> !pte 88328eac
PDE at C0602208 PTE at C0441940
contains 000000000676C963 contains FFFFFFFF00000000
pfn 676c -G-DA--KWEV not valid
So, some code tried to read from a memory location that was completely invalid. This isn’t altogether uncommon, but there is something more unusual about this crash: If we look at the call stack leading up to the crash, the debugger isn’t even displaying a module name for the function that did the bad memory access! Let’s use the .trap command (helpfully supplied in the !analyze output above) to look at the instruction that actually failed, and dump the stack again.
1: kd> .trap 0xfffffffff7516bf0
1: kd> kb
*** Stack trace for last set context - .thread/.cxr resets it
ChildEBP RetAddr Args to Child
So, we were doing some file I/O (NtQueryDirectoryFile) and somehow ended up running some code which is loaded in memory around 0x86c6929b. But the debugger isn’t even able to match this code up with a module name. Why not? Well, because there’s nothing in the kernel’s loaded module list that matches up with this address. (You can dump the loaded module list with the “lm” command.)
This most likely means that this module was not loaded into memory using the standard Win32 APIs, since these APIs would always add the module to the loaded list. Alternately, the loaded module list may have been corrupted or tampered with in some way.
Let’s do a “!address” on the location of this function to see if we can tell anything more:
1: kd> !address 0x86c6929b
86c68000 - 00037000
Now things are looking really strange indeed – this code resides in nonpaged pool memory. We do not ordinarily execute code out of pool memory. Worse, it’s not even allocated pool:
1: kd> !pool 0x86c6929b
Pool page 86c6929b region is Nonpaged pool
86c69000 is not a valid small pool allocation, checking large pool...
unable to get pool big page table - either wrong symbols or pool tagging is disabled
86c69000 is freed (or corrupt) pool
Bad previous allocation size @86c69000, last size was 0
This all looked pretty strange, so I started dumping memory to get a look at the code in question. Turns out there is a module header at the start of the previous page, at 0x86c68000. You can always identify these by the “MZ” string at the beginning of the header. (Fun trivia fact: These are the initials of Mark Zbikowski, the Microsoft developer who originally designed the .exe file format, way back in the early days of MS-DOS.)
1: kd> dc 0x86c68000
86c68000 00905a4d 00000003 00000004 0000ffff MZ..............
86c68010 000000b8 00000000 00000040 00000000 ........@.......
86c68020 00000000 00000000 00000000 00000000 ................
86c68030 00000000 00000000 00000000 00000248 ............H...
86c68040 0eba1f0e cd09b400 4c01b821 685421cd ........!..L.!Th
86c68050 70207369 72676f72 63206d61 6f6e6e61 is program canno
86c68060 65622074 6e757220 206e6920 20534f44 t be run in DOS
86c68070 65646f6d 0a0d0d2e 00000024 00000000 mode....$.......
If you have an executable file in memory, you can dump the header using the “!dh” command. The resulting output is pretty long, so I’ve trimmed some of the output for purposes of this post. But there was one thing that really stood out, highlighted in red below:
1: kd> !dh 0x86c68000
File Type: EXECUTABLE IMAGE
FILE HEADER VALUES
14C machine (i386)
5 number of sections
4A53A574 time date stamp Tue Jul 07 12:43:48 2009
0 file pointer to symbol table
0 number of symbols
E0 size of optional header
32 bit word machine
SECTION HEADER #2
106 virtual size
1980 virtual address
180 size of raw data
1980 file pointer to raw data
0 file pointer to relocation table
0 file pointer to line numbers
0 number of relocations
0 number of line numbers
(no align specified)
Type Size Address Pointer
cv 5a 1a2c 1a2c Format: RSDS, guid, 1, c:\programs\revolution6\innerdrv\objfre_w2k_x86\i386\InnerDrv.pdb
Aha! When you build a module, the compiler puts information in the header to help debuggers find the appropriate symbol file. In this case, we can see that this program seems to be called “InnerDrv.” Now we have something to go on!
A quick Bing search for “InnerDrv.pdb” shows that this particular code is part of a rootkit known as “Pushdo”. My friend’s system had been infected by this malware. In the end, my friend opted to play it safe and simply reformat and reinstall this machine.
Great post, more articles like this please!
Nice article. More like this please