Earlier this week one of my co-workers asked me to look at a virtual machine he had which was crashing frequently; and by frequently I mean around 40 times since 11/25. I already knew that the stop code was 0xA (IRQL_NOT_LESS_OR_EQUAL) and found out that it also had stopped with 0xD1 (DRIVER_IRQL_NOT_LESS_OR_EQUAL) a few times. The first dump that I looked at was corrupt but this box had plenty more crashes in it, but until then I decided to reinstall the VM Additions package and see if that was an easy fix. I checked back the next day and found that it had bugchecked 3 more times since. I took a quick look at the latest dump (pool corruption), here is some of the output.
kd> kv
ChildEBP RetAddr Args to Child
87d97bac 81cd5b2a badb0d00 00000001 00000001 nt!_KiTrap0E+0x2ac (FPO: [0,0] TrapFrame @ 87d97bac) (CONV: cdecl)
87d97c68 81ce0117 000009c0 00000001 000039c0 nt!ExpAllocateBigPool+0x201 (CONV: stdcall)
87d97cc8 81db9017 00000001 000039b8 44506343 nt!ExAllocatePoolWithTag+0x112 (CONV: stdcall)
87d97d10 81db8ea0 87d97d2c 84d9deb0 81cf563c nt!PfSnBuildDumpFromTrace+0x5a (CONV: stdcall)
87d97d38 81db9332 84d9deb0 87d97d7c 81c6b602 nt!PfSnEndTrace+0xa2 (CONV: stdcall)
87d97d44 81c6b602 84d9deb0 00000000 847ed7d8 nt!PfSnEndTraceWorkerThreadRoutine+0xe (CONV: stdcall)
87d97d7c 81dafc1d 84d9deb0 87d9c680 00000000 nt!ExpWorkerThread+0xfd (CONV: stdcall)
87d97dc0 81c9a19e 81c6b505 80000001 00000000 nt!PspSystemThreadStartup+0x9d (CONV: stdcall)
00000000 00000000 00000000 00000000 00000000 nt!KiThreadStartup+0x16
kd> .trap 87d97bac
ErrCode = 00000002
eax=00000003 ebx=000039c0 ecx=00000600 edx=00000001 esi=00000003 edi=9dd859c0
eip=81cd5b2a esp=87d97c20 ebp=87d97c68 iopl=0 nv up di pl nz ac pe nc
cs=0008 ss=0010 ds=0023 es=0023 fs=0030 gs=0000 efl=00010016
nt!ExpAllocateBigPool+0x201:
0008:81cd5b2a 895ffc mov dword ptr [edi-4],ebx ds:0023:9dd859bc=????????
kd> dd edi-4 l1
9dd859bc ????????
kd> !pool 9dd859bc
Pool page 9dd859bc region is Paged pool
*9dd82000 : large page allocation, Tag is CcPD, size is 0x39c0 bytes
Pooltag CcPD : Prefetcher trace dump, Binary : nt!ccpf
One of the things that was of interest is the pool tag as you can see in the output of !pool. Hmmm, Prefetcher. Since pool corruption is hard to catch without help so I used Driver Verifier to analyze some suspicious drivers including fileinfo.sys and fltmgr.sys. When I rebooted the machine it only took about 2 minutes for it to bugcheck and here is stack trace from the dump with driver verification turned on.
85617b58 81cd36d4 000000c4 000000c0 89e70e28 nt!KeBugCheckEx+0x1e (CONV: stdcall)
85617b70 81ec9e0d 92747008 85617b9c 81ec262b nt!VfBugCheckNoStackUsage+0x23 (FPO: [0,0,0]) (CONV: stdcall)
85617b7c 81ec262b 84aa5020 84ab0438 84aa5020 nt!VfBeforeCallDriver+0x88 (CONV: stdcall)
85617b9c 81c679d7 853cab29 89e70e28 84aa5020 nt!IovCallDriver+0x1fd (CONV: fastcall)
85617bb0 853cab29 92758b04 927e4960 853c1e00 nt!IofCallDriver+0x1b (CONV: fastcall)
85617be4 853ce827 84aa5020 84ab0438 92758b04 fltmgr!FltpQueryInformationFile+0xc3 (FPO: [Non-Fpo]) (CONV: stdcall)
85617c18 853cf910 84aa5020 927e4960 85617c44 fltmgr!FltpGetFileName+0x63 (FPO: [Non-Fpo]) (CONV: stdcall)
85617c28 853ccdb4 927e4960 00000000 927e4960 fltmgr!FltpGetOpenedFileName+0x18 (FPO: [Non-Fpo]) (CONV: stdcall)
85617c44 853cf990 927e4960 00000000 927e4960 fltmgr!FltpCallOpenedFileNameHandler+0x8a (FPO: [Non-Fpo]) (CONV: stdcall)
85617c60 853d0129 927e4960 00000000 92758b04 fltmgr!FltpGetNormalizedFileNameWorker+0x16 (FPO: [Non-Fpo]) (CONV: stdcall)
85617c78 853cd51f 927e4960 00000000 927e4960 fltmgr!FltpGetNormalizedFileName+0x19 (FPO: [Non-Fpo]) (CONV: stdcall)
85617c90 853b8f13 927e4960 84ab0438 00000000 fltmgr!FltpCreateFileNameInformation+0x81 (FPO: [Non-Fpo]) (CONV: stdcall)
85617cc0 853cda61 84f84d54 00000001 00000000 fltmgr!FltpGetFileNameInformation+0x31b (FPO: [Non-Fpo]) (CONV: stdcall)
85617cd4 80409378 84ab0438 84ab6890 927e4960 fltmgr!FltGetFileNameInformationUnsafe+0x59 (FPO: [Non-Fpo]) (CONV: stdcall)
85617d14 853d3b70 83823c40 84ab6890 00000228 fileinfo!FIStreamQueryWorker+0x76 (FPO: [Non-Fpo]) (CONV: stdcall)
85617d44 81c6b602 00000000 00000000 832c92d8 fltmgr!FltpProcessGenericWorkItem+0x32 (FPO: [Non-Fpo]) (CONV: stdcall)
85617d7c 81dafc1d 83823c40 8561c680 00000000 nt!ExpWorkerThread+0xfd (CONV: stdcall)
85617dc0 81c9a19e 81c6b505 00000001 00000000 nt!PspSystemThreadStartup+0x9d (CONV: stdcall)
Alright, now we have a more interesting call stack and the debugger says that fileinfo!FIStreamQueryWorker is the culprit. FileInfo.sys is associated with Superfetch/Prefetch which corroborates the evidence in the very first dump I looked at where you can see the pool allocated with the CcPD tag. Let's check the dates on fltmgr and fileinfo to see if they're old.
kd> lmvm fileinfo
start end module name
80402000 80412000 fileinfo (private pdb symbols) c:\symbols\fileinfo.pdb\B768A0AE91484C23A64BA91BEF9C57AD1\fileinfo.pdb
Loaded symbol image file: fileinfo.sys
Image path: \SystemRoot\system32\drivers\fileinfo.sys
Image name: fileinfo.sys
Timestamp: Thu Nov 02 01:36:47 2006 (4549AE1F)
CheckSum: 0001C10F
ImageSize: 00010000
Translations: 0000.04b0 0000.04e4 0409.04b0 0409.04e4
kd> lmvm fltmgr
853b1000 853e2000 fltmgr (private pdb symbols) c:\symbols\fltMgr.pdb\FA42D1F108C1425599238F576CBDDE831\fltMgr.pdb
Loaded symbol image file: fltmgr.sys
Image path: \SystemRoot\system32\drivers\fltmgr.sys
Image name: fltmgr.sys
Timestamp: Thu Nov 02 01:30:58 2006 (4549ACC2)
CheckSum: 00035410
ImageSize: 00031000
After checking another x86 Vista box that was hanging around these file dates should be 1/20/2008 instead of 11/2/2006. Doh, this machine didn't have SP1! I should have checked all of this at first but I assumed that the machine was up-to-date. Apparently there were some funny apps on it that we didn't want to break so SP1 had not been installed. I stopped and disabled the Superfetch service and the machine has been happy ever since.
SERVICE_NAME: sysmain
TYPE : 20 WIN32_SHARE_PROCESS
START_TYPE : 4 DISABLED
ERROR_CONTROL : 0 IGNORE
BINARY_PATH_NAME : C:\Windows\system32\svchost.exe -k LocalSystemNetworkRestricted
LOAD_ORDER_GROUP :
TAG : 0
DISPLAY_NAME : Superfetch
DEPENDENCIES : rpcss
: fileinfo
SERVICE_START_NAME : LocalSystem
If you notice under dependencies you will see the system driver fileinfo.sys.
The key takeaway here is to make sure that you have all appropriate updates installed on your system or you could run into some strange behavior.