In Windows 7, we've added some techniques and tools that assist with troubleshooting hangs involving multiple processes (if you're interested in the evolutionary history, check out Ryan's "Let There Be Hangs" series, where he describes how hang reporting has changed from Windows XP on.) With these improvements come some new concepts, terms, and techniques, which I will attempt to convey so that, after reading this, you can fire up the debugger and understand a bit more about what’s going on.
The “X” stand for cross and the “PROC” stands for process. So in other words, we’re talking about a cross process dump. And what is that exactly? Well, when an app is hung due to cross-process communication, we capture a dump of the hung process and, when possible, the process determined to be at the end of the “wait chain” in the inter-process communication. We include both dumps, as well as other analysis related information, in the cab submitted with the error report. XPROC dumps were introduced in Vista, but there were cases where the state of the hang would be lost before the heap dump was captured.
At its core, Process Reflection is used to make a clone of an existing process. This clone, also referred to as the reflected process, contains a single thread and a copy of the original process’ address space. Mark Russinovich talks more about it here (31:00)
In the case of hang reporting, when a chain of processes is involved in the hang, we use Process Reflection to create a clone of the process determined to be at the end of the wait chain by virtue of the GetThreadWaitChain API. A heap dump is taken of the clone instead of the original. Cloning takes way less time than generating a heap dump file, so using this technique, we increase our chances of capturing the hung state, than if we were just create a dump of the original process. Also, it’s worth pointing out that our single thread in the clone does not execute during hang reporting, so it is simply parked and left idle. Here’s what its stack looks like today:
ChildEBP RetAddr0028fdbc 77a97b8c ntdll!KiFastSystemCallRet0028fdc0 77af02ef ntdll!ZwWaitForSingleObject+0xc0028fe38 75ef0239 ntdll!RtlpProcessReflectionStartup+0x1bf0028fe44 77ab1ec2 kernel32!BaseThreadInitThunk+0xe0028fe84 77ab1e95 ntdll!__RtlUserThreadStart+0x700028fe9c 00000000 ntdll!_RtlUserThreadStart+0x1b
When it comes to debugging analysis, some things aren’t readily apparent with the reflected process. For instance, what about the original threads? Does the handle data conflict with that of the original process? After all, the reflected process is a process in its own right. The short answer is the debugger handles all of this for you. As they say, “the devil is in the details” so we’ll get into how the debugger does this shortly. But first, it’s important to understand the files contained in an AppHangXProcB1 bucket’s cab.
So to recap:A hang has occurred. We get a heap dump of the hung process. Using GetThreadWaitChain(), we discover the process at the end of the wait chain and clone it using ProcessRelection.Next we grab a heap dump of the clone. But, we also have to deal with the fact that the thread and handle data from the original aren’t copied to the clone. So, in addition to this heap dump, we also capture a mini dump of the original process that has the thread and handle data. The debugger knows how to “stitch” both files together into a single, coherent target.
Let’s take a look at an example cab:
File name File size Date Time Attrs----------------------------- ---------- ---------- -------- ----- iexplore.exe.xml 5496 2010/07/14 16:25:22 ---- iexplore.exe.6e79.wxhu.dmp.dbgcfg.ini 224 2010/07/14 16:25:22 ---- WER57FB.tmp.ref 1596362 2010/07/14 16:25:22 ---- iexplore.exe.6e79.wxhu.dmp 150380775 2010/07/14 16:25:22 ---- memory.hdmp 40343241 2010/07/14 16:26:18 ---- WERInternalMetadata.xml 3226 2010/07/14 16:26:30 ---- AppCompat.txt 47354 2010/07/14 16:26:30 ----
Look at the contents of the dbgcfg.ini file and you'll see something like this:
You may be able to discern what’s going on here. The *.tmp.ref file is the default target, which is the original process mini dump that I described earlier; the one with the thread and handle data. Using the debugger, you can view the flags passed to MiniDumpWriteDump used to create this file and thus the data contained within:
0:004> .dumpdebug----- User Mini Dump AnalysisMINIDUMP_HEADER:Version A793 (61B0)NumberOfStreams 11Flags 1104 0004 MiniDumpWithHandleData 0100 MiniDumpWithProcessThreadData 1000 MiniDumpWithThreadInfo
The *.wxhu.dmp file is the reflected process dump. Its naming convention hasn’t changed from Vista. So, those of you that have been working with XPROC dumps in Vista may have been a little confused when you analyzed this file from a Windows 7 machine, only to see one thread waiting in RtlpProcessReflectionStartup. Open the file directly in the debugger and you’ll see that it has full data, thread and memory info, etc…
0:000> .dumpdebug----- User Mini Dump AnalysisMINIDUMP_HEADER:Version A793 (61B0)NumberOfStreams 14Flags 51B25 0001 MiniDumpWithDataSegs 0004 MiniDumpWithHandleData 0020 MiniDumpWithUnloadedModules 0100 MiniDumpWithProcessThreadData 0200 MiniDumpWithPrivateReadWriteMemory 0800 MiniDumpWithFullMemoryInfo 1000 MiniDumpWithThreadInfo 10000 MiniDumpWithPrivateWriteCopyMemory 40000 MiniDumpWithTokenInformation
You’ll also probably notice that this dump also has handle data. But again, hang reporting does not duplicate the handle data from the original to the reflected process, so the only handle you should see is a process handle from the original process:
Stream 10: type HandleDataStream (12), size 00000038, RVA 08F35E0F 1 descriptors, header size is 16, descriptor size is 40 Handle(0000000000000008,"Process","")
There are at least two ways you can view the stitched target (using the same cab as an example):
Make sure you have the latest debugger or else it may not recognize this type of target. You can download it from here.