Today we’ll examine a case where a crash is occurring in a Microsoft process, in core Windows code, but the culprit isn’t the crashing code. In fact, the culprit isn’t even running in the process that crashed! But before I get ahead of myself, let’s start by examining a crash dump that shows the problem...
// The crashing process is the Windows Sidebar in Vista...
. 0 id: 1b3c create name: sidebar.exe
// The exception record shows that we have hit an access violation in RtlInitUnicodeString while reading from 00080000
0:001> .exr -1
ExceptionAddress: 77837e8b (ntdll!RtlInitUnicodeString+0x0000001b)
ExceptionCode: c0000005 (Access violation)
Attempt to read from address 00080000
// Here’s the register context at the time of failure.
// Looks like we are trying to find the end of a unicode string
Last set context:
eax=00000000 ebx=00000000 ecx=ffffffff edx=0088fa88 esi=00000000 edi=00080000
eip=77837e8b esp=0088fa60 ebp=0088fabc iopl=0 nv up ei pl zr na pe nc
cs=001b ss=0023 ds=0023 es=0023 fs=003b gs=0000 efl=00010246
77837e8b 66f2af repne scas word ptr es:[edi]
// Here’s the call stack that the debugger has tried to assemble for us...
*** Stack trace for last set context - .thread/.cxr resets it
ChildEBP RetAddr Args to Child
0088fad0 77154911 00080000 0088fb1c 7782e4b6 ntdll!RtlInitUnicodeString+0x1b
0088fadc 7782e4b6 00080000 77030f7b 00000000 kernel32!BaseThreadInitThunk+0xe
0088fb1c 7782e489 7713361f 00080000 00000000 ntdll!__RtlUserThreadStart+0x23
0088fb34 00000000 7713361f 00080000 00000000 ntdll!_RtlUserThreadStart+0x1b
// The address we are trying to read is freed memory...
0:003> du 00080000
So we crash because RtlInitUnicodeString attempted to deference an invalid pointer (address 00080000). Where did RtlInitUnicodeString get this value? Let’s unassemble the function and see...
0:012> uf ntdll!RtlInitUnicodeString
77567e70 57 push edi
77567e71 8b7c240c mov edi,dword ptr [esp+0Ch] // edi comes from here
77567e75 8b542408 mov edx,dword ptr [esp+8]
77567e79 c70200000000 mov dword ptr [edx],0
77567e7f 897a04 mov dword ptr [edx+4],edi
77567e82 0bff or edi,edi
77567e84 7422 je ntdll!RtlInitUnicodeString+0x38 (77567ea8)
77567e86 83c9ff or ecx,0FFFFFFFFh
77567e89 33c0 xor eax,eax
77567e8b 66f2af repne scas word ptr es:[edi] // We crash here
We can see from the above assembly that edi came from esp+c, which is the second parameter to this function (MSDN tells us that this is the SourceString parameter). So naturally we’ll want to examine the caller of RtlInitUnicodeString to see if it is at fault. If we are to believe the call stack that the debugger gave us, then caller of RtlInitUnicodeString is BaseThreadInitThunk...but that doesn’t make sense. BaseThreadInitThunk normally is the function that calls the thread’s start function. Why would anyone try to make RtlInitUnicodeString the start function for a thread? Let’s assume that the debugger isn’t showing us the real call stack for a moment, and look at the raw stack values...
0:001> dds esp
0088fa60 77837e70 ntdll!RtlInitUnicodeString
0088fa64 7713312c kernel32!LoadLibraryExW+0x6f
0088fab0 7711e289 kernel32!_except_handler4
0088fac0 77133630 kernel32!LoadLibraryW+0x11
0088fad4 77154911 kernel32!BaseThreadInitThunk+0xe
0088fae0 7782e4b6 ntdll!__RtlUserThreadStart+0x23
0088fae8 77030f7b urlmon!g_StaticLiteralTreeCode <PERF> (urlmon+0xe0f7b)
0088fafc 771af389 kernel32!UnhandledExceptionFilter
0088fb00 771af389 kernel32!UnhandledExceptionFilter
0088fb10 777f9834 ntdll!_except_handler4
0088fb20 7782e489 ntdll!_RtlUserThreadStart+0x1b
0088fb24 7713361f kernel32!LoadLibraryW
0088fb3c 7713361f kernel32!LoadLibraryW
<snip - all zeros to base of stack at 00890000>
We can see from the information above that there is more going on here than the “kb” output would lead us to believe. Notice that the second non-zero value on the stack (working from the bottom up) is the address of the LoadLibraryW function. This is not a return address, it is the start of the function. This is a clue that the start address of this thread, specified by some other thread that called CreateThread, is actually LoadLibraryW. This is a parameter to _RtlUserThreadStart, which is the bottommost function on this thread’s call stack. So we can examine the raw data above, and reconstruct the call stack like so...
Does this make sense? Let’s think it through. In Vista, it is typical to see the bottom 3 functions at the start of a thread. Following this is the start address of the thread, which in this case is LoadLibraryW. It also makes sense that LoadLibraryW would call LoadLibraryExW. That leaves us with the question of whether LoadLibraryW actually calls RtlInitUnicodeString. Let’s find out...
0:011> uf kernel32!LoadLibraryExW
76f7311d ff7508 push dword ptr [ebp+8]
76f73120 8d45cc lea eax,[ebp-34h]
76f73123 50 push eax
76f73124 8b3d6c10f576 mov edi,dword ptr [kernel32!_imp__RtlInitUnicodeString (76f5106c)]
76f7312a ffd7 call edi
76f7312c a14cd50177 mov eax,dword ptr [kernel32!BasepExeLdrEntry (7701d54c)]
76f73131 3bc3 cmp eax,ebx
76f73133 0f8498dafeff je kernel32!LoadLibraryExW+0x78 (76f60bd1)
We can see from the above assembly that LoadLibraryExW does indeed call RtlInitUnicodeString, so our re-assembled call stack makes sense.
Now we know how we got to RtlInitUnicodeString, but where did the bad pointer value come from? Note that the bad value, 00080000, is actually the very first non-zero value on the stack, right before the address of LoadLibraryExW. When a call is made to CreateThread, not only is the thread start address specified, but also an optional parameter to start function is specified. That is what we are seeing here. Both the thread start address and the parameter end up as parameters to _RtlUserThreadStart.
Let’s sum up what we know so far. Some other thread is calling CreateThread with LoadLibraryW as the start function address, and 00080000 as the parameter. This leads LoadLibraryExW to call RtlInitUnicodeString with 00080000 as the SourceString parameter, and when RtlInitUnicodeString attempts to read from this address we crash because 00080000 is the address of freed memory. Incidentally, what should 00080000 point to? It is being used as the parameter to LoadLibrary, so it should be a pointer to a null terminated string that specifies the name of the library file to load.
Why would one want to start a thread with LoadLibraryW as the start address? The most common reason is code injection. Typically a thread outside of the target process will use the CreateRemoteThread function to invoke LoadLibrary against a particular DLL, thus loading their code into the target process. Assuming that this is what is going on, we’ll need to move beyond the crash dump and debug a live system that is having the problem. Fortunately we noticed that the crashing systems we observed all had a particular third party application installed. We set up a test system with the third party application, and we were able to sometimes reproduce the crash. Looks like possibly a timing-related crash, since we didn’t have a 100% repro.
So let’s move forward with the debug of the live system. Note that the memory addresses in the live debug notes will differ from the above. Also, addresses and module names that could identify the application vendor have been changed. First things first: setting a breakpoint on ntdll!NtCreateThreadEx and running through the code showed that the LoadLibraryW thread was starting without any in-process thread starting the thread. It also revealed that the LoadLibraryW thread started without crashing this time, and the library that it was loading was a DLL that belonged to the third party app in question....
0:003> du 00b90000
00b90000 "C:\Program Files\AppVendor\App N"
We now know that the thread that is creating the LoadLibraryW thread isn’t in the sidebar.exe process, and we know that the third party application is definitely using LoadLibraryW to remotely inject their code into sidebar.exe. Sometimes this works, and sometimes it crashes because the DLL name buffer points to freed memory. Next step, let’s debug the third party process to confirm our suspicions, and hopefully figure out why the memory is freed sometimes. We’ll set breakpoints on VirtualAllocEx, WriteProcessMemory, CreateRemoteThread, and VirtualFreeEx in the third party process...
// Third party app calls calls VirtualAllocEx against the sidebar.exe process
ChildEBP RetAddr Args to Child
0200f8c0 4c9864db 000002b0 00000000 00000090 kernel32!VirtualAllocEx
WARNING: Stack unwind information not available. Following frames may be wrong.
0200f914 4c986af9 000aada0 0200f938 000adc20 AppName!Ordinal1+0x411c0
0200f96c 4c986d1f 00000000 0200f9a0 00000003 AppName!Ordinal1+0x417de
0200f8c0 6c9884db kernel32!VirtualAllocEx(
void * hProcess = 0x000002b0,
void * lpAddress = 0x00000000,
unsigned long dwSize = 0x90,
unsigned long flAllocationType = 0x1000,
unsigned long flProtect = 4)
// PID 2836 == sidebar.exe
0:003> !handle 0x000002b0 f
Object Specific Information
Process Id 2836
Parent Process 2944
Base Priority 8
// 00b90000 is the base address of the allocation in this run
eax=00b90000 ebx=0200f9ac ecx=0200f89c edx=77249a94 esi=0200f904 edi=00000090
eip=4c9844db esp=0200f8dc ebp=0200f914 iopl=0 nv up ei pl zr na pe nc
cs=001b ss=0023 ds=0023 es=0023 fs=003b gs=0000 efl=00000246
// Now the app calls WriteProcessMemory writing the following string:
// "C:\Program Files\AppVendor\App Name\applib.dll" to the new virtual alloc
Breakpoint 5 hit
eax=00000000 ebx=0200f9ac ecx=0200f904 edx=00b90000 esi=00000090 edi=00000090
eip=77371cc6 esp=0200f8bc ebp=0200f8d8 iopl=0 nv up ei pl nz na pe nc
cs=001b ss=0023 ds=0023 es=0023 fs=003b gs=0000 efl=00000206
77371cc6 8bff mov edi,edi
0200f8b8 6c9885ba kernel32!WriteProcessMemory(
void * lpBaseAddress = 0x00b90000,
void * lpBuffer = 0x000aada0,
unsigned long nSize = 0x90,
unsigned long * lpNumberOfBytesWritten = 0x0200f8e0)
0200f8d8 4c986895 AppName!Ordinal1+0x4129f
0:003> du 0x000aada0
000aada0 "C:\Program Files\AppVendor\App Name\applib.dll"
// Within the sidebar.exe process, we can see that after WriteProcessMemory in the app returns,
// The address has the expected string….
// The app then calls CreateRemoteThread against the sidebar.exe process,
// using LoadLibraryW as the start address, and the address of the virtual alloc as the parameter.
Breakpoint 0 hit
eax=000002b0 ebx=77370000 ecx=7739361f edx=77249a94 esi=0200f9ac edi=00000000
eip=773b46ef esp=0200f8a0 ebp=0200f8d0 iopl=0 ov up ei pl nz na po nc
cs=001b ss=0023 ds=0023 es=0023 fs=003b gs=0000 efl=00000a02
773b46ef 6858010000 push 158h
0200f89c 6c988714 000002b0 00000000 00000000 kernel32!CreateRemoteThread
0200f8d0 4c9688ea 0200f8f8 00000000 6cade78c AppName!Ordinal1+0x413f9
0200f914 4c968af9 000aada0 0200f938 000adc20 AppName!Ordinal1+0x415cf
0:003> kP L1
0200f89c 6c988714 kernel32!CreateRemoteThread(
struct _SECURITY_ATTRIBUTES * lpThreadAttributes = 0x00000000,
unsigned long dwStackSize = 0,
<function> * lpStartAddress = 0x7739361f,
void * lpParameter = 0x00b90000,
unsigned long dwCreationFlags = 0,
unsigned long * lpThreadId = 0x00000000)
0:003> ln 0x7739361f
kernel32!LoadLibraryW (wchar_t *)
// The application then calls VirtualFreeEx on the virtual alloc.
0200f8cc 6c98851c 000002b0 00b90000 00000000 kernel32!VirtualFreeEx
0200f914 4c988af9 000aada0 0200f938 000adc20 AppName!Ordinal1+0x41201
0200f96c 4c988d1f 00000000 0200f9a0 00000003 AppName!Ordinal1+0x417de
// Within the sidebar.exe process, the thread for LoadLibrary starts...
0412f814 7722e489 7739361f 00b90000 00000000 ntdll!__RtlUserThreadStart
0412f82c 00000000 7739361f 00b90000 00000000 ntdll!_RtlUserThreadStart+0x1b
// But the memory has already been freed by the third party application...
// ...and sidebar.exe crashes once the address is read
(b14.4d8): Access violation - code c0000005 (first chance)
First chance exceptions are reported before any exception handling.
This exception may be expected and handled.
eax=00000000 ebx=00000000 ecx=ffffffff edx=0412f780 esi=00000000 edi=00b90000
eip=77237e8b esp=0412f758 ebp=0412f7b4 iopl=0 nv up ei pl zr na pe nc
cs=001b ss=0023 ds=0023 es=0023 fs=003b gs=0000 efl=00010246
77237e8b 66f2af repne scas word ptr es:[edi]
We’ve now come full circle, back to the original crash, at the same instruction in RtlInitUnicodeString. So now we know that the problem is that the third party vendor frees the memory that it allocated in sidebar.exe, before it is read. Sometimes the call to VirtualFreeEx happens after the memory is read, and in that scenario the crash doesn’t occur. But if VirtualFreeEx completes before RtlInitUnicodeString has a chance to read it, then sidebar.exe crashes.
This information was passed on to the third party vendor, and hopefully they’ll make a change to their code to avoid this problem. There are multiple ways that this could be addressed, but one potential fix would be for the third party app to wait on the thread handle returned from CreateRemoteThread before freeing the memory.
- Matthew Justice
The conference was held virtually via Live Meeting with Q&A sessions following every presentation.
Each recorded presentation will be accompanied with a link to the corresponding slide deck.
T. Roy founder of www.codemachine.com
T. Roy presented on instrumentation techniques for developers and engineers who instrument binaries to help resolve complex operating system problems.
Mark discussed new features and tools in the Sysinternals Tools suite.
Members from the Escalation Services team share best practices when engaging Microsoft support, included in this video is a demo of the new MPSReports tool and a demo of using Hyper-V to package up problems that can be reproduced and send to Microsoft.
Jeff discusses the different locations where you can find great content from the Escalation Services team to help solve common and difficult problems.
Tate Calhoun - Escalation Engineer
Tate walks through the use of the new XPERF tool and how to analyze the data for specific scenario's.
Citrix - Escalation Engineers
Citrix engineers Nicholas Vasile, Dmitry Vostokov, and Kapil Ramlal, share tools they have created to take advantage of the ETW tracing infrastructure in Windows, and debugging scripts, and best practices for engaging issues that require multi-vendor support.
Dennis Smeltzer - Executive Director TSANet
Dennis discusses a proposed new TSANet offering which would allow Escalation engineers to contact each other to collaborate on issues that require multi-vendor support.
Windows NT Debugging Blog Live Chat
Microsoft Platform Global Escalation Services is hosting our second live group debug chat session for the debugging community on March 17, 2009 at 11 AM Pacific Time. We will be focusing on debugging techniques and any questions you may have about anything we’ve previously blogged about.
Details about the “PGES-Windows NT Debugging Blog Live Chat” can be found here: http://www.microsoft.com/communities/chats/default.mspx