Below I want to share a basics of setting up debugging in one of the debuggers from Debugging Tools for Microsoft Windows (WinDbg, ntsd, cdb or kd). After performing all of the recommended steps, the session should be hopefully ‘actionable’ – ready for the actual debugging (that I’ll not discuss in this post – more about this later in individual case scenarios). In environments where debugging is a daily routine, those initialization steps should be preferably automated (through dbgeng.dll API, managed code or PowerShell scripting while using third party managed wrapper of dbgeng.dll, dbgeng.dll API, custom wrapper of dbgeng.dll (I created a simple one with mostly P/Invoke calls to the native library), or so).
First of all one must decide the way how he/she will be debugging:
Lets have a quick look on each of those – how can we setup those, and when to choose which
Live local user mode debugging is most common in a typical development scenario. Easiest way how to set such a session is by breaking into the existing running process (in WinDbg use File –> Attach to a process; or by running <debugger> –p <PID> or <debugger> –pn <process name>), or by directly starting the process under the debugger (File –> Open Executable; or by running <debugger> <commandline>). But there are also some more interesting ways of creating live user mode session.
Have the OS start the process under the debugger whenever a particular process would be about to start. This can be done by setting the ‘debugger’ Executable Image global OS flag. There are few ways how to set this flag (direct registry edit, in kernel debugging session, command line), but the most intuitive is by using Global Flags UI editor (gflags.exe executable in Debuggers package):
Have the OS attach the debugger to the process with unhandled exception before it would be forcefully terminated. This OS behavior (applied to all native user mode processes) can be controlled by AeDebug registry key:
HKLM\Software\Microsoft\Windows NT\CurrentVersion\AeDebug
First add or edit the ‘Auto’ string value and set it to “1” for automatic attachment of debugger without prompt dialog to user. Then add or edit the ‘Debugger’ string value to the full path to the debugger (relative paths are not recommendable as OS starts the debugger under the context of the crashing app – so the working directory might vary). For more detailed description of this setting visit msdn. Same registry key can be also updated by using the –iae (install AeDebugger) or –iaec (install AeDebugger with commandline) switches of user mode windows debugger of your choice (cdb, ntsd, windbg). To automatically attach to crashing managed applications edit:
HKLM\Software\Microsoft\.NETFramework
First add or edit DbgJITDebugLaunchSettings string value of “2” and then add or edit DbgManagedDebugger string value to full path to the debugger. For more information about this setting of managed application post mortem debugging visit msdn.
Attaching manually but noninvasively. In this mode the debugger can ‘read’ the state of the process (memory, registry …), however it cannot ‘write’. In order to be able to change process state – and so fully debug the process – Debug port member of the object representation of a process I kernel needs to point to a Debug object (kernel mode structure) which can be associated only with one debugger process at any point of time. Therefore only one debugger can invasively debug a user mode process. Noninvasive debugging doesn’t need to communicate through Debug port (as it only reads the state of the process) – so we can achieve scenario where we debug process with multiple debuggers (e.g. we want to issue few diagnostics commands from windbg, while we are debugging from Visual Studio). Also this way we don’t have to be worry about possibility of corrupting state of the debugged process. For noninvasive debugging use the ‘-pv’ switch:
<debugger> –pv –p <PID> <debugger> –pv –pn <ProcessName>
Once you have live user mode debugging set up you can share this session remotely via the TCP/IP or named pipes protocol (or also 1394 or COM transport protocols – but those are much more common for kernel mode debugging). Some of the techniques (with the exception of smart client debugging) can be used to share dump debugging session. There are several ways of doing this.
In your debugging session to be shared start the debugging server:
.server <protocol>:<parameters>[,IcfEnable]
where protocol is either ‘tcp’ or ‘npipe’ and paramaters are either ‘port=<port>’ or ‘pipe=<pipeName>’. IcfEnable is optional convenient parameter which will case debugger to create the needed firewall exceptions on your behalf. Alternatively you can set up the debugging server at the time of starting the debugger:
windbg –server tcp:port=55555,IcfEnable notepad.exe
To list the debuggers server in your session use the .servers commands (helpful for easy copy-paste sharing of the remote with the person expected to perform the remote debugging):
0:000> .servers On the client, use any of these command lines 0: <debugger> -remote tcp:Port=12345,Server=JAKRIVAN-DEV
From the output it should be clear how to attach to the remote session.
Remember that if you are starting the debugger server, the debugging engine (dbgeng.dll) actually runs on the remote machine – and so also all the symbols and sources resolution or other file system operation are relative to the remote system (there are ways how to send the symbols to remote machine through debugging session, or how to have sources loaded on local machine – I’ll discus those later in sections on symbols and sources resolving)
In situations where we would want the debugging engine to run on the local machine (usually due to access to symbols and sources – typical scenario when debugging live issue from customer machine) Debugging Proxy is our friend. Debugging proxy started on the remote machine (needs to have the debug privilege) communicates with the remote debuggers via a low level protocol (memory reads, memory writes etc.) – and all the actual debugging logic is happening on the server side. To start the Debugging Proxy run on the remote machine:
dbgsrv.exe –t <transport>:<parameters>
To connect to the remote Debugger Server run:
<debugger> -premote <transport>:<parameters>,server=<remote server>
You can connect to the actual process later or at the time of starting the debugger:
<debugger> –premote <transport>:<parameters>,server=<remote server> –p <remote process PID>
Debugging via Debugger Proxy is very similar technique to what the Visual Studio uses for native mode debugging and managed debugging in version 2012. Managed mode debugging in older versions of Visual Studio are more similar to the basic remote debugging (so symbols need to be available from remote machine …).
Another way of remote debugging is by using the remote.exe executable – since this executable is just remotely redirecting input and output it is then impossible to use the debugger engine API or windbg rich user interface to interact with the debugger. Therefore this way is recommendable in a very few limited scenarios (like e.g. when you need to see the actual output of the debugger when it’s initializing itself and may be failing due to various reasons – like incorrect arguments, insufficient privileges or resources etc. – then the native transport mechanism (the debugger server) wouldn’t even start at all, but the remote.exe can already transfer the output). In those special scenarios it might be helpful to use both remoting mechanisms when starting the debugger:
remote.exe /s "<debugger> -server < transport>:<parameters>,IcfEnable <debugger parameters>" < pipename>
And to connect to such a remote server you can either use either the remote.exe:
remote.exe /c <server name> <pipename>
Or the debugger itself (which is preferable as then you have all API and windbg UI richness available):
<debugger> –remote <transport>:<parameters>,server=<server name>
One beauty of remote.exe is that it can redirect console input/output of an arbitrary application – so even though it’s usually mentioned in connection with debugging, you can effectively use it for task like redirecting diagnostic output of a remote automated task (compiling, installation …).
This is technically a live user mode remote debugging technique, but due it’s specific behavior I’ll discuss it as a separate debugging technique.
In order to be able to debug the issue post-mortem, one needs to have a memory dump of process (or system) do debug. There are few ways how to take a memory dump:
Taking memory dump in the Task Manager – this is least recommendable way, since one cannot influence the information that will be included/excluded in the dump and also on 64bit systems Task Manager takes 64bit dump of 32bit processes which will unnecessary complicate the debugging.
Take the memory dump in live debugging session – we can use the .dump command for this. You can consult debuggers help for the detailed options, but I usually recommend taking minidump with all advanced options with overwriting possible existing file:
.dump /ma /o <dump path with extension>
Take the dump with one of the SysInternals tools – Process Explorer, or probably most preferably ProcDump (it can also take dumps based on several conditions like 1st & 2nd chance exceptions, application hangs etc.)
Kernel mode debugging is a very large topic – so I’ll probably dedicate some post to it in future. Now just let me not that you can either debug real physical machine – over the serial, usb or fire-wire cable or recently also over the TCP/IP (with the introduction of the kdnet) – more on setting up kernel debugging please see the msdn. You can also debug the virtual machine over the synthetic serial port. But probably the most convenient way how to try kernel debugging (as it doesn't require any setup on the target machine – so you mainly don’t need to reboot!) is LiveKd. From very simplistic point of view LiveKd basically opens a memory as a dump file and this way it can trick the kernel debugger to debug the local system. This is especially helpful when we need to debug local user mode process in kernel mode context (e.g. to troubleshoot synchronization issues).
There might be situations where local or remote user mode debugging is impossible – e.g. due to very limited resources (machine under load or stress test) or to debug early boot-up processes where no session is yet available to start the console debugger. For those cases one viable way of debugging is by setting up a post-mortem debugger (as described in Live local user mode debugging) and setting it to ‘ntsd –d’, e.g. by running:
ntsd –iaec -d
That way once the debugger starts it immediately redirects its input and output through kernel mode session (while rest of the OS freezes, unless there is some activity required by debugger – e.g. loading symbols, traversing memory etc.).
Because the kernel mode session needs to be set up and healthy and the user mode debugger just redirects it’s input and output through it – this approach combines disadvantages of kernel mode debugging (problems with maintaining infrastructure for debugging, keeping sessions in sync …) and redirecting input and output (no way of direct interacting with dbgeng.dll API, no advanced UI functionality of windbg) – and so it should be only used in cases where you have no other choice (early bootup processes debugging, processes on machine with limited resources).
In some cases the kernel mode debugging session can offer more information then simple usermode session. Let’s have a quick look on a example how can we debug usermode process (e.g. in livekd):
By !process command, we can query currently running processes (second parameter is a byte flag that influences the number of information spewed out):
0: kd> !process 0 0 cmd.exe PROCESS fffffa8008136740 SessionId: 1 Cid: 1564 Peb: 7f7d8dd8000 ParentCid: 0c94 DirBase: 150382000 ObjectTable: fffff8a014733400 HandleCount: 119. Image: cmd.exe PROCESS fffffa800b4c5940 SessionId: 1 Cid: 12b4 Peb: 7f7d8a26000 ParentCid: 0c94 DirBase: 3435b000 ObjectTable: fffff8a0123f0e80 HandleCount: 19. Image: cmd.exe PROCESS fffffa8011f9d940 SessionId: 1 Cid: 1440 Peb: 7f7d835f000 ParentCid: 0c94 DirBase: c1209000 ObjectTable: 00000000 HandleCount: 0. Image: cmd.exe 0: kd> !process 0 0 visio.exe PROCESS fffffa800d3666c0 SessionId: 1 Cid: 1558 Peb: 7ea05000 ParentCid: 18cc DirBase: 4b4d2000 ObjectTable: fffff8a0133418c0 HandleCount: 222. Image: VISIO.EXE 0: kd> !process 0 f visio.exe PROCESS fffffa800d3666c0 SessionId: 1 Cid: 1558 Peb: 7ea05000 ParentCid: 18cc DirBase: 4b4d2000 ObjectTable: fffff8a0133418c0 HandleCount: 222. Image: VISIO.EXE VadRoot fffffa800c1d30f0 Vads 216 Clone 0 Private 1531. Modified 240. Locked 0. DeviceMap fffff8a001c1cf70 Token fffff8a015a59050 ElapsedTime 3 Days 01:38:29.603 UserTime 00:00:00.187 KernelTime 00:00:02.449 QuotaPoolUsage[PagedPool] 607592 QuotaPoolUsage[NonPagedPool] 27328 Working Set Sizes (now,min,max) (3892, 50, 345) (15568KB, 200KB, 1380KB) PeakWorkingSetSize 6260 VirtualSize 307 Mb PeakVirtualSize 327 Mb PageFaultCount 61551 MemoryPriority BACKGROUND BasePriority 8 CommitCharge 2627 THREAD fffffa8006e6bb00 Cid 1558.1994 Teb: 000000007e8da000 Win32Thread: 0000000000000000 WAIT: (WrQueue) UserMode Non-Alertable fffffa8008c74b00 QueueObject Not impersonating DeviceMap fffff8a001c1cf70 Owning Process fffffa800d3666c0 Image: VISIO.EXE Attached Process N/A Image: N/A Wait Start TickCount 3710528 Ticks: 13765657 (2:11:39:05.625) Context Switch Count 5 IdealProcessor: 2 UserTime 00:00:00.000 KernelTime 00:00:00.000 Win32 Start Address 0x00000000728b14aa Stack Init fffff88007f2cdd0 Current fffff88007f2c7a0 Base fffff88007f2d000 Limit fffff88007f27000 Call 0 Priority 10 BasePriority 8 UnusualBoost 0 ForegroundBoost 0 IoPriority 2 PagePriority 5 Kernel stack not resident. Child-SP RetAddr Call Site fffff880`07f2c7e0 fffff801`98f2b99c nt!KiSwapContext+0x76 (Inline Function) --------`-------- nt!KiSwapThread+0xf4 (Inline Function @ fffff801`98f2b99c) fffff880`07f2c920 fffff801`98f36ddb nt!KiCommitThreadWait+0x23c fffff880`07f2c9e0 fffff801`992ceb6c nt!KeRemoveQueueEx+0x26b fffff880`07f2ca90 fffff801`992adcb5 nt!IoRemoveIoCompletion+0x4c fffff880`07f2cb20 fffff801`98f00d53 nt!NtRemoveIoCompletion+0x135 fffff880`07f2cbd0 00000000`76fe2ad2 nt!KiSystemServiceCopyEnd+0x13 (TrapFrame @ fffff880`07f2cc40) 00000000`03deec98 00000000`76fe2847 0x76fe2ad2 00000000`03deeca0 00000023`770e0e48 0x76fe2847 00000000`03deeca8 00000000`00000023 0x23`770e0e48 00000000`03deecb0 00000000`00000000 0x23 THREAD fffffa8007fb5080 Cid 1558.168c Teb: 000000007ea0a000 Win32Thread: 0000000000000000 WAIT: (UserRequest) UserMode Non-Alertable fffffa800bff8800 ProcessObject Not impersonating DeviceMap fffff8a001c1cf70 Owning Process fffffa800d3666c0 Image: VISIO.EXE Attached Process N/A Image: N/A Wait Start TickCount 3710528 Ticks: 13765657 (2:11:39:05.625) Context Switch Count 3 IdealProcessor: 2 UserTime 00:00:00.000 KernelTime 00:00:00.000 Win32 Start Address 0x000000007450d97d Stack Init fffff88006844dd0 Current fffff88006844900 Base fffff88006845000 Limit fffff8800683f000 Call 0 Priority 10 BasePriority 8 UnusualBoost 0 ForegroundBoost 0 IoPriority 2 PagePriority 5 Kernel stack not resident. Child-SP RetAddr Call Site fffff880`06844940 fffff801`98f2b99c nt!KiSwapContext+0x76 (Inline Function) --------`-------- nt!KiSwapThread+0xf4 (Inline Function @ fffff801`98f2b99c) fffff880`06844a80 fffff801`98f27c1f nt!KiCommitThreadWait+0x23c fffff880`06844b40 fffff801`992c7df6 nt!KeWaitForSingleObject+0x1cf fffff880`06844bd0 fffff801`98f00d53 nt!NtWaitForSingleObject+0xb6 fffff880`06844c40 00000000`76fe2ad2 nt!KiSystemServiceCopyEnd+0x13 (TrapFrame @ fffff880`06844c40) 00000000`004ceb28 00000000`76fe2941 0x76fe2ad2 00000000`004ceb30 00000000`770e2fac 0x76fe2941 00000000`004ceb38 00000000`00000023 0x770e2fac 00000000`004ceb40 00000000`00000202 0x23 00000000`004ceb48 00000000`01fdfd68 0x202 00000000`004ceb50 00000000`0000002b 0x1fdfd68 00000000`004ceb58 00000000`00000000 0x2b
In output we can find many helpful diagnostic information (what’s the state of process, threads, what are they waiting on etc.). We can also switch contexts to particular process/thread (by using OS wide ids of process/thread from the informational outputs of !process command) and debug as we are used to in user mode debugging:
0: kd> .process fffffa800d3666c0 Implicit process is now fffffa80`0d3666c0 0: kd> .thread fffffa8007fb5080Implicit thread is now fffffa80`07fb5080 0: kd> k *** Stack trace for last set context - .thread/.cxr resets it Child-SP RetAddr Call Site fffff880`06844940 fffff801`98f2b99c nt!KiSwapContext+0x76 (Inline Function) --------`-------- nt!KiSwapThread+0xf4 fffff880`06844a80 fffff801`98f27c1f nt!KiCommitThreadWait+0x23c fffff880`06844b40 fffff801`992c7df6 nt!KeWaitForSingleObject+0x1cf fffff880`06844bd0 fffff801`98f00d53 nt!NtWaitForSingleObject+0xb6 fffff880`06844c40 00000000`76fe2ad2 nt!KiSystemServiceCopyEnd+0x13 00000000`004ceb28 00000000`76fe2941 0x76fe2ad2 00000000`004ceb30 00000000`770e2fac 0x76fe2941 00000000`004ceb38 00000000`00000023 0x770e2fac 00000000`004ceb40 00000000`00000202 0x23 00000000`004ceb48 00000000`01fdfd68 0x202 00000000`004ceb50 00000000`0000002b 0x1fdfd68 00000000`004ceb58 00000000`00000000 0x2b
But in addition to user mode debugging we have also all kernel mode debugging commands and access to the whole address space (including other processes).
0: kd> !object fffffa800bff8800 Object: fffffa800bff8800 Type: (fffffa800675f750) Process ObjectHeader: fffffa800bff87d0 (new version) HandleCount: 6 PointerCount: 1621333 0: kd> !process fffffa800bff8800 0 PROCESS fffffa800bff8800 SessionId: 1 Cid: 15f8 Peb: 7f722b67000 ParentCid: 1558 DirBase: 143f0d000 ObjectTable: fffff8a0126addc0 HandleCount: 174. Image: splwow64.exe
So in this particular case we found that Viso process was waiting on MS spooler service – so shortly hanging Visio was tracked down to network glitches with our network printer.
However debugging in kernel mode debugger can be challenging for many developers and even extensions – many commands are different in kernel mode and some commands even mean different things in user mode and kernel mode – like e.g. the “~” command that can be used to switch threads in user mode, but is used for switching of processors in kernel mode – this caused !analyze extension to case weird behavior for a long time for user mode breaks in kernel mode context.
With this many ways of debugging it can be sometimes confusing to find out what type of session are we actually using (especially during the remoting scenarios). There is one helpful command to find out the type of debugging for us:
0:000> || . 0 Live user mode: <Local>
0: kd> || . 0 64-bit Full kernel dump: C:\WINDOWS\livekd.dmp
This way we can for example see that LiveKd actually ‘just’ creates full dump of a memory and then starts debugger to debug that dump.
Now I’m finally getting to the actual preparation of the debugging session – this usually means resolving of symbols/binaries/sources and finding owner of code that should perform further investigation. During the kernel mode debugging the first step should be making sure that the session is properly synchronized with the target – but I’m not going to discus this step further in this post.
Without the debugging symbols (private preferably), the debugging engine must sometimes ‘guess’ and as a result we cannot fully trust it. It can often happen that location of the crash (AV, unhandled exception …) is pointed to a wrong function (sometimes even module) until all the symbols for modules on the particular stack are resolved. Debugger will usually tell us that missing symbols might cause it to ‘guess’ wrong:
0:001> k ChildEBP RetAddr WARNING: Stack unwind information not available. Following frames may be wrong. 004efe54 764b8543 ntdll!DbgBreakPoint 004efe60 770fac69 KERNEL32!BaseThreadInitThunk+0xe 004efea4 770fac3c ntdll!RtlInitializeExceptionChain+0x85 004efebc 00000000 ntdll!RtlInitializeExceptionChain+0x58
Usually we’ll have on the stack OS binaries, our binaries and sometimes also third party binaries. Debuggers have a built-in command for pointing symbol path to the Microsoft symbol server (to the external one for the shipping builds, and to the internal for the internal only builds):
3: kd> .symfix 3: kd> .sympath Symbol search path is: srv* Expanded Symbol search path is: SRV*http://msdl.microsoft.com/download/symbols
Ideally you’ll not need a symbols for the third part components (hopefully they are either reliable, or they are not in the middle of the stack when the crash happens in your component) – as it’s usually hard to get symbols from third parties.
For your symbols (and same would apply if you would happen to be able to get symbols for third party components) you will ideally have them on a symbol server, in that case you need to prepend path to your symbols server by ‘srv*’ – this indicates that the path will be searched using the ‘index’ (which is a guid that is baked into the binary and symbol file at the time of build – this guid also ensures that only the symbol file coming from the same build as binary will match that binary – unless forcefully chosen to load mismatched symbol file). If the ‘srv*’ is not prepended, the supplied path will be searched as a flat store – each directory will be examined (and potentiall ‘dll’, ‘sym’, ‘exe’ and other similar subdirectories in single level nesting will be also examined) to find the matching symbol file. This can be time consuming task – especially if accidentally applied to a big symbols server. This – incorrectly set up symbol search path - is actually one of the most common reasons for debugger hanging during some task (including Visual Studio debugger).
In the debugger you can set your symbol search path by .sympath command and following is the syntax of the symbol path:
.sympath <local flat symbol path>;cache*<local cache for remote symbol locations>;<remote flat symbol path>; srv*<remote symbols server>
For example:
.sympath C:\MyBuildLocation\Symbols;cache*C:\MySymbolsCache;\\CorporateSymbolsServer\Symbols; srv*http://msdl.microsoft.com/download/symbols
However instead of setting this path every time you start the debugger it’s much more recommendable to set your _NT_SYMBOL_PATH environment variable to that path. As a bonus many other applications can use this – like Process Explorer, Process Monitor etc.
Once your symbols are correctly set, you can try to reload them with .reload command (or you may need to force reloading with /f if stripped symbols were already loaded) – you can also point it to specific binaries (and use wildcards):
0:001> .reload Reloading current modules .......................... 0:001> k ChildEBP RetAddr 004efe24 7713dcbc ntdll!DbgBreakPoint 004efe54 764b8543 ntdll!DbgUiRemoteBreakin+0x39 004efe60 770fac69 KERNEL32!BaseThreadInitThunk+0xe 004efea4 770fac3c ntdll!__RtlUserThreadStart+0x72 004efebc 00000000 ntdll!_RtlUserThreadStart+0x1b
Let me simulate symbols resolution problems by setting wrong symbol path:
0:001> .sympath aaa Symbol search path is: aaa Expanded Symbol search path is: aaa WARNING: Inaccessible path: 'aaa' 0:001> .reload /f Reloading current modules .*** ERROR: Module load completed but symbols could not be loaded for notepad.exe .*** ERROR: Symbol file could not be found. Defaulted to export symbols for C:\WINDOWS\SysWOW64\SHCORE.DLL - .*** ERROR: Symbol file could not be found. Defaulted to export symbols for C:\WINDOWS\SysWOW64\dwmapi.dll - .*** ERROR: Symbol file could not be found. Defaulted to export symbols for C:\WINDOWS\SysWOW64\uxtheme.dll - .*** ERROR: Symbol file could not be found. Defaulted to export symbols for C:\WINDOWS\SysWOW64\WINSPOOL.DRV - .*** ERROR: Symbol file could not be found. Defaulted to export symbols for C:\WINDOWS\SysWOW64\bcryptPrimitives.dll - .*** ERROR: Symbol file could not be found. Defaulted to export symbols for C:\WINDOWS\SysWOW64\CRYPTBASE.dll - .*** ERROR: Symbol file could not be found. Defaulted to export symbols for C:\WINDOWS\SysWOW64\SspiCli.dll - .*** ERROR: Symbol file could not be found. Defaulted to export symbols for C:\WINDOWS\SysWOW64\SHLWAPI.dll - .*** ERROR: Symbol file could not be found. Defaulted to export symbols for C:\WINDOWS\SysWOW64\combase.dll - .*** ERROR: Symbol file could not be found. Defaulted to export symbols for C:\WINDOWS\SysWOW64\GDI32.dll - .*** ERROR: Symbol file could not be found. Defaulted to export symbols for C:\WINDOWS\SysWOW64\msvcrt.dll - .*** ERROR: Symbol file could not be found. Defaulted to export symbols for C:\WINDOWS\SysWOW64\KERNELBASE.dll - .*** ERROR: Symbol file could not be found. Defaulted to export symbols for C:\WINDOWS\SysWOW64\SHELL32.dll - .*** ERROR: Symbol file could not be found. Defaulted to export symbols for C:\WINDOWS\SysWOW64\ole32.dll - .*** ERROR: Symbol file could not be found. Defaulted to export symbols for C:\WINDOWS\SysWOW64\OLEAUT32.dll - .*** ERROR: Symbol file could not be found. Defaulted to export symbols for C:\WINDOWS\SysWOW64\KERNEL32.DLL - .*** ERROR: Symbol file could not be found. Defaulted to export symbols for C:\WINDOWS\SysWOW64\COMDLG32.dll - .*** ERROR: Symbol file could not be found. Defaulted to export symbols for C:\WINDOWS\SysWOW64\sechost.dll - .*** ERROR: Symbol file could not be found. Defaulted to export symbols for C:\WINDOWS\SysWOW64\IMM32.DLL - .*** ERROR: Symbol file could not be found. Defaulted to export symbols for C:\WINDOWS\SysWOW64\RPCRT4.dll - .*** ERROR: Symbol file could not be found. Defaulted to export symbols for C:\WINDOWS\SysWOW64\MSCTF.dll - .*** ERROR: Symbol file could not be found. Defaulted to export symbols for C:\WINDOWS\SysWOW64\ADVAPI32.dll - .*** ERROR: Symbol file could not be found. Defaulted to export symbols for C:\WINDOWS\SysWOW64\USER32.dll - .*** ERROR: Symbol file could not be found. Defaulted to export symbols for ntdll.dll -
Already in the basic output we some hints that there are problems with symbols loading and also where might be the issue. But let’s dig deeper – firs we want to verify that there is really problem with loading symbols for a particular module of interest:
0:001> lmvm ntdll start end module name 770a0000 771f7000 ntdll (export symbols) C:\WINDOWS\SYSTEM32\ntdll.dll Loaded symbol image file: C:\WINDOWS\SYSTEM32\ntdll.dll Image path: ntdll.dll Image name: ntdll.dll Timestamp: Wed Sep 19 22:32:50 2012 (505AAA82) CheckSum: 0015B576 ImageSize: 00157000 File version: 6.2.9200.16420 Product version: 6.2.9200.16420 File flags: 0 (Mask 3F) File OS: 40004 NT Win32 File type: 2.0 Dll File date: 00000000.00000000 Translations: 0409.04b0 CompanyName: Microsoft Corporation ProductName: Microsoft« Windows« Operating System InternalName: ntdll.dll OriginalFilename: ntdll.dll ProductVersion: 6.2.9200.16420 FileVersion: 6.2.9200.16420 (win8_gdr.120919-1813) FileDescription: NT Layer DLL LegalCopyright: ⌐ Microsoft Corporation. All rights reserved.
And we can see that the (private or public) symbols are not being loaded. Let’s track down why:
0:001> !sym noisy noisy mode - symbol prompts off 0:001> .reload /f ntdll.dll DBGHELP: aaa\wntdll.pdb - file not found DBGHELP: aaa\dll\wntdll.pdb - file not found DBGHELP: aaa\symbols\dll\wntdll.pdb - file not found DBGHELP: C:\WINDOWS\SYSTEM32\wntdll.pdb - file not found DBGHELP: wntdll.pdb - file not found *** ERROR: Symbol file could not be found. Defaulted to export symbols for ntdll.dll - DBGHELP: ntdll - export symbols
Now we see all the paths that are being searched and we can go find why the correct symbol file is not present on any of those paths. Once we fix the issue or/and fix our symbol search path, we can successfully reload symbols.
Sometimes engine doesn’t know which symbol file to load as it doesn’t even have the image loaded (this can for example happen when we debug some minidumps). For this reason we might want to have our images loading set up.
When windows runs your application it loads the executable binaries (.exe, .dll) into memory – at which point we call them executable images. As they are in memory, they can be part of memory dump of a application. However some smaller dumps may not have them, also it might happen that the image was unloaded – so there can be situations where the debugger might need to load those from separate location (this will become extremely helpful during managed code debugging from a dump – I’ll write about this in separate post).
Executable image path has same rules as symbols search path – and in fact in many situations we might want to set them to same locations as (despite many people don’t realize this) binaries can be also indexed on symbols server.
.exepath C:\MyBuildLocation\Binaries;cache*C:\MySymbolsCache;\\CorporateSymbolsServer\Symbols; srv*http://msdl.microsoft.com/download/symbols
And same as with symbols, it’s more advantageous to have this path set already in environment – we can use _NT_EXECUTABLE_IMAGE_PATH variable for this purpose.
Why wouldn’t we make our debugging experience less frustrating by debugging with source codes? Debugger engine can extract the source paths from the symbol files. If we would open a symbol file (e.g. ‘foo.pdb’) in simple text editor we would see that it contains locations of source code – those locations are taken at the build time. So if you debug on the same machine as where you build your code, or if your debug machine has same structure of code files in same location as your build machine, the debugger will be able to ‘magically’ bring up the source code file – but it’s not guaranteed that it’s a correct version of that source file (let’s say someone changed it from the time the debugged binary was build). Therefore the most recommendable way of having the best source code debugging experience is to use a source control system and source index your symbols.
Source indexing symbols means that the symbol file will have inside itself information about how to retrieve the file from source control server and which version to retrieve (basically it has section with translation of local paths to the full commands to the source control). You than need to make sure that the debugger has access to the toll for retrieving sources from source control (e.g. p4c.exe for perforce. You can just put it’s location to the path) and that the source path indicates that you want to use ‘source server’.
To indicate that you want to use source server, set the source path to ‘srv*’ string (beware that syntax and meaning is completely different from symbols or executable image path! for source path you just use ‘srv*’ with no other specification and it means that the debugger will attempt to retrieve the source control commands from the symbol files). To use local path point the source path to the flat path with source codes (no recursive search will be performed!). Or you can combine both approaches. Use .srcpath to set your source path:
.srcpath srv*;<any additional flat source locations>
Or if you can use .srcfix command for this (it sets the source path to the ‘srv*’ and any additional path that you pass to the .srcfix+ <path>).
However the most recommendable way is again by using environment variable - _NT_SOUCE_PATH
As was mentioned earlier – during the basic remote debugging, all commands are being executed on the remote system. This may be problematic in case where you debugging remote issue, however have source files on your local machine, or have the source control utility on your machine. In this situation .srcpath or .srcfix commands will have no effect (which can be confusing). You will need to set local source path and/or use local source server:
.lsrcfix .lsrcpath+ <local source codes path>
This will resolve source codes on your local machine (as opposed to the remote machine actually running the debugging engine), and source codes will finally ‘magically’ load (unless you have other issues that I’ll try to cover in a second). Also keep in mind that those two commands cannot be scripted.
Sometimes you’re changing the frames in stack and expect the appropriate source code to appear, but nothing happens. For those reason there is a way to troubleshoot such an issue (I’ve censored some irrelevant information):
0:000> .lines -e Line number information will be loaded 0:000> kn # Child-SP RetAddr Call Site 00 00000070`af71f6f0 000007ff`ecc72cc9 ntdll!LdrpDoDebuggerBreak+0x30 [d:\XXXXXX\ntdll\ldrinit.c @ 2737] 01 00000070`af71f730 000007ff`ecc0216a ntdll!LdrpInitializeProcess+0x1927 [d:\XXXXXX\ntdll\ldrinit.c @ 5131] 02 00000070`af71fa30 000007ff`ecbf32ae ntdll!_LdrpInitialize+0xee9a [d:\XXXXXX\ldrinit.c @ 1334] 03 00000070`af71faa0 00000000`00000000 ntdll!LdrInitializeThunk+0xe [d:\XXXXXX\ntdll\ldrstart.c @ 90] 0:000> !srcnoisy 3Noisy source output: on Noisy source server output: on Filter out everything but source server output: on 0:000> .frame 3 03 00000070`af71faa0 00000000`00000000 ntdll!LdrInitializeThunk+0xe [d:\XXXXXX\ntdll\ldrstart.c @ 90] DBGENG: Check plain file: DBGENG: 'd:\XXXXXX\ntdll\ldrstart.c' - not found 0:000> .srcfix DBGHELP: Symbol Search Path: srv*c:\mysymcache*http://XXXXXX Source search path is: SRV* DBGENG: Scan srcsrv SRV* for: DBGENG: '!d:\XXXXXX\ntdll\ldrstart.c' SRCSRV: XX.exe -p XXXXXXXXXXXXX.ntdev.microsoft.com:2020 print -o "D:\Debuggers_64bit\src\ XXXXXXXXXX\ntdll\ldrstart.c\1\ldrstart.c" -q //depot/XXXXXXXXXX/ntdll/ldrstart.c#1 DBGENG: found file 'd:\XXXXXXXXXX\ntdll\ldrstart.c' DBGENG: server path 'SRV*' DBGENG: local 'D:\Debuggers_64bit\src\XXXXXXXXXXXX\ntdll\ldrstart.c\1\ldrstart.c'
So we see that we can use .lines command to enable (or disable with -d) displaying of that build time paths baked into the symbol file. Alse we can see that !srcnoisy can switch on more verbose logging on what’s happening during sources loading. Finally we see how is the source file loaded from source control server – provided that symbol file was properly source indexed. Here is what would happen if symbol file wouldn’t be source indexed:
0:000> .frame 10 SRCSRV: d:\YYYYYYYYYYY\identityauthority.cpp not indexed
Once we are done with debugging session we have several option of ending the session. If we are not debugging live user mode session, we might not care about the way of exiting – simple close of the debugger app or q command might be enough. However when debugging the live user mode process we might want to preserve the process. To end debugging without killing the target application use .detach command or qd (quit and detach) command, or best way might be to start the debugger with –pd option, which will cause that target application will remain running, no matter how you quit your debugging session.