Below I want to share a basics of setting up debugging in one of the debuggers from Debugging Tools for Microsoft Windows (WinDbg, ntsd, cdb or kd). After performing all of the recommended steps, the session should be hopefully ‘actionable’ – ready for the actual debugging (that I’ll not discuss in this post – more about this later in individual case scenarios). In environments where debugging is a daily routine, those initialization steps should be preferably automated (through dbgeng.dll API, managed code or PowerShell scripting while using third party managed wrapper of dbgeng.dll, dbgeng.dll API, custom wrapper of dbgeng.dll (I created a simple one with mostly P/Invoke calls to the native library), or so).

Creating a debugging session

First of all one must decide the way how he/she will be debugging:

  • Live local user mode debugging
  • Live remote user mode debugging
  • Post-mortem (dump) debugging
  • Kernel mode debugging (live, or dump)
  • Live user mode debugging piped through kernel mode session
  • Debugging live user mode process in kernel mode context

Lets have a quick look on each of those – how can we setup those, and when to choose which

Live local user mode debugging

Live local user mode debugging is most common in a typical development scenario. Easiest way how to set such a session is by breaking into the existing running process (in WinDbg use File –> Attach to a process; or by running <debugger> –p <PID> or <debugger> –pn <process name>), or by directly starting the process under the debugger (File –> Open Executable; or by running <debugger> <commandline>). But there are also some more interesting ways of creating live user mode session.

Have the OS start the process under the debugger whenever a particular process would be about to start. This can be done by setting the ‘debugger’ Executable Image global OS flag. There are few ways how to set this flag (direct registry edit, in kernel debugging session, command line), but the most intuitive is by using Global Flags UI editor (gflags.exe executable in Debuggers package):

gflags

Have the OS attach the debugger to the process with unhandled exception before it would be forcefully terminated. This OS behavior (applied to all native user mode processes) can be controlled by AeDebug registry key:

HKLM\Software\Microsoft\Windows NT\CurrentVersion\AeDebug

First add or edit the ‘Auto’ string value and set it to “1” for automatic attachment of debugger without prompt dialog to user. Then add or edit the ‘Debugger’ string value to the full path to the debugger (relative paths are not recommendable as OS starts the debugger under the context of the crashing app – so the working directory might vary). For more detailed description of this setting visit msdn. Same registry key can be also updated by using the –iae (install AeDebugger) or –iaec (install AeDebugger with commandline) switches of user mode windows debugger of your choice (cdb, ntsd, windbg).
To automatically attach to crashing managed applications edit:

HKLM\Software\Microsoft\.NETFramework

First add or edit DbgJITDebugLaunchSettings string value of “2” and then add or edit DbgManagedDebugger string value to full path to the debugger. For more information about this setting of managed application post mortem debugging visit msdn.

Attaching manually but noninvasively. In this mode the debugger can ‘read’ the state of the process (memory, registry …), however it cannot ‘write’. In order to be able to change process state – and so fully debug the process – Debug port member of the object representation of a process I kernel needs to point to a Debug object (kernel mode structure) which can be associated only with one debugger process at any point of time. Therefore only one debugger can invasively debug a user mode process. Noninvasive debugging doesn’t need to communicate through Debug port (as it only reads the state of the process) – so we can achieve scenario where we debug process with multiple debuggers (e.g. we want to issue few diagnostics commands from windbg, while we are debugging from Visual Studio). Also this way we don’t have to be worry about possibility of corrupting state of the debugged process. For noninvasive debugging use the ‘-pv’ switch:

<debugger> –pv –p <PID>
<debugger> –pv –pn <ProcessName>

Live remote user mode debugging

Once you have live user mode debugging set up you can share this session remotely via the TCP/IP or named pipes protocol (or also 1394 or COM transport protocols – but those are much more common for kernel mode debugging). Some of the techniques (with the exception of smart client debugging) can be used to share dump debugging session. There are several ways of doing this.

Starting debugging server

In your debugging session to be shared start the debugging server:

.server <protocol>:<parameters>[,IcfEnable]

where protocol is either ‘tcp’ or ‘npipe’ and paramaters are either ‘port=<port>’ or ‘pipe=<pipeName>’. IcfEnable is optional convenient parameter which will case debugger to create the needed firewall exceptions on your behalf. 
Alternatively you can set up the debugging server at the time of starting the debugger:

windbg –server tcp:port=55555,IcfEnable notepad.exe

To list the debuggers server in your session use the .servers commands (helpful for easy copy-paste sharing of the remote with the person expected to perform the remote debugging):

0:000> .servers
On the client, use any of these command lines
0: <debugger> -remote tcp:Port=12345,Server=JAKRIVAN-DEV

From the output it should be clear how to attach to the remote session.

Remember that if you are starting the debugger server, the debugging engine (dbgeng.dll) actually runs on the remote machine – and so also all the symbols and sources resolution or other file system operation are relative to the remote system (there are ways how to send the symbols to remote machine through debugging session, or how to have sources loaded on local machine – I’ll discus those later in sections on symbols and sources resolving)

Starting debugging proxy server and smart client debugger

In situations where we would want the debugging engine to run on the local machine (usually due to access to symbols and sources – typical scenario when debugging live issue from customer machine) Debugging Proxy is our friend. Debugging proxy started on the remote machine (needs to have the debug privilege) communicates with the remote debuggers via a low level protocol (memory reads, memory writes etc.) – and all the actual debugging logic is happening on the server side. To start the Debugging Proxy run on the remote machine:

dbgsrv.exe –t <transport>:<parameters>

To connect to the remote Debugger Server run:

<debugger> -premote <transport>:<parameters>,server=<remote server> 

You can connect to the actual process later or at the time of starting the debugger:

<debugger> –premote <transport>:<parameters>,server=<remote server> –p <remote process PID>

Debugging via Debugger Proxy is very similar technique to what the Visual Studio uses for native mode debugging and managed debugging in version 2012. Managed mode debugging in older versions of Visual Studio are more similar to the basic remote debugging (so symbols need to be available from remote machine …).

Piping console input output through remote.exe

Another way of remote debugging is by using the remote.exe executable – since this executable is just remotely redirecting input and output it is then impossible to use the debugger engine API or windbg rich user interface to interact with the debugger. Therefore this way is recommendable in a very few limited scenarios (like e.g. when you need to see the actual output of the debugger when it’s initializing itself and may be failing due to various reasons – like incorrect arguments, insufficient privileges or resources etc. – then the native transport mechanism (the debugger server) wouldn’t even start at all, but the remote.exe can already transfer the output). In those special scenarios it might be helpful to use both remoting mechanisms when starting the debugger:

remote.exe /s "<debugger> -server < transport>:<parameters>,IcfEnable <debugger parameters>" < pipename>

And to connect to such a remote server you can either use either the remote.exe:

remote.exe /c <server name> <pipename>

Or the debugger itself (which is preferable as then you have all API and windbg UI richness available):

<debugger> –remote <transport>:<parameters>,server=<server name>

One beauty of remote.exe is that it can redirect console input/output of an arbitrary application – so even though it’s usually mentioned in connection with debugging, you can effectively use it for task like redirecting diagnostic output of a remote automated task (compiling, installation …).

Piping console input and output through live kd session

This is technically a live user mode remote debugging technique, but due it’s specific behavior I’ll discuss it as a separate debugging technique.

 

Post-mortem debugging

In order to be able to debug the issue post-mortem, one needs to have a memory dump of process (or system) do debug. There are few ways how to take a memory dump:

Taking memory dump in the Task Manager – this is least recommendable way, since one cannot influence the information that will be included/excluded in the dump and also on 64bit systems Task Manager takes 64bit dump of 32bit processes which will unnecessary complicate the debugging.

Take the memory dump in live debugging session – we can use the .dump command for this. You can consult debuggers help for the detailed options, but I usually recommend taking minidump with all advanced options with overwriting possible existing file:

.dump /ma /o <dump path with extension>

Take the dump with one of the SysInternals tools – Process Explorer, or probably most preferably ProcDump (it can also take dumps based on several conditions like 1st & 2nd chance exceptions, application hangs etc.)

Kernel mode debugging

Kernel mode debugging is a very large topic – so I’ll probably dedicate some post to it in future. Now just let me not that you can either debug real physical machine – over the serial, usb or fire-wire cable or recently also over the TCP/IP (with the introduction of the kdnet) – more on setting up kernel debugging please see the msdn. You can also debug the virtual machine over the synthetic serial port. But probably the most convenient way how to try kernel debugging (as it doesn't require any setup on the target machine – so you mainly don’t need to reboot!) is LiveKd. From very simplistic point of view LiveKd basically opens a memory as a dump file and this way it can trick the kernel debugger to debug the local system. This is especially helpful when we need to debug local user mode process in kernel mode context (e.g. to troubleshoot synchronization issues).

Live user mode debugging piped through kernel mode session

There might be situations where local or remote user mode debugging is impossible – e.g. due to very limited resources (machine under load or stress test) or to debug early boot-up processes where no session is yet available to start the console debugger. For those cases one viable way of debugging is by setting up a post-mortem debugger (as described in Live local user mode debugging) and setting it to ‘ntsd –d’, e.g. by running:

ntsd –iaec -d

That way once the debugger starts it immediately redirects its input and output through kernel mode session (while rest of the OS freezes, unless there is some activity required by debugger – e.g. loading symbols, traversing memory etc.).

Because the kernel mode session needs to be set up and healthy and the user mode debugger just redirects it’s input and output through it – this approach combines disadvantages of kernel mode debugging (problems with maintaining infrastructure for debugging, keeping sessions in sync …) and redirecting input and output (no way of direct interacting with dbgeng.dll API, no advanced UI functionality of windbg) – and so it should be only used in cases where you have no other choice (early bootup processes debugging, processes on machine with limited resources).

Debugging live user mode process in the kernel mode context

In some cases the kernel mode debugging session can offer more information then simple usermode session. Let’s have a quick look on a example how can we debug usermode process (e.g. in livekd):

By !process command, we can query currently running processes (second parameter is a byte flag that influences the number of information spewed out):

  

0: kd> !process 0 0 cmd.exe
PROCESS fffffa8008136740
    SessionId: 1  Cid: 1564    Peb: 7f7d8dd8000  ParentCid: 0c94
    DirBase: 150382000  ObjectTable: fffff8a014733400  HandleCount: 119.
    Image: cmd.exe

PROCESS fffffa800b4c5940
    SessionId: 1  Cid: 12b4    Peb: 7f7d8a26000  ParentCid: 0c94
    DirBase: 3435b000  ObjectTable: fffff8a0123f0e80  HandleCount:  19.
    Image: cmd.exe

PROCESS fffffa8011f9d940
    SessionId: 1  Cid: 1440    Peb: 7f7d835f000  ParentCid: 0c94
    DirBase: c1209000  ObjectTable: 00000000  HandleCount:   0.
    Image: cmd.exe

0: kd> !process 0 0 visio.exe
PROCESS fffffa800d3666c0
    SessionId: 1  Cid: 1558    Peb: 7ea05000  ParentCid: 18cc
    DirBase: 4b4d2000  ObjectTable: fffff8a0133418c0  HandleCount: 222.
    Image: VISIO.EXE

0: kd> !process 0 f visio.exe
PROCESS fffffa800d3666c0    SessionId: 1  Cid: 1558    Peb: 7ea05000  ParentCid: 18cc
    DirBase: 4b4d2000  ObjectTable: fffff8a0133418c0  HandleCount: 222.
    Image: VISIO.EXE
    VadRoot fffffa800c1d30f0 Vads 216 Clone 0 Private 1531. Modified 240. Locked 0.
    DeviceMap fffff8a001c1cf70
    Token                             fffff8a015a59050
    ElapsedTime                       3 Days 01:38:29.603
    UserTime                          00:00:00.187
    KernelTime                        00:00:02.449
    QuotaPoolUsage[PagedPool]         607592
    QuotaPoolUsage[NonPagedPool]      27328
    Working Set Sizes (now,min,max)  (3892, 50, 345) (15568KB, 200KB, 1380KB)
    PeakWorkingSetSize                6260
    VirtualSize                       307 Mb
    PeakVirtualSize                   327 Mb
    PageFaultCount                    61551
    MemoryPriority                    BACKGROUND
    BasePriority                      8
    CommitCharge                      2627

        THREAD fffffa8006e6bb00  Cid 1558.1994  Teb: 000000007e8da000 Win32Thread: 0000000000000000 WAIT: (WrQueue) UserMode Non-Alertable
            fffffa8008c74b00  QueueObject
        Not impersonating
        DeviceMap                 fffff8a001c1cf70
        Owning Process            fffffa800d3666c0       Image:         VISIO.EXE
        Attached Process          N/A            Image:         N/A
        Wait Start TickCount      3710528        Ticks: 13765657 (2:11:39:05.625)
        Context Switch Count      5              IdealProcessor: 2
        UserTime                  00:00:00.000
        KernelTime                00:00:00.000
        Win32 Start Address 0x00000000728b14aa
        Stack Init fffff88007f2cdd0 Current fffff88007f2c7a0
        Base fffff88007f2d000 Limit fffff88007f27000 Call 0
        Priority 10 BasePriority 8 UnusualBoost 0 ForegroundBoost 0 IoPriority 2 PagePriority 5
        Kernel stack not resident.
        Child-SP          RetAddr           Call Site
        fffff880`07f2c7e0 fffff801`98f2b99c nt!KiSwapContext+0x76
        (Inline Function) --------`-------- nt!KiSwapThread+0xf4 (Inline Function @ fffff801`98f2b99c)
        fffff880`07f2c920 fffff801`98f36ddb nt!KiCommitThreadWait+0x23c
        fffff880`07f2c9e0 fffff801`992ceb6c nt!KeRemoveQueueEx+0x26b
        fffff880`07f2ca90 fffff801`992adcb5 nt!IoRemoveIoCompletion+0x4c
        fffff880`07f2cb20 fffff801`98f00d53 nt!NtRemoveIoCompletion+0x135
        fffff880`07f2cbd0 00000000`76fe2ad2 nt!KiSystemServiceCopyEnd+0x13 (TrapFrame @ fffff880`07f2cc40)
        00000000`03deec98 00000000`76fe2847 0x76fe2ad2
        00000000`03deeca0 00000023`770e0e48 0x76fe2847
        00000000`03deeca8 00000000`00000023 0x23`770e0e48
        00000000`03deecb0 00000000`00000000 0x23

        THREAD fffffa8007fb5080  Cid 1558.168c  Teb: 000000007ea0a000 Win32Thread: 0000000000000000 WAIT: (UserRequest) UserMode Non-Alertable
            fffffa800bff8800  ProcessObject
        Not impersonating
        DeviceMap                 fffff8a001c1cf70
        Owning Process            fffffa800d3666c0       Image:         VISIO.EXE
        Attached Process          N/A            Image:         N/A
        Wait Start TickCount      3710528        Ticks: 13765657 (2:11:39:05.625)
        Context Switch Count      3              IdealProcessor: 2
        UserTime                  00:00:00.000
        KernelTime                00:00:00.000
        Win32 Start Address 0x000000007450d97d
        Stack Init fffff88006844dd0 Current fffff88006844900
        Base fffff88006845000 Limit fffff8800683f000 Call 0
        Priority 10 BasePriority 8 UnusualBoost 0 ForegroundBoost 0 IoPriority 2 PagePriority 5
        Kernel stack not resident.
        Child-SP          RetAddr           Call Site
        fffff880`06844940 fffff801`98f2b99c nt!KiSwapContext+0x76
        (Inline Function) --------`-------- nt!KiSwapThread+0xf4 (Inline Function @ fffff801`98f2b99c)
        fffff880`06844a80 fffff801`98f27c1f nt!KiCommitThreadWait+0x23c
        fffff880`06844b40 fffff801`992c7df6 nt!KeWaitForSingleObject+0x1cf
        fffff880`06844bd0 fffff801`98f00d53 nt!NtWaitForSingleObject+0xb6
        fffff880`06844c40 00000000`76fe2ad2 nt!KiSystemServiceCopyEnd+0x13 (TrapFrame @ fffff880`06844c40)
        00000000`004ceb28 00000000`76fe2941 0x76fe2ad2
        00000000`004ceb30 00000000`770e2fac 0x76fe2941
        00000000`004ceb38 00000000`00000023 0x770e2fac
        00000000`004ceb40 00000000`00000202 0x23
        00000000`004ceb48 00000000`01fdfd68 0x202
        00000000`004ceb50 00000000`0000002b 0x1fdfd68
        00000000`004ceb58 00000000`00000000 0x2b

 

In output we can find many helpful diagnostic information (what’s the state of process, threads, what are they waiting on etc.). We can also switch contexts to particular process/thread (by using OS wide ids of process/thread from the informational outputs of !process command) and debug as we are used to in user mode debugging:

0: kd> .process fffffa800d3666c0
Implicit process is now fffffa80`0d3666c0        

0: kd> .thread fffffa8007fb5080Implicit thread is now fffffa80`07fb5080

0: kd> k
  *** Stack trace for last set context - .thread/.cxr resets it
Child-SP          RetAddr           Call Site
fffff880`06844940 fffff801`98f2b99c nt!KiSwapContext+0x76
(Inline Function) --------`-------- nt!KiSwapThread+0xf4
fffff880`06844a80 fffff801`98f27c1f nt!KiCommitThreadWait+0x23c
fffff880`06844b40 fffff801`992c7df6 nt!KeWaitForSingleObject+0x1cf
fffff880`06844bd0 fffff801`98f00d53 nt!NtWaitForSingleObject+0xb6
fffff880`06844c40 00000000`76fe2ad2 nt!KiSystemServiceCopyEnd+0x13
00000000`004ceb28 00000000`76fe2941 0x76fe2ad2
00000000`004ceb30 00000000`770e2fac 0x76fe2941
00000000`004ceb38 00000000`00000023 0x770e2fac
00000000`004ceb40 00000000`00000202 0x23
00000000`004ceb48 00000000`01fdfd68 0x202
00000000`004ceb50 00000000`0000002b 0x1fdfd68
00000000`004ceb58 00000000`00000000 0x2b

 

But in addition to user mode debugging we have also all kernel mode debugging commands and access to the whole address space (including other processes).

0: kd> !object fffffa800bff8800
Object: fffffa800bff8800  Type: (fffffa800675f750) Process
    ObjectHeader: fffffa800bff87d0 (new version)
    HandleCount: 6  PointerCount: 1621333

0: kd> !process fffffa800bff8800 0
PROCESS fffffa800bff8800
    SessionId: 1  Cid: 15f8    Peb: 7f722b67000  ParentCid: 1558
    DirBase: 143f0d000  ObjectTable: fffff8a0126addc0  HandleCount: 174.
    Image: splwow64.exe

So in this particular case we found that Viso process was waiting on MS spooler service – so shortly hanging Visio was tracked down to network glitches with our network printer.

However debugging in kernel mode debugger can be challenging for many developers and even extensions – many commands are different in kernel mode and some commands even mean different things in user mode and kernel mode – like e.g. the “~” command that can be used to switch threads in user mode, but is used for switching of processors in kernel mode – this caused !analyze extension to case weird behavior for a long time for user mode breaks in kernel mode context.

What debugging type is used in current session

With this many ways of debugging it can be sometimes confusing to find out what type of session are we actually using (especially during the remoting scenarios). There is one helpful command to find out the type of debugging for us:

0:000> ||
  
.  0 Live user mode: <Local>
0: kd> ||
  
.  0 64-bit Full kernel dump: C:\WINDOWS\livekd.dmp

This way we can for example see that LiveKd actually ‘just’ creates full dump of a memory and then starts debugger to debug that dump.

 

Preparing session for debugging

Now I’m finally getting to the actual preparation of the debugging session – this usually means resolving of symbols/binaries/sources and finding owner of code that should perform further investigation. During the kernel mode debugging the first step should be making sure that the session is properly synchronized with the target – but I’m not going to discus this step further in this post.

Ensuring symbols resolution

Without the debugging symbols (private preferably), the debugging engine must sometimes ‘guess’ and as a result we cannot fully trust it. It can often happen that location of the crash (AV, unhandled exception …) is pointed to a wrong function (sometimes even module) until all the symbols for modules on the particular stack are resolved. Debugger will usually tell us that missing symbols might cause it to ‘guess’ wrong:

0:001> k
ChildEBP RetAddr
WARNING: Stack unwind information not available. Following frames may be wrong.
004efe54 764b8543 ntdll!DbgBreakPoint
004efe60 770fac69 KERNEL32!BaseThreadInitThunk+0xe
004efea4 770fac3c ntdll!RtlInitializeExceptionChain+0x85
004efebc 00000000 ntdll!RtlInitializeExceptionChain+0x58

Usually we’ll have on the stack OS binaries, our binaries and sometimes also third party binaries. Debuggers have a built-in command for pointing symbol path to the Microsoft symbol server (to the external one for the shipping builds, and to the internal for the internal only builds):

3: kd> .symfix
3: kd> .sympath
Symbol search path is: srv*
Expanded Symbol search path is: SRV*http://msdl.microsoft.com/download/symbols

Ideally you’ll not need a symbols for the third part components (hopefully they are either reliable, or they are not in the middle of the stack when the crash happens in your component) – as it’s usually hard to get symbols from third parties.

For your symbols (and same would apply if you would happen to be able to get symbols for third party components) you will ideally have them on a symbol server, in that case you need to prepend path to your symbols server by ‘srv*’ – this indicates that the path will be searched using the ‘index’ (which is a guid that is baked into the binary and symbol file at the time of build – this guid also ensures that only the symbol file coming from the same build as binary will match that binary – unless forcefully chosen to load mismatched symbol file). If the ‘srv*’ is not prepended, the supplied path will be searched as a flat store – each directory will be examined (and potentiall ‘dll’, ‘sym’, ‘exe’ and other similar subdirectories in single level nesting will be also examined) to find the matching symbol file. This can be time consuming task – especially if accidentally applied to a big symbols server. This – incorrectly set up symbol search path - is actually one of the most common reasons for debugger hanging during some task (including Visual Studio debugger).

In the debugger you can set your symbol search path by .sympath command and following is the syntax of the symbol path:

.sympath <local flat symbol path>;cache*<local cache for remote symbol locations>;<remote flat symbol path>;
		srv*<remote symbols server>

For example:

.sympath C:\MyBuildLocation\Symbols;cache*C:\MySymbolsCache;\\CorporateSymbolsServer\Symbols;
		srv*http://msdl.microsoft.com/download/symbols

However instead of setting this path every time you start the debugger it’s much more recommendable to set your _NT_SYMBOL_PATH environment variable to that path. As a bonus many other applications can use this – like Process Explorer, Process Monitor etc.

Once your symbols are correctly set, you can try to reload them with .reload command (or you may need to force reloading with /f if stripped symbols were already loaded) – you can also point it to specific binaries (and use wildcards):

0:001> .reload
Reloading current modules
..........................
0:001> k
ChildEBP RetAddr
004efe24 7713dcbc ntdll!DbgBreakPoint
004efe54 764b8543 ntdll!DbgUiRemoteBreakin+0x39
004efe60 770fac69 KERNEL32!BaseThreadInitThunk+0xe
004efea4 770fac3c ntdll!__RtlUserThreadStart+0x72
004efebc 00000000 ntdll!_RtlUserThreadStart+0x1b

Troubleshooting symbols resolution

Let me simulate symbols resolution problems by setting wrong symbol path:

0:001> .sympath aaa
Symbol search path is: aaa
Expanded Symbol search path is: aaa
WARNING: Inaccessible path: 'aaa'
0:001> .reload /f
Reloading current modules
.*** ERROR: Module load completed but symbols could not be loaded for notepad.exe
.*** ERROR: Symbol file could not be found.  Defaulted to export symbols for C:\WINDOWS\SysWOW64\SHCORE.DLL -
.*** ERROR: Symbol file could not be found.  Defaulted to export symbols for C:\WINDOWS\SysWOW64\dwmapi.dll -
.*** ERROR: Symbol file could not be found.  Defaulted to export symbols for C:\WINDOWS\SysWOW64\uxtheme.dll -
.*** ERROR: Symbol file could not be found.  Defaulted to export symbols for C:\WINDOWS\SysWOW64\WINSPOOL.DRV -
.*** ERROR: Symbol file could not be found.  Defaulted to export symbols for C:\WINDOWS\SysWOW64\bcryptPrimitives.dll -
.*** ERROR: Symbol file could not be found.  Defaulted to export symbols for C:\WINDOWS\SysWOW64\CRYPTBASE.dll -
.*** ERROR: Symbol file could not be found.  Defaulted to export symbols for C:\WINDOWS\SysWOW64\SspiCli.dll -
.*** ERROR: Symbol file could not be found.  Defaulted to export symbols for C:\WINDOWS\SysWOW64\SHLWAPI.dll -
.*** ERROR: Symbol file could not be found.  Defaulted to export symbols for C:\WINDOWS\SysWOW64\combase.dll -
.*** ERROR: Symbol file could not be found.  Defaulted to export symbols for C:\WINDOWS\SysWOW64\GDI32.dll -
.*** ERROR: Symbol file could not be found.  Defaulted to export symbols for C:\WINDOWS\SysWOW64\msvcrt.dll -
.*** ERROR: Symbol file could not be found.  Defaulted to export symbols for C:\WINDOWS\SysWOW64\KERNELBASE.dll -
.*** ERROR: Symbol file could not be found.  Defaulted to export symbols for C:\WINDOWS\SysWOW64\SHELL32.dll -
.*** ERROR: Symbol file could not be found.  Defaulted to export symbols for C:\WINDOWS\SysWOW64\ole32.dll -
.*** ERROR: Symbol file could not be found.  Defaulted to export symbols for C:\WINDOWS\SysWOW64\OLEAUT32.dll -
.*** ERROR: Symbol file could not be found.  Defaulted to export symbols for C:\WINDOWS\SysWOW64\KERNEL32.DLL -
.*** ERROR: Symbol file could not be found.  Defaulted to export symbols for C:\WINDOWS\SysWOW64\COMDLG32.dll -
.*** ERROR: Symbol file could not be found.  Defaulted to export symbols for C:\WINDOWS\SysWOW64\sechost.dll -
.*** ERROR: Symbol file could not be found.  Defaulted to export symbols for C:\WINDOWS\SysWOW64\IMM32.DLL -
.*** ERROR: Symbol file could not be found.  Defaulted to export symbols for C:\WINDOWS\SysWOW64\RPCRT4.dll -
.*** ERROR: Symbol file could not be found.  Defaulted to export symbols for C:\WINDOWS\SysWOW64\MSCTF.dll -
.*** ERROR: Symbol file could not be found.  Defaulted to export symbols for C:\WINDOWS\SysWOW64\ADVAPI32.dll -
.*** ERROR: Symbol file could not be found.  Defaulted to export symbols for C:\WINDOWS\SysWOW64\USER32.dll -
.*** ERROR: Symbol file could not be found.  Defaulted to export symbols for ntdll.dll -

Already in the basic output we some hints that there are problems with symbols loading and also where might be the issue. But let’s dig deeper – firs we want to verify that there is really problem with loading symbols for a particular module of interest:

0:001> lmvm ntdll
start    end        module name
770a0000 771f7000   ntdll      (export symbols)       C:\WINDOWS\SYSTEM32\ntdll.dll
    Loaded symbol image file: C:\WINDOWS\SYSTEM32\ntdll.dll
    Image path: ntdll.dll
    Image name: ntdll.dll
    Timestamp:        Wed Sep 19 22:32:50 2012 (505AAA82)
    CheckSum:         0015B576
    ImageSize:        00157000
    File version:     6.2.9200.16420
    Product version:  6.2.9200.16420
    File flags:       0 (Mask 3F)
    File OS:          40004 NT Win32
    File type:        2.0 Dll
    File date:        00000000.00000000
    Translations:     0409.04b0
    CompanyName:      Microsoft Corporation
    ProductName:      Microsoft« Windows« Operating System
    InternalName:     ntdll.dll
    OriginalFilename: ntdll.dll
    ProductVersion:   6.2.9200.16420
    FileVersion:      6.2.9200.16420 (win8_gdr.120919-1813)
    FileDescription:  NT Layer DLL
    LegalCopyright:   ⌐ Microsoft Corporation. All rights reserved.

And we can see that the (private or public) symbols are not being loaded. Let’s track down why:

0:001> !sym noisy
noisy mode - symbol prompts off
0:001> .reload /f ntdll.dll
DBGHELP: aaa\wntdll.pdb - file not found
DBGHELP: aaa\dll\wntdll.pdb - file not found
DBGHELP: aaa\symbols\dll\wntdll.pdb - file not found
DBGHELP: C:\WINDOWS\SYSTEM32\wntdll.pdb - file not found
DBGHELP: wntdll.pdb - file not found
*** ERROR: Symbol file could not be found.  Defaulted to export symbols for ntdll.dll -
DBGHELP: ntdll - export symbols

Now we see all the paths that are being searched and we can go find why the correct symbol file is not present on any of those paths. Once we fix the issue or/and fix our symbol search path, we can successfully reload symbols.

Sometimes engine doesn’t know which symbol file to load as it doesn’t even have the image loaded (this can for example happen when we debug some minidumps). For this reason we might want to have our images loading set up.

Executable images loading

When windows runs your application it loads the executable binaries (.exe, .dll) into memory – at which point we call them executable images. As they are in memory, they can be part of memory dump of a application. However some smaller dumps may not have them, also it might happen that the image was unloaded – so there can be situations where the debugger might need to load those from separate location (this will become extremely helpful during managed code debugging from a dump – I’ll write about this in separate post).

Executable image path has same rules as symbols search path – and in fact in many situations we might want to set them to same locations as (despite many people don’t realize this) binaries can be also indexed on symbols server.

.exepath C:\MyBuildLocation\Binaries;cache*C:\MySymbolsCache;\\CorporateSymbolsServer\Symbols;
		srv*http://msdl.microsoft.com/download/symbols

And same as with symbols, it’s more advantageous to have this path set already in environment – we can use _NT_EXECUTABLE_IMAGE_PATH variable for this purpose.

Source code loading

Why wouldn’t we make our debugging experience less frustrating by debugging with source codes? Debugger engine can extract the source paths from the symbol files. If we would open a symbol file (e.g. ‘foo.pdb’) in simple text editor we would see that it contains locations of source code – those locations are taken at the build time. So if you debug on the same machine as where you build your code, or if your debug machine has same structure of code files in same location as your build machine, the debugger will be able to ‘magically’ bring up the source code file – but it’s not guaranteed that it’s a correct version of that source file (let’s say someone changed it from the time the debugged binary was build). Therefore the most recommendable way of having the best source code debugging experience is to use a source control system and source index your symbols.

Source indexing symbols means that the symbol file will have inside itself information about how to retrieve the file from source control server and which version to retrieve (basically it has section with translation of local paths to the full commands to the source control). You than need to make sure that the debugger has access to the toll for retrieving sources from source control (e.g. p4c.exe for perforce. You can just put it’s location to the path) and that the source path indicates that you want to use ‘source server’.

To indicate that you want to use source server, set the source path to ‘srv*’ string (beware that syntax and meaning is completely different from symbols or executable image path! for source path you just use ‘srv*’ with no other specification and it means that the debugger will attempt to retrieve the source control commands from the symbol files). To use local path point the source path to the flat path with source codes (no recursive search will be performed!). Or you can combine both approaches. Use .srcpath to set your source path:

.srcpath srv*;<any additional flat source locations>

Or if you can use .srcfix command for this (it sets the source path to the ‘srv*’ and any additional path that you pass to the .srcfix+ <path>).

However the most recommendable way is again by using environment variable - _NT_SOUCE_PATH

Source code loading during remote debugging

As was mentioned earlier – during the basic remote debugging, all commands are being executed on the remote system. This may be problematic in case where you debugging remote issue, however have source files on your local machine, or have the source control utility on your machine. In this situation .srcpath or .srcfix commands will have no effect (which can be confusing). You will need to set local source path and/or use local source server:

.lsrcfix
.lsrcpath+ <local source codes path>

This will resolve source codes on your local machine (as opposed to the remote machine actually running the debugging engine), and source codes will finally ‘magically’ load (unless you have other issues that I’ll try to cover in a second). Also keep in mind that those two commands cannot be scripted.

Troubleshooting source code loading

Sometimes you’re changing the frames in stack and expect the appropriate source code to appear, but nothing happens. For those reason there is a way to troubleshoot such an issue (I’ve censored some irrelevant information):

0:000> .lines -e
Line number information will be loaded
0:000> kn
 # Child-SP          RetAddr           Call Site
00 00000070`af71f6f0 000007ff`ecc72cc9 ntdll!LdrpDoDebuggerBreak+0x30 [d:\XXXXXX\ntdll\ldrinit.c @ 2737]
01 00000070`af71f730 000007ff`ecc0216a ntdll!LdrpInitializeProcess+0x1927 [d:\XXXXXX\ntdll\ldrinit.c @ 5131]
02 00000070`af71fa30 000007ff`ecbf32ae ntdll!_LdrpInitialize+0xee9a [d:\XXXXXX\ldrinit.c @ 1334]
03 00000070`af71faa0 00000000`00000000 ntdll!LdrInitializeThunk+0xe [d:\XXXXXX\ntdll\ldrstart.c @ 90]

0:000> !srcnoisy 3Noisy source output: on
Noisy source server output: on
Filter out everything but source server output: on
0:000> .frame 3
03 00000070`af71faa0 00000000`00000000 ntdll!LdrInitializeThunk+0xe [d:\XXXXXX\ntdll\ldrstart.c @ 90]
DBGENG:  Check plain file:
DBGENG:    'd:\XXXXXX\ntdll\ldrstart.c' - not found
0:000> .srcfix
DBGHELP: Symbol Search Path: srv*c:\mysymcache*http://XXXXXX
Source search path is: SRV*
DBGENG:  Scan srcsrv SRV* for:
DBGENG:    '!d:\XXXXXX\ntdll\ldrstart.c'
SRCSRV:  XX.exe -p XXXXXXXXXXXXX.ntdev.microsoft.com:2020 print -o "D:\Debuggers_64bit\src\
			XXXXXXXXXX\ntdll\ldrstart.c\1\ldrstart.c" -q //depot/XXXXXXXXXX/ntdll/ldrstart.c#1
DBGENG:      found file 'd:\XXXXXXXXXX\ntdll\ldrstart.c'
DBGENG:      server path 'SRV*'
DBGENG:      local 'D:\Debuggers_64bit\src\XXXXXXXXXXXX\ntdll\ldrstart.c\1\ldrstart.c'

So we see that we can use .lines command to enable (or disable with -d) displaying of that build time paths baked into the symbol file. Alse we can see that !srcnoisy can switch on more verbose logging on what’s happening during sources loading. Finally we see how is the source file loaded from source control server – provided that symbol file was properly source indexed. Here is what would happen if symbol file wouldn’t be source indexed:

0:000> .frame 10
SRCSRV:  d:\YYYYYYYYYYY\identityauthority.cpp not indexed

 

Finishing the debugging

Once we are done with debugging session we have several option of ending the session. If we are not debugging live user mode session, we might not care about the way of exiting – simple close of the debugger app or q command might be enough. However when debugging the live user mode process we might want to preserve the process. To end debugging without killing the target application use .detach command or qd (quit and detach) command, or best way might be to start the debugger with –pd option, which will cause that target application will remain running, no matter how you quit your debugging session.