• Ntdebugging Blog

    How the Clipboard Works, Part 1

    • 4 Comments

    Recently I had the opportunity to debug the clipboard in Windows, and I thought I’d share some of the things I learned.  The clipboard is one of those parts of Windows that many of us use dozens (hundreds?) of times a day and don’t really think about. Before working on this case, I had never even considered how it works under the hood.  It turns out that there’s more to it than you might think. In this first article, I’ll describe how applications store different types of data to the clipboard and how it is retrieved.  My next post will describe how applications can hook the clipboard to monitor it for changes.  In both, I’ll include debug notes to show you how to access the data from the debugger.

     

    Let’s start by discussing clipboard formats.  A clipboard format is used to describe what type of data is placed on the clipboard.  There are a number of predefined standard formats that an application can use, such as bitmap, ANSI text, Unicode text, and TIFF.  Windows also allows an application to specify its own formats. For example, a word processor may want to register a format that includes text, formatting, and images.  Of course, this leads to one problem, what happens if you want to copy from your word processor and paste into Notepad, which doesn’t understand all of the formatting and pictures?

     

    The answer is to allow multiple formats to be stored in the clipboard at one time.  When I used to think of the clipboard I thought of it as a single object (“my text” or “my image”), but in reality the clipboard usually has my data in several different forms.  The destination program gets a format it can use when I paste.

     

    So how does this data end up on the clipboard?  Simple, an application first takes ownership of the clipboard via the OpenClipboard function.   Once it has done that, it can empty the clipboard with EmptyClipboard.  Finally, it is ready to place data on the clipboard using SetClipboardData.  SetClipboardData takes two parameters; the first is the identifier of one of the clipboard formats we discussed above.  The second is a handle to the memory containing the data in that format. The application can continue to call SetClipboardData for each of the various formats it wishes to provide going from best to worst (since the destination application will select the first format in the list it recognizes).  To make things easier for the developer, Windows will automatically provide converted formats for some clipboard format types.  Once the program is done, it calls CloseClipboard.

     

    When a user hits paste, the destination application will call OpenClipboard and one of these functions to determine what data format(s) are available: IsClipboardFormatAvailable, GetPriorityClipboardFormat, or EnumClipboardFormats. Assuming the application finds a format it can use, it will then call GetClipboardData with the desired format identifier as a parameter to get a handle to the data.  Finally, it will call CloseClipboard.

     

    Now let’s take a look at how you can find what data being written to the clipboard using the debugger. (Note that all of my notes are taken from a Win7/2008 R2 system – so things might vary slightly in different versions of the OS.)   Since the clipboard is part of Win32k.sys, you’ll need to use a kernel debugger.  I like to use win32k!InternalSetClipboardData+0xe4 as a breakpoint.  The nice thing about this offset is that it is right after we’ve populated the RDI register with data from SetClipboardData in a structure known as tagCLIP.

     

    kd> u win32k!InternalSetClipboardData+0xe4-c L5

    win32k!InternalSetClipboardData+0xd8:

    fffff960`0011e278 894360          mov     dword ptr [rbx+60h],eax

    fffff960`0011e27b 8937            mov     dword ptr [rdi],esi

    fffff960`0011e27d 4c896708        mov     qword ptr [rdi+8],r12

    fffff960`0011e281 896f10          mov     dword ptr [rdi+10h],ebp

    fffff960`0011e284 ff15667e1900    call    qword ptr[win32k!_imp_PsGetCurrentProcessWin32Process (fffff960`002b60f0)]

     

    kd> dt win32k!tagCLIP

       +0x000 fmt              : Uint4B

       +0x008 hData            : Ptr64 Void

       +0x010fGlobalHandle     : Int4B

     

    Here’s what a call to SetClipboardData from Notepad looks like:

    kd> k

    Child-SP          RetAddr           Call Site

    fffff880`0513a940 fffff960`0011e14f win32k!InternalSetClipboardData+0xe4

    fffff880`0513ab90 fffff960`000e9312 win32k!SetClipboardData+0x57

    fffff880`0513abd0 fffff800`01482ed3 win32k!NtUserSetClipboardData+0x9e

    fffff880`0513ac20 00000000`7792e30ant!KiSystemServiceCopyEnd+0x13

    00000000`001dfad8 00000000`7792e494 USER32!ZwUserSetClipboardData+0xa

    00000000`001dfae0 000007fe`fc5b892b USER32!SetClipboardData+0xdf

    00000000`001dfb20 000007fe`fc5ba625 COMCTL32!Edit_Copy+0xdf

    00000000`001dfb60 00000000`77929bd1 COMCTL32!Edit_WndProc+0xec9

    00000000`001dfc00 00000000`779298da USER32!UserCallWinProcCheckWow+0x1ad

    00000000`001dfcc0 00000000`ff5110bc USER32!DispatchMessageWorker+0x3b5

    00000000`001dfd40 00000000`ff51133c notepad!WinMain+0x16f

    00000000`001dfdc0 00000000`77a2652d notepad!DisplayNonGenuineDlgWorker+0x2da

    00000000`001dfe80 00000000`77b5c521 kernel32!BaseThreadInitThunk+0xd

    00000000`001dfeb0 00000000`00000000ntdll!RtlUserThreadStart+0x1d

     

    So here, we can dt RDI as a tagCLIP to see what was written:

    kd> dt win32k!tagCLIP @rdi

       +0x000 fmt              : 0xd

       +0x008 hData            : 0x00000000`00270235 Void

       +0x010fGlobalHandle     : 0n1

     

    Fmt is the clipboard format. 0xd is 13, which indicates this data is Unicode text.  We can’t just ‘du’ the value in hData, however, because this is a handle, not a direct pointer to the data.  So now we need to look up the handle.  To do that, we need to look at a win32k global structure – gSharedInfo:

    kd> ?win32k!gSharedInfo

    Evaluate expression: -7284261440224 = fffff960`002f3520

    kd> dt win32k!tagSHAREDINFO fffff960`002f3520

       +0x000 psi              : 0xfffff900`c0980a70 tagSERVERINFO

       +0x008 aheList          : 0xfffff900`c0800000 _HANDLEENTRY

       +0x010 HeEntrySize      : 0x18

       +0x018 pDispInfo        : 0xfffff900`c0981e50 tagDISPLAYINFO

       +0x020ulSharedDelta     : 0

       +0x028 awmControl       : [31] _WNDMSG

       +0x218DefWindowMsgs     : _WNDMSG

       +0x228DefWindowSpecMsgs : _WNDMSG

     

    aheList in gSharedInfo contains an array of handle entries, and the last 2 bytes of hData multiplied by the size of a handle entry the address of our handle entry:

    kd> ?0x00000000`00270235 & FFFF

    Evaluate expression: 565 = 00000000`00000235

    kd> ??sizeof(win32k!_HANDLEENTRY)

    unsigned int64 0x18

    kd> ? 0xfffff900`c0800000 + (0x235*0x18)

    Evaluate expression: -7693351766792 = fffff900`c08034f8

     

    kd> dt win32k!_HANDLEENTRY fffff900`c08034f8

       +0x000 phead            : 0xfffff900`c0de0fb0 _HEAD

       +0x008 pOwner           : (null)

       +0x010 bType            : 0x6 ''

       +0x011 bFlags           : 0 ''

       +0x012 wUniq            : 0x27

     

    If we look in phead at offset 14, we’ll get our data (this offset may vary on different platforms):

    kd> du fffff900`c0de0fb0 + 0x14

    fffff900`c0de0fc4 "Hi NTDebugging readers!"

     

    Let’s consider one other scenario.  I copied some text out of Wordpad, and a number of SetClipboardData calls were made to accommodate different formats. The Unicode format entry looks like this:

    Breakpoint 0 hit

    win32k!InternalSetClipboardData+0xe4:

    fffff960`0011e284 ff15667e1900   call    qword ptr[win32k!_imp_PsGetCurrentProcessWin32Process (fffff960`002b60f0)]

    kd> dt win32k!tagCLIP @rdi

       +0x000 fmt              : 0xd

       +0x008 hData            : (null)

       +0x010fGlobalHandle    : 0n0

     

    hData is null!  Why is that?  It turns out that the clipboard allows an application to pass in null to SetClipboardData for a given format.  This indicates that the application can provide the data in that format, but is deferring doing so until it is actually needed.  Sure enough, if I paste the text into Notepad, which needs the text in Unicode, Windows sends a WM_RENDERFORMAT message to the WordPad window, and WordPad provides the data in the new format.  Of course, if the application exits before populating all of its formats, Windows needs all of the formats rendered.  In this case, Windows will send the WM_RENDERALLFORMATS message so other applications can use the clipboard data after the source application has exited.

     

    That’s all for now.  Next time we’ll look at how applications can monitor the clipboard for changes using two hooks.  If you want to know more about using the clipboard in your code, this is a great reference.

     

    -Matt Burrough

     

    Part 2 of this article can be found here:

    http://blogs.msdn.com/b/ntdebugging/archive/2012/03/29/how-the-clipboard-works-part-2.aspx

  • Ntdebugging Blog

    Debugging Backwards: Proving root cause

    • 0 Comments

    Matt Burrough here again.  On rare occasions when debugging, we'll actually know (or strongly suspect) what the root cause of a problem is at the beginning of our analysis - but we still need to investigate to confirm our assertion.  The following is a case study for an issue I worked on recently where the print spooler was hanging.

     

    This customer had recently upgraded their print servers from Windows 2003 to Windows 2008 R2.  After the upgrade, the spooler would frequently go unresponsive, and no jobs could be printed.  Rebooting the server provided some relief, but the problem would reoccur as the jobs started coming in again.

     

    One of my peers had completed an initial analysis of a user mode memory dump of the spooler process, and found that spooler seemed to be blocked waiting on PrintIsolationHost.exe.  For those not familiar with recent changes to the print spooler's design - Print Isolation Host was added in Windows 7/2008 R2 as a way to isolate print drivers from each other and from the spooler process so a crash in one driver doesn't take down the entire printing environment.  Given the large number of print drivers found on some print servers, this can be a great help for stability and availability of the spooler.  See http://blogs.technet.com/b/askperf/archive/2009/10/08/windows-7-windows-server-2008-r2-print-driver-isolation.aspx if you would like more details on Print Isolation Host.

     

    Unfortunately for my team mate, the data collected did not include dumps of Print Isolation Host, so he requested that the next time the problem happened that both spooler and PrintIsolationHost dumps would be gathered.  The customer had configured his server for the "Isolated" Print Isolation Host option during troubleshooting, which places each driver in its own process.  (The default is shared which places all drivers in one PrintIsolationHost.exe instance.  Driver isolation is configured using the Print Management Console.)

     

    The next morning, the newly requested data came in, and since the problem was urgent, I began looking at the new dumps immediately.  This dataset included both a spoolsv.exe dump, as well as nearly two dozen PrintIsolationHost dumps!  I knew from the past analysis that I should start with the PrintIsolationHost data - but where to begin?  In order to triage the dumps, I wrote a small batch file to open each dump (luckily, they were sequentially numbered), write the call stacks in each thread to a file, and close the log.  On every iteration, the script created a new cmd.txt file, which contained a set of commands that were passed to the debugger.  This allowed me to name the debugger output files sequentially with names that matched their dump (e.g. PrintIsolationHost1.txt contained output from PrintIsolationHost1.dmp).

     

    set x=1

    :Top

    echo .reload > c:\data\cmd.txt

    echo .logopen c:\data\PrintIsolationHost%x%.txt  >> c:\data\cmd.txt

    echo ~*kc >> c:\data\cmd.txt

    echo .logclose >> c:\data\cmd.txt

    echo qq >> c:\data\cmd.txt

    c:\debuggers\kd.exe -cf c:\data\cmd.txt -z "c:\data\PrintIsolationHost%x%.DMP"

    set /A x+=1

    IF %x% LEQ 23 GOTO Top

     

    Now that I had a directory full of text files, I used my favorite differencing tool to compare the stacks in each text file.  I used the first PrintIsolationHost file as a reference.  It only had four stacks, and these were common to all of the other files:

     

    .  0  Id: 40c.1c28 Suspend: 0 Teb: 000007ff`fffde000 Unfrozen

    Call Site

    user32!ZwUserGetMessage

    user32!GetMessageW

    PrintIsolationHost!ATL::CAtlExeModuleT<Host>::Run

    PrintIsolationHost!Host::RunUsingConfiguration

    PrintIsolationHost!wWinMain

    PrintIsolationHost!__wmainCRTStartup

    kernel32!BaseThreadInitThunk

    ntdll!RtlUserThreadStart

     

       1  Id: 40c.23c0 Suspend: 0 Teb: 000007ff`fffdb000 Unfrozen

    Call Site

    ntdll!ZwWaitForMultipleObjects

    ntdll!TppWaiterpThread

    kernel32!BaseThreadInitThunk

    ntdll!RtlUserThreadStart

     

       2  Id: 40c.2598 Suspend: 0 Teb: 000007ff`fffd3000 Unfrozen

    Call Site

    ntdll!ZwWaitForSingleObject

    KERNELBASE!WaitForSingleObjectEx

    PrintIsolationHost!Host::MonitorShutdown

    PrintIsolationHost!Host::MonitorProc

    kernel32!BaseThreadInitThunk

    ntdll!RtlUserThreadStart

     

       3  Id: 40c.820 Suspend: 0 Teb: 000007ff`fffd9000 Unfrozen

    Call Site

    ntdll!ZwWaitForWorkViaWorkerFactory

    ntdll!TppWorkerThread

    kernel32!BaseThreadInitThunk

    ntdll!RtlUserThreadStart

     

    I was able to rule out a number of other PrintIsolationHost instances that were either identical to this one (aside from the process/thread IDs and Tebs), or which just had one or two additional idle worker threads (like thread 3 above).

     

    Things got interesting when I looked at two of the PrintIsolationHost dumps.  Both had these two stacks not found in any other the other dumps (note that I do not have symbols for the 3rd-party print processor ProseWarePrintProcessor):

     

       2  Id: 20a4.1328 Suspend: 0 Teb: 000007ff`fffda000 Unfrozen

    Call Site

    ntdll!ZwWaitForSingleObject

    ntdll!RtlpWaitOnCriticalSection

    ntdll!RtlEnterCriticalSection

    ntdll!LdrLockLoaderLock

    *** ERROR: Symbol file could not be found.  Defaulted to export symbols for ProseWarePrintProcessor.dll -

    KERNELBASE!GetModuleFileNameW

    ProseWarePrintProcessor

    ProseWarePrintProcessor

    ProseWarePrintProcessor!ControlPrintProcessor

    PrintIsolationProxy!sandbox::PrintProcessor::ControlPrintProcessor

    rpcrt4!Invoke

    rpcrt4!Ndr64StubWorker

    rpcrt4!NdrStubCall3

    ole32!CStdStubBuffer_Invoke

    ole32!SyncStubInvoke

    ole32!StubInvoke

    ole32!CCtxComChnl::ContextInvoke

    ole32!AppInvoke

    ole32!ComInvokeWithLockAndIPID

    ole32!ThreadInvoke

    rpcrt4!DispatchToStubInCNoAvrf

    rpcrt4!RPC_INTERFACE::DispatchToStubWorker

    rpcrt4!RPC_INTERFACE::DispatchToStub

    rpcrt4!RPC_INTERFACE::DispatchToStubWithObject

    rpcrt4!LRPC_SCALL::DispatchRequest

    rpcrt4!LRPC_SCALL::HandleRequest

    rpcrt4!LRPC_ADDRESS::ProcessIO

    rpcrt4!LrpcIoComplete

    ntdll!TppAlpcpExecuteCallback

    ntdll!TppWorkerThread

    kernel32!BaseThreadInitThunk

    ntdll!RtlUserThreadStart

     

       6  Id: 20a4.1668 Suspend: 0 Teb: 000007ff`fffac000 Unfrozen

    Call Site

    ntdll!ZwWaitForSingleObject

    ntdll!RtlpWaitOnCriticalSection

    ntdll!RtlEnterCriticalSection

    ProseWarePrintProcessor

    ProseWarePrintProcessor

    ProseWarePrintProcessor

    ntdll!LdrpInitializeThread

    ntdll!_LdrpInitialize

    ntdll!LdrInitializeThunk

     

    Interesting.  Thread 6 is running the DllMain code for the ProseWarePrintProcessor DLL, which holds the loader lock. It is waiting on a critical section.  Meanwhile, thread 2 is waiting on the loader lock.  So who holds the critical section that thread 6 wants?  First, let's find what lock thread 6 wants:

     

    0:006> kn

     # Child-SP          RetAddr           Call Site

    00 00000000`0104eb18 00000000`777fe4e8 ntdll!ZwWaitForSingleObject+0xa

    01 00000000`0104eb20 00000000`777fe3db ntdll!RtlpWaitOnCriticalSection+0xe8

    02 00000000`0104ebd0 00000000`750c5d6b ntdll!RtlEnterCriticalSection+0xd1

    03 00000000`0104ec00 00000000`750c6256 ProseWarePrintProcessor+0xabf

    04 00000000`0104ec30 00000000`750c7015 ProseWarePrintProcessor+0xfaa

    05 00000000`0104f090 00000000`777dc76c ProseWarePrintProcessor+0x1d69

    06 00000000`0104f1f0 00000000`777dc42f ntdll!LdrpInitializeThread+0x17c

    07 00000000`0104f2f0 00000000`777dc32e ntdll!LdrpInitialize+0x9f

    08 00000000`0104f360 00000000`00000000 ntdll!LdrInitializeThunk+0xe

    0:006> .frame /c /r 2

    02 00000000`0104ebd0 00000000`750c5d6b ntdll!RtlEnterCriticalSection+0xd1

    rax=0000000300d1001a rbx=00000000750ca330 rcx=00000000001d0000

    rdx=0000000000000040 rsi=0000000000000001 rdi=0000000000000004

    rip=00000000777fe3db rsp=000000000104ebd0 rbp=00000000ff000000

     r8=00000000002c6a00  r9=00000000002c6a10 r10=0000000000000073

    r11=0000000000000001 r12=000007fffffd6000 r13=00000000778e2660

    r14=0000000000000001 r15=000000007780a280

    iopl=0         nv up ei pl zr na po nc

    cs=0033  ss=002b  ds=002b  es=002b  fs=0053  gs=002b             efl=00000246

    ntdll!RtlEnterCriticalSection+0xd1:

    00000000`777fe3db 83f801          cmp     eax,1

    0:006> ub 00000000`750c5d6b      <<<< Let's see what gets passed to RtlEnterCriticalSection

    ProseWarePrintProcessor+0xaa2:

    00000000`750c5d4e f9              stc

    00000000`750c5d4f ff743f33        push    qword ptr [rdi+rdi+33h]

    00000000`750c5d53 d2ff            sar     bh,cl

    00000000`750c5d55 15aeb4ffff      adc     eax,0FFFFB4AEh

    00000000`750c5d5a 85c0            test    eax,eax

    00000000`750c5d5c 7533            jne     ProseWarePrintProcessor+0xae5 (00000000`750c5d91)

    00000000`750c5d5e 488d0dcb450000  lea     rcx,[ProseWarePrintProcessor+0x5084 (00000000`750ca330)] <<<< Critical section is in rcx

    00000000`750c5d65 ff15bdb3ffff    call    qword ptr [ProseWarePrintProcessor+0x1128 (00000000`750c1128)]

    0:006> u ntdll!RtlEnterCriticalSection

    ntdll!RtlEnterCriticalSection:

    00000000`77802fc0 fff3            push    rbx

    00000000`77802fc2 4883ec20        sub     rsp,20h

    00000000`77802fc6 f00fba710800    lock btr dword ptr [rcx+8],0

    00000000`77802fcc 488bd9          mov     rbx,rcx   <<<< Critical section is saved in RBX.  RBX isn't modified between here and our current position

    00000000`77802fcf 0f83e9b1ffff    jae     ntdll!RtlEnterCriticalSection+0x31 (00000000`777fe1be)

    00000000`77802fd5 65488b042530000000 mov   rax,qword ptr gs:[30h]

    00000000`77802fde 488b4848        mov     rcx,qword ptr [rax+48h]

    00000000`77802fe2 c7430c01000000  mov     dword ptr [rbx+0Ch],1

    0:006> r rbx

    Last set context:

    rbx=00000000750ca330  <<<< This is the address of the critical section thread 6 is waiting for

     

    Now let's look at the threads in this process, and the held locks:

     

    0:002> ~

    #  0  Id: 20a4.180c Suspend: 0 Teb: 000007ff`fffde000 Unfrozen

       1  Id: 20a4.2294 Suspend: 0 Teb: 000007ff`fffdc000 Unfrozen

    .  2  Id: 20a4.1328 Suspend: 0 Teb: 000007ff`fffda000 Unfrozen

       3  Id: 20a4.1c34 Suspend: 0 Teb: 000007ff`fffd8000 Unfrozen

       4  Id: 20a4.5bc Suspend: 0 Teb: 000007ff`fffd4000 Unfrozen

       5  Id: 20a4.3c4 Suspend: 0 Teb: 000007ff`fffae000 Unfrozen

       6  Id: 20a4.1668 Suspend: 0 Teb: 000007ff`fffac000 Unfrozen

    0:002> !locks

     

    CritSec ntdll!LdrpLoaderLock+0 at 00000000778e7490

    WaiterWoken        No

    LockCount          2

    RecursionCount     1

    OwningThread       1668

    EntryCount         0

    ContentionCount    2

    *** Locked

     

    CritSec ProseWarePrintProcessor+5084 at 00000000750ca330

    WaiterWoken        No

    LockCount          1

    RecursionCount     2

    OwningThread       1328

    EntryCount         0

    ContentionCount    1

    *** Locked

     

    Scanned 201 critical sections

     

    That's not good!  We can see that thread 6 indeed owns the loader lock, which thread 2 is waiting for.  But thread 2 owns the ProseWarePrintProcessor lock - and thread 6 is waiting for it!  This is a classic deadlock.  In fact, Raymond Chen even described this on his blog.  More information about LdrInitialize can be found here.

    image002

     

    So we know that there is a deadlock in the Print Isolation Host, but why exactly does this cause spooler to hang?  Here's where we work backwards.  We know that thread 6 is handling DLL initialization, but what is thread 2 doing?  From the stack we can see it is handling an RPC request that called into ProseWarePrintProcessor.  Let's determine who called into this thread.

     

    0:002> kn

     # Child-SP          RetAddr           Call Site

    00 00000000`0085dd88 00000000`777fe4e8 ntdll!ZwWaitForSingleObject+0xa

    01 00000000`0085dd90 00000000`777fe3db ntdll!RtlpWaitOnCriticalSection+0xe8

    02 00000000`0085de40 00000000`777db9e7 ntdll!RtlEnterCriticalSection+0xd1

    03 00000000`0085de70 000007fe`fd963706 ntdll!LdrLockLoaderLock+0x6d

    04 00000000`0085deb0 00000000`750c58f3 KERNELBASE!GetModuleFileNameW+0x96

    05 00000000`0085df10 00000000`750c5d77 ProseWarePrintProcessor!InstallPrintProcessor+0x647

    06 00000000`0085e380 00000000`750c51d9 ProseWarePrintProcessor!InstallPrintProcessor+0xacb

    07 00000000`0085e3b0 000007fe`f96766b4 ProseWarePrintProcessor!ControlPrintProcessor+0x25

    08 00000000`0085e3e0 000007fe`fe2f23d5 PrintIsolationProxy!sandbox::PrintProcessor::ControlPrintProcessor+0x34

    09 00000000`0085e420 000007fe`fe39b68e rpcrt4!Invoke+0x65

    0a 00000000`0085e470 000007fe`fe2f48d6 rpcrt4!Ndr64StubWorker+0x61b

    0b 00000000`0085ea30 000007fe`fdd50883 rpcrt4!NdrStubCall3+0xb5

    0c 00000000`0085ea90 000007fe`fdd50ccd ole32!CStdStubBuffer_Invoke+0x5b

    0d 00000000`0085eac0 000007fe`fdd50c43 ole32!SyncStubInvoke+0x5d

    0e 00000000`0085eb30 000007fe`fdc0a4f0 ole32!StubInvoke+0xdb

    0f 00000000`0085ebe0 000007fe`fdd514d6 ole32!CCtxComChnl::ContextInvoke+0x190

    10 00000000`0085ed70 000007fe`fdd5122b ole32!AppInvoke+0xc2

    11 00000000`0085ede0 000007fe`fdd4fd6d ole32!ComInvokeWithLockAndIPID+0x52b

    12 00000000`0085ef70 000007fe`fe2e50f4 ole32!ThreadInvoke+0x30d

    13 00000000`0085f010 000007fe`fe2e4f56 rpcrt4!DispatchToStubInCNoAvrf+0x14

    14 00000000`0085f040 000007fe`fe2e775b rpcrt4!RPC_INTERFACE::DispatchToStubWorker+0x146

    15 00000000`0085f160 000007fe`fe2e769b rpcrt4!RPC_INTERFACE::DispatchToStub+0x9b

    16 00000000`0085f1a0 000007fe`fe2e7632 rpcrt4!RPC_INTERFACE::DispatchToStubWithObject+0x5b

    17 00000000`0085f220 000007fe`fe2e532d rpcrt4!LRPC_SCALL::DispatchRequest+0x422

    18 00000000`0085f300 000007fe`fe302e7f rpcrt4!LRPC_SCALL::HandleRequest+0x20d

    19 00000000`0085f430 000007fe`fe302a35 rpcrt4!LRPC_ADDRESS::ProcessIO+0x3bf

    1a 00000000`0085f570 00000000`777cb68b rpcrt4!LrpcIoComplete+0xa5

    1b 00000000`0085f600 00000000`777cfeff ntdll!TppAlpcpExecuteCallback+0x26b

    1c 00000000`0085f690 00000000`776a652d ntdll!TppWorkerThread+0x3f8

    1d 00000000`0085f990 00000000`777dc521 kernel32!BaseThreadInitThunk+0xd

    1e 00000000`0085f9c0 00000000`00000000 ntdll!RtlUserThreadStart+0x1d

     

    I know that the code in frame 19 deals with processing the RPC and has a record of the calling process' PID and TID.  In fact, from a bit of code review, I know that at this portion of the code, we get back a value that contains a ntdll!_CLIENT_ID structure at offset 8:

     

    rpcrt4!LRPC_ADDRESS::ProcessIO+0xe6:

    000007fe`fe302ba6 ff151c050a00    call    qword ptr [rpcrt4!_imp_AlpcGetMessageFromCompletionList (000007fe`fe3a30c8)]

    000007fe`fe302bac 4885c0          test    rax,rax

    000007fe`fe302baf 0f84d1020000    je      rpcrt4!LRPC_ADDRESS::ProcessIO+0x3c6 (000007fe`fe302e86)

    000007fe`fe302bb5 4c8bac2488000000 mov     r13,qword ptr [rsp+88h]

    000007fe`fe302bbd 41bc01000000    mov     r12d,1

    000007fe`fe302bc3 33d2            xor     edx,edx

    000007fe`fe302bc5 488bf8          mov     rdi,rax

    000007fe`fe302bc8 41bfff000000    mov     r15d,0FFh

     

    Reviewing the assembly, from ProcessIO+0xe6 to where we are now (ProcessIO+0x3bf), we don't modify rdi again - and rdi is nonvolatile - so if we switch to that frame and check out rdi+8, we'll know who called this thread!

     

    0:002> .frame /c /r 19

    19 00000000`0085f430 000007fe`fe302a35 rpcrt4!LRPC_ADDRESS::ProcessIO+0x3bf

    rax=0000e5a47e7646ad rbx=0000000000000001 rcx=00000000750ca330

    rdx=0000000000000000 rsi=00000000002b8aa0 rdi=00000000002c4a80

    rip=000007fefe302e7f rsp=000000000085f430 rbp=0000000000ecff90

     r8=000000000085e2d8  r9=0000000000000002 r10=0000000000000000

    r11=0000000000000246 r12=0000000000ec8bf0 r13=0000000000000000

    r14=0000000000000000 r15=0000000000ec8080

    iopl=0         nv up ei pl zr na po nc

    cs=0033  ss=002b  ds=002b  es=002b  fs=0053  gs=002b             efl=00000246

    rpcrt4!LRPC_ADDRESS::ProcessIO+0x3bf:

    000007fe`fe302e7f 4c8d357ad1fbff  lea     r14,[rpcrt4!COMMON_ResubmitListen <PERF> (rpcrt4+0x0) (000007fe`fe2c0000)]

    0:002> dt _CLIENT_ID rdi+8

    ntdll!_CLIENT_ID

       +0x000 UniqueProcess    : 0x00000000`00002694 Void

       +0x008 UniqueThread     : 0x00000000`000005e8 Void

    0:002> ? 0x00000000`00002694

    Evaluate expression: 9876 = 00000000`00002694

    0:002> ? 0x00000000`000005e8

    Evaluate expression: 1512 = 00000000`000005e8

     

    So now we know that the caller was process 0x2694 and thread 0x5e8, or 9876 and 1512 in decimal, respectively.  Our current process (PrintIsolationHost.exe) is PID 20a4 (see above ~ output), or 8356 decimal.  So who is 9876?  I happen to have a process listing from the data collection:

     

    [0]  0 64 9876 spoolsv.exe     Svcs:  Spooler

         Command Line: C:\Windows\System32\spoolsv.exe

    [0]  0 64 8356 PrintIsolationHost.exe

         Command Line: C:\Windows\system32\PrintIsolationHost.exe -Embedding

     

    Okay, so I know the caller is thread 5e8 in spooler.  Loading up the spooler dump, what is that thread doing?

     

    0:000> ~~[5e8]s

    ntdll!NtAlpcSendWaitReceivePort+0xa:

    00000000`77801b6a c3              ret

    # Child-SP          RetAddr           Call Site

    00 00000000`0649d898 000007fe`fe2fa776 ntdll!NtAlpcSendWaitReceivePort+0xa

    01 00000000`0649d8a0 000007fe`fe2f4e42 rpcrt4!LRPC_CCALL::SendReceive+0x156

    02 00000000`0649d960 000007fe`fdd528c0 rpcrt4!I_RpcSendReceive+0x42

    03 00000000`0649d990 000007fe`fdd5282f ole32!ThreadSendReceive+0x40

    04 00000000`0649d9e0 000007fe`fdd5265b ole32!CRpcChannelBuffer::SwitchAptAndDispatchCall+0xa3

    05 00000000`0649da80 000007fe`fdc0daaa ole32!CRpcChannelBuffer::SendReceive2+0x11b

    06 00000000`0649dc40 000007fe`fdc0da0c ole32!CAptRpcChnl::SendReceive+0x52

    07 00000000`0649dd10 000007fe`fdd5205d ole32!CCtxComChnl::SendReceive+0x68

    08 00000000`0649ddc0 000007fe`fe39b949 ole32!NdrExtpProxySendReceive+0x45

    09 00000000`0649ddf0 000007fe`fdd521d0 rpcrt4!NdrpClientCall3+0x2e2

    0a 00000000`0649e0b0 000007fe`fdc0d8a2 ole32!ObjectStublessClient+0x11d

    0b 00000000`0649e440 000007fe`fb059070 ole32!ObjectStubless+0x42

    0c 00000000`0649e490 000007fe`fb057967 localspl!sandbox::PrintProcessorExecuteObserver::ControlPrintProcessor+0x10

    0d 00000000`0649e4c0 000007fe`fb055e27 localspl!sandbox::PrintProcessorAdapterImpl::ControlPrintProcessor+0x27

    0e 00000000`0649e4f0 000007fe`faff7392 localspl!sandbox::PrintProcessorAdapter::ControlPrintProcessor+0x1b

    0f 00000000`0649e520 000007fe`faff8a0a localspl!DeleteJob+0x1ce

    10 00000000`0649eae0 00000000`ff41fe25 localspl!SplSetJob+0x49e

    11 00000000`0649eb80 000007fe`f9683603 spoolsv!SetJobW+0x25

    12 00000000`0649ebc0 00000000`61001ce1 spoolss!SetJobW+0x1f

    13 00000000`0649ec00 00000000`61001d7d Contoso!InitializePrintMonitor+0x781

    14 00000000`0649ec40 000007fe`faffa674 Contoso!InitializePrintMonitor+0x81d

    15 00000000`0649ec70 00000000`ff41c9c7 localspl!SplEndDocPrinter+0x214

    16 00000000`0649ecd0 00000000`ff403ba6 spoolsv!EndDocPrinter+0x1f

    17 00000000`0649ed00 00000000`ff3fe772 spoolsv!YEndDocPrinter+0x22

    18 00000000`0649ed30 000007fe`fe2f23d5 spoolsv!RpcEndDocPrinter+0x3e

    19 00000000`0649ed60 000007fe`fe39b68e rpcrt4!Invoke+0x65

    1a 00000000`0649edb0 000007fe`fe2dac40 rpcrt4!Ndr64StubWorker+0x61b

    1b 00000000`0649f370 000007fe`fe2e50f4 rpcrt4!NdrServerCallAll+0x40

    1c 00000000`0649f3c0 000007fe`fe2e4f56 rpcrt4!DispatchToStubInCNoAvrf+0x14

    1d 00000000`0649f3f0 000007fe`fe2e5679 rpcrt4!RPC_INTERFACE::DispatchToStubWorker+0x146

    1e 00000000`0649f510 000007fe`fe2e532d rpcrt4!LRPC_SCALL::DispatchRequest+0x149

    1f 00000000`0649f5f0 000007fe`fe302e7f rpcrt4!LRPC_SCALL::HandleRequest+0x20d

    20 00000000`0649f720 000007fe`fe302a35 rpcrt4!LRPC_ADDRESS::ProcessIO+0x3bf

    21 00000000`0649f860 00000000`777cb68b rpcrt4!LrpcIoComplete+0xa5

    22 00000000`0649f8f0 00000000`777cfeff ntdll!TppAlpcpExecuteCallback+0x26b

    23 00000000`0649f980 00000000`776a652d ntdll!TppWorkerThread+0x3f8

    24 00000000`0649fc80 00000000`777dc521 kernel32!BaseThreadInitThunk+0xd

    25 00000000`0649fcb0 00000000`00000000 ntdll!RtlUserThreadStart+0x1d

     

    It's calling into the print isolation host as we expect.  It looks like it is doing that to end a print job, based on an RPC call it received.  Using our same method as last time, let's pull out the PID and TID he is responding to:

     

    0:048> .frame /c /r 20

    20 00000000`0649f720 000007fe`fe302a35 rpcrt4!LRPC_ADDRESS::ProcessIO+0x3bf

    rax=000000000622986c rbx=0000000000000000 rcx=000000000622985c

    rdx=000007fefe3a3c40 rsi=00000000027e5150 rdi=0000000008e6abd0

    rip=000007fefe302e7f rsp=000000000649f720 rbp=0000000008e43480

     r8=0000000000000010  r9=0000000000000000 r10=000007fefe2c0000

    r11=0000000000000002 r12=000000000512d8c0 r13=0000000000000000

    r14=0000000000000000 r15=00000000102e3710

    iopl=0         nv up ei pl zr na po nc

    cs=0033  ss=002b  ds=002b  es=002b  fs=0053  gs=002b             efl=00000246

    rpcrt4!LRPC_ADDRESS::ProcessIO+0x3bf:

    000007fe`fe302e7f 4c8d357ad1fbff  lea     r14,[rpcrt4!COMMON_ResubmitListen <PERF> (rpcrt4+0x0) (000007fe`fe2c0000)]

    0:048> dt _CLIENT_ID rdi+8

    ntdll!_CLIENT_ID

       +0x000 UniqueProcess    : 0x00000000`000020a4 Void

       +0x008 UniqueThread     : 0x00000000`00001c34 Void

     

    Look at that, it's doing work for a different thread back in our Print Isolation Host.  Switching back to that dump, what is thread 1c34 doing?

     

    0:002> ~~[1c34]s

    ntdll!NtAlpcSendWaitReceivePort+0xa:

    00000000`77801b6a c3 

    0:003> kn

     # Child-SP          RetAddr           Call Site

    00 00000000`00ebc098 000007fe`fe2fa776 ntdll!NtAlpcSendWaitReceivePort+0xa

    01 00000000`00ebc0a0 000007fe`fe39cc74 rpcrt4!LRPC_CCALL::SendReceive+0x156

    02 00000000`00ebc160 000007fe`fe39cf25 rpcrt4!NdrpClientCall3+0x244

    03 00000000`00ebc420 000007fe`f9bfd878 rpcrt4!NdrClientCall3+0xf2

    04 00000000`00ebc7b0 000007fe`f96845bf winspool!EndDocPrinter+0x15c

    05 00000000`00ebc7f0 00000000`750c4102 spoolss!EndDocPrinter+0x2f

    06 00000000`00ebc820 00000000`750c5013 ProseWarePrintProcessor+0x4102

    07 00000000`00ebe620 000007fe`f9676be2 ProseWarePrintProcessor!PrintDocumentOnPrintProcessor+0xb3

    08 00000000`00ebe650 000007fe`fe2f23d5 PrintIsolationProxy!sandbox::PrintProcessor::PrintDocThroughPrintProcessor+0x82

    09 00000000`00ebe6b0 000007fe`fe39b68e rpcrt4!Invoke+0x65

    0a 00000000`00ebe710 000007fe`fe2f48d6 rpcrt4!Ndr64StubWorker+0x61b

    0b 00000000`00ebecd0 000007fe`fdd50883 rpcrt4!NdrStubCall3+0xb5

    0c 00000000`00ebed30 000007fe`fdd50ccd ole32!CStdStubBuffer_Invoke+0x5b

    0d 00000000`00ebed60 000007fe`fdd50c43 ole32!SyncStubInvoke+0x5d

    0e 00000000`00ebedd0 000007fe`fdc0a4f0 ole32!StubInvoke+0xdb

    0f 00000000`00ebee80 000007fe`fdd514d6 ole32!CCtxComChnl::ContextInvoke+0x190

    10 00000000`00ebf010 000007fe`fdd5122b ole32!AppInvoke+0xc2

    11 00000000`00ebf080 000007fe`fdd4fd6d ole32!ComInvokeWithLockAndIPID+0x52b

    12 00000000`00ebf210 000007fe`fe2e50f4 ole32!ThreadInvoke+0x30d

    13 00000000`00ebf2b0 000007fe`fe2e4f56 rpcrt4!DispatchToStubInCNoAvrf+0x14

    14 00000000`00ebf2e0 000007fe`fe2e775b rpcrt4!RPC_INTERFACE::DispatchToStubWorker+0x146

    15 00000000`00ebf400 000007fe`fe2e769b rpcrt4!RPC_INTERFACE::DispatchToStub+0x9b

    16 00000000`00ebf440 000007fe`fe2e7632 rpcrt4!RPC_INTERFACE::DispatchToStubWithObject+0x5b

    17 00000000`00ebf4c0 000007fe`fe2e532d rpcrt4!LRPC_SCALL::DispatchRequest+0x422

    18 00000000`00ebf5a0 000007fe`fe302e7f rpcrt4!LRPC_SCALL::HandleRequest+0x20d

    19 00000000`00ebf6d0 000007fe`fe302a35 rpcrt4!LRPC_ADDRESS::ProcessIO+0x3bf

    1a 00000000`00ebf810 00000000`777cb68b rpcrt4!LrpcIoComplete+0xa5

    1b 00000000`00ebf8a0 00000000`777cfeff ntdll!TppAlpcpExecuteCallback+0x26b

    1c 00000000`00ebf930 00000000`776a652d ntdll!TppWorkerThread+0x3f8

    1d 00000000`00ebfc30 00000000`777dc521 kernel32!BaseThreadInitThunk+0xd

    1e 00000000`00ebfc60 00000000`00000000 ntdll!RtlUserThreadStart+0x1d

     

    This thread called into spooler to end a document based on yet another RPC!  To be clear, this is what we're looking at so far:

    image004

     

    Again, who called into this thread?

     

    0:003> .frame /c /r 19

    19 00000000`00ebf6d0 000007fe`fe302a35 rpcrt4!LRPC_ADDRESS::ProcessIO+0x3bf

    rax=57633d3cd2c9c145 rbx=0000000000000000 rcx=0000000000861ff0

    rdx=ffffffffff88ffe0 rsi=00000000002b8aa0 rdi=0000000000ec6bc0

    rip=000007fefe302e7f rsp=0000000000ebf6d0 rbp=00000000002cdef0

     r8=000000000003dbf0  r9=00000000000000fe r10=6dc0575d3d3cc4c8

    r11=0000000000860020 r12=0000000000ec82b0 r13=0000000000000000

    r14=0000000000000000 r15=0000000000ec8080

    iopl=0         nv up ei pl zr na po nc

    cs=0033  ss=002b  ds=002b  es=002b  fs=0053  gs=002b             efl=00000246

    rpcrt4!LRPC_ADDRESS::ProcessIO+0x3bf:

    000007fe`fe302e7f 4c8d357ad1fbff  lea     r14,[rpcrt4!COMMON_ResubmitListen <PERF> (rpcrt4+0x0) (000007fe`fe2c0000)]

    0:003> dt _CLIENT_ID rdi+8

    ntdll!_CLIENT_ID

       +0x000 UniqueProcess    : 0x00000000`00002694 Void

       +0x008 UniqueThread     : 0x00000000`000014cc Void

     

    So another call from spooler (note the same PID as earlier) - let's go back to spoolsv.dmp.  Here is this thread:

    0:048> ~~[14cc]s

    ntdll!NtAlpcSendWaitReceivePort+0xa:

    00000000`77801b6a c3              ret

    0:049> k

    Child-SP          RetAddr           Call Site

    00000000`0351e538 000007fe`fe2fa776 ntdll!NtAlpcSendWaitReceivePort+0xa

    00000000`0351e540 000007fe`fe2f4e42 rpcrt4!LRPC_CCALL::SendReceive+0x156

    00000000`0351e600 000007fe`fdd528c0 rpcrt4!I_RpcSendReceive+0x42

    00000000`0351e630 000007fe`fdd5282f ole32!ThreadSendReceive+0x40

    00000000`0351e680 000007fe`fdd5265b ole32!CRpcChannelBuffer::SwitchAptAndDispatchCall+0xa3

    00000000`0351e720 000007fe`fdc0daaa ole32!CRpcChannelBuffer::SendReceive2+0x11b

    00000000`0351e8e0 000007fe`fdc0da0c ole32!CAptRpcChnl::SendReceive+0x52

    00000000`0351e9b0 000007fe`fdd5205d ole32!CCtxComChnl::SendReceive+0x68

    00000000`0351ea60 000007fe`fe39b949 ole32!NdrExtpProxySendReceive+0x45

    00000000`0351ea90 000007fe`fdd521d0 rpcrt4!NdrpClientCall3+0x2e2

    00000000`0351ed50 000007fe`fdc0d8a2 ole32!ObjectStublessClient+0x11d

    00000000`0351f0e0 000007fe`fb0591ac ole32!ObjectStubless+0x42

    00000000`0351f130 000007fe`fb057882 localspl!sandbox::PrintProcessorExecuteObserver::PrintDocThroughPrintProcessor+0x124

    00000000`0351f1f0 000007fe`fb05601d localspl!sandbox::PrintProcessorAdapterImpl::PrintDocumentOnPrintProcessor+0x3a

    00000000`0351f220 000007fe`fb013b70 localspl!sandbox::PrintProcessorAdapter::PrintDocumentOnPrintProcessor+0x9d

    00000000`0351f270 000007fe`fb014c7c localspl!PrintDocumentThruPrintProcessor+0x46c

    00000000`0351fa70 00000000`776a652d localspl!PortThread+0x4d0

    00000000`0351fd80 00000000`777dc521 kernel32!BaseThreadInitThunk+0xd

    00000000`0351fdb0 00000000`00000000 ntdll!RtlUserThreadStart+0x1d

     

    Perfect.  So now we know that localspl was printing a document, which resulted in all of these RPC calls between spooler and Print Isolation Host, and ultimately the deadlock in Print Isolation Host is holding up this thread.  Just out of curiosity, are there any other threads blocked on this wait chain?

     

    0:049> !locks

    CritSec +5a00060 at 0000000005a00060

    WaiterWoken        No

    LockCount          105

    RecursionCount     1

    OwningThread       5e8

    EntryCount         0

    ContentionCount    84

    *** Locked

     

    Scanned 14148 critical sections

     

    Yes there are.  Thread 5e8, which we looked at earlier, is holding a lock that 104 other threads are waiting for!  This dump had 177 threads, so we know now that thread 5e8, 14cc, and 104 others in spooler are all hung on this deadlock. With about 60% of all the threads in spooler hung, the deadlock in ProseWarePrintProcessor is clearly the cause of our issue.  Here's the final wait chain diagram:

    image006

     

    To resolve the issue, ProseWarePrintProcessor needs to avoid calling GetModuleFileName while its DllMain routine is still running, since the former requires and the latter holds the Loader Lock.

  • Ntdebugging Blog

    What Should Never Happen... Did

    • 1 Comments

    Hi, this is Bob Golding; I wanted to write a blog about an interesting hardware issue I ran into. Hardware problems can be tricky to isolate. I recently came across one that I thought was interesting and gave an example of how to trace code execution.  The machine executed the filler “int 3” instructions generated by the compiler.  Execution should never reach these filler instructions, so we needed to determine how the instruction pointer got there.

     

    What was the issue?

    The issue was a bug check 8E (unhandled exception).  The exception was a debug exception (80000003), because a filler INT 3 instruction was executed.

     

    1: kd> .bugcheck

    Bugcheck code 0000008E

    Arguments 80000003 8082e3e0 f78aec38 00000000

     

    Below is the trap frame (the trap frame is the third argument in the bugcheck code). Note that the actual trapping instruction is 8082e3e0, the instruction pointer is incremented before the INT 3 generates a trap. The correct EIP is reported in the bug check values.

     

    1: kd> .trap f78aec38

    ErrCode = 00000000

    eax=0e364a01 ebx=f78aed50 ecx=f78aecb8 edx=000000e1 esi=f7727ec0 edi=f75d9ca2

    eip=8082e3e1 esp=f78aecac ebp=f78aecc0 iopl=0         nv up ei pl nz na po nc

    cs=0008  ss=0010  ds=0023  es=0023  fs=0030  gs=0000             efl=00000202

    nt!KiCheckForKernelApcDelivery+0x37:

    8082e3e1 cc              int     3

     

    The machine executed code at the end of a function that should never be executed.  This should never happen.

     

    1: kd> ub 8082e3e1

    nt!KiCheckForKernelApcDelivery+0x1c:

    8082e3c6 32c9            xor     cl,cl

    8082e3c8 ff2508118080    jmp     dword ptr [nt!_imp_KfLowerIrql (80801108)]

    8082e3ce 64a124010000    mov     eax,dword ptr fs:[00000124h]

    8082e3d4 c6403d01        mov     byte ptr [eax+3Dh],1

    8082e3d8 ff2598108080    jmp     dword ptr [nt!_imp_HalRequestSoftwareInterrupt (80801098)]

    8082e3de cc              int     3 <<< This is after the function.

    8082e3df cc              int     3

    8082e3e0 cc              int     3

     

    OK, So Now What?

    Now, we need to find the execution path that caused the machine to execute this INT 3. There are places to look to find clues that will tell us. The first place to start looking is the stack. If a “call” instruction was made, the return will be pushed on the stack. This way we can try to determine if we arrived at this bad instruction pointer from a call or a ret instruction.

     

    Using the value from esp in the above trap frame, let’s dump the stack.

     

    1: kd> dps f78aecac

    f78aecac  80a5d8fc hal!HalpIpiHandler+0xcc <<< Interesting?

    f78aecb0  f78aecc0

    f78aecb4  00000000

    f78aecb8  00000002

    f78aecbc  000000e1

    f78aecc0  f78aed50

    f78aecc4  f75d9ca2

    f78aecc8  badb0d00

    f78aeccc  000086a8

     

    In looking at the stack dump, we see that there may have been a call from HalpIpiHandler.  Let’s dump the code leading up to hal!HalpIpiHandler+0xcc to see what it did.

     

    1: kd> ub 80a5d8fc

    hal!HalpIpiHandler+0xb1:

    80a5d8e1 e8d82b0000      call    hal!HalBeginSystemInterrupt (80a604be)

    80a5d8e6 0ac0            or      al,al

    80a5d8e8 7509            jne     hal!HalpIpiHandler+0xc3 (80a5d8f3)

    80a5d8ea 83c408          add     esp,8

    80a5d8ed ff2510b0a580    jmp     dword ptr [hal!_imp_Kei386EoiHelper (80a5b010)]

    80a5d8f3 6a00            push    0

    80a5d8f5 55              push    ebp

    80a5d8f6 ff1520b0a580    call    dword ptr [hal!_imp__KiIpiServiceRoutine (80a5b020)]

     

    In the above assembly, we can see that there is a call made using a pointer in the import table.  Now, let’s have a look at that pointer.

     

    1: kd> dd 80a5b020 l 1

    80a5b020 8082e3e4

     

    The pointer is very close to the instruction we trapped on.  Is this a coincidence?

     

    It looks like due to an effective address calculation failure, the machine starting executing at 8082e3e0 instead of 8082e3e4.  Somewhere in the data path the processor executing this instruction stream dropped bit three, turning a 4 into a 0.

     

    1: kd> ?0y0100

    Evaluate expression: 4 = 00000000`00000004

    1: kd> ?0y0000

    Evaluate expression: 0 = 00000000`00000000

     

    What does all of this mean?

    There is some circumstantial evidence here that the machine was in the IPI handler.  The IPI Handler is used in multiprocessor systems so that one processor may interrupt another.  So, how can we further prove this is where we were?  Let’s try to match the trap frame registers to the assembly from HalpIpiHandler before it calls KiIpiServiceRoutine.

     

    1: kd> .trap f78aec38

    ErrCode = 00000000

    eax=0e364a01 ebx=f78aed50 ecx=f78aecb8 edx=000000e1 esi=f7727ec0 edi=f75d9ca2

    eip=8082e3e1 esp=f78aecac ebp=f78aecc0 iopl=0         nv up ei pl nz na po nc

    cs=0008  ss=0010  ds=0023  es=0023  fs=0030  gs=0000             efl=00000202

    nt!KiCheckForKernelApcDelivery+0x37:

    8082e3e1 cc              int     3

    1: kd> ub 80a5d8fc l10

    hal!HalpIpiHandler+0x94:

    80a5d8c4 c74508000ddbba  mov     dword ptr [ebp+8],0BADB0D00h <<< EBP+ 8 does equal badb0d00

    80a5d8cb 895d00          mov     dword ptr [ebp],ebx <<< ebp does equal ebx

     

    80a5d8ce 897d04          mov     dword ptr [ebp+4],edi <<< ebp+4 does equal edi

    80a5d8d1 68e1000000      push    0E1h

    80a5d8d6 83ec04          sub     esp,4

    80a5d8d9 54              push    esp

    80a5d8da 68e1000000      push    0E1h

    80a5d8df 6a1d            push    1Dh

    80a5d8e1 e8d82b0000      call    hal!HalBeginSystemInterrupt (80a604be)

    80a5d8e6 0ac0            or      al,al <<< al is equal to 1

    80a5d8e8 7509            jne     hal!HalpIpiHandler+0xc3 (80a5d8f3)

    80a5d8ea 83c408          add     esp,8

    80a5d8ed ff2510b0a580    jmp     dword ptr [hal!_imp_Kei386EoiHelper (80a5b010)]

    80a5d8f3 6a00            push    0

    80a5d8f5 55              push    ebp

    80a5d8f6 ff1520b0a580    call    dword ptr [hal!_imp__KiIpiServiceRoutine (80a5b020)]

     

    Below is the stack data that the instructions above produced (borrowed from the dps output above). It matches, so the machine did execute the above instructions.

     

    f78aecac  80a5d8fc  hal!HalpIpiHandler+0xcc <<< Return Pushed by call instruction

    f78aecb0  f78aecc0 <<< Pushed EBP

    f78aecb4  00000000 <<< Pushed a 0

     

    Finally, the last bit of evidence:

    Let's view the IPI state using the debugger command !ipi.  From this output we can see that processor 1 is the receiver of a cross interrupt from Processor 0.  This is consistent with the data we found on the stack.

     

    1: kd> !ipi

    IPI State for Processor 0

     

        As a sender, awaiting packet completion from processor 1.

     

        TargetSet             2  PacketBarrier         e  IpiFrozen      2 [Frozen]

        IpiFrame       8089a570  SignalDone     00000000  RequestSummary 0

        Packet State     Active                           WorkerRoutine  nt!KiFlushTargetMultipleTb

        Parameter[0]   00000000  Parameter[1]   80899f10  Parameter[2]   80899f04

     

    IPI State for Processor 1

     

        As a receiver, the following unhandled requests are pending: [Packet] [DPC]

     

        TargetSet             0  PacketBarrier         0  IpiFrozen      0 [Running]

        IpiFrame       b5ad7be8  SignalDone     ffdff120  RequestSummary 2 [DPC]

        Packet State      Stale                           WorkerRoutine  nt!KiFlushTargetMultipleTb

        Parameter[0]   00000000  Parameter[1]   b5ad7950  Parameter[2]   b5ad7948

     

    What went wrong?

    Based on the evidence in this dump it appears that call instruction transferred execution to the wrong address.  The machine ended up executing at address 8082e3e0 instead of 8082e3e4, a single bit difference.  This same bit was flipped in several crashes from this machine, so all the evidence pointed to a faulty processor.  After replacing the processor we were running on when we bugchecked, the issue did not occur again.

     

    Hardware can sometimes cause pretty specific failures, such as the flipped bit we see here.  To determine that this failure was a hardware issue, we had to reconstruct the execution path and trace how we ended up at the failing instruction.  We were able to match the register contents to what they would have been before the call to KiIpiServiceRoutine.  This demonstrated that the call should have been made to KiIpiServiceRoutine, but it unexpectedly went to the wrong address.

  • Ntdebugging Blog

    Identifying Global Atom Table Leaks

    • 6 Comments

    Hi, it's the Debug Ninja back again with another debugging adventure.  Recently I have encountered several instances where processes fail to initialize, and a review of available resources showed that there was no obvious resource exhaustion.  A more in depth review found that there were no available string atoms in the global atom table.

     

    Global atoms are organized on a per-session basis.  If atoms cannot be allocated in session 0, services may fail to start or processes launched by various services may fail to start.  However, a user logged in to a different session will not experience any such failures.

     

    String atoms are numbered from 0xC000 through 0xFFFF, providing a maximum of 0x4000 atoms per session.  For more information on atoms, and atom tables, see http://technet.microsoft.com/en-us/query/ms649053.

     

    When there are no more string atoms available, calls to APIs that allocate string atoms will fail.  Because atoms are often allocated at process or dll init time, the most common symptom is that processes fail to initialize.  The process may cleanly exit without an error.  You are likely experiencing this problem if you debug your application and find that the failure originates from an API that allocates string atoms such as RegisterClass, RegisterClassEx, GlobalAddAtom, or AddAtom.

     

    To determine if the global string atom table is full you will need to perform a kernel debug.  This can be a live debug or a post-mortem debug using a dump.

     

    First identify the session where the failures have occurred and set the process context to a process in this session.  In my example, w3wp.exe was launching a process and this process failed to initialize.

     

    2: kd> !process 0 0 w3wp.exe

    PROCESS fffffa8005083060

        SessionId: 0  Cid: 1668    Peb: fffdf000  ParentCid: 08ec

        DirBase: 8a2df000  ObjectTable: fffff8a0128bbe40  HandleCount: 441.

        Image: w3wp.exe

    2: kd> .process /p /r fffffa8005083060

    Implicit process is now fffffa80`05083060

    Loading User Symbols

    .....

     

    Next we need to analyze the global atom table.  The pointer to the table is stored in the UserAtomTableHandle global.

     

    2: kd> dq win32k!UserAtomTableHandle l1

    fffff960`003bf7a8  fffff8a0`05e5bc70

     

    The UserAtomTableHandle has a pointer to a handle table at offset 0x10 in 64-bit, and offset 0x8 in 32-bit.  Note that although the atom table is defined as a _RTL_ATOM_TABLE, the format shown by dt is for user mode and does not apply to the UserAtomTableHandle in kernel mode.

     

    2: kd> dq fffff8a0`05e5bc70+10 l1

    fffff8a0`05e5bc80  fffff8a0`05db7740

    2: kd> dt nt!_HANDLE_TABLE fffff8a0`05db7740

       +0x000 TableCode        : 0xfffff8a0`109c8001

       +0x008 QuotaProcess     : (null)

       +0x010 UniqueProcessId  : 0x00000000`00000184 Void

       +0x018 HandleLock       : _EX_PUSH_LOCK

       +0x020 HandleTableList  : _LIST_ENTRY [ 0xfffff8a0`05db7760 - 0xfffff8a0`05db7760 ]

       +0x030 HandleContentionEvent : _EX_PUSH_LOCK

       +0x038 DebugInfo        : (null)

       +0x040 ExtraInfoPages   : 0n0

       +0x044 Flags            : 0

       +0x044 StrictFIFO       : 0y0

       +0x048 FirstFreeHandle  : 0x10004

       +0x050 LastFreeHandleEntry : 0xfffff8a0`10ca4ff0 _HANDLE_TABLE_ENTRY

       +0x058 HandleCount      : 0x3fc0

       +0x05c NextHandleNeedingPool : 0x10400

       +0x060 HandleCountHighWatermark : 0x3fc1

     

    The FirstFreeHandle contains the handle number that will be given to the next handle allocated from this table.  This value is encoded, to get the next handle number we need to right shift the FirstFreeHandle by 2 bits.

     

    2: kd> ?00010004>>2

    Evaluate expression: 16385 = 00000000`00004001

     

    The result from above, 0x4001, is greater than the number of possible string atoms.  As I mentioned earlier, there is a limit of 0x4000 string atoms.  Now we know that the session is out of string atoms.

     

    The next step is to dump the string atoms to identify whether there is an observable pattern in the leaked strings.  The !atom command only works in user mode, so we need to dump the kernel mode strings manually.  An atom table is comprised of multiple buckets.   Each bucket is the head of a list of atoms.  The buckets start at offset 0x20 in the atom table in 64-bit, and offset 0x10 in 32-bit.

     

    2: kd> dq fffff8a0`05e5bc70+20

    fffff8a0`05e5bc90  fffff8a0`05e5ba60 fffff8a0`05db7be0

    fffff8a0`05e5bca0  fffff8a0`08cf1770 fffff8a0`05e5b3d0

    fffff8a0`05e5bcb0  fffff8a0`05ea9020 fffff8a0`05e5b8e0

    fffff8a0`05e5bcc0  fffff8a0`05ea9b10 fffff8a0`05ea9910

    fffff8a0`05e5bcd0  fffff8a0`05ea9f00 fffff8a0`05e5b650

    fffff8a0`05e5bce0  fffff8a0`05cda290 fffff8a0`05ea9e80

    fffff8a0`05e5bcf0  fffff8a0`05e5b200 fffff8a0`05ea9e30

    fffff8a0`05e5bd00  fffff8a0`05e5b7e0 fffff8a0`06c56210

    2: kd> dq

    fffff8a0`05e5bd10  fffff8a0`06d6b5a0 fffff8a0`05ea9d50

    fffff8a0`05e5bd20  fffff8a0`05e5b790 fffff8a0`05e5b9d0

    fffff8a0`05e5bd30  fffff8a0`06bd9bc0 fffff8a0`05ea9c90

    fffff8a0`05e5bd40  fffff8a0`05e5b0c0 fffff8a0`06ae2020

    fffff8a0`05e5bd50  fffff8a0`05e5b930 fffff8a0`04d2af40

    fffff8a0`05e5bd60  fffff8a0`05e5b690 fffff8a0`05e5b980

    fffff8a0`05e5bd70  fffff8a0`05e5b490 fffff8a0`05e5b410

    fffff8a0`05e5bd80  fffff8a0`05e5ba20 fffff8a0`05e5b4f0

    2: kd> dq

    fffff8a0`05e5bd90  fffff8a0`05e5baa0 fffff8a0`05e5b390

    fffff8a0`05e5bda0  fffff8a0`05e5b840 fffff8a0`05ea9c50

    fffff8a0`05e5bdb0  fffff8a0`05e5b250 00000000`00000000

    fffff8a0`05e5bdc0  00000000`00000000 00000000`00000000

    fffff8a0`05e5bdd0  00000000`00000000 00000000`00000000

    fffff8a0`05e5bde0  00000000`00000000 00000000`00000000

    fffff8a0`05e5bdf0  00000000`00000000 00000000`00000000

    fffff8a0`05e5be00  00000000`00000000 00000000`00000000

     

    The quick and dirty way to dump the buckets is with !list.  I am sure that some will say it is tedious to dump each bucket list by hand and that there are easier ways to accomplish this.  To prevent this article from becoming a lesson on debugger scripting, I am leaving that as an exercise to the reader.

     

    2: kd> !list "-t nt!_RTL_ATOM_TABLE_ENTRY.HashLink -e -x \"du @$extret+10\" fffff8a0`05e5ba60"

    du @$extret+10

    fffff8a0`05e5ba70  "Native"

     

    <snip strings that don't match a pattern>

     

    du @$extret+10

    fffff8a0`0838a120  "ControlOfs0210000000000700"

     

    du @$extret+10

    fffff8a0`0f7ff430  "ControlOfs021A000000000C30"

     

    du @$extret+10

    fffff8a0`162168c0  "ControlOfs020E000000001774"

     

    du @$extret+10

    fffff8a0`08c33870  "ControlOfs01F70000000007F4"

     

    du @$extret+10

    fffff8a0`07c46910  "ControlOfs0202000000000BF8"

     

    du @$extret+10

    fffff8a0`062aab50  "ControlOfs01F5000000001274"

     

    du @$extret+10

    fffff8a0`0777b150  "ControlOfs0202000000000C80"

     

    du @$extret+10

    fffff8a0`07dd3410  "ControlOfs0207000000000F00"

     

    du @$extret+10

    fffff8a0`0f01d190  "ControlOfs0214000000000DAC"

     

    Dumping the atoms I found that there is a continuous pattern of the string ControlOfs followed by 16 hexadecimal numbers.  Some time spent with your favorite search engine should find other reports of atom leaks involving the string ControlOfs, and that these leaks have been identified as a problem in some specific software.  In this instance the programmer using that software needs to change their application to avoid the problem.

  • Ntdebugging Blog

    Stop 0x19 in a Large Pool Allocation

    • 2 Comments

    Hello all, Scott Olson here again to share another interesting issue I recently debugged with pool corruption and found that using special pool does not work with large pool allocations (pool allocations greater than a PAGE_SIZE).

     

    Here is an example of a valid large page allocation. Notice the size is 0x1fb0 and a PAGE_SIZE is 0x1000 or 4kb.

     

    0: kd> !pool fffffa80`0dba6fa0

    Pool page fffffa800dba6fa0 region is Nonpaged pool

    *fffffa800dba5000 : large page allocation, Tag is Io  , size is 0x1fb0 bytes

                    Pooltag Io   : general IO allocations, Binary : nt!io

     

    In Windows 7, at the end of the large pool allocation it will have an allocation tag of “Frag” then a “Free” tag with the rest of the page size and is stored on the free pool list for allocation less than a page in size.

     

    0: kd> dc fffffa800dba5000 fffffa800dba5000+0x1fb0-4

    fffffa80`0dba5000  00558001 32373242 00000000 00000000  ..U.B272........

    fffffa80`0dba5010  55555555 55555555 98764321 01b75f55  UUUUUUUU!Cv.U_..

    fffffa80`0dba5020  00000001 00000001 704e6ff0 fffff981  .........oNp....

    …<cut>

    fffffa80`0dba6f80  55555555 55555555 55555555 55555555  UUUUUUUUUUUUUUUU

    fffffa80`0dba6f90  55555555 55555555 55555555 55555555  UUUUUUUUUUUUUUUU

    fffffa80`0dba6fa0  55555555 55555555 00001fb0 00000000  UUUUUUUU........

    0: kd> dc

    fffffa80`0dba6fb0  02010100 67617246 55555555 55555555  ....FragUUUUUUUU

    fffffa80`0dba6fc0  00040101 65657246 55555555 55555555  ....FreeUUUUUUUU

    fffffa80`0dba6fd0  00802170 fffff880 0e49cf70 fffffa80  p!......p.I.....

    fffffa80`0dba6fe0  15cc8fe8 fffff981 3b9c50a7 00000005  .........P.;....


    Displayed with the !pool command:

    0: kd> !pool fffffa80`0dba6fb0

    Pool page fffffa800dba6fb0 region is Nonpaged pool

    *fffffa800dba6fb0 size:   10 previous size:    0  (Allocated) *Frag

                    Owning component : Unknown (update pooltag.txt)

     fffffa800dba6fc0 size:   40 previous size:   10  (Free)       Free

     

    The example above demonstrates how this normally works.  The downside to this architecture is that if a driver were to overrun its pool allocation then special pool would not be useful because the large pool allocation has to be page-aligned. Special pool detects pool overruns by putting the data at the end of the page, which would not be feasible with a large pool allocation.

     

    In Windows 7 there is a check while freeing the pool memory that will determine if this allocation had written past the end of its allocation, and if so will bug check the machine with a Stop 0x19 BAD_POOL_HEADER with the first parameter being a 0x21.  Here is the definition along with what each parameter means:

     

    BAD_POOL_HEADER (19)

    The pool is already corrupt at the time of the current request.

    This may or may not be due to the caller.

    The internal pool links must be walked to figure out a possible cause of

    the problem, and then special pool applied to the suspect tags or the driver

    verifier to a suspect driver.

    Arguments:

    Arg1: 0000000000000021, the data following the pool block being freed is corrupt.  Typically this means the consumer (call stack ) has overrun the block.

    Arg2: fffffa800dc57000, The pool pointer being freed.

    Arg3: 0000000000002180, The number of bytes allocated for the pool block.

    Arg4: 006b0072006f0077, The corrupted value found following the pool block.

     

    Here is an example of what this corruption looks like compared to the above valid large pool allocation:

    0: kd> !pool fffffa800dc57000

    Pool page fffffa800dc57000 region is Nonpaged pool

    fffffa800dc57000 is not a valid large pool allocation, checking large session pool...

    fffffa800dc57000 is freed (or corrupt) pool

    Bad allocation size @fffffa800dc57000, zero is invalid

     

    ***

    *** An error (or corruption) in the pool was detected;

    *** Attempting to diagnose the problem.

    ***

    *** Use !poolval fffffa800dc57000 for more details.

     

     

    Pool page [ fffffa800dc57000 ] is __inVALID.

     

    Analyzing linked list...

    [ fffffa800dc57000 ]: invalid previous size [ 0x38 ] should be [ 0x0 ]

     

     

    Scanning for single bit errors...

     

    None found

     

    Next, I dump the allocation from the start to the end.  Notice the size of the allocation is stored in the bugcheck code as argument 3.

     

    0: kd> dc fffffa800dc57000 fffffa800dc57000+2180-4

    fffffa80`0dc57000  00000038 0000000e 00000000 00000000  8...............

    fffffa80`0dc57010  a24da497 01ccc5d6 c827993c 41946d1f  ..M.....<.'..m.A

    fffffa80`0dc57020  c0d75c9b b7cff1a5 00000000 00000020  .\.......... ...

    fffffa80`0dc57030  000021e0 00000006 0000006c 00000110  .!......l.......

    fffffa80`0dc57040  00000208 000003b8 00000208 00000660  ............`...

    fffffa80`0dc57050  00000208 00000910 00000208 00000bb0  ................

    <cut>

    fffffa80`0dc59150  002d0033 00300031 0063002e 006d006f  3.-.1.0...c.o.m.

    fffffa80`0dc59160  006c002e 00660065 00680074 006e0061  ..l.e.f.t.h.a.n.

    fffffa80`0dc59170  006e0064 00740065 006f0077 006b0072  d.n.e.t.w.o.r.k.

     

    This should be the end of the allocation.  The next thing we see should be the “Frag” and “Free” tags.

     

    0: kd> dc

    fffffa80`0dc59180  003a0073 0061006d 0061006e 00650067  s.:.m.a.n.a.g.e.

    fffffa80`0dc59190  0065006d 0074006e 0038003a 00390036  m.e.n.t.:.8.6.9.

    fffffa80`0dc591a0  0062003a 00670069 0075006c 00790063  :.b.i.g.l.u.c.y.

    fffffa80`0dc591b0  0064002d 00740061 002d0061 006e0069  -.d.a.t.a.-.i.n.

    fffffa80`0dc591c0  00650064 00650078 002d0073 00740063  d.e.x.e.s.-.c.t.

    fffffa80`0dc591d0  006c0072 0031005f 00000031 00000000  r.l._.1.1.......

     

    We clearly see that the Frag and Free tag have been overwritten with some string value which is causing the corruption.  At this point, you would need to look at the current stack to determine which driver had allocated the memory, and review the code to investigate when this corruption could have occurred.

  • Ntdebugging Blog

    Configuring a Hyper-V VM For Kernel Debugging

    • 2 Comments

    Yesterday's blog prompted some questions about how to set up a debugger for a Windows OS running in a Hyper-V VM.  I was surprised that I wasn't able to find good, publicly available, Microsoft issued documentation for this configuration.

     

    The first step is to configure the Windows OS in the VM to enable a kernel debugger on COM1.  One would use these same steps if you were preparing the OS to be debugged using a null modem cable.  Hyper-V will allow us to redirect the COM port so that we don't need such a cable.

    1. Start an administrative command prompt.
    2. Turn on debugging with this command:
      bcdedit /debug on
    3. Configure the debugger to use COM1 with this command:
      bcdedit /dbgsettings SERIAL DEBUGPORT:1 BAUDRATE:115200
      Note that these are the default settings and already exist in most bcd stores.  However setting them again won't damage anything, and guards against a situation where the dbgsettings have been previously modified.
    4. Reboot so that the boot loader can read the new settings and configure the OS for debugging.

    CommandPrompt

     

    Next, configure Hyper-V to redirect the COM1 port to a named pipe.  We will use this pipe in place of a traditional null modem cable.

    1. Open Hyper-V Manager and browse to the settings page of the VM you configured to debug.
    2. Under the Hardware list choose COM 1.
    3. Change the Attachment to 'Named pipe:' and provide a pipe name.
      1. Note that the Hyper-V Manager provides the complete path to your named pipe.  Make a note of this path as you will need it in the next step.

    Settings2

     

    After the OS and the VM are configured for debugging, we need to connect a debugger.

    1. On the Hyper-V parent partition download and install the Debugging Tools for Windows from http://msdn.microsoft.com/en-us/windows/hardware/gg463009.
    2. After installing the debugging tools you will have a ‘Debugging Tools for Windows’ entry in your start menu.
      1. From this folder right click ‘WinDbg’ and choose ‘Run as administrator’.  Windbg needs administrative rights to connect to the pipe.
    3. In windbg open the File menu and choose ‘Kernel Debug’.
    4. Enter a Baud Rate of 115200, to match the settings made in the VM.
    5. Enter the Port that you configured in the VM settings page.
      1. To connect to the pipe remotely, substitute the '.' in the path with the Hyper-V server name.
    6. Ensure that the Pipe and Reconnect boxes are checked.
    7. Set Resets to 0.
    8. Click OK to start debugging.
    9. Windbg should display the string ' Waiting to reconnect...'

    image004

     

    To test the debugger connection in windbg, from the ‘Debug’ menu choose ‘Break’.  This should cause the server to break into the debugger and display a kd> prompt.  Please note that breaking into the debugger will cause the OS running in the VM to halt until you tell the debugger to go, the OS will appear to be hung during this time.  The command 'g' followed by Enter will tell the debugger to ‘go’ causing the VM to resume operation.

     

    Windbg

  • Ntdebugging Blog

    My Kernel Debugger Won't Connect

    • 2 Comments

    Hello ntdebugging readers, the Debug Ninja is back again with a quick blog this holiday season.  I recently encountered a situation where the kernel debugger could not connect to a Windows Server 2008 R2 system running in a Hyper-V virtual machine.  The configuration appeared correct; however, the debugger would not connect to the VM.

     

    In windbg you can use Ctrl+Alt+D to view the debugger’s internal information flow.  In KD use Ctrl+D followed by ENTER to toggle the output.  Enabling this output I could see that the debugger was unable to read from the debug port, and that it was getting timeouts.  The error "SYNCTARGET: Timeout." is a clear indication that the debug host cannot communicate with the debug target, especially when this error appears after a “Send Break in” message.

    SYNCTARGET: Timeout

     

    Because I was using a named pipe on a Hyper-V VM I knew that I didn't have a bad cable, although this is a common cause of kernel debug failures.  I also knew that the configuration of the VM was correct, and I could use the debugger for other VMs on this server.  The problem was most likely with the OS running in the VM.

     

    By checking Device Manager I was able to confirm that there was a problem with the configuration of the OS running in the VM.  The bcdedit settings were configured to use COM1, and this should make COM1 unavailable in the OS, however, COM1 was present in device manager.  For some reason the debugger was not capturing COM1 on boot as it was configured to.

    Device Manager

     

    Examining the bcd configuration of this server I found that the bcd configuration was not correct.  In the bcd store of normal Windows 7 or Windows Server 2008 R2 OS, the Windows Boot Loader sections of bcdedit have an inherit setting.  You can view this information on your system from an elevated command prompt using the command ‘bcdedit /enum all’.  Ordinarily the Windows Boot Loader inherits the {bootloadersettings}, the {bootloadersettings} inherit the {globalsettings}, and the {globalsettings} inherit the {dbgsettings}.  Without the inherit settings, the debugger configuration will not be read by the boot loader.

     

    Below are the bcd settings from the broken VM.  You can see that all of the normal inherited settings are missing.

    C:\Windows\system32>bcdedit /enum all

     

    Windows Boot Manager

    --------------------

    identifier              {bootmgr}

    device                  partition=C:

    path                    \bootmgr

    description             Windows Boot Manager

    locale                  en-US

    default                 {current}

    displayorder            {current}

    timeout                 30

     

    Windows Boot Loader

    -------------------

    identifier              {current}

    device                  partition=C:

    path                    \Windows\system32\winload.exe

    description             Windows Server 2008 R2 Standard (recovered)

    locale                  en-US

    osdevice                partition=C:

    systemroot              \Windows

    resumeobject            {2ec5363f-2a92-11e1-bbe4-806e6f6e6963}

    usefirmwarepcisettings  No

    debug                   Yes

     

    Resume from Hibernate

    ---------------------

    identifier              {2ec5363f-2a92-11e1-bbe4-806e6f6e6963}

    device                  partition=C:

    path                    \Windows\system32\winresume.exe

    description             Windows Server 2008 R2 Standard (recovered)

    locale                  en-US

    inherit                 {resumeloadersettings}

    filedevice              partition=C:

    filepath                \hiberfil.sys

    debugoptionenabled      Yes

     

    Windows Memory Tester

    ---------------------

    identifier              {memdiag}

    device                  partition=C:

    path                    \boot\memtest.exe

    description             Windows Memory Diagnostic

    locale                  en-US

     

    Debugger Settings

    -----------------

    identifier              {dbgsettings}

    debugtype               Serial

    debugport               1

    baudrate                115200

     

    Because my only interest in this VM was to get the debugger working, I did not add all of the missing settings to the bcd store.  I was able to force the debugger configuration to be read on boot using this command:

    bcdedit /set inherit {dbgsettings}

     

    I hope this helps the next time you are trying to configure a debugger and it does not work.  Remember that we don't just need the debugger to be turned on and be configured; we need the settings to be inherited as well.

  • Ntdebugging Blog

    Fixing an ICorDebugUnmanagedCallback induced hang

    • 2 Comments

    Hi debuggers, Andrew Richards here with a NTDebugging post that is a little different to what is usually posted.  Instead of talking about debugging, I’m going to talk about an issue I just faced while writing a debugger.

     

    This debugger work is an extension of an upcoming article that I’ve written for MSDN Magazine (scheduled for the December 2011 issue). The MSDN Magazine article goes over how to write a native debugger using the DbgHelp API. It also explains how you can use this code to then make a plugin for Sysinternals ProcDump.

     

    When debugging a managed application, you can take debugging one step further by being both a managed and unmanaged (native) debugger. To do this, you use the CLR Debugger API instead of the DbgHelp API.

     

    What prompted this post was an issue that I hit while implementing the ICorDebugUnmanagedCallback::DebugEvent function of my unmanaged interface implementation. I was finding that the target process was hung after I processed in-band debug events but not out-of-band debug events. This was despite calling ICorDebugController::Continue, with or without calling ICorDebugProcess::ClearCurrentException first.

     

    ICorDebug Interface:

    Firstly, let’s take a step back and look at what it takes to get to the point of my issue. The goal in the initialization code is to get an instance of an ICorDebug based object.

     

    Below is an abridged version of the code to do this using .NET 4.0; I have omitted the error handling and some of the cleanup (IUnknown::Release) to keep the code brief.

     

    // Start COM

    CoInitialize(NULL);

     

    // Get a ICLRMetaHost instance (from .NET 4.0)

    ICLRMetaHost* pCLRMetaHost = NULL;

    CLRCreateInstance(CLSID_CLRMetaHost, IID_ICLRMetaHost, (LPVOID*)&pCLRMetaHost);

     

    // Get an enumeration of the loaded runtimes in the target process (opened prior with OpenProcess)

    IEnumUnknown* pEnumUnknown = NULL;

    pCLRMetaHost->EnumerateLoadedRuntimes(hProcess, &pEnumUnknown);

     

    // Use the first runtime found (Note, you can only debug one runtime at once)

    IUnknown* pUnknown = NULL;

    ULONG ulFetched = 0;

    pEnumUnknown->Next(1, &pUnknown, &ulFetched);

     

    // QueryInterface for the ICLRRuntimeInfo interface

    ICLRRuntimeInfo* pCLRRuntimeInfo = NULL;

    pUnknown->QueryInterface(__uuidof(ICLRRuntimeInfo), (void **)&pCLRRuntimeInfo);

     

    // Get the ICorDebug interface (this allows you to debug .NET 2.0 targets with the .NET 4.0 API)

    pCLRRuntimeInfo->GetInterface(CLSID_CLRDebuggingLegacy, IID_ICorDebug, (void **)&pCorDebug);

     

    // Initialize the .NET 2.0 debugging interface

    pCorDebug->Initialize();

     

    // Allocate our ICorDebugManagedCallback2 implementation and apply it to ICorDebug

    CCorDebugManagedCallback2* pCorDebugManagedCallback2 = new CCorDebugManagedCallback2();

    pCorDebug->SetManagedHandler((ICorDebugManagedCallback*)pCorDebugManagedCallback2);

     

    // Allocate our ICorDebugUnmanagedCallback implementation and apply it to ICorDebug

    CCorDebugUnmanagedCallback* pCorDebugUnmanagedCallback = new CCorDebugUnmanagedCallback();

    pCorDebug->SetUnmanagedHandler((ICorDebugUnmanagedCallback*)pCorDebugUnmanagedCallback);

     

    // Start debugging the process; returns the ICorDebugProcess we’ll need in the callbacks

    pCorDebug->DebugActiveProcess(nProcessId, TRUE, &pCorDebugProcess);

     

    This code is pretty linear; if any call fails you are out of luck.  By the end, you have associated your own managed and unmanaged callback classes with the ICorDebug object and are attached as a debugger. The code supports a target process using any of the.NET versions (v1.0, v1.1, v2.0, v4.0). Note that .NET v3.0 and v3.5 applications are actually v2.0 applications from a debugger point-of-view as these .NET releases just contain additional class libraries.

     

    My managed callback implementation supports the IUnknown, ICorDebugManagedCallback and ICorDebugManagedCallback2 interfaces. (I’m not going to discuss this code here).

     

    My unmanaged callback implementation supports the IUnknown and ICorDebugUnmanagedCallback interfaces. It is in this class that I had the issue.

     

    ICorDebugUnmanagedCallback Interface:

    The ICorDebugUnmanagedCallback interface has just one function:

     

    HRESULT DebugEvent (

        [in] LPDEBUG_EVENT  pDebugEvent,

        [in] BOOL           fOutOfBand

    );

     

    The function provides a DEBUG_EVENT structure in the same way that WaitForDebugEvent does. This is not surprising as under the covers, that is what the .NET 4.0 API is using – it is just passing it to us. As such, the rules for handling a DEBUG_EVENT structure apply here too.  Namely, close the handle passed with the CREATE_PROCESS_DEBUG_EVENT and LOAD_DLL_DEBUG_EVENT events.

     

    Following the DebugEvent documentation, I ended up with (roughly) the code below – which hangs the target process.

     

    STDMETHODIMP CCorDebugUnmanagedCallback::DebugEvent(LPDEBUG_EVENT pDebugEvent, BOOL fOutOfBand)

    {

          BOOL bClear = TRUE;

          switch (pDebugEvent->dwDebugEventCode)

          {

          case EXCEPTION_DEBUG_EVENT:

                if (pDebugEvent->u.Exception.dwFirstChance != 0)

                      bClear = FALSE;

                break;

          case CREATE_PROCESS_DEBUG_EVENT:

                if (pDebugEvent->u.CreateProcessInfo.hFile)

                      CloseHandle(pDebugEvent->u.CreateProcessInfo.hFile);

                break;

          case LOAD_DLL_DEBUG_EVENT:

                if (pDebugEvent->u.LoadDll.hFile)

                      CloseHandle(pDebugEvent->u.LoadDll.hFile);

                break;

          }

          if (bClear)

                pCorDebugProcess->ClearCurrentException(pDebugEvent->dwThreadId);

     

          pCorDebugProcess->Continue(fOutOfBand);

          return S_OK;

    }

     

    If you know what to look for, the answer to the ‘hang’ issue is on the MSDN page:

     

    You can call ICorDebugController::Continue only on a Win32 thread and only when continuing past an out-of-band event.

     

    So what does this really mean?

     

    What is means is that you must call ICorDebugController::Continue from any other thread than the one servicing the callback if the debug event is in-band (fOutOfBand == FALSE). The reason for this is to stop a race condition. In-band debug events can be interrupted by out-of-band debug events – that is, the DebugEvent function can be firing multiple times concurrently. By forcing the continuation on an alternate thread, the race condition is averted.

     

    I’m being brief here (on purpose) as I don’t want to incorrectly dissect for you the extremely complex internals of the CLR. You just need to know that you must use another thread for the hang to be averted.

     

    So what does the code look like now?  It’s something like this:

     

    STDMETHODIMP CCorDebugUnmanagedCallback::DebugEvent(LPDEBUG_EVENT pDebugEvent, BOOL fOutOfBand)

    {

          BOOL bClear = TRUE;

          switch (pDebugEvent->dwDebugEventCode)

          {

          case EXCEPTION_DEBUG_EVENT:

                if (pDebugEvent->u.Exception.dwFirstChance != 0)

                      bClear = FALSE;

                break;

          case CREATE_PROCESS_DEBUG_EVENT:

                if (pDebugEvent->u.CreateProcessInfo.hFile)

                      CloseHandle(pDebugEvent->u.CreateProcessInfo.hFile);

                break;

          case LOAD_DLL_DEBUG_EVENT:

                if (pDebugEvent->u.LoadDll.hFile)

                      CloseHandle(pDebugEvent->u.LoadDll.hFile);

                break;

          }

         

          if (bClear)

                pCorDebugProcess->ClearCurrentException(pDebugEvent->dwThreadId);

     

          if (fOutOfBand)

          {

                pCorDebugProcess->Continue(TRUE);

          }

          else

          {

                SetEvent(hEventContinueBegin);

                WaitForSingleEvent(hEventContinueDone, INFINITE);

          }

          return S_OK;

    }

     

    DWORD WINAPI CCorDebugUnmanagedCallbackThreadProc(LPVOID lpParameter)

    {

          while (!bQuit)

          {

                switch (WaitForSingleObject(hEventContinueBegin, 1000))

                {

                case WAIT_OBJECT_0:

                      pCorDebugProcess->Continue(FALSE);

                      SetEvent(hEventContinueDone);

                      break;

                }

          }

          return 0;

    }

     

    For out-of-band debug events, nothing has changed; the ICorDebugProcess::Continue call is made locally.

    For in-band debug events, an event is set to trigger the ICorDebugProcess::Continue on a dedicated thread. The dedicated thread sets an event to tell the callback thread that the Continue has been done.

     

    Note that the above code is a massive simplification of what is actually required – there is a ton of code missing that passes all the interface pointers & handles around and to create & shutdown the thread at the correct time.

     

    In-band vs. Out-of-band:

    So what is the difference between In-band vs. Out-of-band debug events?

     

    An out-of-band debug event causes all threads in the target process to suspend (it’s exactly the same as native debugger induced suspend). As such, it is not possible to use the managed debugging interfaces to gather information from the target – as the managed debugging thread is suspended.

     

    An in-band debug event only causes the managed threads in the target process to suspend – the managed debugging thread is still running. As such, it is possible to use the managed debugging interfaces to gather information from the target.

     

    The act of using the managed debugging thread from within an in-band debug event can cause an out-of-band debug event (the common examples being first chance exceptions).

     

    Cleanup/Detach:

    Just to be complete, below is the code to cleanup and (optionally) detach from the ICorDebug session. In .NET 4.0, the ICorDebugController::Detach will terminate the process if interop debugging (passing TRUE to ICorDebug::DebugActiveProcess) is used. Interop debugging is not supported in .NET 2.0 on x64 - so this is less of an issue.

     

    // If the target process is still running, we need to detach.

    if (bDetachNeeded)

    {

          ICorDebugController* pCorDebugController = NULL;

          pCorDebugProcess->QueryInterface(__uuidof(ICorDebugController), (void**)&pCorDebugController);

          pCorDebugController->Stop(INFINITE /* Note: Value is ignored – always INFINITE */);

          pCorDebugController->Detach();

          pCorDebugController->Release();

    }

    pCorDebug->SetUnmanagedHandler(NULL);

    pCorDebugUnmanagedCallback->Release();

     

    pCorDebug->SetManagedHandler(NULL);

    pCorDebugManagedCallback2->Release();

     

    pCorDebug->Terminate();

     

    pCorDebug->Release();

     

    CoUninitialize();

     

    There is still quite a big bit of code required to implement the debugger completely.

     

    You’ll need an ICorDebugManagedCallback implementation that handles process exiting, attaching to an application domain (ICorDebugAppDomain::Attach), handling name changes, and continuation.

     

    Plus, if you want to support .NET 2.0 debugging without .NET 4.0 installed, you’ll need to use LoadLibrary/GetProcAddress to call .NET 4.0 (optionally), and fall back to the .NET 2.0 GetVersionFromProcess and CreateDebuggingInterfaceFromVersion functions.

     

    Conclusion:

    The CLR Debugging API is not for the faint at heart.  There are numerous pitfalls when using the ICorDebug interface against different versions of the CLR, different versions of Windows, different architectures, and with or without interop debugging.

     

    If you have any questions about the API, post a comment here and I’ll do my best to answer them for you.

  • Ntdebugging Blog

    Where Did My Disk I/O Go?

    • 1 Comments

    Hello, Mr. Ninja back again.  I recently discovered that although my team often tracks I/O from the file system through to the disk controller, we have never publicly documented the steps required to do this.  This seems like a great opportunity for a blog because most of the structures are known, and they are even included in the public symbols.

     

    When debugging a system that is hung, slow, or otherwise unresponsive you will likely encounter an IRP that has been sent from ntfs to the disk.  Running !irp against such a request will show that the request has gone to disk.sys, but that is not really where the story ends.

     

    Below is one such example of ntfs waiting with an IRP that appears to be stuck in disk.sys.  You can determine what driver last handled the IRP by looking for the > character, this points to the current io stack location.

     

    2: kd> !thread fffffa804f151040 e

    THREAD fffffa804f151040  Cid 0004.33f8  Teb: 0000000000000000 Win32Thread: 0000000000000000 WAIT: (Executive) KernelMode Non-Alertable

        fffff8803836e730  NotificationEvent

    IRP List:

        fffffa804f379440: (0006,0310) Flags: 00060043  Mdl: fffffa803c854090

    Not impersonating

    DeviceMap                 fffff8a000008720

    Owning Process            fffffa8030cdeb30       Image:         System

    Attached Process          N/A            Image:         N/A

    Wait Start TickCount      34797397       Ticks: 1118 (0:00:00:17.440)

    Context Switch Count      5893

    UserTime                  00:00:00.000

    KernelTime                00:00:00.296

    Win32 Start Address nt!ExpWorkerThread (0xfffff80002ae2ef0)

    Stack Init fffff88038370db0 Current fffff8803836e0d0

    Base fffff88038371000 Limit fffff8803836b000 Call 0

    Priority 16 BasePriority 13 UnusualBoost 0 ForegroundBoost 0 IoPriority 2 PagePriority 5

    Child-SP          RetAddr           Call Site

    fffff880`3836e110 fffff800`02addf32 nt!KiSwapContext+0x7a

    fffff880`3836e250 fffff800`02ae074f nt!KiCommitThreadWait+0x1d2

    fffff880`3836e2e0 fffff880`0164b3ff nt!KeWaitForSingleObject+0x19f

    fffff880`3836e380 fffff880`01654224 Ntfs!NtfsNonCachedIo+0x23f

    fffff880`3836e550 fffff880`0164f507 Ntfs!NtfsNonCachedUsaWrite+0x64

    fffff880`3836e5e0 fffff880`016501a3 Ntfs!NtfsCommonWrite+0x2ca4

    fffff880`3836e790 fffff800`02abebff Ntfs!NtfsFsdWrite+0x1c3

    fffff880`3836ea10 fffff800`02b1cc00 nt!IoSynchronousPageWrite+0x24f

    fffff880`3836ea90 fffff800`02b1b2d8 nt!MiFlushSectionInternal+0xb30

    fffff880`3836ecc0 fffff800`02b1a83c nt!MmFlushSection+0x1f4

    fffff880`3836ed80 fffff880`01653bb7 nt!CcFlushCache+0x7bc

    fffff880`3836ee80 fffff880`01700037 Ntfs!LfsFlushLfcb+0x647

    fffff880`3836f000 fffff880`017025b0 Ntfs!LfsFlushToLsnPriv+0x143

    fffff880`3836f090 fffff880`0172445f Ntfs!LfsWriteLfsRestart+0xf0

    fffff880`3836f0d0 fffff880`017242d0 Ntfs!LfsCloseLogFile+0x17f

    fffff880`3836f190 fffff880`01715810 Ntfs!NtfsStopLogFile+0x70

    fffff880`3836f1d0 fffff880`0172bfdb Ntfs!NtfsPerformDismountOnVcb+0x184

    2: kd> !irp fffffa804f379440

    Irp is active with 8 stacks 5 is current (= 0xfffffa804f379630)

     Mdl=fffffa803c854090: No System Buffer: Thread fffffa804f151040:  Irp stack trace.

         cmd  flg cl Device   File     Completion-Context

     [  0, 0]   0  0 00000000 00000000 00000000-00000000

     

                            Args: 00000000 00000000 00000000 00000000

     [  0, 0]   0  0 00000000 00000000 00000000-00000000

     

                            Args: 00000000 00000000 00000000 00000000

     [  0, 0]   0  0 00000000 00000000 00000000-00000000

     

                            Args: 00000000 00000000 00000000 00000000

     [  0, 0]   0  0 00000000 00000000 00000000-00000000

     

                            Args: 00000000 00000000 00000000 00000000

    >[  4,34]  1c e0 fffffa8032052060 00000000 fffff880011bb010-fffffa803a604c90 Success Error Cancel

                   \Driver\Disk     volmgr!VmpReadWriteCompletionRoutine

                            Args: 00001000 00000000 b5f8a000 00000000

     [  4, 0]   c e0 fffffa803a604b40 00000000 fffff88001cb5150-fffffa803a1ec180 Success Error Cancel

                   \Driver\volmgr   volsnap!VspRefCountCompletionRoutine

                            Args: 00001000 00000000 b5e8a000 00000000

     [  4, 0]   c e1 fffffa803a1ec030 00000000 fffff8800164c344-fffff8803836e728 Success Error Cancel pending

                   \Driver\volsnap  Ntfs!NtfsMasterIrpSyncCompletionRoutine

                            Args: 00001000 00000000 b5e8a000 00000000

     [  4, 0]   0  0 fffffa803d1bf030 fffffa803b268540 00000000-00000000

                   \FileSystem\Ntfs

                            Args: 00001000 00000000 01d0c000 00000000

     

    To learn more about what disk.sys is doing with this request we start by looking at the device extension.  Disk.sys is a miniclass driver, it depends on classpnp.sys to do most of the work.  The device extension will be a FUNCTIONAL_DEVICE_EXTENSION structure from classpnp.

     

    2: kd> !devobj fffffa8032052060

    Device object (fffffa8032052060) is for:

     DR36 \Driver\Disk DriverObject fffffa80319fa990

    Current Irp 00000000 RefCount 0 Type 00000007 Flags 01002050

    Vpb fffffa803204aba0 Dacl fffff9a100463450 DevExt fffffa80320521b0 DevObjExt fffffa8032052858 Dope fffffa803204ab30

    ExtensionFlags (0x00000800)

                                 Unknown flags 0x00000800

    AttachedDevice (Upper) fffffa8032052b90 \Driver\partmgr

    AttachedTo (Lower) fffffa8031dcc060 \Driver\mpio

    Device queue is not busy.

    2: kd> dt classpnp!_FUNCTIONAL_DEVICE_EXTENSION fffffa80320521b0

       +0x000 Version          : 3

       +0x008 DeviceObject     : 0xfffffa80`32052060 _DEVICE_OBJECT

       +0x000 CommonExtension  : _COMMON_DEVICE_EXTENSION

       +0x200 LowerPdo         : 0xfffffa80`31dcc060 _DEVICE_OBJECT

       +0x208 DeviceDescriptor : 0xfffffa80`320afeb0 _STORAGE_DEVICE_DESCRIPTOR

       +0x210 AdapterDescriptor : 0xfffffa80`32043910 _STORAGE_ADAPTER_DESCRIPTOR

       +0x218 DevicePowerState : 1 ( PowerDeviceD0 )

       +0x21c DMByteSkew       : 0

       +0x220 DMSkew           : 0

       +0x224 DMActive         : 0 ''

       +0x228 DiskGeometry     : _DISK_GEOMETRY

       +0x240 SenseData        : 0xfffffa80`320a65c0 _SENSE_DATA

       +0x248 TimeOutValue     : 0x3c

       +0x24c DeviceNumber     : 0x24

       +0x250 SrbFlags         : 0x200102

       +0x254 ErrorCount       : 0

       +0x258 LockCount        : 0n1

       +0x25c ProtectedLockCount : 0n0

       +0x260 InternalLockCount : 0n0

       +0x268 EjectSynchronizationEvent : _KEVENT

       +0x280 DeviceFlags      : 4

       +0x282 SectorShift      : 0x9 ''

       +0x283 CdbForceUnitAccess : 0 ''

       +0x288 MediaChangeDetectionInfo : (null)

       +0x290 Unused1          : (null)

       +0x298 Unused2          : (null)

       +0x2a0 KernelModeMcnContext : _FILE_OBJECT_EXTENSION

       +0x2b8 MediaChangeCount : 6

       +0x2c0 DeviceDirectory  : 0xffffffff`800003cc Void

       +0x2c8 ReleaseQueueSpinLock : 0

       +0x2d0 ReleaseQueueIrp  : (null)

       +0x2d8 ReleaseQueueSrb  : _SCSI_REQUEST_BLOCK

       +0x330 ReleaseQueueNeeded : 0 ''

       +0x331 ReleaseQueueInProgress : 0 ''

       +0x332 ReleaseQueueIrpFromPool : 0 ''

       +0x333 FailurePredicted : 0 ''

       +0x334 FailureReason    : 0

       +0x338 FailurePredictionInfo : (null)

       +0x340 PowerDownInProgress : 0 ''

       +0x344 EnumerationInterlock : 0

       +0x348 ChildLock        : _KEVENT

       +0x360 ChildLockOwner   : (null)

       +0x368 ChildLockAcquisitionCount : 0

       +0x36c ScanForSpecialFlags : 0

       +0x370 PowerRetryDpc    : _KDPC

       +0x3b0 PowerRetryTimer  : _KTIMER

       +0x3f0 PowerContext     : _CLASS_POWER_CONTEXT

       +0x478 PrivateFdoData   : 0xfffffa80`320bc010 _CLASS_PRIVATE_FDO_DATA

       +0x480 Reserved2        : 0

       +0x488 Reserved3        : 0

       +0x490 Reserved4        : 0

     

    The information about requests is stored in the PrivateFdoData .

    2: kd> dt 0xfffffa80`320bc010 _CLASS_PRIVATE_FDO_DATA

    CLASSPNP!_CLASS_PRIVATE_FDO_DATA

       +0x000 SqmData          : 0x62a05

       +0x008 TrackingFlags    : 0

       +0x00c UpdateDiskPropertiesWorkItemActive : 0

       +0x010 LocalMinWorkingSetTransferPackets : 0x200

       +0x014 LocalMaxWorkingSetTransferPackets : 0x800

       +0x018 AllFdosListEntry : _LIST_ENTRY [ 0xfffffa80`320be028 - 0xfffffa80`320b8028 ]

       +0x028 Perf             : <unnamed-tag>

       +0x038 HackFlags        : 0

       +0x040 HotplugInfo      : _STORAGE_HOTPLUG_INFO

       +0x048 Retry            : <unnamed-tag>

       +0x0f0 TimerInitialized : 0 ''

       +0x0f1 LoggedTURFailureSinceLastIO : 0 ''

       +0x0f2 LoggedSYNCFailure : 0 ''

       +0x0f3 ReleaseQueueIrpAllocated : 0x1 ''

       +0x0f8 ReleaseQueueIrp  : 0xfffffa80`320bcc40 _IRP

       +0x100 AllTransferPacketsList : _LIST_ENTRY [ 0xfffffa80`320bbe60 - 0xfffffa80`4ed53d10 ]

       +0x110 FreeTransferPacketsList : _SLIST_HEADER

       +0x120 NumFreeTransferPackets : 0xff

       +0x124 NumTotalTransferPackets : 0x100

       +0x128 DbgPeakNumTransferPackets : 0x100

       +0x130 DeferredClientIrpList : _LIST_ENTRY [ 0xfffffa80`320bc140 - 0xfffffa80`320bc140 ]

       +0x140 HwMaxXferLen     : 0x80000

       +0x148 SrbTemplate      : _SCSI_REQUEST_BLOCK

       +0x1a0 SpinLock         : 0

       +0x1a8 LastKnownDriveCapacityData : _READ_CAPACITY_DATA_EX

       +0x1b4 IsCachedDriveCapDataValid : 0x1 ''

       +0x1b8 ErrorLogNextIndex : 6

       +0x1c0 ErrorLogs        : [16] _CLASS_ERROR_LOG_DATA

       +0x9c0 NumHighPriorityPagingIo : 0

       +0x9c4 MaxInterleavedNormalIo : 0

       +0x9c8 ThrottleStartTime : _LARGE_INTEGER 0x0

       +0x9d0 ThrottleStopTime : _LARGE_INTEGER 0x0

       +0x9d8 LongestThrottlePeriod : _LARGE_INTEGER 0x0

       +0x9e0 IdlePrioritySupported : 0x1 ''

       +0x9e8 IdleListLock     : 0

       +0x9f0 IdleIrpList      : _LIST_ENTRY [ 0xfffffa80`320bca00 - 0xfffffa80`320bca00 ]

       +0xa00 IdleTimer        : _KTIMER

       +0xa40 IdleDpc          : _KDPC

       +0xa80 IdleTimerInterval : 0x19

       +0xa82 StarvationCount  : 0x14

       +0xa84 IdleTimerTicks   : 0

       +0xa88 IdleTicks        : 0

       +0xa8c IdleIoCount      : 0

       +0xa90 IdleTimerStarted : 0n0

       +0xa98 LastIoTime       : _LARGE_INTEGER 0x1cc8bde`4f571cca

       +0xaa0 ActiveIoCount    : 0n1

       +0xaa4 ActiveIdleIoCount : 0n0

       +0xaa8 InterpretSenseInfo : (null)

       +0xab0 MaxPowerOperationRetryCount : 0

       +0xab8 PowerProcessIrp  : 0xfffffa80`320bd010 _IRP

       +0xac0 PerfCounterFrequency : _LARGE_INTEGER 0x23c3c4

     

    The outstanding requests are stored in the AllTransferPacketsList.  Classpnp uses a transfer packet to send the request to the lower level drivers.  This allows the request to be split into smaller packets if necessary, and for the request to be retried if there is a failure.

     

    We can dump the AllTransferPacketsList with !list and then search for our irp, it will be in the OriginalIrp field of one of the transfer packets.  Note that the output from dt will displayed with a `, while the output from !thread does not, so you will need to add a ` when searching through the !list output.  Also, there may be multiple transfer packets with the same OriginalIrp.

     

    2: kd> !list "-t classpnp!_TRANSFER_PACKET.AllPktsListEntry.Flink -e -x \"??@$extret; dt classpnp!_TRANSFER_PACKET @$extret\" 0xfffffa80`320bbe60"

    ??@$extret; dt classpnp!_TRANSFER_PACKET @$extret

    unsigned int64 0xfffffa80`399ad5e0

       +0x000 AllPktsListEntry : _LIST_ENTRY [ 0xfffffa80`3bae2b40 - 0xfffffa80`3bc7cb30 ]

       +0x010 SlistEntry       : _SLIST_ENTRY

       +0x020 Irp              : 0xfffffa80`3bb71570 _IRP

       +0x028 Fdo              : 0xfffffa80`32052060 _DEVICE_OBJECT

       +0x030 OriginalIrp      : 0xfffffa80`4f379440 _IRP

       +0x038 CompleteOriginalIrpWhenLastPacketCompletes : 0x1 ''

       +0x03c NumRetries       : 8

       +0x040 RetryTimer       : _KTIMER

       +0x080 RetryTimerDPC    : _KDPC

       +0x0c0 RetryIn100nsUnits : 0n0

       +0x0c8 SyncEventPtr     : (null)

       +0x0d0 DriverUsesStartIO : 0 ''

       +0x0d1 InLowMemRetry    : 0 ''

       +0x0d8 LowMemRetry_remainingBufPtr : (null)

       +0x0e0 LowMemRetry_remainingBufLen : 0

       +0x0e8 LowMemRetry_nextChunkTargetLocation : _LARGE_INTEGER 0x0

       +0x0f0 BufPtrCopy       : 0xfffffa80`40d79000  "RCRD("

       +0x0f8 BufLenCopy       : 0x1000

       +0x100 TargetLocationCopy : _LARGE_INTEGER 0xb5f8a000

       +0x108 SrbErrorSenseData : _SENSE_DATA

       +0x120 Srb              : _SCSI_REQUEST_BLOCK

       +0x178 UsePartialMdl    : 0 ''

       +0x180 PartialMdl       : 0xfffffa80`3bfda010 _MDL

       +0x188 RetryHistory     : (null)

       +0x190 RequestStartTime : 0

     

    Now we can view the irp that classpnp sent to the lower level drivers and determine what it is doing.

     

    2: kd> !irp 0xfffffa80`3bb71570

    Irp is active with 3 stacks 3 is current (= 0xfffffa80`3bb71688)

     Mdl=fffffa803c854090: No System Buffer: Thread 00000000:  Irp stack trace.  Pending has been returned

         cmd  flg cl Device   File     Completion-Context

     [  0, 0]   0  2 00000000 00000000 00000000-00000000

     

                            Args: 00000000 00000000 00000000 ffffffffc0000185

    >[  f, 0]  1c  0 fffffa8031dcc060 00000000 fffff8800107d1a0-fffffa80413ec4c0

                   \Driver\elxstor  mpio!MPIOPdoCompletion

                            Args: fffffa80399ad700 00000000 00000000 fffffa80413ec4c0

     [  f, 0]  1c e1 fffffa8031dcc060 00000000 fffff88001d61a00-fffffa80399ad5e0 Success Error Cancel pending

                   \Driver\mpio     CLASSPNP!TransferPktComplete

                            Args: fffffa80399ad700 00000000 00000000 fffffa80413ec4c0

     

    We can see that the request has been sent to the disk driver.  More specifically the request has been sent to the storport miniport driver elxstor.  From this data we can usually assume that the request has been sent to the disk drive and we are waiting for the disk to respond.  There may be conditions where the request is stuck in storport, or in the miniport, however those conditions are beyond the scope of this article.

     

    As you can see, there are several drivers between the disk.sys mini class driver and the actual physical disk drive.  It is often necessary to determine how far down the storage driver stack a request has been before you can determine where it is stuck.

  • Ntdebugging Blog

    Call Stacks for Pool Allocations

    • 0 Comments

    Hello, it's the Debug Ninja back again for another NtDebugging Blog article.  For as long as I can remember user mode debuggers have had an easy way to get call stacks for heap allocations.  On more recent versions of Windows this has been as simple as using gflags +ust and umdh or !heap -k.  Kernel debuggers have not always had an easy way to determine who allocated a pool block.  Sure, we have pool tags to help us out, but often a programmer will use the same tag in many places and devalue this as a troubleshooting technique.

     

    Fortunately, starting in Windows Vista and Server 2008, kernel debuggers can get call stacks from pool allocations.  We can even get call stacks from pool frees.  This little known technique is not quite as useful as gflags +ust is for heap, but when it is needed it is very useful.

     

    First, you need to turn on special pool using driver verifier.  Verifier will obtain and track the call stack for the allocation and the free, so this technique will not work with traditional special pool as documented in KB188831 because those settings do not use driver verifier.  Because special pool requires additional memory overhead to run, this technique is not valuable for large memory leaks.  However, this technique is a good way to determine what code allocated or freed your pool block in other conditions.  For example, this works well if you find that pool has been freed when you expected it to be allocated.  This also works for smaller memory leaks, especially those for which you can easily reproduce the leak.  Analyzing the allocations and stacks for a leak must be done by hand, as there is no umdh-like tool for kernel mode.

     

    Step 1 - Turning on verifier

    In this example I am using Sysinternals’ notmyfault tool to generate the pool allocations.  Because I know the driver in question I set verifier to only monitor that driver.  Note that a reboot is required to make this setting take effect.

     

    Verifier /flags 1 /driver myfault.sys

     

    Step 2 - Finding the pool allocation to analyze

    For this example I am going to find the call stack of a leaked pool allocation.  First find the tag that is using the most pool by using !poolused.

     

    kd> !poolused 4

       Sorting by  Paged Pool Consumed

     

      Pool Used:

                NonPaged            Paged

     Tag    Allocs     Used    Allocs     Used

    Leak        0        0        23 23552000 UNKNOWN pooltag 'Leak', please update pooltag.txt

    CM31        0        0     20520 18514560 Internal Configuration manager allocations , Binary: nt!cm

    CIcr        0        0      2977  8511504 Code Integrity allocations for image integrity checking , Binary: ci.dll

     

    Next find the pool allocations for that tag with !poolfind.  There is some guessing to be done with all pool leak debugging techniques; you can’t be sure that the allocation you’re looking at has really been leaked and is not just in a state where it has not yet been freed.  You need to make an educated guess because there is no umdh-type functionality to analyze allocates and frees.  If you have the benefit of a live debug you can go the debugger and check back later to see if the memory has been freed or not.

     

    kd> !poolfind Leak

     

    Scanning large pool allocation table for Tag: Leak (fffffa8002e00000 : fffffa8002f80000)

     

    *fffff8a006a00000 :large page allocation, Tag  is Leak, size  is 0xfa000 bytes

    *fffff8a0058fa000 :large page allocation, Tag  is Leak, size  is 0xfa000 bytes

    *fffff8a006200000 :large page allocation, Tag  is Leak, size  is 0xfa000 bytes

    *fffff8a0068fa000 :large page allocation, Tag  is Leak, size  is 0xfa000 bytes

    *fffff8a0060fa000 :large page allocation, Tag  is Leak, size  is 0xfa000 bytes

    *fffff8a005a00000 :large page allocation, Tag  is Leak, size  is 0xfa000 bytes

    *fffff8a006c00000 :large page allocation, Tag  is Leak, size  is 0xfa000 bytes

    *fffff8a006400000 :large page allocation, Tag  is Leak, size  is 0xfa000 bytes

    *fffff8a0062fa000 :large page allocation, Tag  is Leak, size  is 0xfa000 bytes

    *fffff8a005afa000 :large page allocation, Tag  is Leak, size  is 0xfa000 bytes

    *fffff8a005c00000 :large page allocation, Tag  is Leak, size  is 0xfa000 bytes

    *fffff8a006e00000 :large page allocation, Tag  is Leak, size  is 0xfa000 bytes

    *fffff8a006600000 :large page allocation, Tag  is Leak, size  is 0xfa000 bytes

    *fffff8a0064fa000 :large page allocation, Tag  is Leak, size  is 0xfa000 bytes

    *fffff8a005cfa000 :large page allocation, Tag  is Leak, size  is 0xfa000 bytes

    *fffff8a006afa000 :large page allocation, Tag  is Leak, size  is 0xfa000 bytes

    *fffff8a005e00000 :large page allocation, Tag  is Leak, size  is 0xfa000 bytes

    *fffff8a006800000 :large page allocation, Tag  is Leak, size  is 0xfa000 bytes

    *fffff8a0066fa000 :large page allocation, Tag  is Leak, size  is 0xfa000 bytes

    *fffff8a005efa000 :large page allocation, Tag  is Leak, size  is 0xfa000 bytes

    *fffff8a006cfa000 :large page allocation, Tag  is Leak, size  is 0xfa000 bytes

    *fffff8a006000000 :large page allocation, Tag  is Leak, size  is 0xfa000 bytes

    *fffff8a005800000 :large page allocation, Tag  is Leak, size  is 0xfa000 bytes

     

    Step 3 – Dump the call stack for the allocation

    This step is the easy one.  Once you have the address of the allocation use !verifier 0x80 Address.  If you were interested in all of the call stacks in the log you can run !verifier 0x80 without the Address parameter.

     

    kd> !verifier 0x80 fffff8a005800000

     

    Log of recent kernel pool Allocate and Free operations:

     

    There are up to 0x10000 entries in the log.

     

    Parsing 0x0000000000010000 log entries, searching for address 0xfffff8a005800000.

     

     

    ======================================================================

    Pool block fffff8a005800000, Size 00000000000fa000, Thread fffffa8002be4060

    fffff80001923cc6 nt!VeAllocatePoolWithTagPriority+0x2b6

    fffff80001923d3d nt!VerifierExAllocatePoolEx+0x1d

    fffff880042881f6 myfault+0x11f6

    fffff8800428842f myfault+0x142f

    fffff8000192e750 nt!IovCallDriver+0xa0

    fffff800017a3a97 nt!IopXxxControlFile+0x607

    fffff800017a42f6 nt!NtDeviceIoControlFile+0x56

    fffff80001487ed3 nt!KiSystemServiceCopyEnd+0x13

    ======================================================================

    Pool block fffff8a005800000, Size 0000000000001000, Thread fffffa8002187060

    fffff8000192393a nt!VfFreePoolNotification+0x4a

    fffff800015b6a6f nt!ExDeferredFreePool+0x107b

    fffff800017273eb nt!HvFreeDirtyData+0x7f

    fffff800017269bb nt!HvOptimizedSyncHive+0x53

    fffff80001726303 nt!CmFlushKey+0xaf

    fffff80001726b22 nt!NtFlushKey+0x142

    fffff80001487ed3 nt!KiSystemServiceCopyEnd+0x13

    Parsed entry 0000000000010000/0000000000010000...

    Finished parsing all pool tracking information.

     

    Keep in mind that the log may contain allocate and free information that predates the current use of the pool block, and that the log is of a fixed size so eventually old data will fall off the end.  The most recent use of the pool will be at the top of the output.  Usually this is the stack at the top of the output is what you are interested in, I have highlighted the relevant call stack in red.  In this instance we can see that the pool was most recently allocated by myfault.sys.

     

    Sometimes it is useful to have historical information about previous uses of the pool block such as when dealing with pool that was improperly freed.  In that scenario the most recent call stack may be from an allocate call when the pool block was reused by the memory manager and so you may need to go down several levels to find out where the pool was improperly freed.

     

    For more information on using !verifier you can refer to the debugger help in MSDN, http://msdn.microsoft.com/en-us/library/ff565591.aspx.

Page 6 of 24 (238 items) «45678»