• Ntdebugging Blog

    Use Caution When Implementing IPC for Performance Counters

    • 0 Comments

     

    Recently I was working with a developer who had created performance counters that work in Performance Monitor but are never collected in a user defined data collector set.  The customer explained that their counters update named shared memory inside the application which should be read by perfmon or the data collector set.

     

    Putting counter data in shared memory is a common technique for performance counter developers.  A programmer can update performance data in a block of shared memory in their application and then use a performance extension dll (aka an “extensible counter”) to read from the shared memory.

     

    Shared memory is created by calling CreateFileMapping and MapViewOfFile.  This memory is then accessed by another application by calling OpenFileMapping.  All applications which use this shared memory must pass the same lpName to CreateFileMapping or OpenFileMapping.  An example of using these APIs to implement shared memory is available on MSDN.

     

    Based on the customer’s explanation that they are populating shared memory in their application, and their counters work in Performance Monitor but do not work in a user defined data collector set, I suspected that OpenFileMapping was failing for the data collector set.

     

    User defined data collector sets run in a rundll32.exe process.  If you have multiple rundll32.exe processes you may need to identify which one is related to your data collector set.  The relevant process has a command line similar to “rundll32.exe C:\Windows\system32\pla.dll,PlaHost”.  There are several tools that can be used to identify the command line of the process such as tlist.exe, which is included with the Debugging Tools for Windows.

     

    After attaching a debugger to rundll32.exe, I wanted to break on the ret instruction at the end of the OpenFileMappingW function.  This would allow me to determine if the function succeeds or fails.  According to MSDN “If the function fails, the return value is NULL. To get extended error information, call GetLastError.”

     

    The uf command is an easy way to unassemble a function and find the ret instruction to break on.

     

    0:001> uf kernelbase!OpenFileMappingW

    kernelbase!OpenFileMappingW:

    75b88e0d 8bff            mov     edi,edi

    <snip>

    kernelbase!OpenFileMappingW+0x8f:

    75b88e79 c9              leave

    75b88e7a c20c00          ret     0Ch

    0:001> bp 75b88e7a

    0:001> g

    Breakpoint 0 hit

    eax=00000000 ebx=00008022 ecx=7ffd8000 edx=00000002 esi=05abf03c edi=00000000

    eip=75b88e7a esp=05abeb20 ebp=05abeb3c iopl=0         nv up ei pl zr na pe nc

    cs=001b  ss=0023  ds=0023  es=0023  fs=003b  gs=0000             efl=00000246

    kernelbase!OpenFileMappingW+0x90:

    75b88e7a c20c00          ret     0Ch

     

    In the above output we can see that the eax register is NULL, indicating that the call to OpenFileMapping failed.  The !gle command will show the last error and last status.

     

    0:008> !gle

    LastErrorValue: (Win32) 0x2 (2) - The system cannot find the file specified.

    LastStatusValue: (NTSTATUS) 0xc0000034 - Object Name not found.

     

    The failure is that OpenFileMapping cannot find the file.  The file name is the third parameter to OpenFileMapping.  We can get the first three parameters from the kb command.

     

    0:008> kb

    ChildEBP RetAddr  Args to Child             

    05abf0d0 6abae355 0002001f 00000000 05abeb7c kernelbase!OpenFileMappingW+0x90

    WARNING: Stack unwind information not available. Following frames may be wrong.

    05abf0f8 7784fe67 02a7ae90 05abf224 05abf254 ninjaprf+0x10edb

    05abf110 7784fc97 00472158 02a7ae90 05abf224 advapi32!CallExtObj+0x17

    05abf270 7784efaf 05abf2bc 60fcfa02 05abf778 advapi32!QueryExtensibleData+0x735

    05abf654 75ff0468 80000004 05abf778 00000000 advapi32!PerfRegQueryValue+0x5da

    05abf748 75ffd505 80000004 05abf778 05abf790 kernel32!LocalBaseRegQueryValue+0x366

    05abf7b4 61247dc5 80000004 02a7ae90 00000000 kernel32!RegQueryValueExW+0xb7

    05abf830 61250595 80000004 02a7ae58 02a7ae90 pdh!GetSystemPerfData+0x92

    05abf89c 6124c753 02a407d0 05abf8e8 61241928 pdh!GetQueryPerfData+0xa4

    05abf8b8 61254463 02a407d0 05abf8e8 60fcf32f pdh!PdhiCollectQueryData+0x32

    05abf90c 611c6d04 02a58f08 00000000 75ffc3e0 pdh!PdhUpdateLogW+0xa2

    05abf9bc 611be128 0045c968 00000000 00000000 pla!HPerformanceCounterDataCollector::Cycle+0x48

    05abf9bc 00000000 0045c968 00000000 00000000 pla!PlaiCollectorControl+0x3b7

    0:008> da 05abeb7c

    05abeb7c  "Local\NINJAPERF_S-1-5-18"

     

    The user defined data collector set is failing to open the file "Local\NINJAPERF_S-1-5-18".  This is the name that the performance extension dll ninjaprf.dll has given to its shared memory.

     

    Based on the customer’s description this operation works in Performance Monitor.  Next, I attached a debugger to perfmon and set the same breakpoint.

     

    Breakpoint 0 hit

    eax=000009f8 ebx=00008022 ecx=a7330000 edx=080ee678 esi=06798070 edi=00000000

    eip=760be9bb esp=0a84e564 ebp=0a84e580 iopl=0         nv up ei pl zr na pe nc

    cs=0023  ss=002b  ds=002b  es=002b  fs=0053  gs=002b             efl=00000246

    kernelbase!OpenFileMappingW+0x90:

    760be9bb c20c00          ret     0Ch

     

    In the above output we can see that eax is a handle number, indicating that the function succeeded.

     

    Looking at the file being opened we can see why this works in one scenario and not in another.

     

    0:016> kb 1

    ChildEBP RetAddr  Args to Child             

    0a84e580 698e4ab9 0002001f 00000000 0a84e5c0 kernelbase!OpenFileMappingW+0x90

    0:016> da 0a84e5c0

    0a84e5c0  "Local\NINJAPERF_S-1-5-21-123578"

    0a84e5e0  "095-571698237-1598563147-18961"

     

    The file name used is unique for each user.  The ninjaprf dll has chosen a file name which includes the SID of the current user.  This works for Performance Monitor because the user who starts the application is the same as the user who runs Performance Monitor.  However, a user defined data collector set runs in rundll32.exe as the Local System account.

     

    It is possible to run the user defined data collector set as a different user, however the file name being used will not work in that scenario either.  By using the “Local\” prefix, the file is created in the local session namespace.  The application runs in the user’s session, while rundll32.exe is started by the Task Scheduler service and runs in session 0.  This prevents the user defined data collector set from seeing the file created by the application.

     

    If your performance counter uses shared memory to communicate, be aware that your performance extension dll may be run in the context of a different user account and a different user session.  Your inter process communication techniques must account for this, or your customers will ask you why your counters do not work outside of Performance Monitor.

  • Ntdebugging Blog

    Determining the source of Bug Check 0x133 (DPC_WATCHDOG_VIOLATION) errors on Windows Server 2012

    • 2 Comments

    What is a bug check 0x133?

    Starting in Windows Server 2012, a DPC watchdog timer is enabled which will bug check a system if too much time is spent in DPC routines. This bug check was added to help identify drivers that are deadlocked or misbehaving.  The bug check is of type "DPC_WATCHDOG_VIOLATION" and has a code of 0x133.  (Windows 7 also included a DPC watchdog but by default, it only took action when a kernel debugger was attached to the system.)  A description of DPC routines can be found at http://msdn.microsoft.com/en-us/library/windows/hardware/ff544084(v=vs.85).aspx.

     

    The DPC_WATCHDOG_VIOLATION bug check can be triggered in two ways. First, if a single DPC exceeds a specified number of ticks, the system will stop with 0x133 with parameter 1 of the bug check set to 0.  In this case, the system's time limit for single DPC will be in parameter 3, with the number of ticks taken by this DPC in parameter 2.  Alternatively, if the system exceeds a larger timeout of time spent cumulatively in all DPCs since the IRQL was raised to DPC level, the system will stop with a 0x133 with parameter 1 set to 1.  Microsoft recommends that DPCs should not run longer than 100 microseconds and ISRs should not run longer than 25 microseconds, however the actual timeout values on the system are set much higher.

     

    How to debug a 0x133 (0, …

    In the case of a stop 0x133 with the first parameter set to 0, the call stack should contain the offending driver.  For example, here is a debug of a 0x133 (0,…) kernel dump:

     

    0: kd> .bugcheck

    Bugcheck code 00000133

    Arguments 00000000`00000000 00000000`00000283 00000000`00000282 00000000`00000000  

     

    Per MSDN, we know that this DPC has run for 0x283 ticks, when the limit was 0x282.

     

    0: kd> k

    Child-SP          RetAddr           Call Site

    fffff803`08c18428 fffff803`098525df nt!KeBugCheckEx

    fffff803`08c18430 fffff803`09723f11 nt! ??::FNODOBFM::`string'+0x13ba4

    fffff803`08c184b0 fffff803`09724d98 nt!KeUpdateRunTime+0x51

    fffff803`08c184e0 fffff803`09634eba nt!KeUpdateTime+0x3f9

    fffff803`08c186d0 fffff803`096f24ae hal!HalpTimerClockInterrupt+0x86

    fffff803`08c18700 fffff803`0963dba2 nt!KiInterruptDispatchLBControl+0x1ce

    fffff803`08c18898 fffff803`096300d0 hal!HalpTscQueryCounter+0x2

    fffff803`08c188a0 fffff880`04be3409 hal!HalpTimerStallExecutionProcessor+0x131

    fffff803`08c18930 fffff880`011202ee ECHO!EchoEvtTimerFunc+0x7d                //Here is our driver, and we can see it calls into StallExecutionProcessor

    fffff803`08c18960 fffff803`097258b4 Wdf01000!FxTimer::TimerHandler+0x92

    fffff803`08c189a0 fffff803`09725ed5 nt!KiProcessExpiredTimerList+0x214

    fffff803`08c18ae0 fffff803`09725d88 nt!KiExpireTimerTable+0xa9

    fffff803`08c18b80 fffff803`0971fe76 nt!KiTimerExpiration+0xc8

    fffff803`08c18c30 fffff803`0972457a nt!KiRetireDpcList+0x1f6

    fffff803`08c18da0 00000000`00000000 nt!KiIdleLoop+0x5a

     

    Let’s view the driver’s unassembled DPC routine and see what it is doing

     

    0: kd> ub fffff880`04be3409

    ECHO!EchoEvtTimerFunc+0x54:

    fffff880`04be33e0 448b4320        mov     r8d,dword ptr[rbx+20h]

    fffff880`04be33e4 488b0d6d2a0000  mov     rcx,qword ptr [ECHO!WdfDriverGlobals (fffff880`04be5e58)]

    fffff880`04be33eb 4883631800      and     qword ptr [rbx+18h],0

    fffff880`04be33f0 488bd7          mov     rdx,rdi

    fffff880`04be33f3 ff150f260000    call    qword ptr [ECHO!WdfFunctions+0x838(fffff880`04be5a08)]

    fffff880`04be33f9 bbc0d40100      mov     ebx,1D4C0h

    fffff880`04be33fe b964000000      mov     ecx,64h

    fffff880`04be3403 ff15f70b0000    call    qword ptr[ECHO!_imp_KeStallExecutionProcessor (fffff880`04be4000)]   //Its Calling KeStallExecutionProcessor with 0x64 (decimal 100) as a parameter

    0: kd> u fffff880`04be3409

    ECHO!EchoEvtTimerFunc+0x7d:

    fffff880`04be3409 4883eb01        sub     rbx,1

    fffff880`04be340d 75ef            jne     ECHO!EchoEvtTimerFunc+0x72 (fffff880`04be33fe)     //Here we can see it is jumping back to call KeStallExecutionProcessor in a loop

    fffff880`04be340f 488b5c2430      mov     rbx,qword ptr[rsp+30h]

    fffff880`04be3414 4883c420        add     rsp,20h

    fffff880`04be3418 5f              pop     rdi

    fffff880`04be3419 c3              ret

    fffff880`04be341a cc              int     3

    fffff880`04be341b cc              int     3

     

    0: kd> !pcr

    KPCR for Processor 0 at fffff80309974000:

        Major 1 Minor 1

          NtTib.ExceptionList: fffff80308c11000

              NtTib.StackBase: fffff80308c12080

             NtTib.StackLimit: 000000d70c7bf988

           NtTib.SubSystemTib: fffff80309974000

                NtTib.Version: 0000000009974180

            NtTib.UserPointer: fffff803099747f0

                NtTib.SelfTib: 000007f7ab80c000

     

                      SelfPcr: 0000000000000000

                         Prcb: fffff80309974180

                         Irql: 0000000000000000

                          IRR: 0000000000000000

                          IDR: 0000000000000000

                InterruptMode: 0000000000000000

                          IDT: 0000000000000000

                          GDT: 0000000000000000

                          TSS: 0000000000000000

     

                CurrentThread: fffff803099ce880

                   NextThread: fffffa800261cb00

                   IdleThread: fffff803099ce880

     

                    DpcQueue:  0xfffffa80020ce790 0xfffff880012e4e9c [Normal] NDIS!NdisReturnNetBufferLists

                               0xfffffa800185f118 0xfffff88000c0ca00 [Normal] ataport!AtaPortInitialize

                               0xfffff8030994fda0 0xfffff8030972bc30 [Normal] nt!KiBalanceSetManagerDeferredRoutine

                               0xfffffa8001dbc118 0xfffff88000c0ca00 [Normal] ataport!AtaPortInitialize

                               0xfffffa8002082300 0xfffff88001701df0 [Normal] USBPORT

     

    The !pcr output shows us queued DPCs for this processor. If you want to see more information about DPCs and the DPC Watchdog, you could dump the PRCB listed in the !pcr output like this:

     

    dt nt!_KPRCB fffff80309974180 Dpc*

     

    Often the driver will be calling into a function like KeStallExecutionProcessor in a loop, as in our example debug.  To resolve this problem, contact the driver vendor to request an updated driver version that spends less time in its DPC Routine.

     

    How to troubleshoot a 0x133 (1, …

    Determining the cause of a stop 0x133 with a first parameter of 1 is a bit more difficult because the problem is a result of DPCs running from multiple drivers, so the call stack is insufficient to determine the culprit.  To troubleshoot this stop, first make sure that the NT Kernel Logger or Circular Kernel Context Logger ETW traces are enabled on the system.  (For directions on setting this up, see http://blogs.msdn.com/b/ntdebugging/archive/2009/12/11/test.aspx.)

     

    Once the logging is enabled and the system bug checks, dump out the list of ETW loggers using !wmitrace.strdump. Find the ID of the NT Kernel logger or the Circular logger.  You can then use !wmitrace.logsave (ID) (path to ETL) to write out the ETL log to a file.  Load it up with Windows Performance Analyzer and add the DPC or DPC/ISR Duration by Module, Function view (located in the Computation group) to your current analysis window:

     

     

    Next, make sure the table is also shown by clicking the box in the upper right of the view:

     

     

    Ensure that the Address column is added on the left of the gold bar, then expand each address entry to see individual DPC enters/exits for each function.  Using this data, you can determine which DPC routines took the longest by looking at the inclusive duration column, which should be added to the right of the gold bar: 

     

    In this case, these DPCs took 1 second, which is well over the recommended maximum of 100 us.  The module column (and possible the function column, if you have symbols) will show which driver is responsible for that DPC routine.  Since our ECHO driver was based on WDF, that is the module named here.

     

    For an example of doing this type of analysis in xperf, see http://blogs.msdn.com/b/ntdebugging/archive/2008/04/03/windows-performance-toolkit-xperf.aspx.

     

    More Information

    For additional information about Stop 0x133 errors, see this page on MSDN: http://msdn.microsoft.com/en-us/library/windows/hardware/jj154556(v=vs.85).aspx.

     

    For DPC timing recommendations and for advice on capturing DPC timing information using tracelog, see http://msdn.microsoft.com/en-us/library/windows/hardware/ff545764(v=vs.85).aspx.

     

    Guidelines for writing DPC routines can be found at http://msdn.microsoft.com/en-us/library/windows/hardware/ff546551(v=vs.85).aspx.

     

     

    -Matt Burrough

Page 1 of 1 (2 items)