• Ntdebugging Blog

    Virtual Machine Managment Hangs on Windows Server 2012 R2 Hyper-V Host

    • 0 Comments

    Hi, my name is Christian Sträßner from the Global Escalation Services team based in Munich, Germany. Today we will look at a hang scenario that involves user and kernel dump analysis.  Also, we will demonstrate how to analyze a hang from both user and kernel modes without forcing a crash dump of the machine.

     

    With the increasing popularity of virtualization using Hyper-V it becomes more unrealistic to get a full memory dump of a Hyper-V host due to the increasing amount of RAM installed into those machines.  Fortunately there are sometimes alternative tools available to debug these servers.

     

    Customer Scenario for this particular investigation:

    The customer was unable to execute any kind of management activity against virtual machines running on a Windows Server 2012 R2 Hyper-V host.  Already running Virtual Machines were operating fine.

     

    We discovered that the get-vm powershell cmdlet hung forever when executed on the host. Also, any kind of WMI operation executing management activities against the virtual machines hung forever or produced errors.

     

    Tools used to gather data on the affected host:

    Task Manager

    Livekd.exe

     

    Since Management operations against VMs involve the Service “Hyper-V Virtual Machine Management” (VMMS.EXE) we first take a look into that Service. 

     

    So we first create a User mode memory dump from that process.  To do so, simply open task manager and use the “create dump” option that is available when you right click on a process in the details tab view.

    image001

     

    Tip: you can copy and paste the path from the following message box:

    image002

     

    Save the dump and copy it to a workstation that has windbg installed.

     

    Analyzing the Usermode Dump:

    After opening the dump and setting the symbol path if required, check the number of threads in the process:

    0:020> ~

    #  0  Id: c08.c0c Suspend: 0 Teb: 00007ff7`765ae000 Unfrozen

       1  Id: c08.c34 Suspend: 0 Teb: 00007ff7`765a9000 Unfrozen

    <snip>

    181  Id: c08.36a0 Suspend: 0 Teb: 00007ff7`76360000 Unfrozen

    182  Id: c08.4c10 Suspend: 0 Teb: 00007ff7`7633a000 Unfrozen

     

    The process has 183 threads. This is high, but we cannot judge if this is abnormal or not purely by that number.  Looking at another Hyper-V server with similar load might provide an idea if 183 threads is a lot.

     

    Next we take a look into what these threads are doing using the command ~*kcn.  The output is very long, save it to a .txt file either via copy + paste or using windbg’s .logopen functionality.

     

    Looking at the text file in your favorite editor, we quickly see that there are 104 threads that have the function RtlpWaitOnCriticalSection on the stack.

     

    Next we use the !cs command to analyze the critical sections in that process.  The output shows 1 critical section in state LOCKED and the owning thread is 43d8.

    DebugInfo          = 0x00000092cfd20960

    Critical section   = 0x00000092cf31fb38 (+0x92CF31FB38)

    LOCKED

    LockCount          = 0x68

    WaiterWoken        = No

    OwningThread       = 0x00000000000043d8

    RecursionCount     = 0x1

    LockSemaphore      = 0x1B48

    SpinCount          = 0x00000000020007d0

     

    To verify the output above is a valid critical section, you can dump the DebugInfo member:

     

    0:000> dt _RTL_CRITICAL_SECTION_DEBUG 0x00000092cfd20960

    vmms!_RTL_CRITICAL_SECTION_DEBUG

      +0x000 Type             : 0

      +0x002 CreatorBackTraceIndex : 0

      +0x008 CriticalSection  : 0x00000092`cf31fb38 _RTL_CRITICAL_SECTION

      +0x010 ProcessLocksList : _LIST_ENTRY [ 0x00000092`d2fc01a0 - 0x00000092`d31a0e20 ]

      +0x020 EntryCount       : 0

      +0x024 ContentionCount  : 0xa27

      +0x028 Flags            : 0

      +0x02c CreatorBackTraceIndexHigh : 0

      +0x02e SpareUSHORT      : 0

     

    The CriticalSection member should point back to the Critical Section structure we looked at earlier.

     

    From the output of the ~ command run earlier we find owning thread 43d8 is thread 20 in the dump:

    . 20  Id: c08.43d8 Suspend: 0 Teb: 00007ff7`76460000 Unfrozen

     

    Switch to thread 20 and display the stack of the critical section owner:

    0:020> ~20s

    ntdll!NtDeviceIoControlFile+0xa:

    00007fff`97c3124a c3              ret

    0:020> kcn

    # Call Site

    00 ntdll!NtDeviceIoControlFile

    01 nsi!NsiAllocateAndGetTable

    02 IPHLPAPI!InternalGetIfTable2Ex

    03 vmms!VmNetworkInterfaceTable::Initialize

    04 vmms!NvspWmiINetCfgHelper::EnumerateNetworkAdapters

    05 vmms!NetConfigHelper::GetNetworkBridgeAdapter

    06 vmms!NvspWmiUtility::GetBridgedAdapterId

    07 vmms!VmEthernetSwitchPort::GetLogicalAdapterInternal

    08 vmms!VmEthernetSwitchPort::GetConnectedLogicalAdapter

    09 vmms!ExternalConnectionStatusRetriever::GetExternalSwitchAdapter

    0a vmms!ExternalConnectionStatusRetriever::GetExternalSwitchAdapterStatusInternal

    0b vmms!ExternalConnectionStatusRetriever::GetSwitchStatus

    0c vmms!VmEthernetConnectionProvider::GetSwitchConnectionStatus

    0d vmms!VmmsRealizedVirtualMachine::VerifyExternalSwitches

    0e vmms!VmmsClusterManager::IsVmHealthy

    0f vmms!ComVmmsClusterManager::IsVmHealthy

    10 rpcrt4!Invoke

    11 rpcrt4!Ndr64StubWorker

    12 rpcrt4!NdrStubCall3

    <snip>

     

    Reading the stack from bottom to top we can make a fairly educated guess what’s going on.  A function called IsVmHealthy is invoked through a RPC call (frame 10->0xf).

     

    Note:

    Typically, seeing these RPC calls at the bottom of a stack means that this thread is currently serving an incoming RPC call.

     

    We then execute multiple functions and end up enumerating the network adapters on the machine (frame 4).  In frame 0 we call ntdll!NtDeviceIoControlFile and transition to kernel mode.

     

    Here our journey in user mode ends. Since we only have a user mode dump from the vmms process, we cannot tell what’s going on in the kernel part of that thread.

     

    The Kernel Side:

    Ideally the next step would be to create a full memory dump of the machine to inspect both user mode and kernel in one piece of data. As mentioned before, this is sometimes not an option on Hyper-V host machines due to the amount of physical memory installed in these machines.

     

    This leaves us with a kernel dump. Creating a kernel dump requires inducing a crash (aka “bluescreen”) of the machine which would take down the host and all VMs running.  We may need to wait for a maintenance window to execute those actions.

     

    Live migrating the running VMs to a different host will not work because the VMMS process is hung.

     

    So we move to LiveKD.exe, a Sysinternals Tool that has existed for a long time.  In earlier versions of LiveKD the consistency of the memory dumps captured was not good.  Windows Server 2012 R2 and Windows 8.1 enable the new LiveKD -ML functionality. With the -ML switch, LiveKD asks the Operating System to create a Mirror Dump without crashing the machine.  This captures the kernel memory in a way that the dump content is consistent (call it a snapshot).

     

    Copy Livekd.exe to c:\temp.  Create an empty text file called KD.exe in the same directory.  This works around a limitation in LiveKD.exe which always checks for the existence of kd.exe even when it’s not required for the -ML option.

     

    Command to create the dump:

    livekd.exe -ml -o c:\temp\kerneldump.dmp

     

    Once this is finished, copy the dump file to your workstation and open it in windbg.

     

    From the User mode dump we know that this our process is c08 and our thread is 43d8:

    20  Id: c08.43d8 Suspend: 0 Teb: 00007ff7`76460000 Unfrozen

     

    So let’s search for our VMMS.exe process:

    0: kd> !process 0 0 vmms.exe

    PROCESS ffffe0005c21c080

        SessionId: 0  Cid: 0c08   Peb: 7ff7765ad000  ParentCid: 0554

        DirBase: 3705e000  ObjectTable: ffffc000717c3040  HandleCount: <Data Not Accessible>

        Image: vmms.exe

     

    Switch to the Address Space of that process:

    0: kd> .process /p /r ffffe0005c21c080

    Implicit process is now ffffe000`5c21c080

    Loading User Symbols

    PEB is paged out (Peb.Ldr = 00007ff7`765ad018).  Type ".hh dbgerr001" for details

     

    The Error is expected since we only have the Kernel Address Space available.

     

    Run the following command to get the threads inside that process:

    !process 0 2 vmms.exe

     

    In the output, search for our thread’s TID 43d8

    THREAD ffffe00069cdb880  Cid 0c08.43d8 Teb: 00007ff776460000 Win32Thread: 0000000000000000 WAIT: (Executive) KernelMode Non-Alertable

                ffffd001b8931ed0  NotificationEvent

     

    Switch to that thread and display the stack:

    0: kd> .thread ffffe00069cdb880 

    Implicit thread is now ffffe000`69cdb880

    0: kd> kcn

      *** Stack trace for last set context - .thread/.cxr resets it

    # Call Site

    00 nt!KiSwapContext

    01 nt!KiSwapThread

    02 nt!KiCommitThreadWait

    03 nt!KeWaitForSingleObject

    04 NDIS!ndisQuerySetMiniportEx

    05 NDIS!ndisQuerySetMiniport

    06 NDIS!ndisIfGetMiniportStatistics

    07 NDIS!ndisIfQueryFilterObject

    08 NDIS!ndisIfQueryObject

    09 NDIS!ndisNsiGetInterfaceRodInformation

    0a NDIS!ndisNsiGetAllInterfaceInformation

    0b NETIO!NsiEnumerateObjectsAllParametersEx

    0c nsiproxy!NsippEnumerateObjectsAllParameters

    0d nsiproxy!NsippDispatch

    0e nt!IopSynchronousServiceTail

    0f nt!IopXxxControlFile

    10 nt!NtDeviceIoControlFile

    11 nt!KiSystemServiceCopyEnd

     

    0: kd> !thread ffffe00069cdb880

    THREAD ffffe00069cdb880  Cid 0c08.43d8  Teb: 00007ff776460000 Win32Thread: 0000000000000000 WAIT: (Executive) KernelMode Non-Alertable

       IRP List:

        ffffe801316e7ae0: (0006,0118) Flags: 00060000  Mdl: 00000000

    Not impersonating

    DeviceMap                 ffffc0006ea0c4e0

    Owning Process            ffffe0005c21c080       Image:         vmms.exe

    Attached Process          N/A            Image:         N/A

    Wait Start TickCount      3667296        Ticks: 14135 (0:00:03:40.859)

    Context Switch Count      3149           IdealProcessor: 14            

    UserTime                  00:00:00.390

     

    So this is the kernel mode stack of our thread 43d8.

     

    To recap:  Thread ffffe00069cdb880 is the owner of the locked critical section in vmms.exe.

     

    Reading the kernel stack from bottom to top, we see the call to nt!NtDeviceIoControlFile which calls through the network stack components nsiproxy.sys and netio.sys before it enters NDIS.sys.

     

    The last function called in NDIS is NDIS!ndisQuerySetMiniportEx.  After calling ndisQuerySetMiniportEx the thread sits and waits for 0:00:03:40.859.  From the name of the function we can again make a good guess what’s going on, this thread is doing a query or set of something on an NDIS Miniport.

     

    So the remaining work is simple, due to the fact that the NDIS Team ships an excellent debugger extension to investigate the state of NIC’s, Miniports, filters, etc.  Type: !ndiskd.help to get the available help.

     

    Since the function ndisQuerySetMiniportEx executes something against some miniport, we investigate the miniports next: 

    0: kd> !ndiskd.miniports

        MiniDriver         Miniport            Name                                

        ffffe8012a67eba0   ffffe8012ac921a0    Microsoft ISATAP Adapter #5

        ffffe8012a67eba0   ffffe8012aa371a0    Microsoft 6to4 Adapter

        ffffe8012a67eba0   ffffe8012a97b1a0    Microsoft ISATAP Adapter #3

        ffffe8012a67eba0   ffffe8012a9521a0    Microsoft ISATAP Adapter #4

        ffffe8012a67eba0   ffffe8012a6d21a0    Microsoft ISATAP Adapter #9

        ffffe8012a67eba0   ffffe8012a6d01a0    Microsoft ISATAP Adapter #2

        ffffe0005169b6c0   ffffe000595311a0    Microsoft Network Adapter Multiplexor Driver

        ffffe0005169b6c0   ffffe0005876c1a0    Microsoft Network Adapter Multiplexor Driver #2

        ffffe000516be020   ffffe0005161d1a0    Microsoft Failover Cluster Virtual Adapter

        ffffe000516d0ba0   ffffe0005161f1a0    WAN Miniport (PPPOE)

        ffffe0004f3e2390   ffffe000516211a0    Hyper-V Virtual Ethernet Adapter #4

        ffffe0004f3e2390   ffffe000516231a0    Hyper-V Virtual Ethernet Adapter #3

        ffffe0004f3e2390   ffffe000516251a0    Microsoft Hyper-V Network Switch Default Miniport

        ffffe000516dbba0   ffffe000516271a0    WAN Miniport (L2TP)

        ffffe0005169b6c0   ffffe000516291a0    Microsoft Network Adapter Multiplexor Default Miniport

        ffffe0004e98fba0   ffffe0005162b1a0    WAN Miniport (Network Monitor)

        ffffe000516a55e0   ffffe0005162d1a0    WAN Miniport (PPTP)

        ffffe0004f5b9ba0   ffffe000516391a0    FooBar Network Adapter #7

        ffffe0004f5b9ba0   ffffe0005163b1a0    FooBar Network Adapter #4

        ffffe0004f5b9ba0   ffffe0005161b1a0    FooBar Network Adapter #8

        ffffe0004f5b9ba0   ffffe0004f5f51a0    FooBar Network Adapter #3

        ffffe0004f5b9ba0   ffffe0004f5f71a0    FooBar Network Adapter #2

        ffffe0004f5b9ba0   ffffe0004f5f91a0    FooBar Network Adapter

        ffffe0004e98fba0   ffffe0004f5fb1a0    WAN Miniport (IPv6)

        ffffe0004f2e9ba0   ffffe0004f5fd1a0    WAN Miniport (IKEv2)

        ffffe0004f5883a0   ffffe0004f55b1a0    WAN Miniport (SSTP)

        ffffe0004f58f580   ffffe0004f55d1a0    Hyper-V Virtual Switch Extension Adapter #2

        ffffe0004f58f580   ffffe0004f55f1a0    Hyper-V Virtual Switch Extension Adapter

        ffffe0004e98fba0   ffffe0004e95c1a0    WAN Miniport (IP)

     

    Lets start “bottom to top” through the network stack beginning with the Physical NIC  to check their health state.  Inspect each miniport using !ndiskd.miniport ffffe000516391a0.

     

    Since ndiskd uses DML (Debugger Markup Language) links, you can click on each address in the miniport column until you find a miniport that shows something abnormal (different from the other miniports)

    0: kd> !ndiskd.miniport ffffe0004f5f71a0

     

     

    MINIPORT

     

         FooBar Network Adapter #2

     

        Ndis handle        ffffe0004f5f71a0

        Ndis API version   v6.40

        Adapter context    ffffe00051a00000

        Miniport driver    ffffe0004f5b9ba0 - FooBarDriver v1.0

        Network interface  ffffe0004e7c0a20

     

        Media type         802.3

        Device instance    PCI\VEN_1234&DEV_5678&SUBSYS_11112222&REV_01\9877ffe0004f5883a0  

        Device object      ffffe0004f5f7050    More information

        MAC address        12-34-56-78-9a-bc

     

     

    STATE

     

        Miniport           Running

        Device PnP         Started             Show state history

        Datapath           DIVERTED_BECAUSE_MINIPORT_RESETTING

        NBL status         NDIS_STATUS_RESET_IN_PROGRESS

        Interface          Up

        Media              Connected

        Power              D0

        References         0n173               Show detail

        User handles       2

        Automatic resets   1

        Resets requested   0

        Pending OID        OID_RECEIVE_FILTER_SET_FILTER

        Flags              BUS_MASTER, 64BIT_DMA, SG_DMA, DEFAULT_PORT_ACTIVATED,

                           RESET_IN_PROGRESS, RESTORING_FILTERS,

                           SUPPORTS_MEDIA_SENSE, DOES_NOT_DO_LOOPBACK,

                           MEDIA_CONNECTED

        PnP flags          PM_SUPPORTED, DEVICE_POWER_ENABLED, RECEIVED_START,

                           HARDWARE_DEVICE

        More flags         PROCESSING_REQUEST, REQUEST_TIMEOUT

     

    The above output indicates this Miniport is in status of resetting.  It also is currently processing an OID.

     

    Clicking on the DML link:

        Pending OID        OID_RECEIVE_FILTER_SET_FILTER

     

    Gives us:

    0: kd> !ndiskd.oid ffffe8012c504090

     

     

    OID REQUEST

     

        OID_RECEIVE_FILTER_SET_FILTER

     

        Request type:      OID_METHOD

        Completed:         Not completed

        Cloned from:       OID_RECEIVE_FILTER_SET_FILTER

     

     

     

    DATA PAYLOAD

    <snip>

     

    Setting a Filter via the OID above is a relatively common operation and should complete very quickly. It is very unusual to see such operation “in progress” in a memory dump collected at a random point in time.

     

    Note:

    You can read more on OID’s  and specifically on OID_RECEIVE_FILTER_SET_FILTER  here:

    https://msdn.microsoft.com/en-us/library/windows/hardware/ff569795(v=vs.85).aspx

     

    So here we have reached a point where it is worthwhile to approach the vendor of the NIC or the OEM of the server.  To be more confident, we could collect another kernel dump 5 minutes later and check if that miniport is still in the same state, but just out of interest, let dig around a bit more.

     

    FooDRV.sys is the driver serving this NIC

    !ndiskd.minidriver ffffe0004f5b9ba0 -handlers

     

    HANDLERS

     

        Miniport Handler                       Function pointer   Symbol (if available)

    *** ERROR: Module load completed but symbols could not be loaded for FooDRV.sys

        InitializeHandlerEx                    fffff80032712fc0   FooDRV+3cfc0

        SetOptionsHandler                      fffff8003274bb10   FooDRV+75b10

    <snip>

     

     

    Run !stacks 2 FooDRV to search for all threads in the dump that currently have this driver on the stack. We have a hit:

    4.001114  ffffe0007510c040 ffc7d369 Blocked    nt!KiSwapContext+0x76

         nt!KiSwapThread+0x14e

         nt!KiCommitThreadWait+0x129

                                            nt!KeWaitForSingleObject+0x373

         NDIS!NdisMSleep+0x59

         FooDRV+0x8ccbb

         FooDRV+0x8bc8b

         FooDRV+0x70d79

         FooDRV+0x14696

         nt!IopProcessWorkItem+0x80

         nt!ExpWorkerThread+0x69f

         nt!PspSystemThreadStartup+0x58

         nt!KiStartSystemThread+0x16

     

    FooDRV.sys is calling NdisMSleep.  The Function name already tells it what it does, but here is the documentation, https://msdn.microsoft.com/de-de/library/ff563677(v=vs.85).aspx

     

    This is where I stopped.  With a minimum impact to the production system we captured enough data to find the problem.  The vendor of the NIC driver will fix that in a newer version of the driver and in the meantime they found a configuration workaround till that fix is ready.  Also, we found by experimenting that disabling VMQ on the Machine does seem to be a valid workaround too.

  • Ntdebugging Blog

    When Special Pool is not so Special

    • 2 Comments

    Hi Everyone.  Richard here in the UK GES team bringing you an interesting case we saw recently where Special Pool gave us some unexpected results, but ultimately still helped us track down the cause of the problem.

     

    It started when we were looking in to some Blue Screen bugchecks on Windows 7 Client builds for a customer.  We saw several different bugcheck codes (0xA, 0xD1 and 0x19 being the most common) and everything pointed towards a pool corruption.

     

    We enabled Special Pool and Pool Tracking using Driver Verifier with the following parameters...

     

    verifier /flags 0x9 /all

     

    ...and sat back, made a cup of tea, and waited for the new dumps to arrive.

     

    When they did, it looked very promising.  All were bugcheck 0xC1 similar to the following...

     

    0: kd> .bugcheck

    Bugcheck code 000000C1

    Arguments ca0b4fe0 00000521 00000020 00000021

    SPECIAL_POOL_DETECTED_MEMORY_CORRUPTION (c1)

    Special pool has detected memory corruption. Typically the current thread's

    stack backtrace will reveal the guilty party.

    Arguments:

    Arg1: ca0b4fe0, Address trying to free.

    Arg2: 00000521, Size of the memory block, as recorded in the pool block header.

    Arg3: 00000020, Size of the memory block, as computed based on the address being freed.

    Arg4: 00000021, Caller is trying to free an incorrect Special Pool memory block.

     

    It looked as if Verifier had done its magic and we were just a stack backtrace away from solving the issue.  However, when we looked the stacks were all different, such as the two examples below...

     

    1: kd> kL

    # ChildEBP RetAddr

    00 de0c5438 82eef4a3 nt!KeBugCheckEx+0x1e

    01 de0c545c 82eeff89 nt!MiCheckSpecialPoolSlop+0x6e

    02 de0c553c 82f29b90 nt!MmFreeSpecialPool+0x15b

    03 de0c55a4 8313df90 nt!ExFreePoolWithTag+0xd6

    04 de0c55b8 8e556044 nt!VerifierExFreePoolWithTag+0x30

    05 de0c55c8 8e561814 dxgmms1!operator delete[]+0x16

    06 de0c5608 8e56203d dxgmms1!VIDMM_GLOBAL::CloseOneAllocation+0x1a0

    07 de0c5628 8e54532b dxgmms1!VIDMM_GLOBAL::CloseAllocation+0x37

    08 de0c5638 9736b4e5 dxgmms1!VidMmCloseAllocation+0x13

    09 de0c56b4 9736a9ca dxgkrnl!DXGDEVICE::DestroyAllocations+0x19a

    0a de0c56d4 97361ab9 dxgkrnl!DXGDEVICE::DestroyResource+0x4d

    0b de0c56f4 97364ede dxgkrnl!DXGDEVICE::ProcessTerminationList+0x7a

    0c de0c5904 9736761e dxgkrnl!DXGCONTEXT::Present+0x1c3

    0d de0c5c18 9f01c011 dxgkrnl!DxgkPresent+0x2dd

    0e de0c5c28 82e45d86 win32k!NtGdiDdDDIPresent+0x19

    0f de0c5c28 77d16b94 nt!KiSystemServicePostCall

    WARNING: Frame IP not in any known module. Following frames may be wrong.

    10 001fe3f0 00000000 0x77d16b94

     

     

    0: kd> kL

    # ChildEBP RetAddr 

    00 b5a078dc 82f25e99 nt!KeBugCheckEx+0x1e

    01 b5a079cc 82f5fb90 nt!MmFreeSpecialPool+0x6c

    02 b5a07a34 83173f90 nt!ExFreePoolWithTag+0xd6

    03 b5a07a48 8dd7754b nt!VerifierExFreePoolWithTag+0x30

    04 b5a07a64 8dd7a54d Npfs!NpRemoveDataQueueEntry+0x81

    05 b5a07a84 8dd7623f Npfs!NpSetClosingPipeState+0x11f

    06 b5a07ab8 8dd762e3 Npfs!NpCommonCleanup+0xf5

    07 b5a07ad0 831724d9 Npfs!NpFsdCleanup+0x19

    08 b5a07af4 82e75072 nt!IovCallDriver+0x73

    09 b5a07b08 83071fd7 nt!IofCallDriver+0x1b

    0a b5a07b48 8306344d nt!IopCloseFile+0x2f3

    0b b5a07b94 83084a95 nt!ObpDecrementHandleCount+0x139

    0c b5a07bdc 830847d5 nt!ObpCloseHandleTableEntry+0x203

    0d b5a07c0c 83084b6f nt!ObpCloseHandle+0x7f

    0e b5a07c28 82e7bd86 nt!NtClose+0x4e

    0f b5a07c28 77736b94 nt!KiSystemServicePostCall

    WARNING: Frame IP not in any known module. Following frames may be wrong.

    10 00defd44 00000000 0x77736b94

     

    Usually we expect to see the guilty driver in the stack, but in this case we see multiple drivers in different stacks, but all bugchecking when we try to free the allocation.

    If we dump back a little from the first parameter we can see the slop pattern, the allocation (of size 0x20) and the following guard page

     

    0: kd> dc ca0b4fe0-30

    ca0b4fb0 53535353 53535353 53535353 53535353 SSSSSSSSSSSSSSSS<

    ca0b4fc0 53535353 53535353 53535353 53535353 SSSSSSSSSSSSSSSS

    ca0b4fd0 53535353 53535353 53535353 53535353 SSSSSSSSSSSSSSSS

    ca0b4fe0 cb55f91c cb55f91c b5e70c50 00000000 ..U...U.P.......

    ca0b4ff0 00000000 00000400 00000400 53535353 ............SSSS

    ca0b5000 ???????? ???????? ???????? ???????? ????????????????

    ca0b5010 ???????? ???????? ???????? ???????? ????????????????

    ca0b5020 ???????? ???????? ???????? ???????? ????????????????

     

    The start of this page should show a 12 byte header, followed by the start of the slop bytes such as this example taken from an intact page in the same dump…

     

    0: kd> dc ca0b2000

    ca0b2000 001d4064 6c734d46 c7a66250 1d1d1d1d d@..FMslPb......

    ca0b2010 1d1d1d1d 1d1d1d1d 1d1d1d1d 1d1d1d1d ................

    ca0b2020 1d1d1d1d 1d1d1d1d 1d1d1d1d 1d1d1d1d ................

    ca0b2030 1d1d1d1d 1d1d1d1d 1d1d1d1d 1d1d1d1d ................

     

    …but what we see at the start of the damaged pages is…

     

    0: kd> dc ca0b4000

    ca0b4000 827f0521 40a47703 d610c641 01800011 !....w.@A.......

    ca0b4010 9073f69c 900f7422 48d9077a 2667a442 ..s."t..z..HB.g&

    ca0b4020 001c955c 42893020 8c3cc405 02051bb8 \... 0.B..<.....

    ca0b4030 c13675c1 52055328 73985e4a 041877a1 .u6.(S.RJ^.s.w..

     

    …garbage.

     

    However, all is not lost.  Further inspection shows that across all the 0xC1 dumps this “garbage” has a semi-predictable pattern…

     

    0: kd> dc ca0b4000 ca0b4fff

    ca0b4000 827f0521 40a47703 d610c641 01800011 !....w.@A.......

    ca0b4010 9073f69c 900f7422 48d9077a 2667a442 ..s."t..z..HB.g&

    ca0b4020 001c955c 42893020 8c3cc405 02051bb8 \... 0.B..<.....

    ca0b4030 c13675c1 52055328 73985e4a 041877a1 .u6.(S.RJ^.s.w..

    ca0b4040 665f093d e9bcfca4 efb0dded d65bf575 =._f........u.[.

    ca0b4050 10e40107 441909c4 a058556d b1844702 .......DmUX..G..

    ca0b4060 0ed49060 641b130c 86ba2210 82604002 `......d."...@`.

    ============================SNIP============================

    ca0b48d0 70006000 00000000 00000000 00000000 .`.p............

    ca0b48e0 74006000 00000000 00000000 00000000 .`.t............

    ca0b48f0 78006000 00000000 00000000 00000000 .`.x............

    ca0b4900 7c006000 00000000 00000000 00000000 .`.|............

    ca0b4910 80006000 00000000 00000000 00000000 .`..............

    ca0b4920 78090043 02000000 22220000 02400000 C..x......""..@.

    ca0b4930 11130000 02400000 11130000 02c0000c ......@.........

    ca0b4940 11110000 00000000 00000000 00000000 ................

    ca0b4950 00000000 00000000 00000000 00000000 ................

    ============================SNIP============================

    ca0b4a10 00000000 00000000 00000000 00000000 ................

    ca0b4a20 00000000 00000000 00000000 00000000 ................

    ca0b4a30 00000000 680b0000 00000000 00000000 .......h........

    ca0b4a40 53535353 53535353 53535353 53535353 SSSSSSSSSSSSSSSS

    ca0b4a50 53535353 53535353 53535353 53535353 SSSSSSSSSSSSSSSS

    ca0b4a60 53535353 53535353 53535353 53535353 SSSSSSSSSSSSSSSS

    ca0b4a70 53535353 53535353 53535353 53535353 SSSSSSSSSSSSSSSS

    ca0b4a80 53535353 53535353 53535353 53535353 SSSSSSSSSSSSSSSS

    ============================SNIP============================

    ca0b4fb0 53535353 53535353 53535353 53535353 SSSSSSSSSSSSSSSS

    ca0b4fc0 53535353 53535353 53535353 53535353 SSSSSSSSSSSSSSSS

    ca0b4fd0 53535353 53535353 53535353 53535353 SSSSSSSSSSSSSSSS

    ca0b4fe0 cb55f91c cb55f91c b5e70c50 00000000 ..U...U.P.......

    ca0b4ff0 00000000 00000400 00000400 53535353 ............SSSS

     

    Most of the garbage data towards the start of the page varies between corruptions, but some parts, such as THIS, were consistent across all dumps.  So we chose this DWORD to search for in kernel memory (remembering to set .ignore_missing_pages first to keep screen-chaff to a minimum)…

     

    0: kd> .ignore_missing_pages 1;s -d 80000000 L?7fffffff 78090043

    Suppress kernel summary dump missing page error message

    8ade97f8 78090043 02000000 22220000 02400000 C..x......""..@.

    96f94038 78090043 00000000 20000000 1b000001 C..x....... ....

    ca0b4920 78090043 02000000 22220000 02400000 C..x......""..@.

     

    Always 3 hits in all the dumps: one for the corrupted pool page; one for the crashdump driver where it is dumping out this page; and the third one points to a 3rd party driver…

     

    0: kd> lmta 96f94038

    start end module name

    96616000 97332000 DRIVER_X Tue Mar 27 03:05:05 2012 (4F712051)

     

    Since these crashes were on 32-bit systems we could use the above search to check the entire range of Kernel memory for the pattern.  However, since 64-bit systems are more common these days, even on Client systems, we could instead have narrowed this down to just searching the address space of the loaded modules...

     

    0: kd> !for_each_module s -d @#Base @#End 78090043

    96f94038  78090043 00000000 20000000 1b000001  C..x....... ....

     

    Whichever method you prefer, an update to the third-party DRIVER_X.SYS resolved the issue, and there was much rejoicing.

  • Ntdebugging Blog

    We Are Hiring – North Carolina and Texas

    • 0 Comments

    Would you like to join the world’s best and most elite debuggers to enable the success of Microsoft solutions?

     

    As a trusted advisor to our top customers you will be working with to the most experienced IT professionals and developers in the industry. You will influence our product teams in sustained engineering efforts to drive improvements in our products.

     

    This role involves deep analysis of product source code and debugging to solve problems in multi-million dollar configurations and will give you an opportunity to stretch your critical thinking skills. During the course of debugging, you will uncover opportunities to improve the customer experience while influencing the current and future design of our products.

     

    In addition to providing support to customers while being the primary interface to our sustained engineering teams, you will also have the opportunity to work with new technologies and unreleased software. Through our continuous investment in depth training and hands-on experience with tough customer challenges you will become the world’s best in this area. Expect to partner with many various roles at Microsoft launching a very successful career!

     

    We currently have positions open at our sites in Charlotte, NC and Las Colinas, TX (Dallas-Fort Worth area).

    Learn more about what an Escalation Engineer does at:

    Microsoft JobsBlog JobCast with Escalation Engineer Jeff Dailey

    Microsoft JobsBlog JobCast with Escalation Engineer Scott Oseychik

     

    Apply here:

    https://careers.microsoft.com/jobdetails.aspx?ss=&pg=0&so=&rw=1&jid=203335&jlang=EN&pp=SS

  • Ntdebugging Blog

    Uncover the mystery of a bugcheck 0x24 (NTFS_FILE_SYSTEM)

    • 2 Comments

     

    My name is Nan, I am an Escalation Engineer in Platforms Global Escalation Services in GCR. Today I’d like to share an interesting sample case with respect to filter manager to showcase the powerful debugging extension fltkd.dll (it is included in the public debugging tools) and talk about Filter Manager/Minifilter architecture briefly through the investigation.

     

    Recently I worked on an interesting server bugcheck case where Windows Server 2012 R2 would crash with bugcheck code 0x24 (NTFS_FILE_SYSTEM) after mounting a ReFS volume.

     

    3: kd> .bugcheck

    Bugcheck code 00000024

    Arguments 000000b5`00190637 ffffd000`23f17e58 ffffd000`23f17660 fffff800`00cf95c7

     

    The 2nd and 3rd parameters are the exception record and context record. Do a .cxr on the 3rd parameter and we obtain a more informative stack.

     

    3: kd> .cxr ffffd00023f17660

    rax=ffffe0003c8dae20 rbx=ffffc00003dba2c0 rcx=0000000000000018

    rdx=ffffe0003c84bc60 rsi=ffffd00023f18210 rdi=ffffc00003dba120

    rip=fffff80000cf95c7 rsp=ffffd00023f18090 rbp=0000000000000000

    r8=0000000100000000  r9=0000000000000001 r10=0000000000000004

    r11=ffffd00023f18168 r12=0000000000000000 r13=ffffe0003c84bf70

    r14=0000000000000005 r15=0000000000000000

    iopl=0         nv up ei ng nz na pe nc

    cs=0010  ss=0018  ds=002b  es=002b  fs=0053  gs=002b             efl=00010282

    Ntfs!NtfsCommonQueryInformation+0xa7:

    fffff800`00cf95c7 418b4104        mov    eax,dword ptr [r9+4] ds:002b:00000000`00000005=???????? <<<< R9 has invalid data.

    3: kd> knL

      *** Stack trace for last set context - .thread/.cxr resets it

    # Child-SP          RetAddr           Call Site

    00 ffffd000`23f18090 fffff800`00d00813 Ntfs!NtfsCommonQueryInformation+0xa7

    01 ffffd000`23f18170 fffff800`00d01156 Ntfs!NtfsFsdDispatchSwitch+0xd3

    02 ffffd000`23f181f0 fffff800`00402f3e Ntfs!NtfsFsdDispatchWait+0x47

    03 ffffd000`23f18440 fffff800`00406986 FLTMGR!FltpLegacyProcessingAfterPreCallbacksCompleted+0x25e

    04 ffffd000`23f184c0 fffff800`0042ef2a FLTMGR!FltPerformSynchronousIo+0x2b6

    05 ffffd000`23f18570 fffff800`01041843 FLTMGR!FltQueryInformationFile+0x52

    06 ffffd000`23f185b0 fffff800`010417bb av+0x3843

    07 ffffd000`23f18600 fffff800`00f1c06d av+0x37bb

    08 ffffd000`23f18630 fffff800`00401eea fltdrv1+0xc06d

    09 ffffd000`23f18720 fffff800`00405c13 FLTMGR!FltpPerformPreCallbacks+0x31a

    0a ffffd000`23f18830 fffff800`0042bcf2 FLTMGR!FltpPassThroughFastIo+0x73

    0b ffffd000`23f18890 fffff801`6c9b62b1 FLTMGR!FltpFastIoQueryStandardInfo+0x132

    0c ffffd000`23f18920 fffff801`6c7684b3 nt!NtQueryInformationFile+0x3e1

    0d ffffd000`23f18a90 00007ff9`e632668a nt!KiSystemServiceCopyEnd+0x13

    0e 00000000`09bcd828 00000000`00000000 0x00007ff9`e632668a

     

    An exception happens in Ntfs.sys because Ntfs!NtfsCommonQueryInformation attempts to access an invalid memory address in kernel mode.  We look back a few instructions to figure how R9 obtains the bogus value.

     

    3: kd> ub fffff80000cf95c7

    Ntfs!NtfsCommonQueryInformation+0x81:

    fffff800`00cf95a1 4c8b8fa8000000  mov    r9,qword ptr [rdi+0A8h] <<<< where R9 value is obtained.

    fffff800`00cf95a8 4d894bb0        mov     qword ptr [r11-50h],r9

    fffff800`00cf95ac 4d894ba0        mov     qword ptr [r11-60h],r9

    fffff800`00cf95b0 488b5820        mov     rbx,qword ptr [rax+20h]

    fffff800`00cf95b4 49895ba8        mov     qword ptr [r11-58h],rbx

    fffff800`00cf95b8 4c8b87a0000000  mov     r8,qword ptr [rdi+0A0h]

    fffff800`00cf95bf 4d894388        mov     qword ptr [r11-78h],r8

    fffff800`00cf95c3 4d894398        mov     qword ptr [r11-68h],r8

    3: kd> r rdi

    Last set context:

    rdi=ffffc00003dba120

     

    Data in R9 is obtained from memory being referenced by RDI+0A8h. Typically it is supposed to be an opaque structure pertaining to NTFS. Before giving up and engaging Microsoft support, we can give another shot by our old friend !pool extension.

     

    3: kd> !pool ffffc00003dba120 2

    Pool page ffffc00003dba120 region is Paged pool

    *ffffc00003dba000 size:  450 previous size:    0  (Allocated) *Reff

                Pooltag Reff : FCB_DATA, Binary : refs.sys

     

    To my surprise, the pool tag points to ReFS. Let’s revisit the call stack above, it crashes in Ntfs path clearly. How the query operation against Ntfs end up referencing something in ReFS?

     

    Instead of doing the bottom-up analysis, we can take the top-down approach in effort to figure out more information. Let’s revisit the call stack again. The top entry on the stack is nt!NtQueryInformationFile that enters Filter Manager and winds up calling FltpPerformPreCallbacks function, which calls into 3rd party driver fltdrv1.sys.

     

    3: kd> knL

      *** Stack trace for last set context - .thread/.cxr resets it

    # Child-SP          RetAddr           Call Site

    00 ffffd000`23f18090 fffff800`00d00813 Ntfs!NtfsCommonQueryInformation+0xa7

    01 ffffd000`23f18170 fffff800`00d01156 Ntfs!NtfsFsdDispatchSwitch+0xd3

    02 ffffd000`23f181f0 fffff800`00402f3e Ntfs!NtfsFsdDispatchWait+0x47

    03 ffffd000`23f18440 fffff800`00406986 FLTMGR!FltpLegacyProcessingAfterPreCallbacksCompleted+0x25e

    04 ffffd000`23f184c0 fffff800`0042ef2a FLTMGR!FltPerformSynchronousIo+0x2b6

    05 ffffd000`23f18570 fffff800`01041843 FLTMGR!FltQueryInformationFile+0x52

    06 ffffd000`23f185b0 fffff800`010417bb av+0x3843

    07 ffffd000`23f18600 fffff800`00f1c06d av+0x37bb

    08 ffffd000`23f18630 fffff800`00401eea fltdrv1+0xc06d

    09 ffffd000`23f18720 fffff800`00405c13 FLTMGR!FltpPerformPreCallbacks+0x31a

    0a ffffd000`23f18830 fffff800`0042bcf2 FLTMGR!FltpPassThroughFastIo+0x73

    0b ffffd000`23f18890 fffff801`6c9b62b1 FLTMGR!FltpFastIoQueryStandardInfo+0x132

    0c ffffd000`23f18920 fffff801`6c7684b3 nt!NtQueryInformationFile+0x3e1

    0d ffffd000`23f18a90 00007ff9`e632668a nt!KiSystemServiceCopyEnd+0x13

    0e 00000000`09bcd828 00000000`00000000 0x00007ff9`e632668a

     

    As implied by the function name FLTMGR!FltpPerformPreCallbacks, we can suppose it is calling some call back routine in 3rd party driver fltdrv1.sys which is a minifilter registered in Filter Manager.

     

    In the Filter Manager architecture a minifilter driver can filter IRP-based I/O operations as well as fast I/O and file system filter (FSFilter) callback operations. For each of the I/O operations it chooses to filter, a minifilter driver can register a preoperation callback routine, a postoperation callback routine, or both. When handling an I/O operation, the filter manager calls the appropriate callback routine for each minifilter driver that registered for that operation. When that callback routine returns, the filter manager calls the appropriate callback routine for the next minifilter driver that registered for the operation.

     

    So FLTMGR!FltpPerformPreCallbacks is supposed to invoke the preoperation call back inside fltdrv1.sys, Here is how preoperation call back is declared.

     

    typedef FLT_PREOP_CALLBACK_STATUS ( *PFLT_PRE_OPERATION_CALLBACK)(

      _Inout_  PFLT_CALLBACK_DATA Data,

      _In_     PCFLT_RELATED_OBJECTS FltObjects,

      _Out_    PVOID *CompletionContext

    );

     

    The FLT_CALLBACK_DATA structure represents an I/O operation. The Filter Manager and minifilters use this structure to initiate and process I/O operations. In this structure Iopb pointers to an FLT_IO_PARAMETER_BLOCK structure that contains the parameters for the I/O operation.

     

    So let’s decode the assembly code FLTMGR!FltpPerformPreCallbacks and figure out the parameters it passes into the 3rd party callback first.

     

    3: kd> ub fffff800`00401eea

    FLTMGR!FltpPerformPreCallbacks+0x2ff:

    fffff800`00401ecf 7810            js     FLTMGR!FltpPerformPreCallbacks+0x311 (fffff800`00401ee1)

    fffff800`00401ed1 000f            add     byte ptr [rdi],cl

    fffff800`00401ed3 85d2            test    edx,edx

    fffff800`00401ed5 c00000          rol     byte ptr [rax],0

    fffff800`00401ed8 4c8d45bf        lea     r8,[rbp-41h]

    fffff800`00401edc 488d55d7        lea     rdx,[rbp-29h]

    fffff800`00401ee0 488d8bd8000000  lea     rcx,[rbx+0D8h] <<<< this has the first parameter Data

    fffff800`00401ee7 ff5618          call    qword ptr [rsi+18h]

    3: kd> .frame /r 09

    09 ffffd000`23f18720 fffff800`00405c13 FLTMGR!FltpPerformPreCallbacks+0x31a

    rax=ffffe0003c8dae20 rbx=ffffe0003c8d2c80 rcx=0000000000000018

    rdx=ffffe0003c84bc60 rsi=ffffe0000165b560 rdi=ffffe0003c884890

    rip=fffff80000401eea rsp=ffffd00023f18720 rbp=ffffd00023f187c9

    r8=0000000100000000  r9=0000000000000001 r10=0000000000000004

    r11=ffffd00023f18168 r12=ffffe0003c8d2d00 r13=ffffe0000165b010

    r14=ffffe0003c8d2df8 r15=0000000000000004

    iopl=0         nv up ei ng nz na pe nc

    cs=0010  ss=0018  ds=002b  es=002b  fs=0053  gs=002b             efl=00010282

    FLTMGR!FltpPerformPreCallbacks+0x31a:

    fffff800`00401eea 8bf8            mov     edi,eax

    3: kd> dt ffffe0003c8d2c80+d8 _FLT_CALLBACK_DATA

    FLTMGR!_FLT_CALLBACK_DATA

       +0x000 Flags            : 2

       +0x008 Thread           : 0xffffe000`0375c880 _KTHREAD

       +0x010 Iopb             : 0xffffe000`3c8d2db0 _FLT_IO_PARAMETER_BLOCK

    3: kd> dt 0xffffe000`3c8d2db0 _FLT_IO_PARAMETER_BLOCK

    FLTMGR!_FLT_IO_PARAMETER_BLOCK

       +0x000 IrpFlags         : 0

       +0x004 MajorFunction    : 0x5 ''

       +0x005 MinorFunction    : 0 ''

       +0x006 OperationFlags   : 0 ''

       +0x007 Reserved         : 0 ''

       +0x008 TargetFileObject : 0xffffe000`3c884890 _FILE_OBJECT

       +0x010 TargetInstance   : 0xffffe000`0165b010 _FLT_INSTANCE

     

    What we are concerned about are parameters TargetFileObject and TargetInstance.

    • TargetFileObject - A file object pointer for the file or directory that is the target for this I/O operation.
    • TargetInstance - An opaque instance pointer for the minifilter that is the target for this I/O operation.

     

    Let’s dump the TargetFileObject first, it is a file open on HarddiskVolume4.

     

    3: kd> !fileobj 0xffffe000`3c884890

     

    \AV-REDIRECT-F3392236-BA8C-4AD9-8AEA-48F099A8EE9C\ffffe0003c8dae20.ini

     

    Device Object: 0xffffe000011009b0   \Driver\volmgr

    Vpb is NULL

    Event signalled

     

    Flags:  0x40002

          Synchronous IO

          Handle Created

     

    File Object is currently busy and has 0 waiters.

     

    FsContext: 0x00000000   FsContext2: 0x46527641

    CurrentByteOffset: 0

     

    3: kd> !devstack 0xffffe000011009b0  

      !DevObj           !DrvObj            !DevExt          ObjectName

      ffffe00001143040  \Driver\volsnap    ffffe00001143190 

    > ffffe000011009b0  \Driver\volmgr     ffffe00001100b00  HarddiskVolume4

    !DevNode ffffe00001148a50 :

      DeviceInst is "STORAGE\Volume\{19df1540-2f74-11e4-80bb-0050568127c8}#0000000000100000"

      ServiceName is "volsnap"

     

    The target File open is \AV-REDIRECT-F3392236-BA8C-4AD9-8AEA-48F099A8EE9C\ffffe0003c8dae20.ini on HarddiskVolume4. Next, what is the target filter instance?

     

    In short, a minifilter driver attaches to the file system stack indirectly, by registering with the filter manager for the I/O operations the minifilter driver chooses to filter. An instance represents an attachment of a filter on a volume. Here is an excellent article to elaborate how filter attach/altitudes works.

     

    Public symbols for fltmgr.sys does include the definitions of various key filter manager structures. We don’t bother to browse the structure manually. There is a public debug extension fltkd.dll and we can use it to view the instance in a very neat way. Here is how we print out the filter instance in question.

     

    3: kd> !fltkd.instance 0xffffe000`0165b010

     

    FLT_INSTANCE: ffffe0000165b010 "FLTDRV1" "388920" <<<< instance name and altitude #

       FLT_OBJECT: ffffe0000165b010  [01000000] Instance

          RundownRef               : 0x0000000000000002 (1)

          PointerCount             : 0x00000001

          PrimaryLink              : [ffffe0000165dab0-ffffe00002501108]

       OperationRundownRef      : ffffe000025035b0

    Could not read field "Number" of fltmgr!_EX_RUNDOWN_REF_CACHE_AWARE from address: ffffe000025035b0

       Flags                    : [00000000]

       Volume                   : ffffe00002501010 "\Device\HarddiskVolume4"  <<<< the underlying volume filter instance attaches to.

       Filter                   : ffffe000011f1010 "fltdrv1"  <<<< filter name

       TrackCompletionNodes     : ffffe0000165bda0

       ContextLock              : (ffffe0000165b080)

       Context                  : 0000000000000000

       CallbackNodes            : (ffffe0000165b0a0)

       VolumeLink               : [ffffe0000165dab0-ffffe00002501108]

       FilterLink               : [ffffe00001650070-ffffe0000164c340]

     

    In !fltkd.instance output, we can see the instance passed into 3rd party pre-operation call back is attached to the volume "\Device\HarddiskVolume4”.  Filter in the output describes a minifilter.  We can run !fltkd.filter to print out the details of filter.

     

    3: kd> !fltkd.filter ffffe000011f1010

     

    FLT_FILTER: ffffe000011f1010 "fltdrv1" "388920"

       FLT_OBJECT: ffffe000011f1010  [02000000] Filter

          RundownRef               : 0x0000000000000026 (19)

          PointerCount             : 0x00000002

          PrimaryLink              : [ffffe00000a9ac60-ffffe00000ab50c0]

       Frame                    : ffffe00000ab5010 "Frame 0"

       Flags                    : [00000002] FilteringInitiated

       DriverObject             : ffffe0000119f9c0

       FilterLink               : [ffffe00000a9ac60-ffffe00000ab50c0] *** ERROR: Module load completed but symbols could not be loaded for fltdrv1.sys

     

       PreVolumeMount           : fffff80000f1bd78  fltdrv1+0xbd78

       PostVolumeMount          : fffff80000f1c354  fltdrv1+0xc354

       FilterUnload             : 0000000000000000  (null)

       InstanceSetup            : fffff80000f1c528  fltdrv1+0xc528

       InstanceList             : (ffffe000011f1068) <<<< all filter instances this filter creates.

          FLT_INSTANCE: ffffe000011e6640 "FLTDRV1" "388920"

          …

          FLT_INSTANCE: ffffe0000173d010 "FLTDRV1" "388920"

          FLT_INSTANCE: ffffe00001793010 "FLTDRV1" "388920"

          FLT_INSTANCE: ffffe00001787010 "FLTDRV1" "388920"

     

    A side note about Filter Frame, for interoperability with legacy filter drivers the filter manager can attach filter device objects to a file system I/O stack in more than one location. Each of the filter manager's filter device objects is called a frame. From the perspective of a legacy filter driver, each filter manager frame is just another legacy filter driver. If there are no legacy filters present, there is only one frame, "Frame 0". If there are some legacy filters present on the system (attached on top of Frame 0), FltMgr might create more frames. We can view all frames Filter Manager manages by running !fltkd.frames

     

    In !fltkd.instance output,  it also displays the volume object filter instance is attached to. We can also dump out the Filter Volume and see all filter instances attached to it. All filter instances are displayed in order of altitude (number next to the filter instance name). Filter manager will process the I/O from filter with higher altitude to one with lower altitude. The filter I figured out just now is the top one and its filter instance altitude implies FLTDRV1 is a sort of FSFilter Activity Monitor.

     

    3: kd> !fltkd.volume ffffe00002501010

     

    FLT_VOLUME: ffffe00002501010 "\Device\HarddiskVolume4"

       FLT_OBJECT: ffffe00002501010  [04000000] Volume

          RundownRef               : 0x000000000000000a (5)

          PointerCount             : 0x00000001

          PrimaryLink              : [ffffe0000165a020-ffffe0000254f020]

       Frame                    : ffffe00000ab5010 "Frame 0"

       Flags                    : [00000064] SetupNotifyCalled EnableNameCaching FilterAttached

       FileSystemType           : [00000002] FLT_FSTYPE_NTFS

       VolumeLink               : [ffffe0000165a020-ffffe0000254f020]

       DeviceObject             : ffffe00002501df0

       DiskDeviceObject         : ffffe000011009b0

       FrameZeroVolume          : ffffe00002501010

       VolumeInNextFrame        : 0000000000000000

       Guid                     : "\??\Volume{19df169b-2f74-11e4-80bb-0050568127c8}"

       CDODeviceName            : "\Ntfs"

       CDODriverName            : "\FileSystem\Ntfs"

       TargetedOpenCount        : 0

       Callbacks                : (ffffe00002501120)

       ContextLock              : (ffffe00002501508)

       VolumeContexts           : (ffffe00002501510)  Count=0

       StreamListCtrls          : (ffffe00002501518)  rCount=4

       FileListCtrls            : (ffffe00002501598)  rCount=0

       NameCacheCtrl            : (ffffe00002501618)

       InstanceList             : (ffffe000025010a0) <<<< all filter instances attached to particular volume.

          FLT_INSTANCE: ffffe0000165b010 "FLTDRV1" "388920"

          FLT_INSTANCE: ffffe0000165daa0 "Fltdrv2 Instance" "366992"

          FLT_INSTANCE: ffffe00001661600 "V-locity cache Instance" "366789"

          FLT_INSTANCE: ffffe0000343c010 "DKRtWrt Instance" "137100"

    3: kd> !devstack ffffe00002501df0

      !DevObj           !DrvObj            !DevExt          ObjectName

    > ffffe00002501df0  \FileSystem\FltMgr ffffe00002501f40 

      ffffe00000244030  \FileSystem\Ntfs   ffffe00000244180  <<<< this volume is a NTFS volume.

     

    Let’s recap a bit before proceeding, we know the original query is all about a file open against \AV-REDIRECT-F3392236-BA8C-4AD9-8AEA-48F099A8EE9C\ffffe0003c8dae20.ini on HarddiskVolume4 which happens to be NTFS volume.

     

    It appears to be a normal query from I/O manager and nothing stands out yet.  Let’s carry on the investigation.

     

    On the call stack, 3rd party functions are black box, but the third party driver calls filter manager API FLTMGR!FltQueryInformationFile to retrieve some file information. So we pick up the analysis from that call.

     

    3: kd> knL

      *** Stack trace for last set context - .thread/.cxr resets it

    # Child-SP          RetAddr           Call Site

    00 ffffd000`23f18090 fffff800`00d00813 Ntfs!NtfsCommonQueryInformation+0xa7

    01 ffffd000`23f18170 fffff800`00d01156 Ntfs!NtfsFsdDispatchSwitch+0xd3

    02 ffffd000`23f181f0 fffff800`00402f3e Ntfs!NtfsFsdDispatchWait+0x47

    03 ffffd000`23f18440 fffff800`00406986 FLTMGR!FltpLegacyProcessingAfterPreCallbacksCompleted+0x25e

    04 ffffd000`23f184c0 fffff800`0042ef2a FLTMGR!FltPerformSynchronousIo+0x2b6

    05 ffffd000`23f18570 fffff800`01041843 FLTMGR!FltQueryInformationFile+0x52

    06 ffffd000`23f185b0 fffff800`010417bb av+0x3843

    07 ffffd000`23f18600 fffff800`00f1c06d av+0x37bb

    08 ffffd000`23f18630 fffff800`00401eea fltdrv1+0xc06d

    09 ffffd000`23f18720 fffff800`00405c13 FLTMGR!FltpPerformPreCallbacks+0x31a

     

    Here is how function FltQueryInformationFile is defined.

     

    NTSTATUS FltQueryInformationFile(

      _In_       PFLT_INSTANCE Instance,

      _In_       PFILE_OBJECT FileObject,

      _Out_      PVOID FileInformation,

      _In_       ULONG Length,

      _In_       FILE_INFORMATION_CLASS FileInformationClass,

      _Out_opt_  PULONG LengthReturned

    );

     

    First parameter represents an attachment to specific volume and second parameter FileObject is the target file to be queried. We need to figure out what instance and file object 3rd party driver were passed into FltQueryInformationFile. As there is no public symbol information for FltQueryInformationFile, we can decode the assembly code directly.

     

    3: kd> u FltQueryInformationFile

    FLTMGR!FltQueryInformationFile:

    fffff800`0042eed8 48895c2408      mov     qword ptr [rsp+8],rbx

    fffff800`0042eedd 4889742410      mov     qword ptr [rsp+10h],rsi

    fffff800`0042eee2 57              push    rdi

    fffff800`0042eee3 4883ec30        sub     rsp,30h

    fffff800`0042eee7 498bf0          mov     rsi,r8

    fffff800`0042eeea 4c8d442420      lea     r8,[rsp+20h]

    fffff800`0042eeef 418bf9          mov     edi,r9d

    fffff800`0042eef2 e8b970fdff      call    FLTMGR!FltAllocateCallbackData (fffff800`00405fb0)

     

    Because FltQueryInformationFile does not populate the rcx and rdx registers (used for the first to parameters), we can assume that it passes its first two parameters through to FltAllocateCallbackData. Let’s see what FltAllocateCallbackData does and what we may obtain from this hint.

     

    NTSTATUS FltAllocateCallbackData(

      _In_      PFLT_INSTANCE Instance,

      _In_opt_  PFILE_OBJECT FileObject,

      _Out_     PFLT_CALLBACK_DATA *RetNewCallbackData

    );

     

    It takes Instance and FileObject as the first two parameters and an instance of PFLT_CALLBACK_DATA as the third parameter. Keep in mind the FLT_CALLBACK_DATA structure represents an I/O operation. The Filter Manager and minifilters use this structure to initiate and process I/O operations, so it should contain the information pertaining to the query a 3rd party driver issues.

     

    Looking back the assembly code before calling FltAllocateCallbackData, 3rd output parameter RetNewCallbackData is obtained by this instruction:

     

    fffff800`0042eeea 4c8d442420      lea     r8,[rsp+20h]

     

    So let’s dump out the PFLT_CALLBACK_DATA from the stack pointer.

     

    3: kd> .frame /r 5

    05 ffffd000`23f18570 fffff800`01041843 FLTMGR!FltQueryInformationFile+0x52

    rax=ffffe0003c8dae20 rbx=ffffe0003c8d20e8 rcx=0000000000000018

    rdx=ffffe0003c84bc60 rsi=0000000009bcd870 rdi=0000000000000018

    rip=fffff8000042ef2a rsp=ffffd00023f18570 rbp=ffffd00023f18690

    r8=0000000100000000  r9=0000000000000001 r10=0000000000000004

    r11=ffffd00023f18168 r12=0000000000000000 r13=0000000000000000

    r14=0000000003000000 r15=0000000000000000

    iopl=0         nv up ei ng nz na pe nc

    cs=0010  ss=0018  ds=002b  es=002b  fs=0053  gs=002b             efl=00010282

    FLTMGR!FltQueryInformationFile+0x52:

    fffff800`0042ef2a 488b4c2468      mov     rcx,qword ptr [rsp+68h] ss:0018:ffffd000`23f185d8=ffffd00023f18618

    3: kd> dt poi(ffffd00023f18570+20) _FLT_CALLBACK_DATA

    FLTMGR!_FLT_CALLBACK_DATA

       +0x000 Flags            : 0x10009

       +0x008 Thread           : 0xffffe000`0375c880 _KTHREAD

       +0x010 Iopb             : 0xffffe000`3c8d2140 _FLT_IO_PARAMETER_BLOCK

    3: kd> dt 0xffffe000`3c8d2140 _FLT_IO_PARAMETER_BLOCK

    FLTMGR!_FLT_IO_PARAMETER_BLOCK

       +0x000 IrpFlags         : 0

       +0x004 MajorFunction    : 0x5 ''

       +0x005 MinorFunction    : 0 ''

       +0x006 OperationFlags   : 0 ''

       +0x007 Reserved         : 0 ''

       +0x008 TargetFileObject : 0xffffe000`3c8dae20 _FILE_OBJECT

       +0x010 TargetInstance   : 0xffffe000`0165daa0 _FLT_INSTANCE

     

    Now we’ve got the target instance and file object the 3rd party driver hopes is querying. Let’s look at the target file object first.

     

    3: kd> !fileobj 0xffffe000`3c8dae20

     

    \$RECYCLE.BIN\S-1-5-21-589703330-573742762-725482543-110873\desktop.ini

     

    Device Object: 0xffffe000010f1ca0   \Driver\volmgr

    Vpb: 0xffffe000010cc9a0

    Access: Read SharedRead SharedWrite SharedDelete

     

    Flags:  0x62

          Synchronous IO

          Sequential Only

          Cache Supported

     

    FsContext: 0xffffc00003dba120 FsContext2: 0xffffc00003dba2c0

    CurrentByteOffset: 0

    Cache Data:

      Section Object Pointers: ffffe0003c887380

      Shared Cache Map: 00000000

    3: kd> !devstack 0xffffe000010f1ca0  

      !DevObj           !DrvObj            !DevExt          ObjectName

      ffffe00001142040  \Driver\volsnap    ffffe00001142190 

    > ffffe000010f1ca0  \Driver\volmgr     ffffe000010f1df0  HarddiskVolume5 <<<< volume changes.

    !DevNode ffffe00001148770 :

      DeviceInst is "STORAGE\Volume\{19df16e3-2f74-11e4-80bb-0050568127c8}#0000000000100000"

      ServiceName is "volsnap"

     

    The biggest difference between the original query and current query is not only the file name but also the target volume, now it is against the file \$RECYCLE.BIN\S-1-5-21-589703330-573742762-725482543-110873\desktop.ini on HarddiskVolume5.

     

    Another note, is about the FsContext 0xffffc00003dba120 in the file object, that is the address I ran !pool with and it is from ReFS. It seems we’ve nailed down where the ReFS things come from.

     

    Let’s continue to dump all of the volumes managed by Filter Manager. Apparently hardiskvolume5 is an ReFS volume instead of NTFS.

     

    3: kd> !fltkd.volumes

     

    Volume List: ffffe00000ab5140 "Frame 0"

       FLT_VOLUME: ffffe000010e5010 "\Device\Mup"

          FLT_INSTANCE: ffffe000011e6640 "FLTDRV1" "388920"

    ..

       FLT_VOLUME: ffffe00002501010 "\Device\HarddiskVolume4"

          FLT_INSTANCE: ffffe0000165b010 "FLTDRV1" "388920"

          FLT_INSTANCE: ffffe0000165daa0 "Fltdrv2 Instance" "366992"

          FLT_INSTANCE: ffffe00001661600 "V-locity cache Instance" "366789"

          FLT_INSTANCE: ffffe0000343c010 "DKRtWrt Instance" "137100"

       FLT_VOLUME: ffffe0000169e010 "\Device\HarddiskVolume5"

          FLT_INSTANCE: ffffe00001694190 "FLTDRV1" "388920"

    3: kd> !fltkd.volume ffffe0000169e010

     

    FLT_VOLUME: ffffe0000169e010 "\Device\HarddiskVolume5"

       FLT_OBJECT: ffffe0000169e010  [04000000] Volume

          RundownRef               : 0x0000000000000004 (2)

          PointerCount             : 0x00000001

          PrimaryLink              : [ffffe00001693020-ffffe000016aa020]

       Frame                    : ffffe00000ab5010 "Frame 0"

       Flags                    : [00000064] SetupNotifyCalled EnableNameCaching FilterAttached

       FileSystemType           : [0000001c]

       VolumeLink               : [ffffe00001693020-ffffe000016aa020]

       DeviceObject             : ffffe000016a1df0

       DiskDeviceObject         : ffffe000010f1ca0

       FrameZeroVolume          : ffffe0000169e010

       VolumeInNextFrame        : 0000000000000000

       Guid                     : "\??\Volume{19df173a-2f74-11e4-80bb-0050568127c8}"

       CDODeviceName            : "\Refs"

       CDODriverName            : "\FileSystem\ReFS"

       TargetedOpenCount        : 0

       Callbacks                : (ffffe0000169e120)

       ContextLock              : (ffffe0000169e508)

       VolumeContexts           : (ffffe0000169e510)  Count=0

       StreamListCtrls          : (ffffe0000169e518)  rCount=2

       FileListCtrls            : (ffffe0000169e598)  rCount=0

       NameCacheCtrl            : (ffffe0000169e618)

       InstanceList             : (ffffe0000169e0a0)

          FLT_INSTANCE: ffffe00001694190 "FLTDRV1" "388920"

    3: kd> !devstack ffffe000016a1df0

      !DevObj           !DrvObj            !DevExt          ObjectName

    > ffffe000016a1df0  \FileSystem\FltMgr ffffe000016a1f40 

      ffffe00002259030  \FileSystem\ReFS   ffffe00002259180 

     

    So the target file object that 3rd party driver would like to query is on a ReFS volume. Let’s dump out the target instance now.

     

    3: kd> !fltkd.instance 0xffffe000`0165daa0

     

    FLT_INSTANCE: ffffe0000165daa0 "Fltdrv2 Instance" "366992" <<<< instance changes.

       FLT_OBJECT: ffffe0000165daa0  [01000000] Instance

          RundownRef               : 0x0000000000000000 (0)

          PointerCount             : 0x00000002

          PrimaryLink              : [ffffe00001661610-ffffe0000165b020]

       OperationRundownRef      : ffffe0000165f140

    Could not read field "Number" of fltmgr!_EX_RUNDOWN_REF_CACHE_AWARE from address: ffffe0000165f140

       Flags                    : [00000000]

       Volume                   : ffffe00002501010 "\Device\HarddiskVolume4" <<<< volume remains the same

       Filter                   : ffffe00000a9ac50 "DKTLFSMF" <<<< another filter.

       TrackCompletionNodes     : ffffe0000165ceb0

       ContextLock              : (ffffe0000165db10)

       Context                  : ffffe00000d096d0

       CallbackNodes            : (ffffe0000165db30)

       VolumeLink               : [ffffe00001661610-ffffe0000165b020]

       FilterLink               : [ffffe00000a9ad10-ffffe00002553cc0]

     

    We can notice that instance changes from original FLT_INSTANCE: ffffe0000165b010 "FLTDRV1" "388920" to FLT_INSTANCE:  0xffffe000`0165daa0 "Fltdrv2 Instance" "366992".  However, the instance is attached to HarddiskVolume4 which is bound to NTFS. After initializing the parameters of the callback data structure returned by FltAllocateCallbackData, FltQueryInformationFile initiates the I/O operation by passing the structure to FltPerformSynchronousIo or FltPerformAsynchronousIo. These routines send the I/O operation only to the minifilter driver instances attached below the initiating instance (specified in the Instance parameter) and to the file system i.e. NTFS.

     

    So the whole story is, the original query is all about a file open against \AV-REDIRECT-F3392236-BA8C-4AD9-8AEA-48F099A8EE9C\ffffe0003c8dae20.ini on HarddiskVolume4 which happens to be NTFS volume. The query is passed to 3rd party preoperation call back routine. The 3rd party driver continues the query by calling FltQueryInformationFile for a different file open that is on a ReFS volume but the filter instance attached to original NTFS volume.

     

    It is not hard to image the potential corruption if NTFS is instructed to operate on something pertaining to ReFS system. Bugcheck is one of the payoffs.

     

    Actually the 3rd party drivers’ behavior breaks the minifilter callback rule. Here is except from MSDN:

     

    A minifilter driver can modify the parameters for an I/O operation. For example, a minifilter driver's preoperation callback routine can redirect an I/O operation to a different volume by changing the target instance for the operation. The new target instance must be an instance of the same minifilter driver at the same altitude on another volume.

     

    Conclusion:

     

    The fltdrv1.sys driver is violating the minifilter callback rule by redirecting I/O to an instance of a different minifilter driver.

     

    In this post, we go through the dump analysis to look for the useful information about filter manager and minifilter, and solve a case with public debugging extension fltkd.dll and public filter manger symbol. Hope you enjoyed it!

  • Ntdebugging Blog

    Disk Performance Internals

    • 3 Comments

    Abstract:

    My name is Ran Jiang. I am from the Platforms Global Escalation Services team in China. Storage is the slowest component of most computer systems. As such, storage is often a performance bottleneck. Today I want to discuss the disk performance kernel provider, partition manager.  By understanding how the disk performance provider works we can understand how disk performance is tracked internally in Windows and how disk related counters are calculated, which will be helpful for diagnosing storage performance issues.

     

    Disk Performance Architecture

    There are two sets of public interfaces to query performance counter data – PDH (Performance Data Helper) or the registry interface. The registry interface to the performance data is older than the PDH interface and has more extensive functionality. However, the PDH interface is easier to use for most performance data collection tasks. The PDH interface is essentially a higher-level abstraction of the functionality that the registry interface provides.

    Windows Disk Performance Architecture

    Windows performance monitor leverages the PDH interface to get performance data. The performance data helper (PDH) interface calls the registry interface.

     

    Perflib is one key component integrated in the registry interface, which is responsible for translating the request from the application and calling the collect procedure exported by a performance extension DLL. The extension DLL does the real work of data collection and returns a standard data format to perflib.

     

    Extension DLLs should expose Open, Collect, and Close functions to be called by perflib. We can find these functions’ name by checking the registry:

    HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Services\<Service Name>\Performance

    Value Name: Close, Collect, Open

     

    Performance Registry

     

    Here is a user mode call stack when an application uses the registry API RegQueryValueEx() to collect performance data:

    0017f7ec 004e0000 perfdisk!CollectDiskObjectData+0xf8

    0017f964 7702eaa9 advapi32!QueryExtensibleData+0x577

    0017fd48 7702e962 advapi32!PerfRegQueryValue+0x5d8

    0017fe38 770576f5 advapi32!LocalBaseRegQueryValue+0x313

    0017fe9c 004011fc advapi32!RegQueryValueExW+0xa2

    0017fec8 00401153 getperfdata!GetPerformanceData+0x3c

    0017ff38 0040322a getperfdata!wmain+0x93

    0017ff88 773deccb getperfdata!__tmainCRTStartup+0x15e

    0017ff94 7798d24d kernel32!BaseThreadInitThunk+0xe

    0017ffd4 7798d45f ntdll!__RtlUserThreadStart+0x23

    0017ffec 00000000 ntdll!_RtlUserThreadStart+0x1b

     

    Disk performance kernel device stack

    Figure 2 shows the I/O manager stack to gather disk performance statistics. The volume manager underneath the file system driver gathers Logical Disk statistics. On Windows 2008 or above, volmgr.sys handles Logical Disk statistics for both dynamic and basic disks. The partition manager, partmgr.sys, gathers physical disk statistics.  These statistics are measured and collected for each request that passes through the I/O manager stack.

    Windows IO Manager

    Physical Disk Statistics

    Partition manager (partmgr.sys) saves performance information in the device extension’s counter context.

     

    Logical Disk statistics

    Volume manager (volmgr.sys) also saves performance metrics in its device extension.

     

    How to track disk performance?

    Performance information is tracked in the read and write dispatch routines and in the IO completion routines. There are 5 kinds of counter data tracked by partition manager:

     

    1. Queue depth - Total concurrent IOs still in process and not yet completed.

    2. Total counts of read and write requests.

    3. Total read and write time for all IO requests.  For example: There are total 2 write IOs completed since disk counter is enabled, one takes 1 sec and the other takes 2 sec. Then, this write counter will be 1 sec + 2 sec = 3 sec.

    4. Total Idle time.

    5. Total split IO (fragmented IO).

     

    Let’s talk about them separately.

     

    Queue depth:

    When a new IRP is sent to partition manager it will increment the queue depth. Partition manager will decrement the queue depth when completing an IRP. Therefore, the value indicates how many concurrent IOs are still in process.

     

    Total read and write count:

    When any read or write IO has been completed the partition manager IRP completion routine will get called. Then the read or write counter will be incremented. Note we only track completed IOs here.

     

    Total read and write time for IOs:

    When any read or write IO starts, partition manager’s dispatch routine will record the current time stamp in the IO stack location of the IRP. When an IRP is completed the completion routine will use this time stamp and the current system time to calculate the time taken to complete this IO.  Partition manager will then add this value to the appropriate counter in the device extension.

     

    Total Idle time:

    When completing an IRP and decrementing the queue depth partition manager will check if the queue depth reaches 0. If yes, it indicates the disk state has been transitioned from busy to idle. Then it will save the time stamp to Last Idle Clock in the counter context.

     

    When a new IRP is sent to partition manager it will increment the queue depth and will check if queue depth reaches 1. If yes, it indicates the disk state has been transitioned from idle to busy. Then Idle time counter will be increased by (current time stamp – Last Idle Clock).

     

    Total split IO (fragmented IO):

    When completing an IRP, partition manager will check if the IRP is marked as IRP_ASSOCIATED_IRP. This flag is usually set by the file system driver when a large IO is split into multiple smaller IOs. Typically, when an IO contains several runs and each run will contain continuous block of data, NTFS will create an associated IRP for each run and send this IRP to the lower level driver. Therefore, this counter usually can be used to track fragmented IOs.

     

    Note: Disk performance statistics are saved to an array whose index corresponds to each processor. Most of the counters are saved to the index corresponding to the processor the IRP was completed on.

     

    How to convert to performance counter?

    Now we understand how the kernel keeps tracking of these metrics. We need to map those metrics in kernel to the performance counter as shown in performance monitor. The counters visible in performance monitor are calculated based on the metrics from kernel. Each counter has a counter type and each counter type has a different calculation. The counter type determines how the counter data is calculated, averaged, and displayed.

     

    For example, Avg. Disk sec/Transfer has counter type of PERF_AVERAGE_TIMER. The formula of PERF_AVERAGE_TIMER is: ((N1 - N0) / F) / (D1 - D0), where the numerator (N) represents the number of ticks counted during the last sample interval, F represents the frequency of the ticks, and the denominator (D) represents the number of reads and writes completed during the last sample interval. N1 - N0 are returned from kernel as ReadTime + WriteTime in ticks. D1 and D0 are returned from partition manager or volume manager as read counts + write counts.

     

    Avg. Disk Transfer/sec:

    Counter type: PERF_COUNTER_COUNTER

    Formula: (N1- N0) / ( (D1-D0) / F), where N1- N0 are returned from partition manager or volume manager as read counts + write counts. D1-D0 are the number of ticks counted during the last sample interval. F represents the frequency of the ticks.

     

    Avg. Disk Queue Length:

    Counter type: PERF_COUNTER_LARGE_QUEUELEN_TYPE

    Formula: (N1 - N0) / (D1 - D0), where the numerator (N) represents queue depth and the denominator (D) represents the time elapsed during the sample interval.

     

    Current Disk Queue Length:

    Counter type: PERF_COUNTER_RAWCOUNT

    Formula: None. Shows raw data as collected. It’s Instantaneous value of queue depth.

     

    Disk Bytes/sec:

    Counter type: PERF_COUNTER_BULK_COUNT

    Formula: (N1 - N0) / ( (D1 - D0) / F, where the numerator (N) represents the total ReadBytes + WriteBytes, the denominator (D) represents the number of ticks elapsed during the last sample interval, and F is the frequency of the ticks.

     

    % Idle Time

    Counter type: PERF_PRECISION_100NS_TIMER

    Formula: NX – N0 / D1 – D0, where the numerator (N) represents the Total IdleTime and the denominator (D) is the value of the private timer. The private timer has the same frequency as the 100 ns timer.

     

    Note: Programmers should avoid calculating counters manually and should instead use pdh.dll.  An example of what can go wrong when calculating this data manually is described in Performance Monitor Averages, the Right Way and the Wrong Way.

     

    How to measure disk performance?

    In this section we are going to discuss which counters are the key to measuring disk performance. Generally we have 4 counters used for performance measurement: Disk Bytes/sec, % Idle Time, Disk sec/Transfer and Avg. Disk Queue Length.

     

    Disk Bytes/sec

    From the formula, Disk Bytes/sec is actually how many bytes have been completed in every second. There are two things could impact this counter value:

     

    1. How much stress is generated to the disk or volume?

    Let’s assume if there are no problems with disk performance and stress has not reached the storage bottleneck. Then, this counter value will be determined the stress IO load generated by the application such as a stress tool.

     

    2. Disk performance

    If the IO load has exceeded the storage bottleneck, this counter value will not be able to be increased with load increasing.

     

    Conclusion: Since this counter value could be affected by IO load from an application we cannot use it as the key to determine disk performance.

     

    % Idle Time

    This counter value indicates how long the disk is in idle status without outstanding IO. It can help to determine how busy the disk is. However, even if the disk is busy with 0% Idle Time, we cannot say it suffers from a performance issue as it may still be able to complete all IOs in time.

     

    Avg. Disk Queue Length

    This counter indicates on average how many IOs are outstanding. If the disk can always complete IO immediately, the value should be 0. Therefore, it’s also a value to determine how busy the storage is. But it does not impact the application directly as the application does not care how many total IOs are outstanding. The application is concerned with how fast every IO can be completed. In practice, if we see the queue depth is more than 10 we may say the storage is busy and could delay the IO in the queue. However if every IO can be completed fast there will be no impact to the application, which means the delay is still acceptable.

     

    Disk sec/Transfer

    This counter indicates how fast the IO is completed on average. This is one of the keys to an application’s performance as discussed above.

     

    Dynamic counter loading feature

    On Windows 2008 or above the disk counter in the kernel provider can be dynamically enabled or disabled. If there is no one open handle to HKEY_PERFORMANCE_DATA the kernel provider will disable IO performance trace by setting a flag in the device extension. Here is the Call stack when the counter is being dynamically disabled:

     

    8f14d970 volmgr!VmWmiFunctionControl

    8f14d9e0 WMILIB!WmiSystemControl+0x3b9

    8f14da00 volmgr!VmWmi+0x8d

    8f14da18 nt!IofCallDriver+0x63

    8f14da20 volsnap!VolSnapDefaultDispatch+0x2b

    8f14da38 nt!IofCallDriver+0x63

    8f14da60 nt!WmipForwardWmiIrp+0x18b

    8f14da8c nt!WmipSendWmiIrp+0x56

    8f14dabc nt!WmipDeliverWnodeToDS+0x22

    8f14dc28 nt!WmipSendEnableDisableRequest+0x10e

    8f14dc4c nt!WmipDoDisableRequest+0x26

    8f14dc64 nt!WmipDisableCollectOrEvent+0x35

    8f14dc8c nt!WmipDeleteMethod+0x25

    8f14dca8 nt!ObpRemoveObjectRoutine+0x13d

    8f14dcd0 nt!ObfDereferenceObject+0xa1

    8f14dd14 nt!ObpCloseHandleTableEntry+0x24e

    8f14dd44 nt!ObpCloseHandle+0x73

    8f14dd58 nt!NtClose+0x20

    8f14dd58 nt!KiFastCallEntry+0x12a

    0012fda4 ntdll!KiFastSystemCallRet

    0012fda8 ntdll!NtClose+0xc

    0012fde0 ADVAPI32!WmiCloseBlock+0x33

    0012fe58 ADVAPI32!PerfRegCloseKey+0x175

    0012fe68 ADVAPI32!BaseRegCloseKeyInternal+0x81

    0012fe7c ADVAPI32!ClosePredefinedHandle+0x7c

    0012feb8 ADVAPI32!RegCloseKey+0x67

    0012fed0 ReadTest!GetPerformanceData+0xe5

    0012ff38 ReadTest!wmain+0xae

    0012ff88 ReadTest!__tmainCRTStartup+0x15e

    0012ff94 kernel32!BaseThreadInitThunk+0xe

    0012ffd4 ntdll!__RtlUserThreadStart+0x23

    0012ffec ntdll!_RtlUserThreadStart+0x1b

     

    Since the sample app from MSDN tries to close the handle every time after calling RegQueryValueEx(), it will disable and enable the disk counter intermittently. The impact to any app using registry API will be that some IO is started with counter disabled with no time stamp recorded and later completed with counter enabled, then generate a huge time difference for such an IO and charge to sec/transfer. KB 2470949 was released to address this issue on Windows 2008 R2.

     

    Resources

    Disk Subsystem Performance Analysis for Windows

    http://download.microsoft.com/download/e/b/a/eba1050f-a31d-436b-9281-92cdfeae4b45/subsys_perf.doc

     

    How to Calculate Your Disk I/O Requirements

    http://technet.microsoft.com/en-us/library/bb125019(EXCHG.65).aspx

     

    Disk Partition Alignment Best Practices for SQL Server

    http://msdn.microsoft.com/en-us/library/dd758814(SQL.100).aspx

     

    Counter Types

    http://technet.microsoft.com/en-us/library/cc785636(v=WS.10).aspx

  • Ntdebugging Blog

    Driver Object Corruption Triggers Bugcheck 109

    • 2 Comments

    My name is Victor Mei, I am an Escalation Engineer in Platforms Global Escalation Services in GCR.  Some customers I worked with have strong interests in debugging; but usually they got frustrated when I told them “To find the cause from this dump, you have to get the code and understand the design behind it”.

     

    This time I am going to talk about one crash dump, on which we can use basic debugging commands and knowledge of the Windows kernel to find out the root cause:

     

    1: kd> !analyze -v

    *******************************************************************************

    *                                                                             *

    *                        Bugcheck Analysis                                    *

    *                                                                             *

    *******************************************************************************

    CRITICAL_STRUCTURE_CORRUPTION (109)

    This bugcheck is generated when the kernel detects that critical kernel code or

    data have been corrupted. There are generally three causes for a corruption:

    1) A driver has inadvertently or deliberately modified critical kernel code

    or data. See http://www.microsoft.com/whdc/driver/kernel/64bitPatching.mspx

    2) A developer attempted to set a normal kernel breakpoint using a kernel

    debugger that was not attached when the system was booted. Normal breakpoints,

    "bp", can only be set if the debugger is attached at boot time. Hardware

    breakpoints, "ba", can be set at any time.

    3) A hardware corruption occurred, e.g. failing RAM holding kernel code or data.

    Arguments:

    Arg1: a3a01f5a3763f650, Reserved

    Arg2: b3b72be089e32ceb, Reserved

    Arg3: ffffe001a2894a20, Failure type dependent information

    Arg4: 000000000000001c, Type of corrupted region, can be

    0   : A generic data region

    1   : Modification of a function or .pdata

    2   : A processor IDT

    3   : A processor GDT

    4   : Type 1 process list corruption

    5   : Type 2 process list corruption

    6   : Debug routine modification

    7   : Critical MSR modification

    8   : Object type

    9   : A processor IVT

    a   : Modification of a system service function

    b   : A generic session data region

    c   : Modification of a session function or .pdata

    d   : Modification of an import table

    e   : Modification of a session import table

    f   : Ps Win32 callout modification

    10  : Debug switch routine modification

    11  : IRP allocator modification

    12  : Driver call dispatcher modification

    13  : IRP completion dispatcher modification

    14  : IRP deallocator modification

    15  : A processor control register

    16  : Critical floating point control register modification

    17  : Local APIC modification

    18  : Kernel notification callout modification

    19  : Loaded module list modification

    1a  : Type 3 process list corruption

    1b  : Type 4 process list corruption

    1c  : Driver object corruption

    1d  : Executive callback object modification

    1e  : Modification of module padding

    1f  : Modification of a protected process

    20  : A generic data region

    21  : A page hash mismatch

    22  : A session page hash mismatch

    102 : Modification of win32k.sys

     

    The stack only contains one frame:

     

    # Child-SP          RetAddr           Call Site

    00 ffffd000`223721c8 00000000`00000000 nt!KeBugCheckEx

     

    You will get disappointed if you attempted to find out who called KeBugCheckEx from the stack, because you will find KeBugCheckEx is the only function address on the stack.

     

    Since there is nothing more on the stack, let’s take a close look at what WinDBG tells about Bugcheck parameters:

     

    Arg3: ffffe001a2894a20, Failure type dependent information

    Arg4: 000000000000001c, Type of corrupted region, can be

    1c  : Driver object corruption

     

    Tip: Always use the latest version of WinDBG, the older versions may not tell you 1c is for Driver Object corruption.

     

    Arg4 indicates this is driver object corruption, so the type dependent information provided by Arg3 should be the Driver object, right? Let’s check the object:

     

    1: kd> !drvobj ffffe001a2894a20

    Driver object (ffffe001a2894a20) is for:

    ffffe001a2894a20: is not a driver object

     

    Let’s try !pool

     

    1: kd> !pool ffffe001a2894a20

    Pool page ffffe001a2894a20 region is Nonpaged pool

    ffffe001a2894000 size:  510 previous size:    0  (Allocated)  FMcr

    ffffe001a2894510 size:   50 previous size:  510  (Allocated)  Wmip

    ffffe001a2894560 size:   60 previous size:   50  (Allocated)  NtfJ

    ffffe001a28945c0 size:   60 previous size:   60  (Allocated)  EtwR

    ffffe001a2894620 size:   60 previous size:   60  (Allocated)  EtwR

    ffffe001a2894680 size:   60 previous size:   60  (Allocated)  EtwR

    ffffe001a28946e0 size:   60 previous size:   60  (Allocated)  EtwR

    ffffe001a2894740 size:  210 previous size:   60  (Allocated)  Devi

    *ffffe001a2894950 size:  200 previous size:  210  (Allocated) *Driv

         Pooltag Driv : Driver objects

    ffffe001a2894b50 size:  2b0 previous size:  200  (Allocated)  Devi

    ffffe001a2894e00 size:  200 previous size:  2b0  (Allocated)  Driv

     

    So the address does belong to a driver object, but what is the base address of NT!_Driver_Object? If you don’t have experience on it, a quick method is to refer to a known device object, for example:

     

    1: kd> !drvobj \driver\acpi

    Driver object (ffffe001a14df060) is for:

    \Driver\ACPI

    1: kd> !pool ffffe001a14df060

    Pool page ffffe001a14df060 region is Nonpaged pool

    *ffffe001a14df000 size:  200 previous size:    0  (Allocated) *Driv

         Pooltag Driv : Driver objects

    ffffe001a14df200 size:   10 previous size:  200  (Free)       Free

    1: kd> ?ffffe001a14df060-ffffe001a14df000

    Evaluate expression: 96 = 00000000`00000060

     

    So, looks like the offset is 0x60, let’s have another try:

     

    1: kd> !drvobj ffffe001a2894950+0x60

    Driver object (ffffe001a28949b0) is for:

    \FileSystem\Ntfs

     

    Great, we got the object.

     

    Arg3 is ffffe001a2894a20, offset 0x70 to the Driver Object.

     

    1: kd> ?ffffe001a2894a20-ffffe001a28949b0

    Evaluate expression: 112 = 00000000`00000070

     

    1: kd> dt nt!_DRIVER_OBJECT ffffe001a28949b0

       +0x000 Type             : 0n4

       +0x002 Size             : 0n336

       +0x008 DeviceObject     : 0xffffe001`a144c030 _DEVICE_OBJECT

       +0x010 Flags            : 0x92

       +0x018 DriverStart      : 0xfffff800`0d044000 Void

       +0x020 DriverSize       : 0x1f6000

       +0x028 DriverSection    : 0xffffe001`a142e2c0 Void

       +0x030 DriverExtension  : 0xffffe001`a2894b00 _DRIVER_EXTENSION

       +0x038 DriverName       : _UNICODE_STRING "\FileSystem\Ntfs"

       +0x048 HardwareDatabase : 0xfffff802`64b31580 _UNICODE_STRING "\REGISTRY\MACHINE\HARDWARE\DESCRIPTION\SYSTEM"

       +0x050 FastIoDispatch   : 0xfffff800`0d0ae640 _FAST_IO_DISPATCH

       +0x058 DriverInit       : 0xfffff800`0d06e280     long  Ntfs!GsDriverEntry+0

       +0x060 DriverStartIo    : (null)

       +0x068 DriverUnload     : 0xfffff800`0c8d5d24     void  +0

       +0x070 MajorFunction    : [28] 0xfffff800`0d126a10     long  Ntfs!NtfsFsdCreate+0

     

    The bugcheck code seems to be indicating that the MajorFunction table is corrupted, let’s look at the details:

     

    1: kd> !drvobj ffffe001a2894950+0x60 f

    Driver object (ffffe001a28949b0) is for:

    \FileSystem\Ntfs

    Driver Extension List: (id , addr)

     

    Device Object list:

    ffffe001a144c030  ffffe001a1449030  ffffe001a144f030  ffffe001a28947a0

     

    DriverEntry:   fffff8000d06e280  Ntfs!GsDriverEntry

    DriverStartIo: 00000000  

    DriverUnload:  fffff8000c8d5d24  vicm

    AddDevice:     00000000  

     

    Dispatch routines:

    [00] IRP_MJ_CREATE                      fffff8000d126a10 Ntfs!NtfsFsdCreate

    [01] IRP_MJ_CREATE_NAMED_PIPE           fffff802645809ac nt!IopInvalidDeviceRequest

    [02] IRP_MJ_CLOSE                       fffff8000d10b390 Ntfs!NtfsFsdClose

    [03] IRP_MJ_READ                        fffff8000d061590 Ntfs!NtfsFsdRead

    [04] IRP_MJ_WRITE                       fffff8000d05c3d0 Ntfs!NtfsFsdWrite

    [05] IRP_MJ_QUERY_INFORMATION           fffff8000d133ca4 Ntfs!NtfsFsdDispatchWait

    [06] IRP_MJ_SET_INFORMATION             fffff8000d130290 Ntfs!NtfsFsdSetInformation

    [07] IRP_MJ_QUERY_EA                    fffff8000d133ca4 Ntfs!NtfsFsdDispatchWait

    [08] IRP_MJ_SET_EA                      fffff8000d133ca4 Ntfs!NtfsFsdDispatchWait

    [09] IRP_MJ_FLUSH_BUFFERS               fffff8000d0e9e94 Ntfs!NtfsFsdFlushBuffers

    [0a] IRP_MJ_QUERY_VOLUME_INFORMATION    fffff8000d1356b0 Ntfs!NtfsFsdDispatch

    [0b] IRP_MJ_SET_VOLUME_INFORMATION      fffff8000d1356b0 Ntfs!NtfsFsdDispatch

    [0c] IRP_MJ_DIRECTORY_CONTROL           fffff8000d12d2f0 Ntfs!NtfsFsdDirectoryControl

    [0d] IRP_MJ_FILE_SYSTEM_CONTROL         fffff8000d131898 Ntfs!NtfsFsdFileSystemControl

    [0e] IRP_MJ_DEVICE_CONTROL              fffff8000d0ed194 Ntfs!NtfsFsdDeviceControl

    [0f] IRP_MJ_INTERNAL_DEVICE_CONTROL     fffff802645809ac nt!IopInvalidDeviceRequest

    [10] IRP_MJ_SHUTDOWN                    fffff8000d1eb730 Ntfs!NtfsFsdShutdown

    [11] IRP_MJ_LOCK_CONTROL                fffff8000d046230 Ntfs!NtfsFsdLockControl

    [12] IRP_MJ_CLEANUP                     fffff8000d12bde0 Ntfs!NtfsFsdCleanup

    [13] IRP_MJ_CREATE_MAILSLOT             fffff802645809ac nt!IopInvalidDeviceRequest

    [14] IRP_MJ_QUERY_SECURITY              fffff8000d1356b0 Ntfs!NtfsFsdDispatch

    [15] IRP_MJ_SET_SECURITY                fffff8000d1356b0 Ntfs!NtfsFsdDispatch

    [16] IRP_MJ_POWER                       fffff802645809ac nt!IopInvalidDeviceRequest

    [17] IRP_MJ_SYSTEM_CONTROL              fffff802645809ac nt!IopInvalidDeviceRequest

    [18] IRP_MJ_DEVICE_CHANGE               fffff802645809ac nt!IopInvalidDeviceRequest

    [19] IRP_MJ_QUERY_QUOTA                 fffff8000d133ca4 Ntfs!NtfsFsdDispatchWait

    [1a] IRP_MJ_SET_QUOTA                   fffff8000d133ca4 Ntfs!NtfsFsdDispatchWait

    [1b] IRP_MJ_PNP                         fffff8000d158bac Ntfs!NtfsFsdPnp

     

    Fast I/O routines:

    FastIoCheckIfPossible                   fffff8000d1d4090 Ntfs!NtfsFastIoCheckIfPossible

    FastIoRead                              fffff8000d0f98e0 Ntfs!NtfsCopyReadA

    FastIoWrite                             fffff8000d12f160 Ntfs!NtfsCopyWriteA

    FastIoQueryBasicInfo                    fffff8000d1390c0 Ntfs!NtfsFastQueryBasicInfo

    FastIoQueryStandardInfo                 fffff8000d123bb0 Ntfs!NtfsFastQueryStdInfo

    FastIoLock                              fffff8000d0dd54c Ntfs!NtfsFastLock

    FastIoUnlockSingle                      fffff8000d0dd848 Ntfs!NtfsFastUnlockSingle

    FastIoUnlockAll                         fffff8000d1d3330 Ntfs!NtfsFastUnlockAll

    FastIoUnlockAllByKey                    fffff8000d1d35ac Ntfs!NtfsFastUnlockAllByKey

    ReleaseFileForNtCreateSection           fffff8000d062814 Ntfs!NtfsReleaseForCreateSection

    FastIoQueryNetworkOpenInfo              fffff8000d0f051c Ntfs!NtfsFastQueryNetworkOpenInfo

    AcquireForModWrite                      fffff8000d04b6d8 Ntfs!NtfsAcquireFileForModWrite

    MdlRead                                 fffff8000d0eb2c0 Ntfs!NtfsMdlReadA

    MdlReadComplete                         fffff80264588594 nt!FsRtlMdlReadCompleteDev

    PrepareMdlWrite                         fffff8000d0eb574 Ntfs!NtfsPrepareMdlWriteA

    MdlWriteComplete                        fffff802649289c8 nt!FsRtlMdlWriteCompleteDev

    FastIoQueryOpen                         ffffe001a17d4540 +0xffffe001a17d4540

    ReleaseForModWrite                      fffff8000d04b4d4 Ntfs!NtfsReleaseFileForModWrite

    AcquireForCcFlush                       fffff8000d06656c Ntfs!NtfsAcquireFileForCcFlush

    ReleaseForCcFlush                       fffff8000d066524 Ntfs!NtfsReleaseFileForCcFlush

     

    We found two potential issues here: DriverUnload and FastIoQueryOpen.

     

    Use FastIoQueryOpen as an example:

     

    1: kd> u ffffe001a17d4540

    ffffe001`a17d4540 4d8bc8          mov     r9,r8

    ffffe001`a17d4543 4c8bc2          mov     r8,rdx

    ffffe001`a17d4546 488bd1          mov     rdx,rcx

    ffffe001`a17d4549 48b900407da101e0ffff mov rcx,0FFFFE001A17D4000h

    ffffe001`a17d4553 48b83c57910c00f8ffff mov rax,offset vicm+0x6973c (fffff800`0c91573c)

    ffffe001`a17d455d ffe0            jmp     rax

     

    1: kd> u fffff800`0c91573c

    vicm+0x6973c:

    fffff800`0c91573c 48895c2408      mov     qword ptr [rsp+8],rbx

    fffff800`0c915741 48896c2410      mov     qword ptr [rsp+10h],rbp

    fffff800`0c915746 4889742418      mov     qword ptr [rsp+18h],rsi

    fffff800`0c91574b 57              push    rdi

    fffff800`0c91574c 4883ec20        sub     rsp,20h

     

    Obviously, FastIoQueryOpen has been modified to execute code in the module vicm.sys.  DriverUnload has been modified in a similar manner.

     

    Follow the description from “!analyze “1) A driver has inadvertently or deliberately modified critical kernel code or data. See http://www.microsoft.com/whdc/driver/kernel/64bitPatching.mspx”.  Kernel patch protection does not allow the MajorFunction table of certain drivers to be modified, if this data is modified the system will bugcheck as seen here.  It is time to remove the vicm.sys driver. The result is positive, the machine no longer crashes.

  • Ntdebugging Blog

    How to identify a driver that calls a Windows API leading to a pool leak on behalf of NT Kernel?

    • 1 Comments

    Hello my name is Gurpreet Singh Jutla and I would like to share information on how we can trace the caller which ends up allocating “Se  “ Pool tag. When we use the Windows debugger and investigate the pool allocation and the binary associated with this pool tag, we see NT Kernel responsible for the allocations. But is the NT Kernel really responsible for a pool leak associated with this pool tag?

     

    Issue at hand

    • On windows 2003 x86 we see that the paged pool has depleted and we are running into event id 333.
    • We can see the same behavior on later versions of the OS and even on x64
    • We see that the leaking pool tag is “Se  “ which is the pool tag for security objects. Is Microsoft component at fault or something is calling an API and using security objects on a large scale?
    • This is windows 2003 x86 and we have limited options to root cause the issue. We need to really find out why we end up having so many allocations for this tag.
    • We could enable Pool Tracking on the NT Kernel, but would that help?

     

    Step 1

    !vm 1  -> This tells us if there were any page pool allocation failures

     

    0: kd> !vm 1

     

    *** Virtual Memory Usage ***

          Physical Memory:     2096922 (   8387688 Kb)

          Page File: \??\D:\pagefile.sys

            Current:  16779264 Kb  Free Space:  16552492 Kb

            Minimum:  16779264 Kb  Maximum:     16779264 Kb

          Available Pages:     1607242 (   6428968 Kb)

          ResAvail Pages:      1991659 (   7966636 Kb)

          Locked IO Pages:         656 (      2624 Kb)

          Free System PTEs:     163671 (    654684 Kb)

          Free NP PTEs:          32766 (    131064 Kb)

          Free Special NP:           0 (         0 Kb)

          Modified Pages:        10775 (     43100 Kb)

          Modified PF Pages:     10728 (     42912 Kb)

          NonPagedPool Usage:     7881 (     31524 Kb)

          NonPagedPool Max:      65279 (    261116 Kb)

          PagedPool 0 Usage:     67074 (    268296 Kb)

          PagedPool 1 Usage:      3266 (     13064 Kb)

          PagedPool 2 Usage:      3282 (     13128 Kb)

          PagedPool 3 Usage:      3268 (     13072 Kb)

          PagedPool 4 Usage:      3214 (     12856 Kb)

          PagedPool Usage:       80104 (    320416 Kb)

          PagedPool Maximum:    134144 (    536576 Kb)

          Session Commit:        14832 (     59328 Kb)

          Shared Commit:         19969 (     79876 Kb)

          Special Pool:              0 (         0 Kb)

          Shared Process:        19362 (     77448 Kb)

          Pages For MDLs:          146 (       584 Kb)

          PagedPool Commit:      80140 (    320560 Kb)

          Driver Commit:          1602 (      6408 Kb)

          Committed pages:      520485 (   2081940 Kb)

          Commit limit:        6241313 (  24965252 Kb)

     

    !poolused /t10 4  -> This will list the top consumers of paged pool. It is “Se  “ in our case.

     

    0: kd> !poolused /t10 4

    ....

    Sorting by Paged Pool Consumed

     

                   NonPaged                  Paged

    Tag     Allocs         Used     Allocs         Used

     

    Se           0            0     172204    232720312  General security allocations , Binary: nt!se

    MmSt         0            0      15231     31835696  Mm section object prototype ptes , Binary: nt!mm

    Ntff         9         1872      10434      8514144  FCB_DATA , Binary: ntfs.sys

    WD         384      1251376         27      5591040  UNKNOWN pooltag 'WD  ', please update pooltag.txt

    UlHT         0            0          1      4198400  Hash Table , Binary: http.sys

    NtfF         0            0       3259      3050424  FCB_INDEX , Binary: ntfs.sys

    Toke         0            0        966      3034216  Token objects , Binary: nt!se

    NtFs     13717       551176      19678      1758008  StrucSup.c , Binary: ntfs.sys

    IoNm         0            0      12034      1737360  Io parsing names , Binary: nt!io

    FSim         0            0      11336      1451008  File System Run Time Mcb Initial Mapping

    CM16         0            0        293      1437696  Internal Configuration manager allocations

    Wmit         6        11392         23      1376912  Wmi Trace

    NtFU         0            0       8719      1237232  usnsup.c , Binary: ntfs.sys

    Obtb         0            0        397       995600  object tables via EX handle.c , Binary: nt!ob

    Key          0            0       7245       753432  Key objects

    MmSm         0            0      11393       729152  segments used to map data files , Binary: nt!mm

     

    TOTAL    192495     33751032     347461    310172416

     

    Step 2

    Once we know the pool tag we need to run the following command. Remember the pool tag has to be a 4 char case sensitive string so I would run the following command which tells me in which all modules we have the Pool tag “Se  “ used. Please note that I have added two spaces in the string during my search. This is because the !poolused shows Se tag has caused the issue. Not providing the spaces in the search string will give different results. The following command searches each module that is loaded on the system.

     

    !for_each_module s -a @#Base @#End "Se  "

     

    The result should be something like:

     

    0: kd> !for_each_module s -a @#Base @#End "Se  "

    8096a1ae  53 65 20 20 6a 0c 6a 01-e8 bd a9 f2 ff 3b c7 89  Se  j.j......;..

    8096a5e3  53 65 20 20 6a 0c 6a 01-e8 88 a5 f2 ff 3b c7 89  Se  j.j......;..

    8096af9e  53 65 20 20 50 6a 01 89-45 fc e8 cb 9b f2 ff 8b  Se  Pj..E.......

    8096c909  53 65 20 20 8d 1c 9d 10-00 00 00 53 6a 01 e8 5c  Se  .......Sj..\

    8096c9e3  53 65 20 20 53 6a 01 e8-89 81 f2 ff 8b f8 85 ff  Se  Sj..........

    8096ca3e  53 65 20 20 8d 04 85 10-00 00 00 50 6a 01 e8 27  Se  .......Pj..'

    8096caf2  53 65 20 20 8d 1c 9d 0c-00 00 00 53 6a 01 e8 73  Se  .......Sj..s

    8096cb3e  53 65 20 20 8d 1c 9d 0c-00 00 00 53 6a 01 e8 27  Se  .......Sj..'

    8096cb9d  53 65 20 20 56 6a 01 e8-cf 7f f2 ff 8b f0 3b f7  Se  Vj........;.

    8096cc44  53 65 20 20 6a 10 6a 01-e8 27 7f f2 ff 85 c0 0f  Se  j.j..'......

    8096cc65  53 65 20 20 6a 04 6a 01-e8 06 7f f2 ff 85 c0 0f  Se  j.j.........

    8096cc96  53 65 20 20 6a 04 6a 01-e8 d5 7e f2 ff 85 c0 0f  Se  j.j...~.....

    8096ccbe  53 65 20 20 6a 38 6a 01-e8 ad 7e f2 ff 8b f8 85  Se  j8j...~.....

    809718ac  53 65 20 20 74 42 8b 46-68 3b c3 74 0f 53 50 e8  Se  tB.Fh;.t.SP.

    80971b48  53 65 20 20 74 42 8b 46-68 3b c3 74 0f 53 50 e8  Se  tB.Fh;.t.SP.

    8097372b  53 65 20 20 6a 0c 6a 01-e8 40 14 f2 ff 85 c0 75  Se  j.j..@.....u

    80976c38  53 65 20 20 6a 0c 6a 01-e8 33 df f1 ff 3b c6 89  Se  j.j..3...;..

    80a20d1b  53 65 20 20 bf 00 01 00-00 57 56 e8 4d 3e e7 ff  Se  .....WV.M>..

    80a22698  53 65 20 20 c7 45 80 4f-62 43 6c c7 45 84 43 63  Se  .E.ObCl.E.Cc

     

    The above steps are also explained in detail at http://blogs.msdn.com/b/ntdebugging/archive/2012/08/31/troubleshooting-pool-leaks-part-3-debugging.aspx

     

    Step 3

    Now the tough part begins. We need to run “ln” command on each of the addresses shown by the output above. Example

     

    0: kd> ln 8096c909 

    (8096c8cc)   nt!SeQueryInformationToken+0x3d   |  (8096cdc0)   nt!SeCaptureObjectTypeList

     

    See the highlighted above. It is the nearest API to that address. Search on MSDN for all API’s listed. The ones that do not have an MSDN article can likely be ignored. SEQueryInformationToken is publically available and hence callable by any loaded driver. Example http://msdn.microsoft.com/en-in/library/windows/hardware/ff556690(v=vs.85).aspx

     

    At this point we will make the assumption that it is a 3rd party driver making the calls to this Windows API.  It is, of course, possible that a Microsoft module is doing this, but if such a problem exists in our code it usually becomes apparent very quickly when we get flooded by support calls.

     

    Step 4

    Method 1

    We can use the “!for_each_module” debugger command to search the address space of each loaded module for the entry point of the function found in step 3.

     

    0: kd> !for_each_module s-d @#Base @#End 8096c8cc

    b94b5938  8096c8cc 8096000a 80960182 80894b78  ............xK..

    ba7d50f4  8096c8cc 8081e5e0 8096b42e 8096b40c  ................

    ba83624c  8096c8cc 8081757e 8082b4f2 8094d438  ....~u......8...

    bab8103c  8096c8cc 80960122 80960158 8096010a  ...."...X.......

    f74e2950  8096c8cc 8089e180 8096091a 80960638  ............8...

    f74f0b68  8096c8cc 8089e180 80959546 80884150  ........F...PA..

    f75cf094  8096c8cc 80960122 80960158 8096010a  ...."...X.......

    f797c110  8096c8cc 808847f0 80966c60 80894b78  .....G..`l..xK..

    f7b643c4  8096c8cc 80960054 809e437c 808eadc6  ....T...|C......

     

    The first column in the above output can then be passed to the “ln” command to get the nearest function to the address listed.

     

    0: kd> ln b94b5938

    *** ERROR: Module load completed but symbols could not be loaded for ABCMiniFilter.sys

    0: kd> ln f797c110

    *** ERROR: Module load completed but symbols could not be loaded for XVhdBusxxx.sys

     

    Method 2

    This is a bit more complicated. Run the lm command to dump all modules. You will get the output as follows

     

    0: kd> lm

    start    end        module name

    80800000 80a5d000   nt        

    80a5d000 80a89000   hal       

    b476c000 b4775000   asyncmac   (deferred)            

    b7f55000 b7fb6000   ABCD   (deferred)            

    b817b000 b81da000   eeCtrl     (deferred)            

    b82b6000 b82b9300   xyz   (deferred)            

    b87a2000 b87cc000   Fastfat   (deferred)

    b8f04000 b8f0de80   ABCMiniFilter   (deferred)                       

    <snip>

     

    Ignore the unloaded modules and the modules you know are Microsoft binaries. For the rest of them you need to run the following command on each module. Remember the first column is the start of the module and the second column is the end of module.

     

    Use the !dh command for each header one by one and you will see the header information for the module.  I just picked one module for reference you will need to run the command for each third party module.

     

    0: kd> !dh b8f04000 -f

     

    File Type: EXECUTABLE IMAGE

    FILE HEADER VALUES

         14C machine (i386)

           7 number of sections

    51F7D252 time date stamp Tue Jul 30 20:18:50 2013

     

           0 file pointer to symbol table

           0 number of symbols

          E0 size of optional header

         102 characteristics

               Executable

               32 bit word machine

     

    OPTIONAL HEADER VALUES

         10B magic #

        9.00 linker version

        8800 size of code

        1200 size of initialized data

           0 size of uninitialized data

        86B2 address of entry point

         480 base of code

             ----- new -----

    00010000 image base

          80 section alignment

          80 file alignment

           1 subsystem (Native)

        6.01 operating system version

        6.01 image version

        5.01 subsystem version

        9E80 size of image

         480 size of headers

       11E27 checksum

    00040000 size of stack reserve

    00001000 size of stack commit

    00100000 size of heap reserve

    00001000 size of heap commit

           0  DLL characteristics

           0 [       0] address [size] of Export Directory

        8A28 [      50] address [size] of Import Directory

        9180 [     498] address [size] of Resource Directory

           0 [       0] address [size] of Exception Directory

        9E80 [    1FD0] address [size] of Security Directory

        9680 [     6C0] address [size] of Base Relocation Directory

        7990 [      1C] address [size] of Debug Directory

           0 [       0] address [size] of Description Directory

           0 [       0] address [size] of Special Directory

           0 [       0] address [size] of Thread Storage Directory

        7A68 [      40] address [size] of Load Configuration Directory

           0 [       0] address [size] of Bound Import Directory

        7880 [     108] address [size] of Import Address Table Directory

           0 [       0] address [size] of Delay Import Directory

           0 [       0] address [size] of COR20 Header Directory

           0 [       0] address [size] of Reserved Directory

     

    The highlighted line is important for us. Remember this is the offset of the start of Import Address Table directory and the end of the same. In our case 7880 offset is start and 108 offset is the end. We run the following command to dump the import table. Running the dps command from the start offset in the module till the end of the import table will list all the functions it imports

     

    0: kd> dps b8f04000+7880 b8f04000+7880+108

    b8f0b880  f7873892 fltMgr!FltGetStreamHandleContext

    b8f0b884  f7873988 fltMgr!FltAllocateContext

    b8f0b888  f787459c fltMgr!FltSetStreamHandleContext

    <snip>

    b8f0b938  8096c8cc nt!SeQueryInformationToken

    b8f0b93c  8096000a nt!RtlValidSid

    <snip>

     

    You see the module we searched in will have the API in its import table. Running the above command on each module, dumping the import tables will help us identify which modules have the API in its import table. However it cannot be confirmed if the driver calls the API and causes the leak. This is really tedious and time consuming but will help us identify each binary that calls the above API which leads to pool tag “Se  “ allocation on behalf of the NT Kernel.

     

    Conclusion

    We can disable the binaries that call the SEQueryInformationToken API one by one and see if the issue still persists. Please note that the boot/System drivers should not be disabled as it can lead to a no boot situation. You can use Autoruns from Sysinternals to disable these drivers and if you run into a no boot situation, boot back into the “last known good configuration” to help recover the box.

  • Ntdebugging Blog

    Windows Troubleshooting – Stop 9E Explained

    • 2 Comments

    What to do if a stop 9E occurs.  How you can solve the issue yourself.

  • Ntdebugging Blog

    Windows Troubleshooting – Special Pool

    • 0 Comments

    The Windows Support team has a new YouTube channel, “Windows Troubleshooting”.  The first set of videos cover debugging blue screens.

    In this video, Bob Golding, Senior Escalation Engineer, describes how the Special Pool Windows diagnostics tool catches drivers that corrupt memory. Bob also introduces how memory is organized in the system for allocating memory for drivers.

  • Ntdebugging Blog

    Bugchecking a Computer on A Usermode Application Crash

    • 2 Comments

    Hello my name is Gurpreet Singh Jutla and I would like to share information on how we can bugcheck a box on any usermode application crash. Set the application as a critical process when the application crash is reproducible. We may sometimes need a complete memory dump to investigate the information from kernel mode on a usermode application crash or closure.

     

    We will use the operating system’s ability to mark a process as critical and cause the system to bugcheck when the critical process closes unexpectedly. This will generate either a CRITICAL_PROCESS_DIED or a CRITICAL_OBJECT_TERMINATION bugcheck.

     

    For this demonstration I will use the following code sample which waits for the user input and then causes an Access Violation. You can use the following steps to collect a complete memory dump for any application crash that launches fine but crashes under known repro conditions.

     

    Code Sample

    #include<conio.h>
    void main()
    {
          _getch();      //Wait for a key press
          *(char*)0xdeaddead ='B';      //Causes the Access Violation
    }

     

    Please follow the steps below

    1. Set the system for a complete memory dump by opening the “Advanced System settings” under System properties in control panel and then setting the value of “Write debugging information” under “Startup and recovery” options on the advanced tab.
           
      image001

      image002
           
    2. Also enable the debug mode by running the following command from a command prompt
      bcdedit -debug on
    3. To enable the “Complete memory dump” and debug mode you need to restart the box to ensure the changes are implemented.
    4. Run the application you want to setup as critical process but do not run the repro steps. I have compiled my test application as test.exe
    5. Download and install the Debugging Tools for Windows, part of SDK which you can download from http://msdn.microsoft.com/en-us/windows/desktop/bg162891.aspx. Note, when the installer launches you can uncheck every feature except Debugging Tools for Windows.
    6. We need to setup the debugger to use the public symbols. Create a folder c:\symbols. Run Windbg with admin privileges, choose “File” menu and then “Symbol file path”. Type SRV*c:\symbols*http://msdl.microsoft.com/download/symbols
      For more details check http://support.microsoft.com/kb/311503/en-us
    7. Assuming you have the debugger installed and setup with the public symbols, launch the debugger with admin privileges.
    8. From the file menu select kernel debug and then choose the “Local” tab and hit Ok button. This will connect the windbg to the local kernel. You should see an “lkd>” prompt.
    9. Run the following command to get the process information in windbg. The below example uses both x64 and x86 architectures
    10.  

      x64
      0: kd> !process 0 0 test.exe

      PROCESS fffffa82fa924b30

          SessionId: 0  Cid: 036c    Peb: 7fffffda000  ParentCid: 02e4

         DirBase: 1085d76000  ObjectTable: fffff8a0042d7970  HandleCount: 11.

          Image: test.exe

      x86
      0: kd> !process 0 0 test.exe

      PROCESS 89038a08  SessionId: 0  Cid: 10f0    Peb: 7ffde000  ParentCid: 0f10

          DirBase: bfa19900  ObjectTable: e669b630  HandleCount: 11.

          Image: test.exe

       

    11. Take the process id from the output and run the following command. The following command shows the process flags. The output shows the flags as 144d0841 in the example for x64 and 0x44082d for x86.
    12.  

      x64
      0: kd> dt nt!_eprocess fffffa82fa924b30 flags

         +0x440 Flags : 0x144d0801

      x86
      0: kd> dt 89038a08 nt!_eprocess flags

         +0x240 Flags : 0x450801

       

    13. Run the ed command to edit the memory and set the process flags to mark the process critical. Adding the value 0x2000 marks the process critical.
    14.  

      x64
      0: kd> ed fffffa82fa924b30+0x440 0x144d0801+0x2000


      x86
      0: kd> ed 89038a08+0x228 0x450801+0x2000

       

    15. Now close the debugger and proceed with the repro steps to crash or close the application.
    16. In our case the test application with the code mentioned above should cause the machine to bugcheck as soon as any key is pressed.

     

    The complete memory dump will contain the process information as well as kernel data for investigation.

Page 1 of 25 (242 items) 12345»