• Ntdebugging Blog

    Disassemble the MBR

    • 1 Comments

     

    UPDATE:  This is why we *LOVE* our readers!  After posting a 32-bit disassembly of 16-bit MBR code, I was gently informed of my misstep by three sharp-eyed readers.  I’ve corrected the original post below.  Thanks go out to Ramon Sola, Jeroen Frijters and 'pedantic gnome' for keeping us honest.  Thanks guys!

     

     

    Hi Everyone,

     

    One of our readers, Pete, asked a very interesting question about disassembling the MBR and I'd like to take a moment to show you the quick & dirty way to do this.

     

    First, dump the MBR to a file using a low-level sector editor such as the Microsoft Resource Kit utility DiskProbe.  Once you've saved the file, launch a program (like Notepad) and attach to it using Windbg (the Windows Debugger). 

     

    Next, find a valid, but unoccupied range of memory (the default heap is a good candidate).  The memory range needs to be at least 512 (0x200) bytes.  For example, here's where I found mine:

     

    000120a0  00000000 00000000 00000000 00000000

    000120b0  00000000 00000000 00000000 00000000

    000120c0  00000000 00000000 00000000 00000000

    000120d0  00000000 00000000 00000000 00000000

    000120e0  00000000 00000000 00000000 00000000

    000120f0  00000000 00000000 00000000 00000000

    00012100  00000000 00000000 00000000 00000000

    00012110  00000000 00000000 00000000 00000000

    00012120  00000000 00000000 00000000 00000000

    <snip>...

     

    Once you've found a good memory range in your process, read the file contents into that memory location using the following command:

     

    0:001> .readmem c:\<<path>>\sector00.bin 120a0 120a0+0x1ff

     

    Tada!  You now have the MBR in memory and it is fully examinable just like any other assembly code. 

     

    0:001> db 000120a0 

    000120a0  33 c0 8e d0 bc 00 7c 8e-c0 8e d8 be 00 7c bf 00  3.....|......|..

    000120b0  06 b9 00 02 fc f3 a4 50-68 1c 06 cb fb b9 04 00  .......Ph.......

    000120c0  bd be 07 80 7e 00 00 7c-0b 0f 85 10 01 83 c5 10  ....~..|........

    000120d0  e2 f1 cd 18 88 56 00 55-c6 46 11 05 c6 46 10 00  .....V.U.F...F..

    000120e0  b4 41 bb aa 55 cd 13 5d-72 0f 81 fb 55 aa 75 09  .A..U..]r...U.u.

    000120f0  f7 c1 01 00 74 03 fe 46-10 66 60 80 7e 10 00 74  ....t..F.f`.~..t

    00012100  26 66 68 00 00 00 00 66-ff 76 08 68 00 00 68 00  &fh....f.v.h..h.

    00012110  7c 68 01 00 68 10 00 b4-42 8a 56 00 8b f4 cd 13  |h..h...B.V.....

     

    Since this is 16-bit assembly, you can unassemble using ‘ur’:

    0:001> ur 120a0
    000120a0 33c0            xor     ax,ax
    000120a2 8ed0            mov     ss,ax
    000120a4 bc007c          mov     sp,7C00h
    000120a7 8ec0            mov     es,ax
    000120a9 8ed8            mov     ds,ax
    000120ab be007c          mov     si,7C00h
    000120ae bf0006          mov     di,600h
    000120b1 b90002          mov     cx,200h
    <snip>...

     

    Have fun Pete!

     

    DiskProbe:       http://technet.microsoft.com/en-us/library/bb457122.aspx

    WinDbg:           http://www.microsoft.com/whdc/devtools/debugging/default.mspx

     

      - David

  • Ntdebugging Blog

    Kernel Stack Overflows

    • 5 Comments

    Hello, this is Omer, and today I would like to talk about a common error that we see in a lot of cases reported to us by customers. It involves drivers taking too much space on the kernel stack that results in a kernel stack overflow, which will then crash the system with one of the following bugchecks:

    1. STOP 0x7F: UNEXPECTED_KERNEL_MODE_TRAP with Parameter 1 set to EXCEPTION_DOUBLE_FAULT, which is caused by running off the end of a kernel stack.

    2. STOP 0x1E: KMODE_EXCEPTION_NOT_HANDLED, 0x7E: SYSTEM_THREAD_EXCEPTION_NOT_HANDLED, or 0x8E: KERNEL_MODE_EXCEPTION_NOT_HANDLED, with an exception code of STATUS_ACCESS_VIOLATION, which indicates a memory access violation.

    3. STOP 0x2B: PANIC_STACK_SWITCH, which usually occurs when a kernel-mode driver uses too much stack space.

    Kernel Stack Overview

    Each thread in the system is allocated with a kernel mode stack. Code running on any kernel-mode thread (whether it is a system thread or a thread created by a driver) uses that thread's kernel-mode stack unless the code is a DPC, in which case it uses the processor's DPC stack on certain platforms.  Stack grows negatively.  This means that the beginning (bottom) of the stack has a higher address than the end (top) of the stack.  For example, let's stay the beginning of your stack is 0x80f1000 and this is where your stack pointer (ESP) is pointing.  If you push a DWORD value onto the stack, its address would be 0x80f0ffc.  The next DWORD value would be stored at 0x80f0ff8 and so on up to the limit (top) of the allocated stack.  The top of the stack is bordered by a guard-page to detect overruns.

    The size of the kernel-mode stack varies among different hardware platforms. For example:

    · On x86-based platforms, the kernel-mode stack is 12K.

    · On x64-based platforms, the kernel-mode stack is 24K. (x64-based platforms include systems with processors using the AMD64 architecture and processors using the Intel EM64T architecture).

    · On Itanium-based platforms, the kernel-mode stack is 32K with a 32K backing store. (If the processor runs out of registers from its register file, it uses the backing store to hold the contents of registers until the allocating function returns. This doesn't affect stack allocations directly, but the operating system uses more registers on Itanium-based platforms than on other platforms, which makes relatively more stack available to drivers.)

    The stack sizes listed above are hard limits that are imposed by the system, and all drivers need to use space conservatively so that they can coexist.

    Exception Overview

    So, now that we have discussed the kernel stack, let’s dive into how the double fault actually happens.

    When we reach the top of the stack, one more push instruction is going to cause an exception. This could be either a simple push instruction, or something along the lines of a call instruction which also pushes the return address onto the stack, etc.

    The push instruction is going to cause the first exception. This will cause the exception handler to kick in, which will then try to allocate the trap frame and other variables on the stack. This causes the second exception.

    This time around, the operating system takes advantage a special x86 structure called the Task State Segment(TSS).  The OS stores the state of the registers in the TSS and then stops.  The TSS can be accessed via an entry in the global descriptor table, and can be used to debug the memory dump that is created.

    The Usual Suspects

    Rogue drivers are usually guilty of one or more of the following design flaws:

    1. Using the stack liberally. Instead of passing large amounts of data on the stack, driver writers should design functions to accept pointers to data structures. These data structures should be allocated out of system space memory(paged or non-paged pool). If you need to pass large number of parameters from one function to another, then group the parameters into a structure and then pass a pointer to that structure.

    2. Calling functions recursively. Heavily nested or recursive functions that are passing large amounts of data on the stack will use too much space and will overflow. Try to design drivers that use a minimal number of recursive calls and nested functions.

    Since the size of the stack is much smaller on x86 machines, you will run into these problems with x86 machines more frequently than any other platform.

    For a more detailed description, please visit

    http://www.microsoft.com/whdc/Driver/tips/KMstack.mspx.

    Debugging Kernel Stack Overflows

    Full kernel dumps are usually enough to find the offending driver. The most common bugcheck code that appears in these dumps is UNEXPECTED_KERNEL_MODE_TRAP (0x7f), with the first argument being EXCEPTION_DOUBLE_FAULT (0x8).

    When you get this dump, the first command that you should run is !analyze-v.

    0: kd> !analyze -v

    *******************************************************************************

    * Bugcheck Analysis *

    *******************************************************************************

    UNEXPECTED_KERNEL_MODE_TRAP (7f)

    This means a trap occurred in kernel mode, and it's a trap of a kind that the kernel isn't allowed to have/catch (bound trap) or that

    is always instant death (double fault). The first number in the bugcheck params is the number of the trap (8 = double fault, etc)

    Consult an Intel x86 family manual to learn more about what these traps are. Here is a *portion* of those codes:

    If kv shows a taskGate

    use .tss on the part before the colon, then kv.

    Else if kv shows a trapframe

    use .trap on that value

    Else

    .trap on the appropriate frame will show where the trap was taken(on x86, this will be the ebp that goes with the procedure KiTrap)

    Endif

    kb will then show the corrected stack.

    Arguments:

    Arg1: 00000008, EXCEPTION_DOUBLE_FAULT

    Arg2: 80042000

    Arg3: 00000000

    Arg4: 00000000

    Debugging Details:

    ------------------

    BUGCHECK_STR: 0x7f_8

    TSS: 00000028 -- (.tss 0x28)

    eax=87b90328 ebx=87b90328 ecx=8aa3d8c0 edx=87b90328 esi=b8cb7138 edi=8084266a

    eip=f7159c53 esp=b8cb7000 ebp=b8cb7010 iopl=0 nv up ei pl nz na po nc

    cs=0008 ss=0010 ds=0023 es=0023 fs=0030 gs=0000 efl=00010202

    Ntfs!NtfsInitializeIrpContext+0xc:

    f7159c53 57 push edi

    Resetting default scope

    DEFAULT_BUCKET_ID: DRIVER_FAULT

    PROCESS_NAME: System

    CURRENT_IRQL: 1

    TRAP_FRAME: b8cb8620 -- (.trap 0xffffffffb8cb8620)

    ErrCode = 00000000

    eax=c1587000 ebx=0000000e ecx=0000000f edx=00000000 esi=87dca350 edi=00000000

    eip=8093837b esp=b8cb8694 ebp=b8cb86d0 iopl=0 nv up ei ng nz ac po cy

    cs=0008 ss=0010 ds=0023 es=0023 fs=0030 gs=0000 efl=00010293

    nt!CcMapData+0x8c:

    8093837b 8a10 mov dl,byte ptr [eax] ds:0023:c1587000=??

    Resetting default scope

    LAST_CONTROL_TRANSFER: from f7158867 to f7159c53

    Let's follow the instructions that the debugger is giving us. Since the debugger gave us a .tss command, lets run that. After that, run a !thread to get the thread summary:

    0: kd> .tss 0x28

    eax=87b90328 ebx=87b90328 ecx=8aa3d8c0 edx=87b90328 esi=b8cb7138 edi=8084266a

    eip=f7159c53 esp=b8cb7000 ebp=b8cb7010 iopl=0 nv up ei pl nz na po nc

    cs=0008 ss=0010 ds=0023 es=0023 fs=0030 gs=0000 efl=00010202

    Ntfs!NtfsInitializeIrpContext+0xc:

    f7159c53 57 push edi

    0: kd> !thread

    THREAD 87dca350 Cid 0420.0990 Teb: 7ffdf000 Win32Thread: efdbe430 RUNNING on processor 0

    IRP List:

    89cba088: (0006,01fc) Flags: 00000404 Mdl: 00000000

    Not impersonating

    DeviceMap e10008d8

    Owning Process 8ab8e238 Image: System

    Wait Start TickCount 7260638 Ticks: 0

    Context Switch Count 17 LargeStack

    UserTime 00:00:00.000

    KernelTime 00:00:00.015

    Start Address 0x4a6810ea

    Stack Init b8cba000 Current b8cb7c64 Base b8cba000 Limit b8cb7000 Call 0

    Priority 14 BasePriority 13 PriorityDecrement 0

    We are looking for the kernel stack limits(above in red). For this particular stack we see that the stack starts at b8cba000, and ends at b8cb7000. If you look at the ESP register in the .tss output above, you will see that we have reached the stack limit. The current instruction being attempted is a push which overflows the stack and causes the bugcheck.

    Now that we have determined we do have a stack overflow, let’s find out what caused this, and who the offending driver is.

    The first thing that I do is dump the stack. You might need to increase the number of frames displayed to see the whole stack.

    0: kd> kb

    *** Stack trace for last set context - .thread/.cxr resets it

    ChildEBP RetAddr

    b8cb7010 f7158867 Ntfs!NtfsInitializeIrpContext+0xc

    b8cb71bc 8083f9c0 Ntfs!NtfsFsdRead+0xb7

    b8cb71d0 f7212c53 nt!IofCallDriver+0x45

    b8cb71f8 8083f9c0 fltmgr!FltpDispatch+0x6f

    b8cb720c ba547bcc nt!IofCallDriver+0x45

    WARNING: Stack unwind information not available. Following frames may be wrong.

    b8cb7214 8083f9c0 tmpreflt!TmpAddRdr+0x7b8

    b8cb7228 ba4e08be nt!IofCallDriver+0x45

    b8cb7430 ba4e09d3 DRIVER_A+0x28be

    b8cb7450 b85fa306 DRIVER_A+0x29d3

    b8cb763c b85fa50d DRIVER_B+0x8306

    b8cb765c 8082f0d7 DRIVER_B+0x850d

    b8cb7674 8082f175 nt!IoPageRead+0x109

    b8cb76f8 80849cd5 nt!MiDispatchFault+0xd2a

    b8cb7754 80837d0a nt!MmAccessFault+0x64a

    b8cb7754 8093837b nt!KiTrap0E+0xdc

    b8cb781c f718c0ac nt!CcMapData+0x8c

    b8cb783c f718c6e6 Ntfs!NtfsMapStream+0x4b

    b8cb78b0 f718c045 Ntfs!NtfsReadMftRecord+0x86

    b8cb78e8 f718c0f4 Ntfs!NtfsReadFileRecord+0x7a

    b8cb7920 f7155c3c Ntfs!NtfsLookupInFileRecord+0x37

    b8cb7a30 f715746a Ntfs!NtfsLookupAllocation+0xdd

    b8cb7bfc f7157655 Ntfs!NtfsPrepareBuffers+0x25d

    b8cb7dd8 f715575e Ntfs!NtfsNonCachedIo+0x1ee

    b8cb7ec4 f71588de Ntfs!NtfsCommonRead+0xaf5

    b8cb8070 8083f9c0 Ntfs!NtfsFsdRead+0x113

    b8cb8084 f7212c53 nt!IofCallDriver+0x45

    b8cb80ac 8083f9c0 fltmgr!FltpDispatch+0x6f

    b8cb80c0 ba547bcc nt!IofCallDriver+0x45

    b8cb80c8 8083f9c0 tmpreflt!TmpAddRdr+0x7b8

    b8cb80dc ba4e08be nt!IofCallDriver+0x45

    b8cb82e4 ba4e09d3 DRIVER_A+0x28be

    b8cb8304 b85fa306 DRIVER_A+0x29d3

    b8cb84f0 b85fa50d DRIVER_B+0x8306

    b8cb8510 8082f0d7 DRIVER_B+0x850d

    b8cb8528 8082f175 nt!IoPageRead+0x109

    b8cb85ac 80849cd5 nt!MiDispatchFault+0xd2a

    b8cb8608 80837d0a nt!MmAccessFault+0x64a

    b8cb8608 8093837b nt!KiTrap0E+0xdc

    b8cb86d0 f718c0ac nt!CcMapData+0x8c

    b8cb86f0 f718ef1b Ntfs!NtfsMapStream+0x4b

    b8cb8720 f7186aa7 Ntfs!ReadIndexBuffer+0x8f

    b8cb8894 f7187042 Ntfs!NtfsUpdateFileNameInIndex+0x62

    b8cb8990 f7186059 Ntfs!NtfsUpdateDuplicateInfo+0x2b0

    b8cb8b98 f7186302 Ntfs!NtfsCommonCleanup+0x1e82

    b8cb8d08 8083f9c0 Ntfs!NtfsFsdCleanup+0xcf

    b8cb8d1c f7212c53 nt!IofCallDriver+0x45

    b8cb8d44 8083f9c0 fltmgr!FltpDispatch+0x6f

    b8cb8d58 ba54809a nt!IofCallDriver+0x45

    b8cb8d80 ba54d01d tmpreflt!TmpQueryFullName+0x454

    b8cb8d90 8083f9c0 tmpreflt!TmpQueryFullName+0x53d7

    b8cb8da4 ba4e08be nt!IofCallDriver+0x45

    b8cb8fac ba4e09d3 DRIVER_A+0x28be

    b8cb8fcc b85fa306 DRIVER_A+0x29d3

    b8cb91b8 b85fa50d DRIVER_B+0x8306

    b8cb91d8 80937f75 DRIVER_B+0x850d

    b8cb9208 8092add4 nt!IopCloseFile+0x2ae

    b8cb9238 8092af7a nt!ObpDecrementHandleCount+0x10a

    b8cb9260 8092ae9e nt!ObpCloseHandleTableEntry+0x131

    b8cb92a4 8092aee9 nt!ObpCloseHandle+0x82

    b8cb92b4 80834d3f nt!NtClose+0x1b

    b8cb92b4 8083c0fc nt!KiFastCallEntry+0xfc

    b8cb9330 bf835765 nt!ZwClose+0x11

    b8cb9608 bf8aa2dd win32k!bCreateSection+0x2ad

    b8cb9660 bf826b45 win32k!EngMapFontFileFDInternal+0xc6

    b8cb96c0 bf82784a win32k!PUBLIC_PFTOBJ::bLoadFonts+0x17f

    b8cb991c bf9bcb67 win32k!PUBLIC_PFTOBJ::bLoadAFont+0x77

    b8cb9af0 bf9bcb16 win32k!bInitOneStockFontInternal+0x42

    b8cb9b0c bf9bb0e8 win32k!bInitOneStockFont+0x3f

    b8cb9cf4 bf9ba845 win32k!bInitStockFontsInternal+0x12a

    b8cb9cfc bf8246ad win32k!bInitStockFonts+0xa

    b8cb9d48 bf8242d5 win32k!InitializeGreCSRSS+0x149

    b8cb9d50 80834d3f win32k!NtUserInitialize+0x66

    b8cb9d50 7c82ed54 nt!KiFastCallEntry+0xfc

    0015fdb0 00000000 0x7c82ed54

    The next step is to calculate how much space each frame is taking up. This can be done by walking the stack manually. Just subtract the subsequent EBP from the current EBP for each frame and add up the space used by all the modules.

    Module Stack Usage Percentage
    Ntfs 4152 36%
    DRIVER_A 1572 14%
    win32k 2592 22%
    DRIVER_B 1656 14%
    tmpreflt 72 1%
    fltmgr 120 1%
    nt 1420 12%

    It would be easy to blame NTFS since it is the top stack consumer, but look closer.  Even though NTFS is using the most of space in our example, this is due to both DRIVER_A and DRIVER_B making repeated calls into NTFS to access data.  Alone, it is likely that neither driver would have caused a problem, but both drivers combined resulted in a bugcheck.  Conscientious driver writing and efficient stack usage would have prevented this problem.  Both drivers need to optimize the number of calls they make to NTFS.

    Further reading

    http://www.microsoft.com/whdc/driver/kernel/mem-mgmt.mspx

    http://www.microsoft.com/whdc/Driver/tips/KMstack.mspx

    http://support.microsoft.com/kb/186775

    For more information on Task State Segments, please see the Intel and AMD Processor manuals.

  • Ntdebugging Blog

    WE ARE HIRING

    • 1 Comments

    Want to debug the most difficult problems against the largest, most complex piece of software ever written? Are you fluent in C and assembly? Do you carry a spare debugger cable in your bag? Can you represent Microsoft in critical, time sensitive solution delivery? Writing code is easy, finding bugs in someone else’s code you’ve never seen before is a real challenge.

     

    Want to hear more?

     

    http://blogs.msdn.com/jobsblog/archive/2007/01/26/jobcast-life-as-an-escalation-engineer.aspx

     

    http://blogs.msdn.com/jobsblog/archive/2007/11/21/life-as-an-escalation-engineer-part-ii.aspx

     

     

    Candidate must be a strong critical thinker, and enjoy solving very difficult problems (often involving code level analysis).  Escalation Engineers within GES are frequently involved with the highest profile issues, and therefore must be able to handle both pressure and complex situations. Candidates must also have strong customer service, accurate logical problem solving and communication skills, and the ability to work in a team environment. Programming and debugging skills are required, preferably in C or C++.  Also, the ability to read and analyze network traces and solid protocol analysis is a plus. The ideal candidate may have a four year degree in C.S. or Engineering and a minimum of four years product support experience or the equivalent in work experience. Prior knowledge of the product to be supported, and other networking products and/or networking operating systems is required. If you enjoy being the problem-solver in the spotlight of critical problems, then this position will excite and challenge you.

     

     

    Applicants that possess the experience, and desire will be considered and are encouraged to apply using the below link. After applying, please submit a copy of your resume to benjammc@microsoft.com

     

    Job Details - Microsoft Careers job code is 209923 and can be found on the Microsoft Career page.

     

     

  • Ntdebugging Blog

    Too Much Cache?

    • 28 Comments

    Cache is used to reduce the performance impact when accessing data that resides on slower storage media.  Without it your PC would crawl along and become nearly unusable.  If data or code pages for a file reside on the hard disk, it can take the system 10 milliseconds to access the page.  If that same page resides in physical RAM, it can take the system 10 nanoseconds to access the page.  Access to physical RAM is about 1 million times faster than to a hard drive.  It would be great if we could load up all the contents of the hard drive into RAM, but that scenario is cost prohibitive and dangerous.  Hard disk space is far less costly and is non-volatile (the data is persistent even when disconnected from a power source). 

     

    Since we are limited with how much RAM we can stick in a box, we have to make the most of it.  We have to share this crucial physical resource with all running processes, the kernel and the file system cache.  You can read more about how this works here:

    http://blogs.msdn.com/ntdebugging/archive/2007/10/10/the-memory-shell-game.aspx

     

    The file system cache resides in kernel address space.  It is used to buffer access to the much slower hard drive.  The file system cache will map and unmap sections of files based on access patterns, application requests and I/O demand.  The file system cache operates like a process working set.  You can monitor the size of your file system cache's working set using the Memory\System Cache Resident Bytes performance monitor counter.  This value will only show you the system cache's current working set.  Once a page is removed from the cache's working set it is placed on the standby list.  You should consider the standby pages from the cache manager as a part of your file cache.  You can also consider these standby pages to be available pages.  This is what the pre-Vista Task Manager does.  Most of what you see as available pages is probably standby pages for the system cache.  Once again, you can read more about this in "The Memory Shell Game" post.

     

    Too Much Cache is a Bad Thing

    The memory manager works on a demand based algorithm.  Physical pages are given to where the current demand is.  If the demand isn't satisfied, the memory manager will start pulling pages from other areas, scrub them and send them to help meet the growing demand.  Just like any process, the system file cache can consume physical memory if there is sufficient demand. 

    Having a lot of cache is generally not a bad thing, but if it is at the expense of other processes it can be detrimental to system performance.  There are two different ways this can occur - read and write I/O.

     

    Excessive Cached Write I/O

    Applications and services can dump lots of write I/O to files through the system file cache.  The system cache's working set will grow as it buffers this write I/O.  System threads will start flushing these dirty pages to disk.  Typically the disk can't keep up with the I/O speed of an application, so the writes get buffered into the system cache.  At a certain point the cache manager will reach a dirty page threshold and start to throttle I/O into the cache manager.  It does this to prevent applications from overtaking physical RAM with write I/O.  There are however, some isolated scenarios where this throttle doesn't work as well as we would expect.  This could be due to bad applications or drivers or not having enough memory.  Fortunately, we can tune the amount of dirty pages allowed before the system starts throttling cached write I/O.  This is handled by the SystemCacheDirtyPageThreshold registry value as described in Knowledge Base article 920739: http://support.microsoft.com/default.aspx?scid=kb;EN-US;920739

     

    Excessive Cached Read I/O

    While the SystemCacheDirtyPageThreshold registry value can tune the number of write/dirty pages in physical memory, it does not affect the number of read pages in the system cache.  If an application or driver opens many files and actively reads from them continuously through the cache manager, then the memory manger will move more physical pages to the cache manager.  If this demand continues to grow, the cache manager can grow to consume physical memory and other process (with less memory demand) will get paged out to disk.  This read I/O demand may be legitimate or may be due to poor application scalability.  The memory manager doesn't know if the demand is due to bad behavior or not, so pages are moved simply because there is demand for it.  On a 32 bit system, the file system cache working set is essentially limited to 1 GB.  This is the maximum size that we blocked off in the kernel for the system cache working set.  Since most systems have more than 1 GB of physical RAM today, having the system cache working set consume physical RAM with read I/O is less likely. 

    This scenario; however, is more prevalent on 64 bit systems.  With the increase in pointer length, the kernel's address space is greatly expanded.  The system cache's working set limit can and typically does exceed how much memory is installed in the system.  It is much easier for applications and drivers to load up the system cache with read I/O.  If the demand is sustained, the system cache's working set can grow to consume physical memory.  This will push out other process and kernel resources out to the page file and can be very detrimental to system performance.

    Fortunately we can also tune the server for this scenario.  We have added two APIs to query and set the system file cache size - GetSystemFileCacheSize() and SetSystemFileCacheSize().  We chose to implement this tuning option via API calls to allow setting the cache working set size dynamically.  I’ve uploaded the source code and compiled binaries for a sample application that calls these APIs.  The source code can be compiled using the Windows DDK, or you can use the included binaries.  The 32 bit version is limited to setting the cache working set to a maximum of 4 GB.  The 64 bit version does not have this limitation.  The sample code and included binaries are completely unsupported.  It is just a quick and dirty implementation with little error handling.

  • Ntdebugging Blog

    DNS Slow to Start?

    • 7 Comments

    Hi my name is Jason. I’m an escalation engineer on the Microsoft critical problem resolutions platform team. We recently dealt with an interesting issue that I would like to share with you and show how we went about discovering the cause. 

    The customer stated that when his server (Windows Server 2003 SP1) boots up the DNS Server service does not become availabe/responsive for ~5 minutes. Customer also stated he has other DC's in this domain that are the same model server (same specs too) that do not have the problem. So I ask, "What do you mean not responsive?". During this 5 minute period, the administrator cannot use the DNS MMC to administer the server, and DNS lookups to this server fail with a timeout error. The customer can: Ping the server, connect to file shares, etc.

    Troubleshooting

    So my initial question was "What is DNS doing for 5 minutes?". In order to determine where dns.exe was hung up or spinning his wheels, I replaced dns.exe with the checked build.

    Walkthrough:

    1.     Downloaded the checked build of SP1

    2.     Extracted and expanded the checked build of dns.exe (What is a checked build?)

    3.     Worked through the process of replacing the free build of DNS.exe (Booting into safe mode in order to bypass WFP (Windows file protection))

    4.     Created the dnsdebug file (a text file with no extension under %windir%\system32\dns). The checked build of DNS looks for this file, and reads the contents to determine the level of output. 

    5.     In the file I set the debug flags for DNS_DEBUG_INIT (0x00000010 ) & DNS_DEBUG_INIT2 (0x01000000) and set the output to go to the debugger (0x00000002), net result (0x01000012). The image below shows contents of the file.
     

    6.     Instead of using the debugger I decided to use DebugView from Sysinternals to view the output. I could have attached a debugger to DNS.exe to see the output, but in this case the DebugView was sufficient.

     

    With all this in place we rebooted. We launched DebugView and did a "net start dns". Here is a screen scrape of what the output looked like:

    image

     

    I have highlighted the debug flag I have set.

    From this output we were able to determine that the Server was going through the following process:

    1.     DNS gets started by Service Control Manger (a.k.a. SCM) and then DNS communicates back to SCM (across the NtControlPipe)

    2.     DNS loads the zones from the registry (HKLM\Software\Microsoft\Windows NT\Current Version\DNS Server\Zones

    3.     DNS pulls this zones from the Active Directory (as this is an AD integrated zone)

     

    During these phases the ability to respond to administrative requests and name lookup requests is blocked (this is by design). So based on this I needed to figure out where we were spending the most of our 5 minutes.

    1.     For the SCM portion, I used Filemon monitoring only Named Pipes for dns.exe and services.exe.

    2.     For the registry portion, I used Regmon with a filter for dns.exe

    3.     For the AD portion, I used perfmon with the NTDS>LDAP Searches/Sec counter

    With these running we were able to isolate the delay to the point where DNS was pulling the zones from AD. Here is a scrape of me running this test on my VM:

    image

     

    You can see (top left) the service communicating back to SCM across the NtControlPipe. Lets take a moment here and have a short discussion how a service starting works.

    1. SCM goes to a registry value that will be used as a seed value to build a named pipe. (HKLM\System\CurrentControlSet\Control\ServiceCurrent\(default) = <a number>
    2. T
    he result is incremented by 1 and then returned back to SCM
    3. SCM then creates a named pipe using NtControlPipe as the prefix followed by the numerical value from the registry. So when starting the first service, the pipe name would be NtControlPipe1 (To get a peek at these pipes download PipeList from Sysinternals)
    4.
    The SCM then starts the service process
    5.
    The service then goes to the same registry value and uses this seed to know which named pipe he will need to talk to in order to convey progress back to SCM.
     

    We can then see (top right) the service pulling in the dns zone information from the registry. Finally we see (bottom) the small spike in LDAP searches/sec (highlighted) where the service was pulling the zone information from the AD. Until the pull of the zones from the registry and AD is complete, the DNS service does not respond to MMC administration or DNS requests.

    The majority of the 5 minute delta fell into the last bucket (the LDAP queries to pull the zone data). So begins our next discussion, What causes LDAP to be slow?

    So LDAP lives in LSASS.EXE. There are two LDAP interfaces in AD. The entire directory is available of TCP port 389, and then the subset of AD that is published (a.k.a. Global Catalog) over TCP port 3268. The DNS services is pulling the DNS zones from the "Entire Directory" interface.

    Lsass memory usage on domain controllers has two components:

    • Data structures, which are like those in other processes (i.e. threads, heaps, and stacks)
    • Database buffer cache, which consists of database pages and index pages for the directory ß This is where a "In-RAM" copy of AD database would be

       

    In Windows Server 2003, there is no limit to how large the database buffer cache can grow (outside of the limits of the 2GB of virtual address space that can be modified via the /3GB switch). This helps explain the behavior of lsass.exe normally being the largest value in the "Mem Usage" category in task manager on a DC.

    Q. What can affect the amount of Database buffer cache that LSASS is maintaining "In-RAM?
    A. Memory pressure. This could be caused by configuration (pagefile settings, etc.) or another application(s) using and paging a lot of memory on the machine.

    lsass.exe on the affected machine was using 300mb in the "Mem Usage" column in task manager. The machine that was not affected was closer to 450mb. This means there was 150mb more data in the Database buffer cache, and due to this, the process is able to respond a lot faster to LDAP requests since he can pull from the RAM cache more and this limits the amount of time spent pulling this data into the cache from the DIT file on the disk.

    The solution is to tune the pagefile settings to match the better performing server.

    By simply increasing the pagefile from ~1 gig to ~3gig the DNS Server can be available within ~30 seconds

    So what are we (Microsoft) doing about this?

    Excerpt from the "Book of Longhorn"

    What new DNS functionality is provided by this feature in Windows Server "Longhorn"?

    The DNS Server service in Windows Server "Longhorn" includes a number of new and enhanced features compared to the DNS Server service that was available in the Microsoft Windows NT® Server, Windows 2000 Server, and Windows Server® 2003 operating systems. The following sections describe these features.

     

    Background zone loading

    Very large organizations with extremely large zones that store their DNS data in AD DS sometimes discover that restarting a DNS server can take an hour or more while the DNS data is retrieved from the directory service. The result is that the DNS server is effectively unavailable to service client requests for the entire time that it takes to load AD DS-based zones.

    A DNS server running Windows Server "Longhorn" now loads zone data from AD DS in the background while it restarts so that it can respond to requests for data from other zones. When the DNS server starts, it:

    • Enumerates all zones to be loaded.
    • Loads root hints from files or AD DS storage.
    • Loads all file-backed zones, that is, zones that are stored in files rather than in AD DS.
    • Begins responding to queries and remote procedure calls (RPCs).
    • Spawns one or more threads to load the zones that are stored in AD DS.

    Because the task of loading zones is performed by separate threads, the DNS server is able to respond to queries while zone loading is in progress. If a DNS client requests data for a host in a zone that has already been loaded, the DNS server responds with the dat (or, if appropriate, a negative response) as expected. If the request is for a node that has not yet been loaded into memory, the DNS server reads the node's data from AD DS and updates the node's record list accordingly.

    Why is this functionality important?

    The DNS server can use background zone loading to begin responding to queries almost immediately when it restarts, instead of waiting until its zones are fully loaded. The DNS server can respond to queries for the nodes that it has loaded or that can be retrieved from AD DS. This functionality also provides another advantage when zone data is stored in AD DS rather than in a file: AD DS can be accessed asynchronously and immediately when a query is received, while file-based zone data can be accessed only through a sequential read of the file.

    I hope you find this information helpful.

  • Ntdebugging Blog

    The Memory Shell Game

    • 12 Comments

    Hello, this is Somak.  Today I’d like to drop some Memory Manager info on the blog that I’ve used to communicate in brief (believe it or not) how the system deals with memory.  If you are ever faced with checking how much Available Memory you have(or don’t have), poor system performance, questions about page faults, or having a Working Set Trimming performance issue, or just want a primer into how Windows manages memory on the system, be not afraid and read on!

    How this memory stuff works

    The fastest and most expensive memory is built into the CPU.  The next fastest and less expensive is physical RAM.  Next we have the hard drive, followed by remote storage or backup.  Each step down the ladder adds significantly to access time.  For example, physical RAM can be almost one million times faster than a hard disk.  Since it is volatile and cost considerably more than hard disk space, we are limited at how much we can put in the system.  This limit is far less than the much slower and non-volatile hard disk.  Due to these design constraints we have a tiered memory structure in the computer.  To achieve the fastest response times and best overall performance, an operating system must efficiently use this tiered memory structure.  It must do its best to reduce the need to retrieve data from a slower storage media like the hard disk.  It has to do this while juggling the memory and I/O demand from all running processes.  The following paragraphs are an overview of how the Memory Manager achieves this in Windows.  A more detailed description can be found in Chapter 7 of Microsoft Windows Internals, Fourth Edition (ISBN: 0-7356-1917-4).  This book is the great source for how stuff works in Windows.

    First let’s lay out some definitions.  They will be useful later on when I talk about the interactions.  These definitions are high level to maintain brevity.

    Virtual Memory – This is a memory that an operating system can address.  Regardless of the amount of physical RAM or hard drive space, this number is limited by your processor architecture.  On a 32 bit processor you are limited to 4 GB of addressable virtual memory (2^32).  With a default installation on a 32 bit box (not using /3GB) the kernel reserves 2GB for itself.  Applications are left with 2GB of addressable virtual memory.  When applications execute they are only presented with this 2GB of addressable memory.  Each application gets its own 2GB virtual memory to play with.  If you have 50 processes running, you’ll have 50 independent 2GB Virtual Memory address spaces and one 2GB Virtual Address space for kernel.  This is possible because Virtual Memory always exists, but doesn't really exist (hence the term virtual).  We basically lie to the application and say, here is 2GB of memory address space for you to use.  This memory isn’t allocated until the application explicitly uses it.  Once the application uses the page, it becomes committed.  A virtual memory page is then translated to a physical memory page.  From this translation, the virtual page can reside in physical RAM or on the hard disk.

    Physical Memory –  This is the physical storage media.  It can be physical RAM, the hard disk, optical disks, tape backups, etc.  This is anything that can store data.  Most times when people talk about Physical Memory, they refer to physical RAM (the memory sticks on your motherboard), but with virtual page translation, physical memory can also be on the hard drive (in your paging file).  Physical RAM is limited by your processor architecture.

    Committed Memory –  When an application touches a virtual memory page (reads/write/programmatically commits) the page becomes a committed page.  It is now backed by a physical memory page.  This will usually be a physical RAM page, but could eventually be a page in the page file on the hard disk, or it could be a page in a memory mapped file on the hard disk.  The memory manager handles the translations from the virtual memory page to the physical page.   A virtual page could be in located in physical RAM, while the page next to it could be on the hard drive in the page file.

    Commit Limit –  This is the maximum amount of memory that all your applications and the OS can commit.  If you had 50 applications fully allocate their 2 GB of virtual address space, you would need 100GB of commit limit (ignore kernel memory usage to keep the numbers simple).  So 100GB of committed pages can be backed by physical RAM or the hard drive.  If you have 100 GB of RAM, you could handle this memory load.   In most cases, 100 GB of RAM isn't economically feasible so the Commit Limit is comprised of physical RAM and the page file.  If you have 2 GB of physical RAM and 98 GB of page file, then your commit limit would be 100 GB.

    Page file –   This is the storage area for virtual memory that has been committed.  It is located on the hard drive.  Since hard drive space is cheaper than physical RAM, it is an inexpensive way to increase the commit limit.

    Working Set – This is a set of virtual memory pages (that are committed) for a process and are located in physical RAM.  These pages fully belong to the process.  A working set is like a "currently/recently working on these pages" list.

    Modified pages - Once a virtual memory page leaves the process's working set, it is moved to another list.  If the page has been modified, it is placed on the modified page list.  This page is in physical RAM.  A thread will then write the page to the page file and move it to the standby list.

    Standby pages - This is a page that has left the process' working set.  This page is in physical RAM.  A standby page is like a cache for virtual memory pages.  It is still associated with the process, but not in its working set.  If the process touches the page, it is quickly faulted back into the working set.  That page also has one foot out the door.  If another process or cache needs more memory, the process association is broken and it is moved to the free page list.  Most of the pages in available memory are actually standby pages.  This makes sense when you realize that these pages can be quickly given to another process (hence available), but you should also understand that they are page caches for working sets and can be quickly given back if the process touches the page again.  The vast majority of available memory is not wasted or empty memory.

    Free pages - When a page is taken off of the standby page list, it is moved to the Free page list.  This page is in physical RAM.  These pages are not associated with any process.   When a process exits, all of its pages are then dumped onto this list. Typically, there is a very small to no amount of free pages hanging around physical RAM. 

    Zeroed pages - When a free page is zeroed out, it is placed on the Zero page list.  This page is in physical RAM.  These are the pages that are given to processes that are making memory allocations.  Due to C2 security requirements, all pages must be scrubbed before handed to a new process.  When the system is idle, a thread will scrub free pages and put them on this list.  Only a small amount of zero pages are required to handle the typical small memory allocations of processes.  Once this list is depleted, and if there is demand for more pages, we pull pages off of the Free page list and scrub them on the fly.  If the Free page list is depleted, then we pull pages off of the standby list, scrub them on the fly and hand them to the new process.

    What is Task Manager telling me?

    Prior to Windows Vista, Task Manager reports memory usage using accounting methods that you probably are not expecting.  It is because of these accounting practices, that we rarely use Task Manager to gauge system performance and memory usage.  We typically use it for a quick overview or to kill processes.  I highly recommend using Performance Monitor (perfmon.msc) for investigating performance issues.  Here's the breakdown of the numbers on the Performance tab:

    Physical Memory

    Total - The is the total physical RAM installed in the system.

    Available - This is the total of the Standby, Free and Zeroed list.  Free and Zeroed makes sense, but Standby seems odd at first.  Standby pages were added to this number because they are available for a quick scrub and given to a new process.  So they are technically available (with minimal effort).

    System Cache- This is the total of the Standby list and the size of the system working set (which includes the file cache).  Standby pages are added to this list because they are cached pages for working sets.

    PF Usage - This is the total number of committed pages on the system.  It does not tell you how many are actually written to the page file.  It only tells you how much of the page file would be used if all committed pages had to be written out to the page file at the same time.

    Commit Charge

    Total - This is the total virtual memory that has been committed.  This includes all committed memory for all processes and the kernel.

    Limit - This is the maximum amount of committed memory this system can handle.  This is a combination of physical RAM and the page file.

    Peak – This is the highest amount of memory committed thus far on this system, since boot.

    How does this work?

    The memory manager optimizes physical RAM usage across the entire system.  Since physical RAM is a finite resource, it has to balance sharing this critical resource amongst all process, the kernel and file I/O.   It tries to keep disk I/O to a minimum, which results in a more responsive system.  It does this by moving pages around to meet the demand of the system.

    Typically, large sections of physical RAM are used for file cache.  This is cache is necessary to improve disk performance.  Without it, disk I/O would make the system crawl along at an nearly unusable pace.  The file system cache is just like a working set for a process.  Pages removed from the file cache are moved to the standby or modified page list.  Many of the standby pages in RAM are probably file cache pages that were removed from its working set.  For example, on a file server, if you see 8 GB of available memory, most of these pages are probably standby pages for the file cache.  The file cache's working set could be 500 MB, but the 8 GB of standby pages should also be considered part of the file cache.

    Now let's take a look at how the memory manager handles processes.  While an application is working with its virtual memory pages, the memory manager keeps the pages in the process' working set.  Since the vast majority of application do not use all of its memory all the time, some pages will age.  Old pages are removed from the working set.  If they are modified, they are moved to the modified list.  The page is saved to the page file and moved to the standby list.  If the page hasn't been modified, it is moved directly to the standby list.  These pages will remain on the standby page list until there is a demand for it.

    If the application touches the page again, it is soft faulted back into the process' working set.  If the process doesn't use the page for a very long time, or if the demand for the page is greater elsewhere, the page is moved off of the standby list.  It is disassociated with the process and moved to a the free page list.  From the free page list, the page is scrub on demand or lazily and placed on the zero page list.  It is from the zero page list that other processes or the kernel or the file cache will get a new page.

    If after a very long time the application once again needs a page that is not in its working set, the memory manager will handle the memory fault.  If the page is on the standby list, it is quickly put back into the process' working set.  If the page is no longer in the standby list, a hard fault occurs.  The memory manager issues I/O to the hard disk to read the page(s) from the page file.  Once the I/O complete, the page is placed back into the process' work set.

    All of this is done to keep physical RAM highly utilized and disk I/O to a minimum.  We don't want to allow process to horde physical RAM for pages that are rarely used.  The physical RAM must be shared with other processes, the kernel and the file cache.  If you see lots of available memory on your system, rest assured that it is not going to waste.  The vast majority is on standby lists for processes and the file cache.

    Also note that page file usage isn't that bad.  The page file allows the Memory Manager to save modified pages before placing the page on the standby list.  The page is still in physical RAM and can be quickly faulted back into the process.  This method gives the process a chance to reclaim an old page and it allows the page to be quickly used if there is demand elsewhere.

    The best way to see the totals of these lists is to use a kernel debugger (live or postmortem).  Use the !memusage command and you'll get an output like this:

    0: kd> !memusage

    loading PFN database

    loading (100% complete)

    Compiling memory usage data (99% Complete).

    Zeroed:    414 (  1656 kb)

                   Free:      2 (     8 kb)

    Standby: 864091 (3456364 kb)

    Modified:    560 (  2240 kb)

        ModifiedNoWrite:     30 (   120 kb)

    Active/Valid: 182954 (731816 kb)

             Transition:      2 (     8 kb)

                    Bad:      0 (     0 kb)

                Unknown:      0 (     0 kb)

    TOTAL: 1048053 (4192212 kb)

    Of the 4GB of physical RAM, only 1.6 MB are on Zeroed or free pages.  731 MB is in process, system and file cache working sets.  2 MB are on the modified page list.  The vast majority, 3.4 GB, is on the standby list.  On this server, most people will see 3.4 GB of wasted physical RAM, but you will know better.

    What should I be worried about?

    Typically you shouldn't be worried about these things, until you have a performance problem.  If your system is sluggish or slow to respond or you are getting errors about out of memory, then you need to rely on this information.  You will need to collect a performance monitor log of the problem time.  If the counter list is daunting, then use the Performance Monitor Wizard to configure the performance monitor log.

    Once the log is collected, you'll need to analyze several counters.  I'm not going into detail about how to review performance monitor logs this time.  I'll save that lengthy topic for another time.  For now I'll focus on the counters relevant to the Memory Manager.

    One of the biggest reasons for slow performance and sluggish system responsiveness is disk bottleneck.  Look at Physical Disk\% Idle Time, Avg. Disk sec/Read and Avg. Disk sec/Write counters.  If your system drive is under 50% idle or your disk response times are way above your drive specifications, then you need to investigate further.  Look at the Memory\Available Mbytes.  You should have a couple hundred Mbytes of Available Memory.  This is one of the most important performance monitor counters.  If this number drops too low, your standby lists, process working sets and cache will be greatly reduced.  You'll need to find out if a process is consuming physical RAM.  Check for large process working sets or for large file cache.

    You will also need to see if paging is really affecting system performance.  Take a look at Memory\Pages Input/sec and correlate that to Physical Disk\Avg. Disk sec/Read.  Pages Input/sec is the number of pages being read in from the page file.  These are the hard faults (when the page wasn't on the standby list).  If your Avg. Disk sec/Read is close to your drive's specification and the drive's idle time is high, than paging really isn't a problem.  Small amounts of hard faults are expected as applications will every once in a while re-touch an old page.  As long as this I/O is not consistent or the disk can't keep up, you probably will not notice this impact.

    You can also look at Memory\Pages Output/sec and Physical Disk\Avg. Disk sec/Write.  These are the page commits to the page file when a modified page is removed from a process' working set.  As long as the disk can keep up with the writes, this shouldn't be a problem.  Remember that once the page is saved to the page file, it is placed on the standby list.  If there isn't a great demand for new pages, it can remain on the standby list for a very long time.  When the process touches the old virtual page again, it can be soft faulted back into the working set.  If there is great demand for memory, you'll see process working sets aggressively being trimmed.  Unless there is memory pressure, this is all done with lazy I/O, so you should not see much of an impact from this activity.

    The Memory Manager works to meet current demand and prepares for future demand when possible.  You need to look at a performance monitor log to see if there is memory pressure on the system.  You'll see this in low Available Mbytes and reductions in process working sets.  You'll be able to correlate this to increase disk I/O to the page file.  If you have established that there is memory pressure on the box, you need to figure where that demand is coming from.  Check for working set increases from processes, file cache or bottlenecked disk I/O to data drives.

  • Ntdebugging Blog

    LPC CASE2 – When things are not rosy

    • 1 Comments

    Hello, this is Roy again.  In this case we will discuss a scenario where tracing a hung client thread is not possible through LPC data structures and extensions. We would rather use hints from the LPC message sent about the operation that was taking place and other heuristics to arrive at possible reason for the hang.

     

    The issue is that whenever an Encrypted File is opened the server locks up. The initial analysis is to determine the reason for the server hang, whether it is low resource condition, deadlock or any other reason. After dumping all the threads and glancing over them I chose to pickup the following thread which looked interestingly abnormal. Below is the thread which is trying to do a CreateFile( ) and is stuck for 4hrs for a LPC reply. The file attempted to be created is an encrypted file. KSecDD.sys is the Kernel Security Device Driver which is used for handling key encryption, decryption tasks. In order to decrypt the file KSecDD makes a LPC request to LsaAuthenticationPort.

     

    2: kd> !thread fffffade6c9f1390

    THREAD fffffade6c9f1390  Cid 04b4.0790  Teb: 000007fffffd8000 Win32Thread: fffffa8008ed1d80 WAIT: (WrLpcReply) KernelMode Non-Alertable

        fffffade6c9f1710  Semaphore Limit 0x1

    Waiting for reply to LPC MessageId 01561d96:

    Current LPC port fffffa800380cc40

    IRP List:

        fffffade6d389010: (0006,03a0) Flags: 00000884  Mdl: 00000000

        fffffade6bf55c60: (0006,03a0) Flags: 00000884  Mdl: 00000000

    Impersonation token:  fffffa8002f5b620 (Level Impersonation)

    Owning Process            fffffade6f97c2b0       Image:         svchost.exe

    Wait Start TickCount      18281033       Ticks: 26423448 (4:18:41:06.375)

    Context Switch Count      11                 LargeStack

    UserTime                  00:00:00.015

    KernelTime                00:00:00.000

    Win32 Start Address 0x0000000003855c60

    Start Address 0x0000000078d6b630

    Stack Init fffffade56ea1e00 Current fffffade56e9f930

    Base fffffade56ea2000 Limit fffffade56e9b000 Call 0

    Priority 9 BasePriority 8 PriorityDecrement 0

     

    Child-SP          RetAddr           Call Site

    fffffade`56e9f970 fffff800`01054df3 nt!KiSwapContext+0x85

    fffffade`56e9faf0 fffff800`0105457f nt!KiSwapThread+0x376

    fffffade`56e9fb50 fffff800`0125555c nt!KeWaitForSingleObject+0x58e

    fffffade`56e9fbd0 fffffade`5b19e471 nt!LpcpRequestWaitReplyPort+0x672

    fffffade`56e9fc60 fffffade`5b19f88a KSecDD!EfspDecryptFek+0x131

    fffffade`56e9fec0 fffffade`5b0a4f46 KSecDD!EfsDecryptFek+0x2a

    fffffade`56e9ff10 fffffade`5b0a4b9d Ntfs!EFSPostCreate+0x323

    fffffade`56ea0060 fffffade`5b119d70 Ntfs!EFSFilePostCreate+0x1a5

    fffffade`56ea00e0 fffffade`5b1f2daa Ntfs!NtfsFsdCreate+0x425

    fffffade`56ea0270 fffffade`59976284 fltMgr!FltpCreate+0x34a

    fffffade`56ea0300 fffffade`5997ed4c SomeFSFilter+0x9284

    fffffade`56ea0360 fffffade`5997637f SomeFSFilter+0x11d4c

    fffffade`56ea0390 fffffade`5b1d759a SomeFSFilter+0x937f

    fffffade`56ea0400 fffffade`5b1f2e14 fltMgr!FltpLegacyProcessingAfterPreCallbacksCompleted+0x41a

    fffffade`56ea0470 fffff800`01295ede fltMgr!FltpCreate+0x3b4

    fffffade`56ea0500 fffff800`0128ea00 nt!IopParseDevice+0x108f

    fffffade`56ea06b0 fffff800`012909a7 nt!ObpLookupObjectName+0x923

    fffffade`56ea07c0 fffff800`01296a14 nt!ObOpenObjectByName+0x180

    fffffade`56ea09b0 fffff800`01296bb7 nt!IopCreateFile+0x630

    fffffade`56ea0b40 fffffade`599e2a67 nt!IoCreateFile+0x12f

    fffffade`56ea0c20 fffffade`599e3c91 SomeFSFilter+0x47a67

    fffffade`56ea0ce0 fffffade`599e62e9 SomeFSFilter+0x48c91

    fffffade`56ea0d10 fffffade`599aacd3 SomeFSFilter+0x4b2e9

    fffffade`56ea0e00 fffffade`599aad6e SomeFSFilter+0xfcd3

    fffffade`56ea0ea0 fffffade`599bea96 SomeFSFilter+0xfd6e

    fffffade`56ea0ed0 fffffade`599ae142 SomeFSFilter+0x23a96

    fffffade`56ea0f90 fffffade`599ae7f1 SomeFSFilter+0x13142

    fffffade`56ea1070 fffffade`5999c0e3 SomeFSFilter+0x137f1

    fffffade`56ea10a0 fffffade`599ae848 SomeFSFilter+0x10e3

    fffffade`56ea10d0 fffffade`599ae8c3 SomeFSFilter+0x13848

    fffffade`56ea1120 fffffade`599a9a96 SomeFSFilter+0x138c3

    fffffade`56ea1160 fffffade`599a782f SomeFSFilter+0xea96

    fffffade`56ea11a0 fffffade`599a7a02 SomeFSFilter+0xc82f

    fffffade`56ea11e0 fffffade`5997d484 SomeFSFilter+0xca02

    fffffade`56ea1230 fffffade`599800f4 SomeFSFilter+0x10484

    fffffade`56ea1260 fffffade`5998038d SomeFSFilter+0x130f4

    fffffade`56ea1290 fffffade`5997ecbc SomeFSFilter+0x1338d

    fffffade`56ea12c0 fffffade`5997637f SomeFSFilter+0x11cbc

    fffffade`56ea12f0 fffffade`5b1d759a SomeFSFilter+0x937f

    fffffade`56ea1360 fffffade`5b1f2e14 fltMgr!FltpLegacyProcessingAfterPreCallbacksCompleted+0x41a

    fffffade`56ea13d0 fffff800`01295ede fltMgr!FltpCreate+0x3b4

    fffffade`56ea1460 fffff800`0128ea00 nt!IopParseDevice+0x108f

    fffffade`56ea1610 fffff800`012909a7 nt!ObpLookupObjectName+0x923

    fffffade`56ea1720 fffff800`01296a14 nt!ObOpenObjectByName+0x180

    fffffade`56ea1910 fffff800`01296bb7 nt!IopCreateFile+0x630

    fffffade`56ea1aa0 fffff800`01298479 nt!IoCreateFile+0x12f

    fffffade`56ea1b80 fffff800`0104fce2 nt!NtCreateFile+0x69

    fffffade`56ea1c00 00000000`78ef17ba nt!KiSystemServiceCopyEnd+0x3

    00000000`00bbf258 00000000`00000000 0x78ef17ba

     

     

    Using the message Id mentioned in the “!thread” output we dump the LPC message. The output has two anomalies.

    1. No server thread has been listed which is processing the client request.
    2. The message queue as well as the LpcDataInfoChain is empty.

     

     

    Answers to both questions are very specific to LSASS. Most types of client requests received on LsaAuthenticationPort do not have extra data. As mentioned in CASE 1, server threads dequeue the message and chain it to LpcDataInfoChainHead only if the message contains extra information. If the message does not contain any extra data then the server does not need to store the message anywhere. The server would only need to remember the kind of request carried by the message.

     

    !lpc message searches the ETHREAD.LpcReceiveMessageId field to determine which server thread is processing the message. The way LSASS handles LPC request is that there is a single thread in LSASS that dequeues all messages from the connection port message queue and enqueues them to an LSASS specific internal queue. LSASS has a pool of worker threads that wait for requests in the internal queue. The worker threads dequeue the request, process it and reply back to the client. Hence the single LPC server thread in LSASS has its LpcReceiveMessageId overwritten by a new one every time it dequeues a message and enqueues it to the internal queue. That is the reason !lpc message fails to find a server thread for the LPC message sent to LsaAuthenticationPort.

     

     

     

     

    2: kd> !LPC MESSAGE 01561d96

    Searching message 1561d96 in threads ...

    Client thread fffffade6c9f1390 waiting a reply from 1561d96                         

    Searching thread fffffade6c9f1390 in port rundown queues ...

     

    Server communication port 0x033fc660

        Handles: 1   References: 1

        The LpcDataInfoChainHead queue is empty

            Connected port: 0xfffffa800380cc40      Server connection port: 0xfffffa800282ca40

     

    Client communication port 0xfffffa800380cc40

        Handles: 0   References: 1

        The LpcDataInfoChainHead queue is empty

     

    Server connection port fffffa800282ca40  Name: LsaAuthenticationPort

        Handles: 1   References: 83

        Server process  : fffffade6ead0040 (lsass.exe)

        Queue semaphore : fffffade6e9513c0

        Semaphore state 0 (0x0)

        The message queue is empty

        The LpcDataInfoChainHead queue is empty

    Done.  

     

     

     

    Then how do you find the server thread that is responsible for replying back to the client and why it isn’t replying? Well there are many ways to do so depending upon the scenario. The way I chose was to look at all threads in LSASS which either have function calls on the stack that relate to the KSecDD request made or other hints. The reason threads only in LSASS were checked is because we knew that the request was sent to LsaAuthenticationPort and will be processed within LSASS. The hint was in the Wait Tick Count of the thread. I found a thread which had exactly the same Wait Tick Count as the client thread that was waiting for a reply. Further investigation of the thread confirmed that it was the one responsible for processing the request from KSecDD.

     

     

     

    2: kd> !thread fffffade6fa01460

    THREAD fffffade6fa01460  Cid 03a4.0100  Teb: 000007fffff8e000 Win32Thread: 0000000000000000 WAIT: (Executive) KernelMode Non-Alertable

        fffffade6f665820  Semaphore Limit 0x1

        fffffade6f665840  NotificationEvent

    IRP List:

        fffffade6bb9a830: (0006,03a0) Flags: 00000884  Mdl: 00000000

    Impersonation token:  fffffa80044d11c0 (Level Impersonation)

    Owning Process            fffffade6ead0040       Image:         lsass.exe

    Wait Start TickCount      18281033       Ticks: 26423448 (4:18:41:06.375)

    Context Switch Count      509109            

    UserTime                  00:01:54.500

    KernelTime                00:00:33.765

    Win32 Start Address 0x0000000000c21aa0

    Start Address 0x0000000078d6b630

    Stack Init fffffade5a894e00 Current fffffade5a893e40

    Base fffffade5a895000 Limit fffffade5a88f000 Call 0

    Priority 13 BasePriority 9 PriorityDecrement 0

     

    Child-SP          RetAddr           Call Site

    fffffade`5a893e80 fffff800`01054df3 nt!KiSwapContext+0x85

    fffffade`5a894000 fffff800`0105518f nt!KiSwapThread+0x376

    fffffade`5a894060 fffffade`599a8b28 nt!KeWaitForMultipleObjects+0x66e

    fffffade`5a8940f0 fffffade`599a7808 SomeFSFilter+0xdb28

    fffffade`5a8941a0 fffffade`599a7a02 SomeFSFilter+0xc808

    fffffade`5a8941e0 fffffade`5997d484 SomeFSFilter+0xca02

    fffffade`5a894230 fffffade`599800f4 SomeFSFilter+0x10484

    fffffade`5a894260 fffffade`5998038d SomeFSFilter+0x130f4

    fffffade`5a894290 fffffade`5997ecbc SomeFSFilter+0x1338d

    fffffade`5a8942c0 fffffade`5997637f SomeFSFilter+0x11cbc

    fffffade`5a8942f0 fffffade`5b1d759a SomeFSFilter+0x937f

    fffffade`5a894360 fffffade`5b1f2e14 fltMgr!FltpLegacyProcessingAfterPreCallbacksCompleted+0x41a

    fffffade`5a8943d0 fffff800`01295ede fltMgr!FltpCreate+0x3b4

    fffffade`5a894460 fffff800`0128ea00 nt!IopParseDevice+0x108f

    fffffade`5a894610 fffff800`012909a7 nt!ObpLookupObjectName+0x923

    fffffade`5a894720 fffff800`01296a14 nt!ObOpenObjectByName+0x180

    fffffade`5a894910 fffff800`01296bb7 nt!IopCreateFile+0x630

    fffffade`5a894aa0 fffff800`01298479 nt!IoCreateFile+0x12f

    fffffade`5a894b80 fffff800`0104fce2 nt!NtCreateFile+0x69

    fffffade`5a894c00 00000000`78ef17ba nt!KiSystemServiceCopyEnd+0x3

     

     

    Having all the facts makes understanding the problem easier. SomeFSFilter.sys is an anti-virus driver that scans every file opened. During the open of an encrypted file it tries to scan the file before it is opened. For doing so it requests KSecDD to decrypt the file. KSecDD sends a LPC request to LsaAuthenticationPort for key decryption. LSASS as part of processing the request tries to open a file which for some reason is blocked by SomeFSFilter.sys. So we have a classic deadlock situation where SomeFSFilter.sys is waiting on LSASS to reply to the LPC message while LSASS is waiting on SomeFSFilter.sys for opening a file.

     

    In this scenario we could only use minimal help from the LPC debugger extension because of the way LPC requests were handled by LSASS. In such situations other heuristics are required to debug the problem.

     

    Finally, our journey through the world of LPC comes to an end. LPC has been the backbone of local inter-process communication since the NT4.0 days but as everything has a shell life, so did LPC. LPC has been replaced by ALPC (Asynchronous LPC) in Vista and Server 2008. The LPC APIs are now implemented over the new ALPC APIs to maintain compatibility. Although the LPC APIs are not exported, neither is there a requirement to call them directly, nevertheless having an understanding of the API helps in debugging as we have seen in the posts.

    Hope this information helps. Happy debugging!!!

     

  • Ntdebugging Blog

    TalkBackVideo Understanding handle leaks and How to use !htrace to find them

    • 2 Comments

    Written by Jeff Dailey

     

    Hello, my name is Jeff Dailey, I’m an Escalation Engineer for the Global Escalation Services Platforms team.   I’d like to show you how to debug and find leaking handles within your application or other process.  We can do this with the !htrace command in windbg .  Windbg is the Microsoft Windows Debugger most of us use in GES/CPR  for debugging.   

     

    Handles are a value we use in user mode, that when passed to a call that transitions to kernel, are used as an offset in your handle table to reference kernel mode objects.  Kernel mode objects are generally allocated from pool.   If you are having pool consumption problems and seeing errors like 2020 or 2019’s reported there is a good chance you may have a handle leak associated with them.  This is generally due to not doing a CloseHandle() on the handle when you have finished using it.

     

    You can vide the channel9 "how to debug handle leaks" video here 

     

    The following is the sample source for a handle leak that we will be debugging in our demo video.

     

    // leakyhandles.cpp : Defines the entry point for the console application.

    //

     

    #include "stdafx.h"

    #include <windows.h>

     

     

    void fun1(void);

    void fun2(void);

    void fun3(void);

    void fun4(void);

     

    int main(int argc, char* argv[])

    {

          while(1)

          {

                fun1();

                fun2();

                Sleep(100);

          }

          return 0;

    }

     

    void fun1(void)

    {

          fun3();

    }

     

    void fun2(void)

    {

          fun4();

     

    }

    void fun3(void)

    {

          HANDLE hEvent;

     

          hEvent = CreateEvent(NULL,TRUE,TRUE,NULL);

          CloseHandle(hEvent);

    }

    void fun4(void)

    {

        HANDLE hEvent2;

     

          hEvent2 = CreateEvent(NULL,TRUE,TRUE,NULL);

    }

     

    Thank you.

    Jeff Dailey

    Escalation Engineer (Platforms core team)

  • Ntdebugging Blog

    TalkBack Video: Cumulative Time Zone Update for Windows

    • 2 Comments

    Hi, my name is Joel Schaeffer, and I am an Escalation Engineer on the Platforms Global Escalation Services (GES) team.   I created a video to discuss the August 2007 Windows Cumulative Time Zone Update.  It is hosted on Channel9, and can be found here.  It will explain how to install the patch, validate the install, manually update non-mainstream Windows versions, and how to update Windows Mobile clients.

     

    See the following knowledge base articles for more information about how to update time zone information on your Windows clients and servers:

     

    August 2007 cumulative time zone update for Microsoft Windows operating systems

    http://support.microsoft.com/?id=933360

     

    How to configure daylight saving time for Microsoft Windows operating systems

    http://support.microsoft.com/?id=914387

     

    An August 2007 daylight saving time update for Windows Mobile-based devices is available

    http://support.microsoft.com/?id=941898

     

    See http://www.microsoft.com/dst2007 for more details about how to prepare for the upcoming DST changes for all Microsoft products.

  • Ntdebugging Blog

    Talkback video: Desktop Heap

    • 3 Comments

    Hello, Matthew here again.  Starting today, my team will be bringing you content in the form of videos, as well blog posts.  We’ll be hosting these videos on Channel 9, and we’ll link them from the ntdebugging blog.  One way that we’ll be using video is as a means of highlighting topics we’ve already covered, and then answering some of the common question that we receive from you in the form of comments on the blog.  So if you leave a comment with an interesting question, we may answer it in an upcoming video.  We are also thinking about using video as a way to show various debugging techniques rather than just blog about them. 

    For our first video, we’ve chosen desktop heap as a topic.  We’ve had a lot of traffic on the desktop heap posts, and we’ve had a lot of questions too.  So check out the video, and let us know what you think!

    http://channel9.msdn.com/Blogs/jeff_dailey/ntdebugging-blog-talk-about-on-desktop-heap

Page 22 of 24 (237 items) «2021222324