• Ntdebugging Blog

    Understanding ARM Assembly Part 1

    • 10 Comments

    My name is Marion Cole, and I am a Sr. EE in Microsoft Platforms Serviceability group.  You may be wondering why Microsoft support would need to know ARM assembly.  Doesn’t Windows only run on x86 and x64 machines?  No.  Windows has ran on a variety of processors in the past.  Those include i860, Alpha, MIPS, Fairchild Clipper, PowerPC, Itanium, SPARC, 286, 386, IA-32, x86, x64, and the newest one is ARM.  Most of these processors are antiquated now.  The common ones now are IA-32, x86, x64.  However Windows has started supporting ARM processors in order to jump into the portable devices arena.  You will find them in the Microsoft Surface RT, Windows Phones, and other things in the future I am sure.  So you may be saying that these devices are locked, and cannot be debugged.  That is true from a live debug perspective, but you can get memory dumps and application dumps from them and those can be debugged.

     

    Processor

     

    There are limitations on ARM processors that Windows supports.  There are 3 System on Chip (SOC) vendors that are supported.  nVidia, Texas-Instruments, and Qualcomm. Windows only supports the ARMv7 (Cortex, Scorpion) architecture in ARMv7-A in (Application Profile) mode.  This implements a traditional ARM architecture with multiple modes and supporting a Virtual Memory System Architecture (VMSA) based on an MMU.  It supports the ARM and Thumb-2 instruction sets which allows for a mixture of 16 (Thumb) and 32 (ARM) bit opcodes.  So it will look strange in the assembly.  Luckily the debuggers know this and handle it for you.  This also helps to shrink the size of the assembly code in memory.  The processor also has to have the Optional ISA extensions of VFP (Hardware Floating Point) and NEON (128-bit SIMD Architecture).

     

    In order to understand the assembly that you will see you need to understand the processor internals.

     

    ARM is a Reduced Instruction Set Computer (RISC) much like some of the previous processors that Windows ran on.  It is a 32 bit load/store style processor.  It has a “Weakly-ordered” memory model: similar to Alpha and IA64, and it requires specific memory barriers to enforce ordering.  In ARM devices these as ISB, DSB, and DMB instructions.

     

    Registers

     

    The processor has 16 available registers r0 – r15.

    0: kd> r

    r0=00000001  r1=00000000  r2=00000000  r3=00000000  r4=e1820044  r5=e17d0580

    r6=00000001  r7=e17f89b9  r8=00000002  r9=00000000 r10=1afc38ec r11=e1263b78

    r12=e127813c  sp=e1263b20  lr=e16c12c3  pc=e178b6d0 psr=00000173 ----- Thumb

     

    r0, r1, r2, r3, and r12 are volatile registers.  Volatile registers are scratch registers presumed by the caller to be destroyed across a call.  Nonvolatile registers are required to retain their values across a function call and must be saved by the callee if used. 

     

    On Windows four of these registers have a designated purpose.  Those are:

    • PC (r15) – Program Counter (EIP on x86)
    • LR (r14) – Link Register.  Used as a return address to the caller.
    • SP (r13) – Stack Pointer (ESP on x86).
    • R11 – Frame Pointer (EBP on x86).
    • CPSR – Current Program Status Register (Flags on x86).

     

    In Windbg all but r11 will be labeled appropriately for you.  So you may be asking why r11 is not labeled “fp” in the debugger.  That is because r11 is only used as a frame pointer when you are calling a non-leaf subroutine.  The way it works is this: when a call to a non-leaf subroutine is made, the called subroutine pushes the value of the previous frame pointer (in r11) to the stack (right after the lr) and then r11 is set to point to this location in the stack, so eventually we end up with a linked list of frame pointers in the stack that easily enables the construction of the call stack. The frame pointer is not pushed to the stack in leaf functions.  Will discuss leaf functions later.

     

    CPSR (Current Program Status Register)

     

    Now we need to understand some about the CPSR register.  Here is the bit breakdown:

    31

    30

    29

    28

    27

    26

    25

    24

    23

    22

    21

    20

    19

    18

    17

    16

    15

    14

    13

    12

    11

    10

    9

    8

    7

    6

    5

    4

    3

    2

    1

    0

    N

    Z

    C

    V

    Q

    IT

    J

    Reserved

    GE

    IT

    E

    A

    I

    F

    T

    M

     

    • Bits [31:28] – Condition Code Flags
      • N – bit 31 – If this bit is set, the result was negative.  If bit is cleared the result was positive or zero.
      • Z – bit 30 – If set this bit indicates the result was zero or values compared were equal.  If it is cleared, the value is non-zero or the compared values are not equal.
      • C – bit 29 – If this bit is set the instruction resulted in a carry condition.  E.g. Adding two unsigned values resulted in a value too large to be strored.
      • V – bit 28 – If this bit is set then the instruction resulted in an overflow condition.  E.g. An overflow of adding two signed values.
    • Instructions variants ending with ‘s’ set the condition codes (mov/movs)
    • E – bit 9 – Endianness (big = 1/Little = 0)
    • T – bit 5 – Set if executing Thumb instructions
    • M – bits [4:0] – CPU Mode (User 10000/Supervisor 10011)

     

    So why do I need to know about the CPSR (Current Program Status Register)?  You will need to know where some of these bits are due to how some of the assembly instruction affect these flags.  Example of this is:

     

    ADD will add two registers together, or add an immediate value to a register.  However it will not affect the flags.

     

    ADDS will do the same as ADD, but it does affect the flags.

     

    MOV will allow you to move a value into a register, and a value between registers.  This is not like the x86/x64.  MOV will not let you read or write to memory.  This does not affect the flags.

     

    MOVS does the same thing as MOV, but it does affect the flags.

     

    I hope you are seeing a trend here.  There are instructions that will look the same.  However if they end in “S” then you need to know that this will affect the flags.  I am not going to list all of those assembly instructions here.  Those are already listed in the ARM Architecture Reference Manual ARMv7-A and ARMv7-R edition at http://infocenter.arm.com/help/topic/com.arm.doc.ddi0406b/index.html.

     

    So now we have an idea of what can set the flags.  Now we need to understand what the flags are used for.  They are mainly used for branching instructions.  Here is an example:

    003a11d2 429a     cmp         r2,r3

    003a11d4 d104     bne         |MyApp!FirstFunc+0x28 (003a11e0)|

     

    The first instruction in this code (cmp) compares the value stored in register r2 to the value stored in register r3. This comparison instruction sets or resets the Z flag in the CPSR register. The second instruction is a branch instruction (b) with the condition code ne which means that if the result of the previous comparison was that the values are not equal (the CPSR flag Z is zero) then branch to the address MyApp!FirstFunc+0x28 (003a11e0). Otherwise the execution continues.

     

    There are a few compare instructions.  “cmp” subtracts two register values, sets the flags, and discards the result.  “cmn” adds two register values, sets the flags, and discards the results.  “tst” does a bit wise AND of two register values, sets the flags, and discards the results.  There is even an If Then (it) instruction.  I am not going to discuss that one here as I have never seen it in any of the Windows code.

     

    So is “bne” the only branch instruction?  No.  There is a lot of them.  Here is a table of things that can be seen beside “b”, and what they check the CPSR register:

    Mnemonic

    Extension

    Meaning (Integer)

    Condition Flags (in CPSR)

    EQ

    Equal

    Z==1

    NE

    Not Equal

    Z==0

    MI

    Negative (Minus)

    N==1

    PL

    Positive or Zero (Plus)

    N==0

    HI

    Unsigned higher

    C==1 and Z==0

    LS

    Unsigned lower or same

    C==0 or Z==1

    GE

    Signed greater than or equal

    N==V

    LT

    Signed less than

    N!=V

    GT

    Signed greater than

    Z==0 and N==V

    LE

    Signed less than or equal

    Z==1 or N!=V

    VS

    Overflow

    V==1

    VC

    No overflow

    V==0

    CS

    Carry set

    C==1

    CC

    Carry clear

    C==0

    None (AL)

    Execute always

     

     

    Floating Point Registers

     

    As mentioned earlier the processor also has to have the ISA extensions of VFP (Hardware Floating Point) and NEON (128-bit SIMD Architecture).  Here is what they are.

    Floating Point

     

    As you can see this is 16 – 64bit regiters (d0-d15) that is overlaid with 32 – 32bit registers (s0-s31).  There are varieties of the ARM processor that has 32 – 64bit registers and 64 – 32bit registers.  Windows 8 will support both 16 and 32 register variants.  You have to be careful when using these, because if you access unaligned floats you may cause an exception.

     

    SIMD (NEON)

     

    As you can see here the SIMD (NEON) extension adds 16 – 128 bit registers (q0-q15) onto the floating point registers.  So if you reference Q0 it is the same as referencing D0-D1 or S0-S1-S2-S3.

     

    In part 2 we will discuss how Windows utilizes this processor.

  • Ntdebugging Blog

    Closing the Loop: CPU Spike in Winlogon.exe

    • 3 Comments

    We recently dealt with an interesting issue that I would like to share, hope you enjoy. - Jason

    Issue Summary

    Customer reports that within their Citrix server farm (running on Windows Server 2003), when a user logs into or out of a session (seems more pronounced on logoff), ALL users connected to the server experience a ~5-20 second hang. The hang is described as follows:

    • Application in the session (i.e. Outlook and Word) stop accepting keyboard input. When the hang subsides, the typed characters show up all at once.
    • If applications are moved, they do not redraw
    • Cannot click the start menu
    • If the user was running performance monitor (in an attempt to diagnose), there would be a gap in the perfmon data that directly correlated to the hang duration

    Customer has found that during the timeframe of the hang, Winlogon.exe (in session 0) is showing 25% CPU usage.

    Troubleshooting & Analysis

    Where to start… to begin, given the details above, we can assume we are not experiencing a hard hang where the server is completely unresponsive. The user can still move windows and the connection to the server is still active. Keyboard input is working, but seems to be buffered during the issue. The windows not redrawing indicates the message pump for each of the affected applications is stuck waiting for something.

    In this scenario, I focused on the CPU usage for Winlogon under the premise that the other symptoms were side effects of whatever the CPU was working so hard on. On to the 25%; this is an interesting number to spike at. Generally speaking a thread has work to do, or it doesn't. If there was work to do it would normally take the CPU to 100%. So why 25%? In this scenario, each of the servers within the server farm had 4 processors. So we pegged 1 out of the 4 @ 100% resulting in task manager displaying 25% CPU utilization for the Winlogon in question.

    So now we have a CPU spike in Winlogon. Why only the Winlogon in session 0? Going back to the issue summary, the customer reported that if ANY user logs off, ALL users experience a hang, and Winlogon in session 0 spikes. First, let's talk about what winlogon does for a living.

    Sidebar: Winlogon on Windows Server 2003.

    Any time you ask what a feature is or how it works, you should begin your research with the Windows Technical Reference (try a Live search like "winlogon site:http://technet2.microsoft.com"). In this case I focused on the "How Interactive Logon Works" article.

    This article starts with:

    The Windows Server 2003 interactive logon architecture includes the following components:

    • Winlogon
    • Graphical Identification and Authentication (GINA) dynamic-link library (DLL)
    • Local Security Authority (LSA)
    • Authentication packages (NTLM and Kerberos)

    So Winlogon has something to do with Interactive logon.

    What else can we find out about the responsibilities of Winlogon

    • Registers the SAS (the secure attention sequence, a.k.a CTRL-ALT-DEL) during the boot process to keep other programs and processes from using it
    • SAS routine dispatching - When Winlogon recognizes a SAS event or the GINA delivers a SAS, Winlogon calls one of the SAS processing functions of the GINA.
    • Desktop lockdown - Winlogon helps prevent unauthorized user's from gaining access to system resources by locking down the computer desktop. At any time, Winlogon is in one of three possible states: logged on, logged off, or workstation locked
    • User profile loading - After a successful logon, Winlogon loads user profiles into the HKEY_CURRENT_USER registry key.
    • Screen saver control - Winlogon monitors keyboard and mouse activity to determine when to activate screen savers.
    • Multiple network provider support - If there are multiple network providers installed on a Windows-based system, they can be included in the authentication process and in password-updating operations

    Looking at this list of tasks performed by Winlogon, we need to try to determine which task aligns to our symptoms (hang during logon and logoff). I decided to focus on User Profiles due to the relationship between profiles and login and logoff.

    What is a User Profile? Back to Live with "user profiles site:http://technet2.microsoft.com". Out of the results, I went with the "User Profile Structure" link. The page provides the following detail:

    A user profile consists of:

    • A registry hive. The registry is a database used to store computer- and user-specific settings. Portions of the registry can be saved as files, called hives. These hives can then be reloaded for use as necessary. User profiles take advantage of the hive feature to provide roaming profile functionality. The user profile registry hive is the NTuser.dat in file form, and is mapped to the HKEY_CURRENT_USER portion of the registry when the user logs on. The NTuser.dat hive maintains the users environment preferences when the user is logged on. It stores those settings that maintain network connections, Control Panel configurations unique to the user (such as the desktop color and mouse), and application-specific settings. The majority of the settings stored in the registry are opaque to user profiles settings are owned and maintained by individual applications and operating system components.
    • A set of profile folders stored in the file system. User profile files are stored in the file system in the Documents and Settings directory, in a per user folder. The user profile folder is a container for applications and other operating system components to populate with subfolders and per-user data, such as shortcut links, desktop icons, startup applications, documents, configuration files and so forth. Windows Explorer uses the user profile folders extensively for special folders such as the users desktop, start menu and my documents folder.

    With this we can now look at the Winlogon that is spiking the CPU. My step when looking at a CPU spike for a process is to determine if the time spent is in User-mode or Kernel mode. In task manager on the Performance tab you can monitor CPU usage. If you select the View menu and chose to "Show Kernel Times" you can get additional detail showing you if the time is associated with a User-mode module or a Kernel module. In this case the Winlogon spike showed the spike to be in Kernel time. This means the application (Winlogon) asked the OS to do something and it is trying to get it done.

    My second step is to determine what user-mode request led to the spike in Kernel time. To answer this on a production system I chose to use Sysinternal's Process Explorer. I downloaded the tool and set it up so it would be able to obtain symbols from the public MS symbols server. During the repro I monitored the thread CPU time in Winlogon. Here is a screen shot of what Process Explorer looks when looking at the thread activity of an idle Winlogon.

    Figure 1 – Process Properties in Process Explorer

    When the CPU spike occurred we looked at the stack for the spiked thread (double click on the thread with the high CPU time or click the "Stack" button with the thread highlighted):

    ntdll.dll!KiFastSystemCall+0x3
    ntdll.dll!KiFastSystemCallRet
    ntdll.dll!ZwUnloadKey+0xc
    ADVAPI32.dll!LocalBaseRegUnLoadKey+0x51
    ADVAPI32.dll!RegUnLoadKeyW+0x73
    USERENV.dll!MyRegUnLoadKey+0x6d
    USERENV.dll!CUserProfile::UnloadUserProfileP+0x2a2
    USERENV.dll!UnloadUserProfileI+0x198

    RPCRT4.dll!Invoke+0x30
    RPCRT4.dll!NdrStubCall2+0x299
    RPCRT4.dll!NdrServerCall2+0x19
    RPCRT4.dll!DispatchToStubInCNoAvrf+0x38
    RPCRT4.dll!RPC_INTERFACE::DispatchToStubWorker+0x11f
    RPCRT4.dll!RPC_INTERFACE::DispatchToStub+0xa3
    RPCRT4.dll!LRPC_SCALL::DealWithRequestMessage+0x42c
    RPCRT4.dll!LRPC_ADDRESS::DealWithLRPCRequest+0x127
    RPCRT4.dll!LRPC_ADDRESS::ReceiveLotsaCalls+0x430
    RPCRT4.dll!RecvLotsaCallsWrapper+0xd
    RPCRT4.dll!BaseCachedThreadRoutine+0x9d
    RPCRT4.dll!ThreadStartRoutine+0x1b
    kernel32.dll!BaseThreadStart+0x34

    Just to make sure, we repro'd again and looked at the call stack

    ntdll.dll!KiFastSystemCall+0x3
    ntdll.dll!KiFastSystemCallRet
    ntdll.dll!ZwUnloadKey+0xc
    ADVAPI32.dll!LocalBaseRegUnLoadKey+0x51
    ADVAPI32.dll!RegUnLoadKeyW+0x73
    USERENV.dll!MyRegUnLoadKey+0x6d
    USERENV.dll!CUserProfile::UnloadUserProfileP+0x2a2
    USERENV.dll!UnloadUserProfileI+0x198

    RPCRT4.dll!Invoke+0x30
    RPCRT4.dll!NdrStubCall2+0x299
    RPCRT4.dll!NdrServerCall2+0x19
    RPCRT4.dll!DispatchToStubInCNoAvrf+0x38
    RPCRT4.dll!RPC_INTERFACE::DispatchToStubWorker+0x11f
    RPCRT4.dll!RPC_INTERFACE::DispatchToStub+0xa3
    RPCRT4.dll!LRPC_SCALL::DealWithRequestMessage+0x42c
    RPCRT4.dll!LRPC_ADDRESS::DealWithLRPCRequest+0x127
    RPCRT4.dll!LRPC_ADDRESS::ReceiveLotsaCalls+0x430
    RPCRT4.dll!RecvLotsaCallsWrapper+0xd
    RPCRT4.dll!BaseCachedThreadRoutine+0x9d
    RPCRT4.dll!ThreadStartRoutine+0x1b
    kernel32.dll!BaseThreadStart+0x34

    We can see here that both call stacks for the thread in question are RPC threads waiting for incoming requests. The request that came in was from userenv module running in Winlogon associated with the session of the user logging off and came into the Winlogon in session 0. The request is to unload the User Profile (highlighted above). This led to a call to RegUnloadKey call which removes a hive from the registry but does not modify the file containing the registry information.

    We now know the issue is related to unloading the user profile and specifically the registry portion of the user profile. Now we need to determine where we are spending the time in kernel mode.

    Kernrate

    Kernrate is a tool included with the Windows 2003 resource kit. This will let us dig into the kernel mode side of what is going on. I collected the following data from kernrate during the hang condition (Summarized):

    Parsing the output, we see that 72.66% of the Kernel time was spent in the idle process. This is attributed to the 3 idle threads that are on the 3 processors not being used by the Winlogon thread. We see that the Winlogon process accounts for 20.39% of the kernel time (the CPU spike on the 4th processor). If we move into the module summary, you can see the correlated CPU times for the modules, INTELPPM (CPU driver) running on the idle processors and NTOSKRNL running on the 4th processor (this is where we will focus). In the final drill down, we see the function summary for NTOSKRNL and we can identify the function in use 82% of the time was CmpCleanUpKCBCacheTable. The Cm function prefix lets us know this is related to Configuration Manager (the Registry – see Table 2-7 from Microsoft Windows Internals 4th Edition for Commonly used Kernel Prefixes).

    What do we know? We know the issue manifests most during user logoff. We know during the logoff we are trying to unload the profile. We know that in the process of doing this we are trying to unload a registry hive. We know that leads us to spend a lot of CPU time in the Kernel doing CmpCleanUpKCBCacheTable.

    Why does this hang the machine? The registry is protected / synchronized with an ERESOURCE named CmpRegistryLock. While this is held exclusively during this cleanup function, all registry access is blocked (both read and write). This explains all of our symptoms. Applications freeze and do not redraw due to operations that need registry access being done on the GUI thread of a process. Perfmon cannot report due to its dependency on the registry.

    Resolution

    Now that we know the exact behavior, we could align it to a known issue that was introduced with the release of Windows Sever 2003 SP1 (applies to both SP1 and SP2)

    KB927182

    From the KB cause section:

    This problem occurs because of the way that Windows Server 2003 Service Pack 1 (SP1) cleans up user registry information. When a user logs off, there is a five-second delay before Windows Server 2003 SP1 flushes the user registry hive. If the terminal server experiences heavy registry activity, the cached registry information on the terminal server may increase significantly. For example, this behavior may occur when a program scans the user profile during the logoff process. When the terminal server experiences heavy registry activity, it takes Windows longer to flush the data.

    The fix applies to Windows Server 2003 with SP1 or SP2.

  • Ntdebugging Blog

    Understanding NTFS Compression

    • 5 Comments

    As our group’s file systems expert, I see most of the common problems associated with the use/misuse of NTFS compression.  Before you check the “Compress contents to save disk space” checkbox, it might be good to understand how this affects the happy little bits, bytes, & nibbles running around on the disk.

    Often backup applications receive ERROR_DISK_FULL errors attempting to back up compressed files and this causes quite a bit of confusion when there are still several gigabytes of free space on the drive.  Other issues may also occur when copying compressed files.  The goal of this blog is to give the reader a more thorough understanding of what really happens when you compress NTFS files.

    Compression Units…

    NTFS uses a parameter called the “compression unit” to define the granularity and alignment of compressed byte ranges within a data stream.  The size of the compression unit is based entirely on NTFS cluster size (refer to the table below for details).  In the descriptions below, the abbreviation "CU" is used to describe a Compression Unit and/or its size.

    The default size of the CU is 16 clusters, although the actual size of the CU really depends on the cluster size of the disk.  Below is a chart showing the CU sizes that correspond to each of the valid NTFS cluster sizes.

    Cluster Size Compression Unit Compression Unit (hex bytes)
    512 Bytes 8 KB 0x2000
    1 KB 16 KB 0x4000
    2 KB 32 KB 0x8000
    4 KB 64 KB 0x10000
    8 KB 64 KB 0x10000
    16 KB 64 KB 0x10000
    32 KB 64 KB 0x10000
    64 KB 64 KB 0x10000

    Native NTFS compression does not function on volumes where the cluster size is greater than 4KB, but sparse file compression can still be used.

    NTFS Sparse Files…

    The Sparse Files features of NTFS allow applications to create very large files consisting mostly of zeroed ranges without actually allocating LCNs (logical clusters) for the zeroed ranges.

    For the code-heads in the audience, this can be done by calling DeviceIoControl with the FSCTL_SET_SPARSE IO control code as shown below.

    BOOL SetSparse(HANDLE hFile)
         {
               DWORD Bytes;
               return DeviceIoControl(hFile, FSCTL_SET_SPARSE, NULL, 0, NULL, 0, &Bytes, NULL);
         }

    To specify a zeroed range, your application must then call the DeviceIoControl with the FSCTL_SET_ZERO_DATA IO control code.

    BOOL ZeroRange(HANDLE hFile, LARGE_INTEGER RangeStart, LONGLONG RangeLength)
         {
                FILE_ZERO_DATA_INFORMATION  FileZeroData;
                DWORD Bytes;
                FileZeroData.FileOffset.QuadPart = RangeStart.QuadPart;
                FileZeroData.BeyondFinalZero.QuadPart = RangeStart.QuadPart + RangeLength + 1;
                return DeviceIoControl(  hFile,
                                         FSCTL_SET_ZERO_DATA,
                                         &FileZeroData,
                                         sizeof(FILE_ZERO_DATA_INFORMATION),
                                         NULL,
                                         0,
                                         &Bytes,
                                         NULL);
         }

    Because sparse files don't actually allocate space for zeroed ranges, a sparse file can be larger than the parent volume.  To do this, NTFS creates a placeholder VCN (virtual cluster number) range with no logical clusters mapped to them. 

    Any attempt to access a “sparsed” range would result in NTFS returning a buffer full of zeroes.  Accessing an allocated range would result in a normal read of the allocated range.  When data is written to a sparse file, an allocated range is created which is exactly aligned with the compression unit boundaries containing the byte(s) written.  Refer to the example below.  If a single byte write occurs for virtual cluster number VCN 0x3a, then all of Compression Unit 3 (VCN 0x30 - 0x3f) would become an allocated LCN (logical cluster number) range.  The allocated LCN range would be filled with zeroes and the single byte would be written to the appropriate byte offset within the target LCN.

    [...] - ALLOCATED
    (,,,) - Compressed
    {   } - Sparse (or free) range
       / 00000000000000000000000000000000000000000000000000000000000000000000000000000000
    VCN  00000000000000001111111111111111222222222222222233333333333333334444444444444444
       \ 0123456789abcdef0123456789abcdef0123456789abcdef0123456789abcdef0123456789abcdef
                CU0             CU1            CU2             CU3             CU4
         |++++++++++++++||++++++++++++++||++++++++++++++||++++++++++++++||++++++++++++++|
         {                                              }[..............]{              }
    Extents  {
        VCN = 0x0 LEN = 0x30                        CU0 - CU2
        VCN = 0x30 LEN = 0x10: LCN = 0x2a21f        CU3
        VCN = 0x10 LEN = 0x10                       CU4
    }

    Below is a screen shot of a 2GB file that was created using the sparse file API’s.

    NewVolume

    Notice that this volume is only 76.9 MB, yet it has a 2 GB file in the root folder.

    NewVolumeProperties

    If we attempt to duplicate this sparse file with the COPY command, it fails.  This is because COPY doesn’t know how to duplicate the sparse attributes on the file, so it attempts to create a real 2GB file in the root of D:.  This can occur in production when attempting to move a large database file from one volume to another.  If you have a database application that uses sparse attributes, then it is a good practice to use the database software’s backup / restore features when moving the database to a different volume.

    CmdPromptCopy

    Now let’s look at the file’s properties with the COMPACT utility.  Notice that the file shows up as compressed, and it has a huge compression ratio.

    CmdPromptCompact

    If you go back and look at the file properties from EXPLORER, you will notice that there is no compression checkbox (or any other indication that the file is compressed).  This is because the shell does not check for the sparse bit on the file.

    AdvancedAttributes

    In short, use caution when moving sparse files from one location to another.  Applications tell the file system the offsets of zeroed ranges, so you should always leave the management of sparse files to the application that created them.  Moving or copying the sparse files manually may cause unexpected results.

    NTFS Compression…

    Now that we have discussed sparse files, we will move on to conventional NTFS compression.

    NTFS compresses files by dividing the data stream into CU’s (this is similar to how sparse files work).  When the stream contents are created or changed, each CU in the data stream is compressed individually.  If the compression results in a reduction by one or more clusters, the compressed unit will be written to disk in its compressed format.  Then a sparse VCN range is tacked to the end of the compressed VCN range for alignment purposes (as shown in the example below).  If the data does not compress enough to reduce the size by one cluster, then the entire CU is written to disk in its uncompressed form.

    This design makes random access very fast since only one CU needs to be decompressed in order to access any single VCN in the file.  Unfortunately, large sequential access will be relatively slower since decompression of many CU’s is required to do sequential operations (such as backups).

    In the example below, the compressed file consists of six sets of mapping pairs (encoded file extents).  Three allocated ranges co-exist with three sparse ranges.  The purpose of the sparse ranges is to maintain VCN alignment on compression unit boundaries.  This prevents NTFS from having to decompress the entire file if a user wants to read a small byte range within the file.  The first compression unit (CU0) is compressed by 12.5% (which makes the allocated range smaller by 2 VCNs).  An additional free VCN range is added to the file extents to act as a placeholder for the freed LCNs at the tail of the CU.  The second allocated compression unit (CU1) is similar to the first except that the CU compressed by roughly 50%.

    NTFS was unable to compress CU2 and CU3, but part of CU4 was compressible by 69%.  For this reason, CU2 & CU3 are left uncompressed while CU4 is compressed from VCNs 0x40 to 0x44.  Thus, CU2, CU3, and CU4 are a single run, but the run contains a mixture of compressed & uncompressed VCNs.

    NOTE:  Each set of brackets represents a contiguous run of allocated or free space.  One set of NTFS mapping pairs describes each set of brackets.

    [...] - ALLOCATED
    (,,,) - Compressed
    {   } - Sparse (or free) range
       / 00000000000000000000000000000000000000000000000000000000000000000000000000000000
    VCN  00000000000000001111111111111111222222222222222233333333333333334444444444444444
       \ 0123456789abcdef0123456789abcdef0123456789abcdef0123456789abcdef0123456789abcdef
               CU0             CU1             CU2             CU3             CU4
         |++++++++++++++||++++++++++++++||++++++++++++++||++++++++++++++||++++++++++++++|
         (,,,,,,,,,,,,){}(,,,,,,){      }[...............................,,,,){         }
    Extents  {
        VCN = 0x0 LEN = 0xe : LCN = 0x29e32d        CU0
        VCN = 0xe LEN = 0x2                         CU0
        VCN = 0x10 LEN = 0x8 : LCN = 0x2a291f       CU1
        VCN = 0x18 LEN = 0x8                        CU1
        VCN = 0x20 LEN = 0x25 : LCN = 0x39dd49      CU2 - CU4
        VCN = 0x45 LEN = 0xb                        CU4
    }

    Now we come to the part where we describe limitations of this design.  Below are some examples of what happens when things go wrong while reading / writing compressed files.

    Disk full error during a backup read operation or file copy…

    1. NTFS determines which compression unit is being accessed.
    2. The compression unit’s entire allocated range is read.
    3. If the unit is not compressed, then we skip to step 5.  Otherwise, NTFS would attempt to reserve (but not allocate) the space required to write the decompressed CU back to disk.  If insufficient free space exists on the disk, then the application might get an ERROR_DISK_FULL during the read.
    4. The CU would be decompressed in memory.
    5. The decompressed byte range would be mapped into cache and returned to the requesting application.
    6. If part of the CU is altered in cache…
      1. The reserved disk space from step 3 would become allocated space.
      2. The CU contents would be compressed and flushed back to the newly allocated region (the LCN location will usually not change).
      3. Any recoverable disk space within the CU would be freed.

    Failure to copy a large file to a compressed folder…

    This is the most common problem seen with compression, and currently the solution is to educate users about limitations.  NTFS compression creates approximately one file fragment for every 16 clusters of data.  Because the max cluster size allowed for standard compression is 4K, the largest compression unit allowed is 64KB.  In order to compress a 100 GB file into 64KB sections, you could potentially end up with 1,638,400 fragments.  Encoding 1,638,400 fragments becomes problematic for the file system and can cause a failure to create the compressed file.  On Vista and later, the file copy will fail with STATUS_FILE_SYSTEM_LIMITATION.  On earlier releases, the status code would be STATUS_INSUFFICIENT_RESOURCES.  If possible, avoid using compression on files that are large, or critical to system performance.

    I received feedback from the NTFS Principal Development Lead about this blog.  Fortunately, most of the feedback was good, but he asked that I add a maximum size recommendation.  According to our development team’s research, 50-60 GB is a “reasonable size” for a compressed file on a volume with a 4KB cluster size.  This “reasonable size” goes down sharply for volumes with smaller cluster sizes.

    Compressing .VHD (Virtual Hard Disk) files causes slow virtual machine performance…

    A popular hammer company received a call from a customer complaining “It sure hurts when I hit my thumb with your hammers!”  Likewise, if you compress .VHD files, it is going to be a painful experience, so please compress responsibly.

    Best Regards,

    Dennis Middleton “The NTFS Doctor”

  • Ntdebugging Blog

    How Windows Starts Up (Part the second)

    • 7 Comments

    Howdy folks, David here again with part two of How Windows Starts Up.  Today we’ll be covering the Boot Loader Phase.  Let’s take a moment to recap where we are at this point.  So far, the computer has completed POST, executed the MBR code, located the active partition, executed the Boot Sector code in that partition, found NTLDR and loaded it into memory.  So what happens next?  During this phase, we’ll be discussing the following files located in the root of your system volume:  NTLDR, ntdetect.com and boot.ini.

     

    The Boot Loader Phase begins when NTLDR executes:

     

    1.    NTLDR switches the processor from real mode to 32-bit flat memory mode.

    How does it do this you ask?  NTLDR contains two parts – a 16-bit component which handles switching the processor to Protected Mode and a 32-bit component which handles everything else. 

     

    When NTLDR first begins execution, the processor is in an x86-operating mode called Real-Mode.  In Real-Mode, no virtual to physical translation of memory addresses occurs, which means that all memory addresses use the 16-bit Segment:Offset format, and only the first 1MB of the computer's physical memory is accessible.  However, we don’t have to concern ourselves with this too much because the first action NTLDR takes is to switch the CPU to Protected-Mode. 

     

    *Trivia Alert*  before we enter Protected-Mode, we enable the A20 Line – you IBM guys know what I’m talking about… 

     

    Still no virtual-to-physical translation occurs at this point in the boot process, but a full 32 bits of memory addressing becomes accessible, and now NTLDR can access up to 4GB of physical memory. 

     

    2.    NTLDR creates enough page tables to make memory below 16MB accessible with paging turned on.  This number has been increased with Windows Server 2003. 

     

    3.    NTLDR enables paging.  “Protected-Mode with Paging Enabled” is the mode in which Windows executes during normal operation.

     

    4.    NTLDR then starts the FAT and NTFS mini file system drivers contained within its own code.  Unlike the Boot Sector's file system code, NTLDR's code is sophisticated enough to read subdirectories.

     

    Note:   All disk access from this point until the Kernel Phase begins uses BIOS Interrupt 13h, Extended Interrupt 13h or NTBOOTDD.SYS

     

    5.    NTLDR locates and parses Hiberfil.sys to determine if it contains a valid hibernator file.  If a valid hibernator file is found, then that file is loaded into memory and execution continues from the point the computer was placed into hibernation.

     

    Hibernation files are interesting.  The file either contains a full memory image, or it contains a pointer (ARC path) to the boot partition of the last operating system that entered hibernation.  That boot partition will contain another Hiberfil.sys file, which contains a full memory image of the hibernating operating system.  If the Hiberfil.sys file is a pointer, the file will contain an ARC path using one of the following syntaxes:

     

                             - linkmulti()disk()rdisk()partition()

                             - linksignature()disk()rdisk()partition()

                             - linkscsi()disk()rdisk()partition()

     

    6.    If Hiberfil.sys is not found, NTLDR parses the Boot.ini file.  If the Boot.ini contains a reference to more than one operating system or boot option, then the boot loader screen is displayed. 

     

    This is a sample of a default Boot.ini file from a Windows 2000 Server-based computer:

     

    [boot loader]

    timeout=30

    default=multi(0)disk(0)rdisk(0)partition(1)\WINNT

    [operating systems]

    multi(0)disk(0)rdisk(0)partition(1)\WINNT="Windows 2000 Server" /fastdetect

     

    Let’s talk a bit about what’s in the Boot.ini file.  Boot.ini contains a list of ARC paths.  Ok, quick – five bonus points to the first person who can tell me what ‘ARC’ stands for…  come on…  Yes, you there in the red shirt… correct!  It stands for ‘Advanced RISC Computing’; now beam down with the captain.  The rest of you come with me…

     

    ARC paths may use the “Multi”, “SCSI” or “Signature” syntax.

     

    The “Multi” syntax instructs Windows to rely on the system BIOS to load system files.  This means that NTLDR will be using INT13 BIOS calls to find and load NTOSKRNL and any other files on the boot partition.

     

    The “SCSI” syntax tells Windows to load a SCSI device driver to access the boot partition.  A copy of the driver must be renamed to NTBOOTDD.SYS and placed in the root of the system partition.

     

    The “Signature” syntax is equivalent to the SCSI syntax, but is used instead to support the Plug and Play architecture in Windows 2000 or later.  Because Windows 2000 is a PnP OS, the SCSI controller number instance might vary each time you start Windows 2000, especially if you add new SCSI controller hardware after setup.

     

    The Signature syntax instructs NTLDR to locate the drive with a disk signature that matches the value in the parentheses, regardless of which SCSI controller number that the drive is connected to.  Keep in mind that the Signature syntax also requires the presence of an Ntbootdd.sys file in the root of the system partition.  If you recall from the previous post, the disk signature value is located in the MBR.

     

    Example:

    signature(8b467c12)disk(1)rdisk(0)partition(2)\winnt="Windows 2000"

     

          Signature() syntax is used only if one of the following conditions exists:

     

    ·         The System BIOS or controller hosting the boot partition on which Windows is installed, does not support INT13 Extensions, or has the INT13 Extensions option disabled, and the partition on which you installed Windows is either larger than ~7.8 gigabytes (GB) in size, or the ending cylinder number for that partition is higher than 1024.

    ·         The drive on which you installed Windows is connected to a SCSI controller with SCSI BIOS disabled, so INT13 BIOS calls cannot be used during the boot process.

     

                            227704 Windows May Use Signature() Syntax in the Boot.ini File

                            http://support.microsoft.com/default.aspx?scid=kb;EN-US;227704

     

    7.    If Boot.ini is not found, NTLDR attempts to continue the boot process by assuming that Windows was installed in the default location on the same partition (c:\windows).  If the c:\windows directory is not found, NTLDR displays this error message:  Windows could not start because the following file is missing or corrupt: \winnt root\system32\ntoskrnl.exe

     

    8.    If the boot loader screen is displayed, the user must select an operating system to boot.

    a.    If the user chooses to run an operating system other than Windows 2003/XP such as DOS or Windows 95/98, NTLDR loads and executes the Bootsect.dos file located in the root of the boot drive.

    (1)  If the Bootsect.dos file is found, NTLDR loads it into RAM and then passes control to it.

    (2)  Bootsect.dos then loads IO.sys, MSDOS.sys, Command.com, etc., as necessary and proceeds to boot the downlevel operating system.

    (3)  If the file is not found in the root, NTLDR displays:

    I/O Error accessing boot sector file multi(0)disk(0)rdisk(0)partition(1): \bootsect.dos

    b.    If the user selects a Windows NT-based operating system, NTLDR executes Ntdetect.com.

    (1)  If the ARC path points to a hard drive or partition that is not on the system, the following is displayed:

    Windows could not start because of a computer disk hardware configuration problem.  Could not read from the selected boot disk.  Check boot path and disk hardware.  Please check the Windows documentation about hardware disk configuration and your hardware reference manuals for additional information.

    (2)  The above message may also be displayed if you are using Signature() ARC path notation and the disk signature in the MBR is overwritten (for example, by a virus) or changed (corrupted) so that it no longer matches the Signature() value in the Boot.ini file.  The following error message may also be displayed when there are problems with the Signature () value:

    Windows could not start because of the following ARC firmware boot configuration problem: did not properly generate ARC name for HAL and system paths. Please check the Windows documentation about ARC configuration options and your hardware reference manuals for additional information.

     

    9.    Ntdetect.com initializes and detects all supported hardware that it can find.  (Troubleshooting tip:  There is an Ntdetect.chk on the Windows Resource Kit CD in the Debug.cab file, which can be renamed and used in place of Ntdetect.com to view what Ntdetect.com is detecting during startup.  You know you want to – go ahead, I’ll wait…)

     

    Ntdetect.com detects the following components:

     

    a)    Computer ID

    b)    Bus/adapter type

    c)    Video adapter

    d)    Keyboard

    e)    Communications ports

    f)     Floppy disks

    g)    Mouse or other pointing devices

    h)    Parallel ports

     

    10.  Ntdetect.com passes its list of hardware components to NTLDR.

     

    11.  If the /SOS switch was used in the Boot.ini for the OS that is loading, NTLDR will display the files it loads into memory from this point on.  Keep in mind – none of these files are being initialized.  They are just being loaded into RAM at this point.  I can’t tell you how many times someone has incorrectly concluded that there is a problem with MUP.SYS because it’s the last file displayed on this screen before their computer hangs.  Poor MUP – always getting a bad rap.  This list can be useful, however; it can help you identify potential malware that may be loading.

     

    12.  NTLDR loads Ntoskrnl.exe and Hal.dll into memory.  Again, *loading*, not initializing.

     

    13.  NTLDR loads the System registry hive into memory.  If Windows cannot load the registry hive, it may be necessary to restore the System hive from the ERD or by using the Windows Recovery Console to copy the files from the Windows\Repair or Windows\Repair\RegBack folder.

     

    14.  NTLDR reads the HKLM\System\Select registry key.

    a.    If the Last Known Good option was not selected, NTLDR looks for the “Default” value in the SELECT key to select which ControlSet to use.  (0x1 = ControlSet001, 0x2 = ContolSet002, 0x3 = ControlSet003) 

    b.    If the Last Known Good option was selected, NTLDR looks for the “LastKnownGood” value in the SELECT key to select the ControlSet to use.

     

    15.  NTLDR sets the “Current” value in the HKLM\System\Select key to the ControlSet being used.

     

    Note:  At this point, NTLDR has determined which ControlSet it needs to use, but we don’t actually create the “CurrentControlSet” symbolic link until the kernel initializes (more on that in our next installment).  Therefore, for the purposes of this discussion, I’ll use *ActiveControlSet* to refer to whichever registry ControlSet has been identified as the one NTLDR needs to use.

     

    16.  NTLDR scans HKLM\System\*ActiveControlSet*\Control\Nls\Codepage for the “Acp”, “Oemcp”, and “Oemhal” values.  These entries are pointers to the table of codepage information stored in the same key.  The table has the filenames that correspond to the codepage numbers.  NTLDR loads these files into memory.  The defaults are C_1252.nls, C_437.nls, and Vgaoem.fon.

     

    A code page is a character set specific to an operating system or hardware platform.  It maps character codes to individual characters.  Different code pages include different special characters, typically customized for a language or a group of languages.  The system uses code pages to translate keyboard input into character values for non-Unicode based applications, and to translate character values into characters for non-Unicode based output displays.

     

    17.  NTLDR scans HKLM\System\*ActiveControlSet*\Control\Nls\Language for the “Default” value.  This entry is a pointer to the table of language information stored in the same key.  NTLDR loads this file into memory.  The default is L_intl.nls.

     

    18.  NTLDR scans HKLM\System\*ActiveControlSet*\Services for device drivers with a “Start” value of 0x0.  It finds their “Group” and “Tag” values also.

     

    19.  NTLDR scans HKLM\System\*ActiveControlSet*\Control\ServiceGroupOrder for the order that service groups should be loaded.

     

    20.  NTLDR sorts the drivers by “Group” and then by their “Tag” value.  Each service or driver has a particular group to which it belongs.  All drivers within a specific group are ordered by tag. 

     

    21.  NTLDR loads the 0x0 device drivers into memory.  It does not initialize or start these drivers.

     

    22.  The board is set, and the pieces are moving.  NTLDR executes Ntoskrnl.exe.

     

    Troubleshooting the Boot Loader Phase

     

    Okay, now that we’ve seen what is supposed to happen, what could possibly go wrong?  Well, several things…

     

    First, all file-system activity during this phase uses BIOS Int13, Extended Int13 or Ntbootdd.sys. 

     

    1.    If your computer uses Extended Int 13, verify that Ext. Int13 is enabled in your system’s BIOS or SCSI BIOS as appropriate.

     

    2.    If you are using a SCSI driver, verify that you are using the correct SCSI driver (Ntbootdd.sys).  If you are using "SCSI()" syntax in the Boot.ini file, copy the correct device driver for the SCSI controller to the root and then rename it to Ntbootdd.sys.  This is the driver that is stored in the Winnt.sbs\System32\Drivers folder (for example, the Aic78xx.sys file).  If you use "MULTI()" syntax in the Boot.ini file, you do not need to do this.  Use the Windows Recovery Console or a Windows NT boot floppy to complete this step.

     

                301680 HOW TO: Create a Boot Disk for an NTFS or FAT Partition

                http://support.microsoft.com/default.aspx?scid=kb;EN-US;301680

     

    307654 How to install and use the Recovery Console in Windows XP

    http://support.microsoft.com/default.aspx?scid=kb;EN-US;307654

     

     

    Significant Errors during this phase:

     

    ·         Windows could not start because of a computer disk hardware configuration problem.  Could not read from the selected boot disk.  Check boot path and disk hardware.  Please check the Windows documentation about hardware disk configuration and your hardware reference manuals for additional information.

     

    This error indicates that the ARC path is pointing to a hard drive or partition that is not on the system.  This message may also be displayed if you are using Signature() ARC path notation and the disk signature in the MBR is overwritten (for example, by a virus) or changed (corrupted) so that it no longer matches the Signature() value in the Boot.ini file. 

     

    ·         Windows could not start because of the following ARC firmware boot configuration problem: did not properly generate ARC name for HAL and system paths. Please check the Windows documentation about ARC configuration options and your hardware reference manuals for additional information.

     

    This error message may be displayed when there are problems with the Signature () ARC path syntax.

     

    Ntoskrnl.exe missing or corrupt.  This error could be the result of an incorrect ARC path in the Boot.ini file.  Use the information in the following knowledge base articles to correct the Boot.ini information.

     

                102873 BOOT.INI and ARC Path Naming Conventions and Usage

                http://support.microsoft.com/default.aspx?scid=kb;EN-US;102873

     

                155222 How to Determine the ARC Path

                http://support.microsoft.com/default.aspx?scid=kb;EN-US;155222

     

                130921 Creating an FT Boot Disk With Scsi( ) and Multi( ) Identifiers

                http://support.microsoft.com/default.aspx?scid=kb;EN-US;130921

     

    It is also possible to use the Windows XP Recovery Console to create a new Boot.ini for a Windows NT, Windows 2000 or .NET installation.  Please see the following knowledge base article for a description of this process:

     

                291980 A Discussion About the Bootcfg Command and Its Uses

                http://support.microsoft.com/default.aspx?scid=kb;EN-US;Q291980

     

    Specify the /SOS switch in the Boot.ini file and monitor the output to the screen.  If the system stops responding before anything is output to the screen, then the problem may be with hardware detection.  Replace Ntdetect.com with Ntdetect.chk to view the hardware detected by the system during this phase.

     

                103659 Setup Hangs While Inspecting Hardware; How to Use Ntdetect.com

                http://support.microsoft.com/default.aspx?scid=kb;EN-US;103659

     

                320040 HOW TO: Use Ntdetect.chk to Identify Hardware Problems

    http://support.microsoft.com/default.aspx?scid=kb;EN-US;320040

     

     

    Incorrect HAL

    If after specifying the /SOS option, you see that an incorrect version of the HAL.DLL is loading, use the following KB articles to replace the HAL or troubleshoot other HAL-related issues:

     

                314477 "Hardware Configuration Problem" Err Msg Starting Windows

                http://support.microsoft.com/default.aspx?scid=kb;EN-US;314477

     

                237556 Troubleshooting Windows 2000 Hardware Abstraction Layer Issues

                http://support.microsoft.com/default.aspx?scid=kb;EN-US;237556

     

    330184  "Invalid Boot.ini" or "Windows could not start" error messages

    http://support.microsoft.com/default.aspx?scid=kb;EN-US;330184

     

     

    Unable to load System hive

    Several problems could happen here.  First, consider that the amount of memory we’re dealing with is only 16MB.  We have to squeeze NTLDR, the System Registry Hive, the kernel, the HAL, all the boot drivers and the PFN database into a relatively small area.  So, with Windows Server 2003, we’ve given ourselves a bit more room. 

     

                302594 The System hive memory limitation is improved in Windows Server 2003

    http://support.microsoft.com/default.aspx?scid=kb;EN-US;302594

     

    If Windows is unable to load the System registry hive into memory, then it may be necessary to restore the System hive from the ERD or use the Windows Recovery Console to copy the files from the Winnt\Repair or Winnt\Repair\RegBack folder whichever version is most recent.

     

                307545 How to Recover from a Corrupted Registry

                http://support.microsoft.com/default.aspx?scid=kb;EN-US;307545

     

     

    Table of Startup Type Values

    HKEY_LOCAL_MACHINE\System\CurrentControlSet\Services

    0x0 = Boot

    Loaded into RAM by the Boot loader (NTLDR) during Boot Loader Phase, then initialized by the Kernel (NTOSKRNL) during Kernel Phase

    0x1 = System

    Loaded and initialized by the Kernel during Kernel Phase

    0x2 = Auto

    Loaded or started automatically at system startup

    0x3 = Manual

    Driver is manually started by the user or another process

    0x4 = Disabled

    Driver is not to be started (There are exceptions, such as NTFS)

     

     

    Conclusion

    Well folks, we certainly have covered a lot of ground so far.  Any questions?  Yes, you there in the front…  Ah, why have we not discussed any blue screens?  Very good question.  We haven’t seen any BSOD’s up to this point in the boot sequence because the kernel isn’t running yet and STOP errors are generated by the kernel.  No kernel, no blue screens. 

     

    That’s all the time we have this week boys & girls.  Join us next time for How Windows Starts Up, The Kernel Phase.

     

  • Ntdebugging Blog

    Challenges of Debugging Optimized x64 Code

    • 7 Comments

    If you have not had the luxury of debugging optimized x64 code as of yet, don’t wait much longer and fall behind the times!  Due to the x64 fastcall-like calling convention coupled with the abundance of general purpose registers, finding variable values at arbitrary points in a call stack can be very tricky indeed.

    In this article, I’d like to detail some of my favorite techniques for debugging optimized x64 code.  But before digging into these techniques, let’s first have a quick overview of the x64 calling convention.

    The x64 Calling Convention

    Those of you familiar with the fastcall calling convention on x86 platforms will recognize the similarities to the x64 calling convention.  Whereas you typically have to maintain knowledge of multiple calling conventions on x86 platforms, on x64 platforms there is currently just one.  (Of course, I’m excluding the case of no calling convention which one can achieve with __declspec(naked) or by coding in straight assembly.)

    I won’t go into all of the various nuances of the x64 calling convention, therefore I recommend you check out the following link (http://msdn.microsoft.com/en-us/library/ms794533.aspx).  But commonly, the first four parameters into a function are passed via the registers rcx, rdx, r8, and r9.  If the function accepts more than four parameters, those parameters are passed on the stack.  (Those of you familiar with the x86 fastcall calling convention where the first two parameters are passed in ecx and edx will recognize the similarities).

    To help illustrate how the x64 calling convention works, I have created some simple example code.  Although the code is contrived and far from real-world code, it demonstrates some scenarios that are likely to encounter in the real word.  The code is shown below.

    #include <stdlib.h>

    #include <stdio.h>

    #include <windows.h>

    __declspec(noinline)

    void

    FunctionWith4Params( int param1, int param2, int param3,

                         int param4 )

    {

        size_t lotsOfLocalVariables1 = rand();

        size_t lotsOfLocalVariables2 = rand();

        size_t lotsOfLocalVariables3 = rand();

        size_t lotsOfLocalVariables4 = rand();

        size_t lotsOfLocalVariables5 = rand();

        size_t lotsOfLocalVariables6 = rand();

        DebugBreak();

        printf( "Entering FunctionWith4Params( %X, %X, %X, %X )\n",

                param1, param2, param3, param4 );

        printf( "Local variables: %X, %X, %X, %X, %X, %X \n",

                lotsOfLocalVariables1, lotsOfLocalVariables2,

                lotsOfLocalVariables3, lotsOfLocalVariables4,

                lotsOfLocalVariables5, lotsOfLocalVariables6 );

    }

    __declspec(noinline)

    void

    FunctionWith5Params( int param1, int param2, int param3,

                         int param4, int param5 )

    {

        FunctionWith4Params( param5, param4, param3, param2 );

        FunctionWith4Params( rand(), rand(), rand(), rand() );

    }

    __declspec(noinline)

    void

    FunctionWith6Params( int param1, int param2, int param3,

                         int param4, int param5, int param6 )

    {

        size_t someLocalVariable1 = rand();

        size_t someLocalVariable2 = rand();

        printf( "Entering %s( %X, %X, %X, %X, %X, %X )\n",

                "FunctionWith6Params",

                param1, param2, param3, param4, param5, param6 );

        FunctionWith5Params( rand(), rand(), rand(),

                             param1, rand() );

        printf( "someLocalVariable1 = %X, someLocalVariable2 = %X\n",

                someLocalVariable1, someLocalVariable2 );

    }

    int

    main( int /*argc*/, TCHAR** /*argv*/ )

    {

        // I use the rand() function throughout this code to keep

        // the compiler from optimizing too much.  If I had used

        // constant values, the compiler would have optimized all

        // of these away.

        int params[] = { rand(), rand(), rand(),

                         rand(), rand(), rand() };

        FunctionWith6Params( params[0], params[1], params[2],

                             params[3], params[4], params[5] );

        return 0;

    }

     

    Cut and paste this code into a cpp file (such as example.cpp).  I used the Windows SDK (specifically the Windows SDK CMD Shell) to compile this code as C++ code by using the following command line:

    cl /EHa /Zi /Od /favor:INTEL64 example.cpp /link /debug

    Notice the /Od switch.  This disables all optimizations.  Later on, I’ll enable maximum optimization and that’s when the fun begins!

    Once you have the executable module built (mine is named example.exe), then you can fire it up in the debugger as follows:

    windbg -Q -c "bu example!main;g;" example.exe

    The command above will launch the application in windbg, set a breakpoint on the main() routine, and then go to that breakpoint.

    Now, let’s have a look at a diagram of what the stack looks like when FunctionWith6Params() gets called.  The diagram shown below illustrates the stack when the instruction pointer is at the beginning of the code for FunctionWith6Params() but before the prolog code has executed:

    CallStack1

    Notice that the caller, in this case main(), allocated enough space on the stack for all six parameters to FunctionWith6Params() even though the first four parameters are passed in via registers.  The extra space on the stack is commonly referred to as the “home space” for the register parameters.  In the previous diagram, I have shown those slots filled with xxxxxxxx to indicate that the values within there are virtually random at this point.  That’s because the caller, main(), does not initialize these slots.  The called function, at its discretion, may store the first four parameters in this space for safe keeping.  This is exactly what happens in non-optimized builds and is a huge debugging convenience since you can easily find the contents of the first four parameters on the stack if you need to.  Additionally, windbg stack commands such as kb and kv which show these first few parameters will report true results.

    With all of that said, here is what the stack looks like after the prolog code in FunctionWith6Params() executes:

    CallStack2

    The prolog assembly code for FunctionWith6Params() is shown below:

    0:000> uf .

    example!FunctionWith6Params [c:\temp\blog_entry\sample_code\example.cpp @ 28]:

       41 00000001`40015900 mov     dword ptr [rsp+20h],r9d
       41 00000001`40015905 mov     dword ptr [rsp+18h],r8d
       41 00000001`4001590a mov     dword ptr [rsp+10h],edx
       41 00000001`4001590e mov     dword ptr [rsp+8],ecx
       41 00000001`40015912 push    rbx
       41 00000001`40015913 push    rsi
       41 00000001`40015914 push    rdi
       41 00000001`40015915 sub     rsp,50h

    You can see that the first four instructions save the first four parameters on the stack in the home space allocated by main().  Then, the prolog code saves any non-volatile registers that FunctionWith6Params() plans to use during its execution.  The saved registers’ states are restored in the function epilog code prior to returning to the caller.  Finally the prolog code reserves some space on the stack, in this case, for 0x50 bytes.

    What is this space reserved on the top of the stack for?  First, space is created for any local variables.  In this case, FunctionWith6Params() has two.  However, those two local variables only account for 0x10 bytes.  What’s the deal with the rest of the space created on the top of the stack?

    On the x64 platform, when code prepares the stack for calling another function, it does not use push instructions to put the parameters on the stack as is commonly the case in x86 code.  Instead, the stack pointer typically remains fixed for a particular function.  The compiler looks at all of the functions the code in the current function calls, it finds the one with the maximum number of parameters, and then creates enough space on the stack to accommodate those parameters.  In this example, FunctionWith6Params() calls printf() passing it 8 parameters.  Since that is the called function with the maximum number of parameters, the compiler creates 8 slots on the stack.  The top four slots on the stack will then be the home space used by any functions FunctionWith6Params() calls.

    A handy side effect of the x64 calling convention is that once you are inside the bracket of the prolog and epilog of a function, the stack pointer does not change while the instruction pointer is in that function.  This eliminates the need for a base pointer which is common in x86 calling conventions.  When the code in FunctionWith6Params() prepares to call a child function, it simply puts the first four parameters into the required registers and, if there are more than 4 parameters, it uses mov instructions to place the remaining parameters in the allocated stack space but making sure to skip the first four parameter slots on the stack.

    Debugging Optimized x64 code (The Nightmare Begins)

    Why is debugging x64 optimized code so tricky?  Well, remember that home space that the caller creates on the stack for the callee to save the first four parameters?  It turns out that the calling convention does not require the callee to use that space!  And you can certainly bet that optimized x64 code will not use that space unless it is necessary and convenient for its optimization purposes.  Moreover, when optimized code does use the home space, it could use it to store non-volatile registers rather than the first four parameters to the function.

    Go ahead and recompile the example code using the following command line:

    cl /EHa /Zi /Ox /favor:INTEL64 example.cpp /link /debug

    Notice the use of the /Ox switch.  This turns on maximum optimization.  Debug symbols are still turned on so we can debug the optimized code easily.  Always build your release product with debug information turned on so you can debug your release builds!

    Let’s look at how the prolog assembly code for FunctionWith6Params() has changed:

       41 00000001`400158e0 mov     qword ptr [rsp+8],rbx
       41 00000001`400158e5 mov     qword ptr [rsp+10h],rbp
       41 00000001`400158ea mov     qword ptr [rsp+18h],rsi
       41 00000001`400158ef push    rdi
       41 00000001`400158f0 push    r12
       41 00000001`400158f2 push    r13
       41 00000001`400158f4 sub     rsp,40h
       41 00000001`400158f8 mov     ebx,r9d
       41 00000001`400158fb mov     edi,r8d
       41 00000001`400158fe mov     esi,edx
       41 00000001`40015900 mov     r12d,ecx

    The optimized code is significantly different!  Let’s itemize the changes below:

    ·         The function uses the home space on the stack, however, it does not store the first four parameters there.  Instead it uses the space to store some non-volatile registers it must restore later in the epilog code.  This optimized code is going to make use of more processor registers, therefore it must save more of the non-volatile registers.

    ·         It still pushes three non-volatile registers onto the stack for safe keeping along with the other three it stored in the home space.

    ·         It then creates space on the stack.  However, it’s less space than in the non-optimized code, and is only 0x40 bytes.  That’s because the optimized code uses registers to represent the local variables someLocalVariable1 and someLocalVariable2.  Therefore, it only has to create space for the 8 slots needed to call the function with the maximum number of parameters, printf().

    ·         It then stores the first four parameters into non-volatile registers rather than in the home space. (Don’t count on this behavior.  An optimized function may make no copies of the contents of rcx, rdx, r8, and r9.  It all depends on the structure of the code)

    Now step through FunctionWith6Params() to the source line just after the first printf() call.  The output generated from the printf() call on my machine is as follows:

    Entering FunctionWith6Params( 29, 4823, 18BE, 6784, 4AE1, 3D6C )

    A common version of the stack command in windbg is kb, which also displays the first few parameters to each function in the frame.  In reality, it is displaying the first few positions of the stack.  The output for the kb command is as follows:

    0:000> kb
    RetAddr           : Args to Child                                                           : Call Site
    00000001`4001593b : 00000000`00004ae1 00000000`00004823 00000000`000018be 00000000`007e3570 : example!FunctionWith6Params+0x6a [c:\temp\blog_entry\sample_code\example.cpp @ 37]
    00000001`40001667 : 00000000`00000000 00000000`00000000 00000000`00000000 00000000`00000001 : example!main+0x5b [c:\temp\blog_entry\sample_code\example.cpp @ 57]
    00000000`76d7495d : 00000000`00000000 00000000`00000000 00000000`00000000 00000000`00000000 : example!__tmainCRTStartup+0x15b
    00000000`76f78791 : 00000000`00000000 00000000`00000000 00000000`00000000 00000000`00000000 : kernel32!BaseThreadInitThunk+0xd
    00000000`00000000 : 00000000`00000000 00000000`00000000 00000000`00000000 00000000`00000000 : ntdll!RtlUserThreadStart+0x1d

    Notice that not all of the first four parameters of FunctionWith6Params() match what is shown by the kb command!  Of course, this is a side effect of the optimization.  You simply cannot trust the output displayed by kb and kv in optimized code.  This is the biggest reason why optimized x64 code is so difficult to debug.  Trust me when I say that it’s just pure luck that the second and third slots in the kb output above match the actual parameter values to FunctionWith6Params().  It’s because FunctionWith6Params() stores non-volatile registers in those slots and it just so happens that main() put those values in those non-volatile registers prior to calling FunctionWith6Params().

    Parameter Sleuthing -- Technique 1 (Down the Call Graph)

    Now, let’s look at some techniques for finding elusive function parameters to functions in the call stack while running x64 code.  I have placed a DebugBreak() call in FunctionWith4Params() to illustrate.  Go ahead and let the code run in windbg until it hits this breakpoint.  Now, imagine what you are looking at is actually not a live debugging scenario but rather a dump file from a customer of yours and this is the point where your application has crashed.  So, you take a look at the stack and it looks like the following:

    0:000> kL
    Child-SP          RetAddr           Call Site
    00000000`0012fdc8 00000001`40015816 ntdll!DbgBreakPoint
    00000000`0012fdd0 00000001`400158a0 example!FunctionWith4Params+0x66
    00000000`0012fe50 00000001`40015977 example!FunctionWith5Params+0x20
    00000000`0012fe80 00000001`40015a0b example!FunctionWith6Params+0x97
    00000000`0012fee0 00000001`4000168b example!main+0x5b
    00000000`0012ff20 00000000`7733495d example!__tmainCRTStartup+0x15b
    00000000`0012ff60 00000000`77538791 kernel32!BaseThreadInitThunk+0xd
    00000000`0012ff90 00000000`00000000 ntdll!RtlUserThreadStart+0x1d

    Now, let’s say that in order for you to figure out what went wrong, you need to know the first parameter to FunctionWith6Params().  Assume you have not seen the first parameter in the console output. No fair cheating!

    The first technique I would like to illustrate involves digging downward into the call graph  to find out what has happened to the contents of rcx (the first parameter) after entering FunctionWith6Params().  In this case, since the parameters are 32bit integers, we’ll be attempting to follow the contents of ecx, which is the lower half of rcx.

    Let’s start by looking at the assembly code within FunctionWith6Params() starting from the beginning up to the call into FunctionWith5Params()::

    0:000> u example!FunctionWith6Params example!FunctionWith6Params+0x97
    example!FunctionWith6Params [c:\temp\blog_entry\sample_code\example.cpp @ 41]:
    00000001`400158e0 mov     qword ptr [rsp+8],rbx
    00000001`400158e5 mov     qword ptr [rsp+10h],rbp
    00000001`400158ea mov     qword ptr [rsp+18h],rsi
    00000001`400158ef push    rdi
    00000001`400158f0 push    r12
    00000001`400158f2 push    r13
    00000001`400158f4 sub     rsp,40h
    00000001`400158f8 mov     ebx,r9d
    00000001`400158fb mov     edi,r8d
    00000001`400158fe mov     esi,edx
    00000001`40015900 mov     r12d,ecx
    00000001`40015903 call    example!rand (00000001`4000148c)
    00000001`40015908 movsxd  r13,eax
    00000001`4001590b call    example!rand (00000001`4000148c)
    00000001`40015910 lea     rdx,[example!`string'+0x68 (00000001`40020d40)]
    00000001`40015917 movsxd  rbp,eax
    00000001`4001591a mov     eax,dword ptr [rsp+88h]
    00000001`40015921 lea     rcx,[example!`string'+0x80 (00000001`40020d58)]
    00000001`40015928 mov     dword ptr [rsp+38h],eax
    00000001`4001592c mov     eax,dword ptr [rsp+80h]
    00000001`40015933 mov     r9d,esi
    00000001`40015936 mov     dword ptr [rsp+30h],eax
    00000001`4001593a mov     r8d,r12d
    00000001`4001593d mov     dword ptr [rsp+28h],ebx
    00000001`40015941 mov     dword ptr [rsp+20h],edi
    00000001`40015945 call    example!printf (00000001`400012bc)
    00000001`4001594a call    example!rand (00000001`4000148c)
    00000001`4001594f mov     edi,eax
    00000001`40015951 call    example!rand (00000001`4000148c)
    00000001`40015956 mov     esi,eax
    00000001`40015958 call    example!rand (00000001`4000148c)
    00000001`4001595d mov     ebx,eax
    00000001`4001595f call    example!rand (00000001`4000148c)
    00000001`40015964 mov     r9d,r12d
    00000001`40015967 mov     r8d,esi
    00000001`4001596a mov     edx,ebx
    00000001`4001596c mov     ecx,eax
    00000001`4001596e mov     dword ptr [rsp+20h],edi
    00000001`40015972 call    example!ILT+5(?FunctionWith5ParamsYAXHHHHHZ) (00000001`4000100a)

    FunctionWith6Params() copies ecx into r12d to preserve it for later use since the contents must be passed to multiple functions within the body of FunctionWith6Params().  Notice at the point where FunctionWith5Params() is called, the contents of ecx have been copied into both r12d and r9d, however, r9d is volatile so we must be careful with it since it could get overwritten prior to the next function call when FunctionWith5Params() calls FunctionWith4Params().  Armed with this information, let’s dig into the assembly code for FunctionWith5Params() that has executed up to this point:

    0:000> u example!FunctionWith5Params example!FunctionWith5Params+0x20
    example!FunctionWith5Params [c:\temp\blog_entry\sample_code\example.cpp @ 32]:
    00000001`40015880 mov     qword ptr [rsp+8],rbx
    00000001`40015885 mov     qword ptr [rsp+10h],rsi
    00000001`4001588a push    rdi
    00000001`4001588b sub     rsp,20h
    00000001`4001588f mov     ecx,dword ptr [rsp+50h]
    00000001`40015893 mov     eax,r9d
    00000001`40015896 mov     r9d,edx
    00000001`40015899 mov     edx,eax
    00000001`4001589b call    example!ILT+10(?FunctionWith4ParamsYAXHHHHZ) (00000001`4000100f)

    At the point where FunctionWith4Params() is called, the value we are after is now in eax, edx, and r12d.  Again, be careful with eax and edx as they are volatile.  However, since FunctionWith5Params() did not touch r12d, the contents of the parameter we are still after are still in r12d

    Now, let’s look at the code in FunctionWith4Params() that has executed so far:

    0:000> u example!FunctionWith4Params example!FunctionWith4Params+0x66
    example!FunctionWith4Params [c:\temp\blog_entry\sample_code\example.cpp @ 9]:
    00000001`400157b0 48895c2408      mov     qword ptr [rsp+8],rbx
    00000001`400157b5 48896c2410      mov     qword ptr [rsp+10h],rbp
    00000001`400157ba 4889742418      mov     qword ptr [rsp+18h],rsi
    00000001`400157bf 57              push    rdi
    00000001`400157c0 4154            push    r12
    00000001`400157c2 4155            push    r13
    00000001`400157c4 4156            push    r14
    00000001`400157c6 4157            push    r15
    00000001`400157c8 4883ec50        sub     rsp,50h
    00000001`400157cc 458be1          mov     r12d,r9d
    00000001`400157cf 458be8          mov     r13d,r8d
    00000001`400157d2 448bf2          mov     r14d,edx
    00000001`400157d5 448bf9          mov     r15d,ecx
    00000001`400157d8 e8afbcfeff      call    example!rand (00000001`4000148c)
    00000001`400157dd 4898            cdqe
    00000001`400157df 4889442448      mov     qword ptr [rsp+48h],rax
    00000001`400157e4 e8a3bcfeff      call    example!rand (00000001`4000148c)
    00000001`400157e9 4898            cdqe
    00000001`400157eb 4889442440      mov     qword ptr [rsp+40h],rax
    00000001`400157f0 e897bcfeff      call    example!rand (00000001`4000148c)
    00000001`400157f5 4863e8          movsxd  rbp,eax
    00000001`400157f8 e88fbcfeff      call    example!rand (00000001`4000148c)
    00000001`400157fd 4863f0          movsxd  rsi,eax
    00000001`40015800 e887bcfeff      call    example!rand (00000001`4000148c)
    00000001`40015805 4863f8          movsxd  rdi,eax
    00000001`40015808 e87fbcfeff      call    example!rand (00000001`4000148c)
    00000001`4001580d 4863d8          movsxd  rbx,eax
    00000001`40015810 ff15a24b0100    call    qword ptr [example!_imp_DebugBreak (00000001`4002a3b8)]

    We just found what we are looking for!  The red highlighted line shows r12 being saved on the stack because FunctionWith4Params() wants to reuse r12.  Since r12 is a non-volatile register, it must save the contents somewhere so it can restore the contents before the function exits.  All we have to do is locate that slot on the stack, and assuming that the stack has not been corrupted, we’ll have our prize.

    One technique for finding the slot is to start with the Child-SP value associated with the FunctionWith4Params() frame in the stack dump shown previously, which is 00000000`0012fdd0 in my build.  Using that value, let’s dump the stack content using the dps command:

    0:000> dps 00000000`0012fdd0 L10
    00000000`0012fdd0  00000001`00000001
    00000000`0012fdd8  00000001`40024040 example!_iob+0x30
    00000000`0012fde0  00000000`00000000
    00000000`0012fde8  00000001`40002f9e example!_getptd_noexit+0x76
    00000000`0012fdf0  00000000`00261310
    00000000`0012fdf8  00000001`40001a92 example!_unlock_file2+0x16
    00000000`0012fe00  00000000`00000001
    00000000`0012fe08  00000000`00004823
    00000000`0012fe10  00000000`000041bb
    00000000`0012fe18  00000000`00005af1
    00000000`0012fe20  00000000`00000000
    00000000`0012fe28  00000000`00000000
    00000000`0012fe30  00000000`00002cd6
    00000000`0012fe38  00000000`00000029
    00000000`0012fe40  00000000`00006952
    00000000`0012fe48  00000001`400158a0 example!FunctionWith5Params+0x20 [c:\temp\blog_entry\sample_code\example.cpp @ 34]

    I have highlighted the position that rsp points to when we enter FunctionWith4Params() in red.  Based on the prolog code shown for FunctionWith4Params() above, we can find the slot where our prize is stored.  I have highlighted it in green above and you can see the value on my machine is 0x29, which matches the value printf() sent to the console.  Additionally, I highlighted r14d in green in the assembly code for FunctionWith4Params() to indicate where the contents of edx (the second parameter) were copied to.  Since FunctionWith4Params() is virtually the top function on the stack (due to the fact that DebugBreak() takes no parameters), then r14d should also contain the value we are after.  Dumping the contents of r14 proves this as shown below:

    0:000> r r14
    r14=0000000000000029

    To sum up, when you are chasing register-passed parameter values down through a call graph, look for places where the value is copied into.  Specifically, if the value is copied into a non-volatile register, that can be a good thing.  If a downstream function wants to reuse that non-volatile register, it must first save the contents (usually on the stack) so it can restore it when it is done.  If you’re not that lucky, you may be able to trace a register it was copied into which has not been changed at the breakpoint.  Both conditions were shown above.

    Parameter Sleuthing -- Technique 2 (Up the Call Graph)

    The second technique I would like to demonstrate is very similar to the first technique except that we walk the stack/call-graph in the opposite direction as before, that is, up the call graph.  Unfortunately, none of these techniques are fool proof and guaranteed to bear fruit.  So, it’s nice to have multiple techniques to employ even though all of them may strike out.

    We know that when FunctionWith6Params() gets called, ecx contains the value we are after.  Therefore, if we look at the code for main(), maybe we can find the source from which the ecx register was filled prior to the function call.  Let’s have a look as the assembly code in main():

    0:000> u example!main example!main+0x5b
    example!main [c:\temp\blog_entry\sample_code\example.cpp @ 58]:
    00000001`400159b0 48895c2408      mov     qword ptr [rsp+8],rbx
    00000001`400159b5 48896c2410      mov     qword ptr [rsp+10h],rbp
    00000001`400159ba 4889742418      mov     qword ptr [rsp+18h],rsi
    00000001`400159bf 48897c2420      mov     qword ptr [rsp+20h],rdi
    00000001`400159c4 4154            push    r12
    00000001`400159c6 4883ec30        sub     rsp,30h
    00000001`400159ca e8bdbafeff      call    example!rand (00000001`4000148c)
    00000001`400159cf 448be0          mov     r12d,eax
    00000001`400159d2 e8b5bafeff      call    example!rand (00000001`4000148c)
    00000001`400159d7 8be8            mov     ebp,eax
    00000001`400159d9 e8aebafeff      call    example!rand (00000001`4000148c)
    00000001`400159de 8bf0            mov     esi,eax
    00000001`400159e0 e8a7bafeff      call    example!rand (00000001`4000148c)
    00000001`400159e5 8bf8            mov     edi,eax
    00000001`400159e7 e8a0bafeff      call    example!rand (00000001`4000148c)
    00000001`400159ec 8bd8            mov     ebx,eax
    00000001`400159ee e899bafeff      call    example!rand (00000001`4000148c)
    00000001`400159f3 448bcf          mov     r9d,edi
    00000001`400159f6 89442428        mov     dword ptr [rsp+28h],eax
    00000001`400159fa 448bc6          mov     r8d,esi
    00000001`400159fd 8bd5            mov     edx,ebp
    00000001`400159ff 418bcc          mov     ecx,r12d
    00000001`40015a02 895c2420        mov     dword ptr [rsp+20h],ebx
    00000001`40015a06 e8fab5feff      call    example!ILT+0(?FunctionWith6ParamsYAXHHHHHHZ) (00000001`40001005)

    We see that ecx was copied from the contents of r12d.  This is helpful since r12d is a non-volatile register, and if it is reused by a function further down the call stack, it must be preserved and that preservation usually means putting a copy on the stack.  It would have been nice if ecx were filled with a value from the stack, at which point we would be virtually done.  But in this case, we just need to start our journey back downwards again.

    We don’t have to look very far.  Let’s have another look at the prolog code for FunctionWith6Params():

    example!FunctionWith6Params [c:\temp\blog_entry\sample_code\example.cpp @ 41]:
       41 00000001`400158e0 mov     qword ptr [rsp+8],rbx
       41 00000001`400158e5 mov     qword ptr [rsp+10h],rbp
       41 00000001`400158ea mov     qword ptr [rsp+18h],rsi
       41 00000001`400158ef push    rdi
       41 00000001`400158f0 push    r12
       41 00000001`400158f2 push    r13
       41 00000001`400158f4 sub     rsp,40h
       41 00000001`400158f8 mov     ebx,r9d
       41 00000001`400158fb mov     edi,r8d
       41 00000001`400158fe mov     esi,edx
       41 00000001`40015900 mov     r12d,ecx

    r12 is reused in FunctionWith6Params(), which means that our prize will be on the stack.  Let’s start by looking at the Child-SP for this frame which is at 00000000`0012fe80 by using the dps command:

    0:000> dps 00000000`0012fe80 L10
    00000000`0012fe80  00000000`00001649
    00000000`0012fe88  00000000`00005f90
    00000000`0012fe90  00000000`00000029
    00000000`0012fe98  00000000`00004823
    00000000`0012fea0  00000000`00006952
    00000000`0012fea8  00000001`00006784
    00000000`0012feb0  00000000`00004ae1
    00000000`0012feb8  00000001`00003d6c
    00000000`0012fec0  00000000`00000000
    00000000`0012fec8  00000000`00000029
    00000000`0012fed0  00000000`00006784
    00000000`0012fed8  00000001`4000128b example!main+0x5b [c:\temp\blog_entry\sample_code\example.cpp @ 72]

    I have highlighted in red the slot rsp points to when we enter FunctionWith6Params().  At this point, it is a simple matter to walk the assembly code and find the slot where the value is stored.  I have highlighted it in green above.

    Parameter Sleuthing -- Technique 3 (Inspecting Dead Space)

    The final technique I would like to demonstrate involves a little more trickery and involves looking at “dead” or previously used slots on the stack that are not used by the current function call.  To demonstrate, let’s say that after the DebugBreak() is hit, we need to know what the contents of param4 that were passed to FunctionWith6Params().  Let’s have another look at the assembly that has executed for FunctionWith6Params() and this time, let’s follow r9d, the fourth parameter:

    0:000> u example!FunctionWith6Params example!FunctionWith6Params+0x97
    example!FunctionWith6Params [c:\temp\blog_entry\sample_code\example.cpp @ 41]:
    00000001`400158e0 mov     qword ptr [rsp+8],rbx
    00000001`400158e5 mov     qword ptr [rsp+10h],rbp
    00000001`400158ea mov     qword ptr [rsp+18h],rsi
    00000001`400158ef push    rdi
    00000001`400158f0 push    r12
    00000001`400158f2 push    r13
    00000001`400158f4 sub     rsp,40h
    00000001`400158f8 mov     ebx,r9d
    00000001`400158fb mov     edi,r8d
    00000001`400158fe mov     esi,edx
    00000001`40015900 mov     r12d,ecx
    00000001`40015903 call    example!rand (00000001`4000148c)
    00000001`40015908 movsxd  r13,eax
    00000001`4001590b call    example!rand (00000001`4000148c)
    00000001`40015910 lea     rdx,[example!`string'+0x68 (00000001`40020d40)]
    00000001`40015917 movsxd  rbp,eax
    00000001`4001591a mov     eax,dword ptr [rsp+88h]
    00000001`40015921 lea     rcx,[example!`string'+0x80 (00000001`40020d58)]
    00000001`40015928 mov     dword ptr [rsp+38h],eax
    00000001`4001592c mov     eax,dword ptr [rsp+80h]
    00000001`40015933 mov     r9d,esi
    00000001`40015936 mov     dword ptr [rsp+30h],eax
    00000001`4001593a mov     r8d,r12d
    00000001`4001593d mov     dword ptr [rsp+28h],ebx
    00000001`40015941 mov     dword ptr [rsp+20h],edi
    00000001`40015945 call    example!printf (00000001`400012bc)
    00000001`4001594a call    example!rand (00000001`4000148c)
    00000001`4001594f mov     edi,eax
    00000001`40015951 call    example!rand (00000001`4000148c)
    00000001`40015956 mov     esi,eax
    00000001`40015958 call    example!rand (00000001`4000148c)
    00000001`4001595d mov     ebx,eax
    00000001`4001595f call    example!rand (00000001`4000148c)
    00000001`40015964 mov     r9d,r12d
    00000001`40015967 mov     r8d,esi
    00000001`4001596a mov     edx,ebx
    00000001`4001596c mov     ecx,eax
    00000001`4001596e mov     dword ptr [rsp+20h],edi
    00000001`40015972 call    example!ILT+5(?FunctionWith5ParamsYAXHHHHHZ) (00000001`4000100a)

    Notice that r9d is first moved into ebx.  But also, notice that it copied the contents into a slot on the stack at rsp+0x28.  What is this slot?  It’s the sixth parameter to the following printf() call.  Remember that the compiler looks at all of the functions the code calls and finds the function with the maximum number of parameters and then allocates enough space for that function.  As the code prepares to call printf(), it is moving the value we are after into the sixth parameter slot in that reserved stack space.  But what use is this information?

    If you examine FunctionWith6Params(), you see that every function called after printf() takes less than six parameters.  Specifically, the call to FunctionWith5Params() only uses five of those slots and just leaves the remaining three with junk in them.  This junk is actually our treasure!  From examining the code, it’s guaranteed that nobody has overwritten the slot represented by rsp+28.

    To find this slot, let’s again start by getting the Child-SP value for the frame we’re talking about as shown below:

    0:000> kL
    Child-SP          RetAddr           Call Site
    00000000`0012fdc8 00000001`40015816 ntdll!DbgBreakPoint
    00000000`0012fdd0 00000001`400158a0 example!FunctionWith4Params+0x66
    00000000`0012fe50 00000001`40015977 example!FunctionWith5Params+0x20
    00000000`0012fe80 00000001`40015a0b example!FunctionWith6Params+0x97
    00000000`0012fee0 00000001`4000168b example!main+0x5b
    00000000`0012ff20 00000000`7733495d example!__tmainCRTStartup+0x15b
    00000000`0012ff60 00000000`77538791 kernel32!BaseThreadInitThunk+0xd
    00000000`0012ff90 00000000`00000000 ntdll!RtlUserThreadStart+0x1d

    We can then take the highlighted value above and use the same offset in the code to find our value:

    0:000> dd 000000000012fe80+28 L1
    00000000`0012fea8  00006784

    As expected, the “dead” slot on the stack contains the value we are after.  You can compare the value to the output shown on the console to verify.

    A Non-volatile Register Shortcut

    Now that I have shown you the theory behind finding these elusive values passed around in registers, let me show you a shortcut that will make life a little bit easier.  The shortcut relies upon the /r option of the .frame command.  When using .frame /r, the debugger has the smarts to track non-volatile registers.  But as with any technique, always have multiple tools in your pocket in case you need to use all of them to verify a result.

    To demonstrate, let’s consider Technique 2 described previously where we look up the call graph and we want to know what r12 was prior to main() calling FunctionWith6Params().  Go ahead and re-launch the application in windbg and let it run until it hits the DebugBreak().  Now, let’s take a look at the stack including the frame numbers:

    0:000> knL
     # Child-SP          RetAddr           Call Site
    00 00000000`0012fdc8 00000001`40015816 ntdll!DbgBreakPoint
    01 00000000`0012fdd0 00000001`400158a0 example!FunctionWith4Params+0x66
    02 00000000`0012fe50 00000001`40015977 example!FunctionWith5Params+0x20
    03 00000000`0012fe80 00000001`40015a0b example!FunctionWith6Params+0x97
    04 00000000`0012fee0 00000001`4000168b example!main+0x5b
    05 00000000`0012ff20 00000000`7748495d example!__tmainCRTStartup+0x15b
    06 00000000`0012ff60 00000000`775b8791 kernel32!BaseThreadInitThunk+0xd
    07 00000000`0012ff90 00000000`00000000 ntdll!RtlUserThreadStart+0x1d

    Based on our previous analysis of the assembly in main(), we know that the first parameter to FunctionWith6Params() was also stored in the non-volatile register r12 in main() prior to calling FunctionWith6Params().  Now, check out what we get when we use the .frame /r command to set the current frame to 4.

    0:000> .frame /r 4
    04 00000000`0012fee0 00000001`4000168b example!main+0x5b [c:\temp\blog_entry\sample_code\example.cpp @ 70]
    rax=0000000000002ea6 rbx=0000000000004ae1 rcx=0000000000002ea6
    rdx=0000000000145460 rsi=00000000000018be rdi=0000000000006784
    rip=0000000140015a0b rsp=000000000012fee0 rbp=0000000000004823
     r8=000007fffffdc000  r9=0000000000001649 r10=0000000000000000
    r11=0000000000000246 r12=0000000000000029 r13=0000000000000000
    r14=0000000000000000 r15=0000000000000000
    iopl=0         nv up ei pl nz na pe nc
    cs=0033  ss=002b  ds=002b  es=002b  fs=0053  gs=002b             efl=00000202
    example!main+0x5b:
    00000001`40015a0b 488b5c2440      mov     rbx,qword ptr [rsp+40h] ss:00000000`0012ff20=0000000000000000

    As you can see, .frame /r shows the register contents as they were in main() prior to calling FunctionWith6Params().  Beware!  You can only trust the non-volatile registers when using this command!  Be sure to check out the following link to see which registers are considered volatile: Register Usage for x64 64-Bit.

    .frame /r can spare you the time spent manually digging around on the stack to find saved volatile registers.  In my experiments, .frame /r even works where there is no symbol information available.  However, it never hurts to know how to do it manually in case you’re faced with a situation where .frame /r breaks down.

    Conclusion

    The x64 calling convention and the abundance of general purpose registers in the processor bring many opportunities for optimization to the table.  However, when all of those optimizations are in play, they can certainly make debugging difficult.  After giving a brief overview of the x64 calling convention, I demonstrated three techniques one can use to find parameter values to various functions in the call stack.  I also showed you a shortcut you can use to see non-volatile registers for a particular frame in the call stack.  I hope you find these techniques useful in your debugging adventures.  Additionally, I urge you to become more familiar with all of the nuances of the x64 calling convention.

     

  • Ntdebugging Blog

    How the Clipboard Works, Part 2

    • 1 Comments

    Last time, we discussed how applications place data on the clipboard, and how to access that data using the debugger. Today, we'll take a look at how an application can monitor the clipboard for changes.  Understanding this is important because it is a place where Windows allows 3rd-party code to "hook" into the system.  If you experience unexpected behavior with copying and pasting, a program using these hooks may be misbehaving.  We’ll start by covering the hooking mechanisms for clipboard, and then review how to identify what applications, if any, are using these hooks in the debugger.

     

    There are three ways to monitor the clipboard for changes - clipboard viewers, clipboard format listeners, and querying the clipboard sequence number.  We will focus on the first two as these allow an application to register for notifications whenever the clipboard is updated.  The third method simply allows an application to check and see if a change has occurred, and should not be used in a polling loop.

     

    The Clipboard Viewer functionality has been around since Windows 2000, if not earlier.  The way it works is pretty simple - an application interested in receiving clipboard change notifications calls SetClipboardViewer and passes a handle to its window.  Windows then stores that handle in a per-session win32k global, and anytime the clipboard is changed Windows sends a WM_DRAWCLIPBOARD message to the registered window.

     

    Of course, multiple applications are allowed to register their windows as clipboard viewers - so how does Windows handle that?  Well, if an application calls SetClipboardViewer and another clipboard viewer was already registered, Windows returns the handle value of the previous viewer's window to the new viewer.  It is then the responsibility of the new viewer to call SendMessage every time it receives a WM_DRAWCLIPBOARD to notify the next viewer in the chain. Each clipboard viewer also needs to handle the WM_CHANGECBCHAIN message, which notifies  all viewers when one of the viewers in the chain is removed, and specifies what the next viewer in the chain is.  This allows the chain to be maintained.

     

    An obvious problem with this design is it relies on each clipboard viewer application to behave correctly, not to terminate unexpectedly, and to generally be a good citizen.  If any viewer decided not to be friendly, it could simply skip notifying the next viewer in line about an update, rendering the next viewer and all subsequent viewers impotent.

     

    To address these problems, the Clipboard Format Listener mechanism was added in Windows Vista. This works in much the same way as the clipboard viewer functionality except in this case Windows maintains the list of listeners, instead of depending on each application to preserve a chain.

     

    If an application wishes to become a clipboard format listener,it calls the AddClipboardFormatListener function and passes in a handle to its window. After that, its window message handler will receive WM_CLIPBOARDUPDATE messages.  When the application is ready to exit or no longer wishes to receive notifications, it can call RemoveClipboardFormatListener.

     

    Now that we've covered the ways to register a viewer/listener, let's take a look at how to identify them using the debugger.  First, you'll need to identify a process in the session you are interested in checking for clipboard monitors.  It can be any win32 process in that session -we just need to use it to locate a pointer to the Window Station.  In this case, I'll use the Notepad window I used in part 1: 

    kd> !process 0 0 notepad.exe

    PROCESS fffff980366ecb30

        SessionId: 1  Cid: 0374   Peb: 7fffffd8000  ParentCid: 0814

        DirBase: 1867e000  ObjectTable: fffff9803d28ef90  HandleCount: 52.

        Image: notepad.exe

     

    If you are doing this in a live kernel debug, you'll need to change context into the process interactively (using .process /I< address> then hit g and wait for the debugger to break back in).  Now DT the process address as an _EPROCESS and look for the Win32Process field: 

    kd> dt _EPROCESS fffff980366ecb30 Win32Process

    nt!_EPROCESS

       +0x258 Win32Process : 0xfffff900`c18c0ce0 Void

     

    Now DT the Win32Process address as a win32k!tagPROCESSINFO and identify the rpwinsta value: 

    kd> dt win32k!tagPROCESSINFO 0xfffff900`c18c0ce0 rpwinsta

       +0x258 rpwinsta : 0xfffff980`0be2af60 tagWINDOWSTATION

     

    This is our Window Station. Dump it using dt: 

    kd> dt 0xfffff980`0be2af60 tagWINDOWSTATION

    win32k!tagWINDOWSTATION

       +0x000 dwSessionId      : 1

       +0x008 rpwinstaNext     : (null)

       +0x010 rpdeskList       : 0xfffff980`0c5e2f20 tagDESKTOP

       +0x018 pTerm            : 0xfffff960`002f5560 tagTERMINAL

       +0x020 dwWSF_Flags      : 0

       +0x028 spklList         : 0xfffff900`c192cf80 tagKL

       +0x030 ptiClipLock      : (null)

       +0x038 ptiDrawingClipboard: (null)

       +0x040 spwndClipOpen    : (null)

       +0x048 spwndClipViewer  : 0xfffff900`c1a4ca70 tagWND

       +0x050 spwndClipOwner   : 0xfffff900`c1a3ef70 tagWND

       +0x058 pClipBase        : 0xfffff900`c5512fa0 tagCLIP

       +0x060 cNumClipFormats  : 4

       +0x064 iClipSerialNumber : 0x16

       +0x068 iClipSequenceNumber : 0xc1

       +0x070 spwndClipboardListener : 0xfffff900`c1a53440 tagWND

       +0x078 pGlobalAtomTable: 0xfffff980`0bd56c70 Void

       +0x080 luidEndSession   : _LUID

       +0x088 luidUser         : _LUID

       +0x090 psidUser         : 0xfffff900`c402afe0 Void

     

    Note the spwndClipViewer, spwndClipboardListener, and spwndClipOwnerfields.  spwndClipViewer is the most-recently-registered window in the clipboard viewer chain.  Similarly, spwndClipboardListener is the most recent listener in our Clipboard Format Listener list.  spwndClipOwner is the window that set the content in the clipboard.

     

    Given the window, it is just a few steps to determine the process.  This would work forspwndClipViewer, spwndClipboardListener, and spwndClipOwner.  First, dt the value as a tagWND.  We'll use the spwndClipViewer for this demonstration: 

    kd> dt 0xfffff900`c1a4ca70 tagWND

    win32k!tagWND

       +0x000 head             : _THRDESKHEAD

       +0x028 state            : 0x40020008

       +0x028 bHasMeun         : 0y0

       +0x028 bHasVerticalScrollbar : 0y0

     

    We only care about the head - so since it is at offset 0, dt the same address as a _THRDESKHEAD: 

    kd> dt 0xfffff900`c1a4ca70 _THRDESKHEAD

    win32k!_THRDESKHEAD

       +0x000 h                : 0x00000000`000102ae Void

       +0x008 cLockObj         : 6

       +0x010 pti              : 0xfffff900`c4f26c20tagTHREADINFO

       +0x018 rpdesk           : 0xfffff980`0c5e2f20 tagDESKTOP

       +0x020 pSelf            : 0xfffff900`c1a4ca70  "???"

     

    Now, dt the address in pti as a tagTHREADINFO: 

    kd> dt 0xfffff900`c4f26c20 tagTHREADINFO

    win32k!tagTHREADINFO

       +0x000 pEThread         : 0xfffff980`0ef6cb10 _ETHREAD

       +0x008 RefCount         : 1

       +0x010 ptlW32           : (null)

       +0x018 pgdiDcattr       : 0x00000000`000f0d00 Void

     

    Here, we only care about the value of pEThread, which we can pass to !thread: 

    kd> !thread 0xfffff980`0ef6cb10 e

    THREAD fffff9800ef6cb10 Cid 087c.07ec  Teb: 000007fffffde000 Win32Thread: fffff900c4f26c20 WAIT: (WrUserRequest) UserModeNon-Alertable

        fffff9801c01efe0  SynchronizationEvent

    Not impersonating

    DeviceMap                 fffff980278a0fc0

    Owning Process            fffff98032e18b30       Image:         viewer02.exe

    Attached Process          N/A            Image:         N/A

    Wait Start TickCount     5435847        Ticks: 33 (0:00:00:00.515)

    Context Switch Count     809            IdealProcessor: 0                 LargeStack

    UserTime                  00:00:00.000

    KernelTime                00:00:00.062

    Win32 Start Address 0x000000013f203044

    Stack Init fffff880050acdb0 Current fffff880050ac6f0

    Base fffff880050ad000 Limit fffff880050a3000 Call 0

    Priority 11 BasePriority 8 UnusualBoost 0 ForegroundBoost 2IoPriority 2 PagePriority 5

    Child-SP          RetAddr           Call Site

    fffff880`050ac730 fffff800`01488f32 nt!KiSwapContext+0x7a

    fffff880`050ac870 fffff800`0148b74f nt!KiCommitThreadWait+0x1d2

    fffff880`050ac900 fffff960`000dc8e7 nt!KeWaitForSingleObject+0x19f

    fffff880`050ac9a0 fffff960`000dc989 win32k!xxxRealSleepThread+0x257

    fffff880`050aca40 fffff960`000dafc0 win32k!xxxSleepThread+0x59

    fffff880`050aca70 fffff960`000db0c5 win32k!xxxRealInternalGetMessage+0x7dc

    fffff880`050acb50 fffff960`000dcab5 win32k!xxxInternalGetMessage+0x35

    fffff880`050acb90 fffff800`01482ed3 win32k!NtUserGetMessage+0x75

    fffff880`050acc20 00000000`77929e6a nt!KiSystemServiceCopyEnd+0x13 (TrapFrame @ fffff880`050acc20)

    00000000`002ffb18 00000000`00000000 0x77929e6a

     

    As you can see, we have a clipboard viewer registered from process viewer02.exe.  Because of viewer's process-maintained chain architecture, it isn't easy to see the next process in the chain.  However, we can do this for clipboard listeners.  Let's look back at our window station: 

    kd> dt 0xfffff980`0be2af60 tagWINDOWSTATION

    win32k!tagWINDOWSTATION

       +0x000 dwSessionId      : 1

       +0x008 rpwinstaNext     : (null)

       +0x010 rpdeskList       : 0xfffff980`0c5e2f20 tagDESKTOP

       +0x018 pTerm            : 0xfffff960`002f5560 tagTERMINAL

       +0x020 dwWSF_Flags      : 0

       +0x028 spklList         : 0xfffff900`c192cf80 tagKL

       +0x030 ptiClipLock      : (null)

       +0x038 ptiDrawingClipboard : (null)

       +0x040 spwndClipOpen    : (null)

       +0x048 spwndClipViewer  : 0xfffff900`c1a4ca70tagWND

       +0x050 spwndClipOwner   : 0xfffff900`c1a3ef70tagWND

       +0x058 pClipBase        : 0xfffff900`c5512fa0 tagCLIP

       +0x060 cNumClipFormats  : 4

       +0x064 iClipSerialNumber : 0x16

       +0x068 iClipSequenceNumber : 0xc1

       +0x070 spwndClipboardListener: 0xfffff900`c1a53440 tagWND

       +0x078 pGlobalAtomTable: 0xfffff980`0bd56c70 Void

       +0x080 luidEndSession   : _LUID

       +0x088 luidUser         : _LUID

       +0x090 psidUser         : 0xfffff900`c402afe0 Void

     

    If we dt the spwndClipboardListener, there is a field that shows the next listener named spwndClipboardListenerNext: 

    kd> dt 0xfffff900`c1a53440 tagWND spwndClipboardListenerNext

    win32k!tagWND

       +0x118 spwndClipboardListenerNext : 0xfffff900`c1a50080 tagWND

     

    When you reach the last clipboard format listener's tagWND, its spwndClipboardListenerNext value will be null: 

    kd> dt 0xfffff900`c1a50080 tagWND spwndClipboardListenerNext

    win32k!tagWND

       +0x118 spwndClipboardListenerNext : (null)

     

    Using this window address, we can go through the same steps as above to identify this listener's process name. As mentioned earlier, since tagWND is a kernel structure, the OS is maintaining these spwndClipboardListener/spwndClipboardListenerNext pointers, so they aren't susceptible to the chain problems of clipboard viewers.

     

    That wraps up our clipboard coverage. I hope you found it informative.  Want to learn more about monitoring the clipboard?  This MSDN article is a good resource.

     

    -Matt Burrough

  • Ntdebugging Blog

    Windows Hotfixes and Updates - How do they work?

    • 13 Comments

    Today I would like to talk about some of the work the Windows Serviceability (WinSE) team does regarding servicing Windows and releasing updates.

    The operating system is divided into multiple components. Each component can consist of one or more files, registry keys, configuration settings, etc.  WinSE releases updates based on components rather than the entire operating system. This reduces a lot of overhead with having to install updates to components that have not changed. Depending on the severity and applicability of the problem, there are different kinds of release mechanisms. Keep in mind, though, the actual fix still remains the same.

    1.       Updates and Security Updates

    These Updates are typically available on Windows Update. They frequently contain security fixes, and from time to time also contain reliability rollup packages. These updates are thoroughly tested and Microsoft highly recommends that you update your computer with these releases. In fact, most are automatically downloaded to your machine if you have Windows Update turned on. In most cases, Update releases are also available as standalone downloads from the download center.

     

    2.       Hotfixes

    When an individual customer reports a bug to Microsoft for a specific scenario, the WinSE team releases Hotfixes to address these problems. Hotfixes are not meant to be widely distributed and go through a limited amount of testing due to the customer's need for an urgent fix.  Hotfixes are developed in a separate environment than the regular Updates.  This allows Microsoft to release Updates that do not include the Hotfix files, thereby minimizing risk for the customer.

    Once the Hotfix is ready and packaged by WinSE, a KB article is written describing the problem, with instructions on how to obtain the Hotfix.  Microsoft recommends that only customers experiencing the particular problem install the Hotfix for that problem.

    Note: Hotfixes are also sometimes referred to as LDRs, or QFE's (Quick Fix Engineering). The term QFE is an old term that is mostly no longer used in reference to current versions of Windows.

     

    3.       SP  - Service Pack

    The service pack is a major update in the life of an OS. It contains a wide variety of fixes as well as all the GDR and LDR fixes that were released since the previous service pack was shipped. This is a thoroughly tested release and highly recommended by Microsoft for installation. This is usually available as a standalone release, and is then released through Windows Update as well.

     

     

    GDR vs. LDR branches

    Now that we have described the different kinds of updates, let's take a deeper look into how these fixes are built. When a new OS or service pack is released, 2 branches are created from the release code tree -a GDR (general distribution release) branch and a LDR (limited distribution release) branch. Hotfixes are built solely from the LDR branch, while Updates for broad release are built from the GDR branch.

    Service Packs are built from a third branch that contains all Updates , Hotfixes and additional fixes.  This way the new service pack is shipped with all the fixes from both branches.

    Note – Once the new service pack is shipped, the code branches from the previous release are still active and serviced as necessary.

    Installing a Hotfix

    By default, all components on Windows systems start on the GDR branch following each major release. When you install updates from Windows Update for a GDR component, it gets upgraded with the GDR version.

    When you install a specific Hotfix, the files and components in the Hotfix package are migrated to the LDR branch. At this point, that particular component is marked as a LDR component. If you install a newer Update over this component, the Windows servicing technology will automatically install the appropriate latest version from the LDR branch for you. This is possible because each Update package ships with both the GDR and LDR versions of the component.

    Once a component is marked as a LDR component, the only way to move back to the GDR branch is to uninstall all Hotfixes for that component, or move to the next available service pack.

     

    What would happen if a user installed a Hotfix, and then sometime later installed the next service pack? Well, in that case it depends on the Hotfix and when it was built.

    1.       If the Hotfix was built before the service pack, then the component will be moved to the GDR version contained in the service pack.

    2.       If the Hotfix was built after the service pack, the component will be migrated to the post-service pack version of the component, and will stay on the same branch that it was originally on.

     

    In order to make this work, these packages contain both the RTM GDR version, the RTM Hotfix branch, and the SP1 Hotfix and GDR version of each binary.

     

    All fixes built for Windows are cumulative in nature by branch, i.e. a new update will contain the new fix, as well as all the previous fixes for that branch. Referencing the chart above, installing fix #4 can get you fixes #2 and #4 on the GDR branch. If the component is on the LDR branch, then the user would get fixes #1-4.

     

    Finally, the servicing technology has to handle the case where you need the functionality of an older Hotfix (e.g. “Fix #1” in the diagram above) but you may already have installed “Fix #4” which might be a critical security update.  What happens is that when the GDR branch of a fix is installed, it also places a copy of the Hotfix version of the same fix on the system.  When you run the installer for Hotfix #1, it detects that a newer version of the file is already installed, but it also detects that it needs to migrate it to the Hotfix version of the binary that was previously stored on the system. The result is that you end up with the Hotfix binary for Fix #4, which has both the Hotfix you need plus the cumulative set of security fixes.

     

    Stay tuned for more, in the next blog entry, I will talk about the staging mechanism that Windows uses to install Updates and Hotfixes as well as the uninstall process. Also, I will talk about how to determine the branch a file is built from.

     

    - Omer 

     

    More Information

    Description of the standard terminology that is used to describe Microsoft software updates

    Description of the contents of Windows XP Service Pack 2 and Windows Server 2003 software update packages

     

  • Ntdebugging Blog

    Tracking down MmSt paged pool usage

    • 6 Comments

     

    A trend that I’ve noticed recently are cases involving paged pool depletion with high MmSt tag usage that remains after trying KB304101 (PoolUsageMaximum). These pool allocations are used by the memory manager for section object prototype PTEs. There are generally only two options when this happens: 1) upgrade to a 64-bit platform, or 2) reduce the size of the volumes. But we may want to know what mapped files are using this memory. Here is how it can be done. Start with !memusage.

     

    5: kd> !memusage

     loading PFN database

    loading (100% complete)

    Compiling memory usage data (99% Complete).

                 Zeroed:  19073 ( 76292 kb)

                   Free:      0 (     0 kb)

                Standby: 1468824 (5875296 kb)

               Modified:    368 (  1472 kb)

        ModifiedNoWrite:   1927 (  7708 kb)

           Active/Valid: 605772 (2423088 kb)

             Transition:      0 (     0 kb)

                    Bad:      0 (     0 kb)

                Unknown:      0 (     0 kb)

                  TOTAL: 2095964 (8383856 kb)

      Building kernel map

      Finished building kernel map

    Scanning PFN database - (100% complete)

     

    Following this you will see the list of mapped files and their control areas.

     

      Usage Summary (in Kb):

    Control Valid Standby Dirty Shared Locked PageTables  name

    8c62a638  1108  945868  3064     0     0     0  mapped_file( $Mft )

     

    The control area is the address at the far left and has a Segment field that contains the total number of PTEs.

     

    5: kd> dt 8c62a638 _CONTROL_AREA Segment->TotalNumberOfPtes

    nt!_CONTROL_AREA

       +0x000 Segment                    :

          +0x004 TotalNumberOfPtes          : 0x1e8b00

     

    The MmSt allocations contain these PTEs so all we need to do is multiply this by the size of a PTE to get the total size of the MmSt allocations for this control area. Note that there may be multiple allocations for this control area, but this number will reflect the total size all these allocations.

     

    5: kd> ?? 0x1e8b00 * sizeof(nt!_MMPTE)

    unsigned int 0xf45800

     

    So now we know the MmSt size in bytes for a single control area, or mapped file. What if we would like to see the totals for all mapped files from the !memusage output? First, place the !memusage output in a text file and remove all header information. You will also need to remove all tail information including the page file and process summaries. Every line should look like these.

     

    8c62a638  1108  945868  3064     0     0     0  mapped_file( $Mft )

    8b06ac18   516      0     0     0     0     0    No Name for File

     

    We want to include the “No Name for File” entries since those are valid mapped files even though the name could not be located. Next strip out everything but the control area address. You can use Excel or any other tool that allows you to select and delete columns in a text file. Now we have a file with a single column of all the control areas on the system. The following debugger command script can be used to process this file.

     

    $$ countptes.txt script

    r $t2 = 0;

     

    $$ Replace the memusage.txt file name with your file name.

    .foreach /f (ca "memusage.txt") {

        r $t1 = @@c++(((nt!_CONTROL_AREA *)(0x${ca}))->Segment->TotalNumberOfPtes);

        .printf "Control Area %p : %d\n", ${ca}, @$t1;

        r $t2 = @$t2 + @$t1;

    }

     

    .printf "Total PTEs : %d\n", @$t2;

    .printf "MmSt size  : %d bytes\n", (@$t2 * @@c++(sizeof(nt!_MMPTE)));

     

    The following command will execute the script.

     

    5: kd> $$><countptes.txt

     

    This will show the number of PTEs for each control area, followed by a summary.

     

    Total PTEs : 62790244

    MmSt size  : 502321952 bytes

     

    A common high user of MmSt allocations is $Mft. The cache manager will hold the MmSt allocations for these file system metadata files at a cost of up to 4 files per PTE. This technique can be used to determine how much $Mft is using MmSt pool memory by first using findstr at a command prompt to isolate just those values from the !memusage output.

     

    C:\Projects>findstr /c:"$Mft" memusage.txt >mftusage.txt

     

    After stripping out the control area addresses with Excel and running the command script you’ll have the size of the MmSt allocations for just the $Mft files. If this is consuming most of the MmSt bytes then you are limited to the options mentioned at the beginning of this article. There may be other options if something else is the primary user but it will likely involve reducing some heavy load on the system.

     

    -Bryan

  • Ntdebugging Blog

    Part 3: ETW Methods of Tracing

    • 1 Comments

    Introduction and Overview

     

    This is Ivan from the Platforms OEM team and this is the final installment of the ETW series. In this article, we are going to continue our exploration of the ETW tracing available in Windows. This post is going to cover some of the other methods available to enable and capture ETW logs. In previous posts we covered

     

    -ETW Introduction and Overview
    -Exploring and Decoding ETW Providers using Event Log Channels

    In this post we will explore some of the methods Microsoft support may use to enable and capture tracing, in order to troubleshoot issues on a box. In fact, some of these methods may ultimately be quite transparent when you use them; but we wanted to dig a bit into what his happening behind the scenes.

    Please keep in mind that the previous post, Exploring and Decoding ETW Providers using Event Log Channels, was geared more towards self-discovery and decoding. This post covers the more advanced scenarios and tools that you may see Microsoft support using.

    We plan on starting with some of what’s available in the box, then using a command line tool (logman) to enable further logging to a kernel buffer – both of these are useful for dumps. The second set of scenarios we are going to cover is using the command line (logman) and GUI to enable tracing to a sequential file. We will then finish up with some other special types of tracing available.

    Tracing already available (and likely running) in the box

     

    Starting with Vista/2008, there is a variety of tracing already running inbox and accessible via a kernel dump, or logging to an on-disk ETL file (limited tracing is available in 2003 & XP). This screenshot was taken from a default Vista box with no additional logging enabled or turned on. While we aren’t going to cover these ETW providers in detail, it’s good to know these are available for analysis. One logger in particular is a useful kernel logger, which is the Circular Kernel Context Logger or CKCL, a sort of In Flight Data Recorder (IFR). The CKCL is one of the two available kernel loggers, and is intended as a high-performance, high volume log to record the last 0.5-3 seconds (assuming high load) of low level kernel activity on the box. The CKCL is set to circular mode meaning newer events will overwrite older events, and with a 4MB total buffer space available.

    This can be very useful, because along with all the rich information a full kernel .dmp provides, the CKCL can enhance that and provide recent historical information about how the kernel arrived at the state it’s currently in. The CKCL is enabled by default in Vista and can be enabled in Server 2008 in the GUI or via the command line.

    image_thumb[3]

    The CKCL Keywords of 0x237f correspond to the following flags that are enabled on this box: process, thread, img, proccntr, cswitch, dpc, isr, disk, file, hf. The screenshot below shows information from the “NT Kernel Logger” session whose provider name is “Windows Kernel Trace”. “Windows Kernel Trace” is the provider for the CKCL and is shown here because it is the easiest/most reliable way to view what the kernel flags mean.

    The green boxes below highlight the flags that were automatically enabled on this Vista machine and available in dumps. You might notice that tools such as XPerf will modify and collect data from the “NT Kernel Logger” session. XPerf post processes/merges the trace (along with other potential providers) to add valuable information to be able read the trace in more human readable form as well as reliably view the trace offline. However, even without the post processing added by Xperf, traces can be pulled from kernel dump, decoded, and even viewed by Xperf (although they will be missing information such as process names, ISR names, symbol information, etc).

    To review how the kernel providers and sessions are related:

    Session                                                                                                 Provider                                               Comment____________________________

    Circular Kernel Context Logger(CKCL)                                     Windows Kernel Trace                   In-Flight Recorder for recent activity on the box
    NT Kernel Logger                                                                      Windows Kernel Trace                  Kernel Logger used by XPerf but can be enabled manually w/o XPerf

     

    Win7/2008R2:

    In order to cover the newer flags available, we illustrate the “NT Kernel Logger”, which is enabled because xperf was enabled with: xperf –on Base+Diag. To reiterate, the same flags are available in the CKCL, but whose configuration and purpose is suited toward an IFR scenario.

    image_thumb[34]

    image_thumb[35]

    image_thumb[36]

    Vista/2008:

    As covered in the picture above, the flags are very similar to Win7, with the exception that dispatcher and virtalloc are not available.

    Server 2003:
    ETW Tracing in 2003 is limited. You can see the Kernel Flags are much smaller and do not cover flags such as ISR or DPC.

    image_thumb[37]

    Using the !wmitrace.strdump debugger extension, we can find the Logger Id for the CKCL which is 2.

    image_thumb[38]

    And use !wmitrace.logsave to save the log to an ETL file

    image_thumb[39]

    Here we use Xperf to load the ETL file and are looking at ISR history over the lifetime of this particular trace (roughly last 250s of time before dump). This is a trace from a normally functioning box and is showing that ISR at 0x83b44b65 had the highest count over our selected ~100ms timeframe.

    image_thumb[40]

    As previously mentioned, you will need to hand decode much of the trace data, because the XPerf merge step is missing. Here we find that the ISR at 0x83b44b65 is from ndis!ndisMiniportIsr

    image_thumb[41]

    Finally, we can dump the log out in the debugger in plain old text format using !wmitrace.logdump, which we can decode because Windbg ships with a system.tmf which will decode most Kernel Events from the CKCL.

    image_thumb[42]

    Using Logman to enable tracing to a circular buffer

     

    Logman is a built in tool that ships with the OS in Vista/Win7, and can be used to enable and control ETW tracing. We are going to cover it first because it can be used to automate starting and stopping ETW logging without using the Computer Management GUI. It is also baked into the OS, making it ideal because no extra tools are required to start using or enabling ETW tracing. While logman is very useful, the syntax is often inconsistent, and it’s often difficult to figure out the right way to run a command, which is why we are going to cover some examples here.

    You may even receive a set of batch scripts that basically automate the starting and stopping of tracing. For example:

    -Start.bat <- Run this to start tracing
    -Stop.bat <- Run this to stop tracing

    Here we are using logman to start the tracing of the just previously mentioned “NT Kernel Logger” (used by XPerf) to enable the “Windows Kernel Trace” provider. This logging would be useful to enable kernel events that aren’t enabled by default in the CKCL, or if you want a larger time than the 4MB buffer the CKCL provides. The logging is flushed to disk after every 1MB buffer fills up.

    <Start.bat>

    REM – This creates a kernel trace using 20MB buffer in a circular file mode that logs to kernel.etl (and available in a memory dump as a circular buffer)

    logman start "NT Kernel Logger" -p "Windows Kernel Trace" (process,thread,disk,isr,dpc,net,img,registry,file) -bs 1024 -nb 20 20 -mode Circular -o kernel.etl -ct perf -max 20 -ets

    </Start.bat>

    <Stop.bat>

    logman stop "NT Kernel Logger" -ets

    </Stop.bat>

     

    Using Logman to enable tracing to a file

     

    You may receive logging scripts that are tuned specifically to the problem at hand, including the necessary providers, and within each provider, applicable Flags and Levels. In the previous blog posts, we already covered some information about how to find out providers, so we let’s assume we already have that information. Using logman to enable tracing in this manner allows for an ETW trace .etl to be captured and sent to Microsoft for decoding. Using the Microsoft-Windows-WinINet example, the tracing may look like this:

    <Start.bat>

    REM - This creates a circular file trace with all Flags and Levels on the Microsoft-Windows-WinINet provider, with a max size of 20MB

    logman start customTrace -o c:\temp\customTrace.etl -p "Microsoft-Windows-WinINet" 0xff 0xff -mode Circular -max 20 -ets

    REM This creates a file trace with all Flags and Levels on the Microsoft-Windows-WinINet provider, logging until stopped

    logman start customTrace -o c:\temp\customTrace.etl -p "Microsoft-Windows-WinINet" 0xff 0xff -ets

     </Start.bat>

    <Stop.bat>

    logman delete customTrace

    </Stop.bat>

    Using the Computer Management – Performance to enable tracing manually

     

    The Computer Management performance MMC GUI snap-in is useful for viewing and changing trace sessions. In previous blog posts, we briefly touched on the Event Trace Sessions GUI to view sessions auto-created when enabling a Windows Event log channel. Channels are useful because they don’t require much knowledge of the individual keywords/flags and levels to enable – they are pre-defined for a scenario. Most events that are delivered to a channel or decoded due to Event Manifests. However, there is another class of ETW providers called WPP Software Tracing, which was originally intended for driver tracing, but lacks a lot of the self-described meta information that manifest based providers give.

    This is where Computer Management -> Performance -> Data Collector Sets -> Event Trace Sessions, once again comes in handy. Not only can you view existing sessions created by the system (Autologgers/Start Event Trace Sessions), but you can modify tracing, and even create new tracing; which is what we are going to cover. This is the only way to create tracing using the GUI for WPP style ETW tracing, such as iSCSI.

    image_thumb[43]

    To create a new session, right click Event Trace Sessions and choose New -> Data Collector Set

    image_thumb[44]

    In our example, we already know what type of tracing to enable (you may be directed by Microsoft Support), or you may be given an .xml template to automatically setup the tracing. In our example, we are going to setup tracing for iSCSI, which is WPP based.

    image_thumb[45]

    The list of providers that displays unfortunately isn’t searchable, but is sorted alphabetically, so we can use our ‘logman query providers | findstr /I iscsi” to find providers named iscsi.

    image_thumb[46]

    Which we should find in the GUI of Event providers:

    image_thumb[47]

    Next we need to choose what Keywords (Any) are used for this provider. Keywords (Any) is an OR filter meaning any bit that is set on the Keyword/Flag mask will be traced. Keywords usually specify functional sub-components while Levels control at which level of detail (Error, Informational, Verbose) events are logged. Components are free to choose what Keywords/Flags they use, and what Levels the use; and don’t have to use both. We will need to set or check both Keywords(Any) and Level(s) in order to get any ETW tracing, otherwise events won’t be logged.

    image_thumb[48]

    When we edit Keywords (Any) to include these four flags, you should notice the mask is set automatically to 0xf, which will reflect under Keywords(Any). When we go to edit the level, we notice levels don’t seem to be exposed or used by the msisci_iScsi provider. That normally means a provider doesn’t use levels, but it is recommended to always be on the safe side, so to make sure not to lose events we can set the level anyways. Here we set the level to 0xff.

    image_thumb[49] image_thumb[50]

     

    image_thumb[51]

    What we should have now is a trace session called iSCSI with the msisci_iSCSI provider with Keywords(Any) of 0xf and Level of 0xf. There are just a couple of more items we need to check before starting to collect data on this provider, while is changing the Trace Buffer parameters, and checking stream mode and log location.

    image_thumb[52] image_thumb[53] image_thumb[54]

    Here we change the Buffer Size to 1MB, flushed to disk every second, with 10 buffers available in case there is so much event load that events can’t be flushed to disk fast enough. We simply check the Stream Mode and see it is set to File, and check the file name which is under %LOCALAPPDATA%\iSCSI.etl. Now we can start the ETW session and start collecting data!

    image_thumb[55]

    Using the Computer Management – Performance to enable tracing with a template

     

    Now, while it’s useful to know how to manually create sessions, either with the GUI or logman command line; you may only want to quickly enable tracing provided by someone else, such as Microsoft Support, or save off commonly used tracing. This is where templates come in handy. Continuing with the last session, we can right click on our iSCSI session and choose “Save Template”, and then save the template as iSCSI.xml.

    image_thumb[56] image_thumb[57]

    Now when creating a new trace with the GUI we can use a template to capture the previous saved settings, and quickly setup tracing.

    image_thumb[58]

    Using tracelog (DDK) to enable tracing to a file

     

    Tracelog is an ETW tracing tool that ships with the DDK. It has some more advanced features than logman that ships with the OS, with a disadvantage that you will need to download the WDK/DDK. You will find tracelog under %DDK_INSTALL_DIR%\tools\tracing\%PROCESSOR_ARCHITECTURE%\tracelog.exe, along with other useful tools such TraceView, and tracefmt.

    While you can use the built-in Computer Management GUI or logman to manipulate most tracing, a couple of key features that tracelog provides are:

                    -Realtime output to the kernel debugger (-kd switch)
                    -Private process heap tracing (-heap -pids <n> <pid1 pid2 …>). Heap tracing is also available in xperf, and is more useful due to stack walking (more on this later)
                    -Private process critical section tracing (-critsec -pids <n> <pid1 pid2 …>)
                    -Ability to create autologgers/traces that persist on reboots (-addautologger). Note: This ability does exist in logman, but is hidden. It can be done by appending “autosession\” to the name of session using logman.

    Here we are going to use tracelog to demonstrate a special type of logger in ETW called a private session logger, which uses private buffers inside a process (normally ETW buffers live in kernel mode). Private buffers are the only way to collect heap or critical section ETW tracing.

    Critical section private tracing

    First we start by finding the notepad process, for which we are going to collect critical section tracing.

    C:\temp>tasklist | findstr /i notepad

    Image Name                     PID Session Name        Session#    Mem Usage
    ========================= ======== ================ =========== ============
    notepad.exe                  7752 Console                    2     10,052 K

    Now we use tracelog with the –critsec switch to

    <Start.bat>

    tracelog -start critSecTrace -f c:\temp\critSecTrace.etl -critsec -pids 1 7752

    </Start.bat>

    <Stop.bat>

    logman -stop  critSecTrace

    </Stop.bat>

    The critical section trace can be decoded with the DDK tracefmt tool using the system.tmf in the DDK. Here is an example portion of the decoded log.

    image_thumb[59]

    Heap private tracing

     

    We can use tracelog in a similar manner to enable and decode heap tracing. The heap trace can also be decoded with the DDK tracefmt tool using the system.tmf in the DDK.

    <Start.bat>

    tracelog -start heapTrace -f c:\temp\heapTrace.etl -heap -pids 1 7752

    </Start.bat>

    <Stop.bat>

    logman -stop  heapTrace

    </Stop.bat>

    image_thumb[60]

    While the heap trace is useful in its own right, we can even get more useful info using XPerf. Xperf is a great way to view most types of ETW data (especially performance), and in this particular case is vital to get full stack traces that led up to the heap entries. Xperf can be downloaded from the Windows Performance Analysis Dev Center. Here we enable stacktraces for heap allocations and reallocations so we know who is creating the heaps. You may want to explore other heap stackwalking available with the xperf help -  ‘xperf -help stackwalk’

    image_thumb[32]

    <Start.bat>

    xperf -start HeapSession -heap -PidNewProcess "notepad.exe" -BufferSize 1024 -MinBuffers 128 -MaxBuffers 128 -stackwalk HeapAlloc+HeapRealloc

    </Start.bat>

    <Stop.bat>

    xperf -stop HeapSession -stop -d c:\temp\xperf_heap.etl

    </Stop.bat>

    Now we can load up the ETW trace in xperf, and view heap allocation information (in additional to kernel info such as CPU from the NT Kernel Logger), with stack traces at the time of allocation!

    image_thumb[61]

    Wrapping it up

     

    Hopefully, we dug further into ETW and explored some of the various methods to put the power of ETW to work. There is a lot of tools and power exposed with the built-in ETW tools, and some other scenarios that can be enabled with add-on tools from the DDK and XPerf.

     

    Share this post :

     

  • Ntdebugging Blog

    Debugging a Hang at “Applying Computer Settings”

    • 2 Comments

    Hi Everyone - My name is Aman. I'm an Escalation Engineer on the Microsoft GES (Global Escalation Services) team. We recently came across a critical situation where-in the following issues were reported by the customer:

     

    1.       DC login hangs at “applying computer settings” regardless of RDP or local console.

    2.       Connecting to the server using computer management console fails with "RPC Server is too busy to complete the operation"

    3.       Server responds to ping in the problem state but cannot be accessed through ports 445/139.

    4.       Reboot will fix the issue for a couple of days but the problem will return again.

     

    The case had already been troubleshot many times in the past without any results. So, to get to the root cause, it was escalated to GES and we took a full memory dump. In this blog I want to show you how to navigate between processes via LPC messages to drill down to root cause.

     

    Disclaimer – Not every hang at “Applying Computer Settings” requires a debug. Standard troubleshooting typically resolves this type of thing. 

     

    Now let's have a look at the Complete Memory Dump (note: a Kernel dump is not enough). Since the console session hangs at "applying computer settings", the target starting point of my debug will be the process called winlogon. Winlogon.exe is the executable responsible for initiating the logon process for users. If the console session is hung at "applying computer settings", winlogon has to have an explanation as to why it's unable to process the user logon successfully.

     

    So, let’s get the address of the winlogon process:

     

    kd> ! process 0 0 winlogon.exe

    PROCESS 85c63830  SessionId: 0  Cid: 01a4    Peb: 7ffdf000  ParentCid: 0158

        DirBase: 39f6a000  ObjectTable: e13d7928  HandleCount: 535.

        Image: winlogon.exe

     

    PROCESS 84f02020  SessionId: 1  Cid: 0344    Peb: 7ffdf000  ParentCid: 0158

        DirBase: 2d675000  ObjectTable: e35abc78  HandleCount: 313.

        Image: winlogon.exe

     

     

    So, why we have two winlogon processes executing? Simple, the customer tried to log into the domain controller via RDP and local console. Both methods resulted in the hang, hence creating two winlogon sessions. Let’s check the first one and see what's happening in there. I picked up only questionable threads for the sake of this session.

     

     

    kd> .process /p /r 85c63830

    Implicit process is now 85c63830

    Loading User Symbols

    ................................................................

    .........

    kd> !process 85c63830 7

    PROCESS 85c63830  SessionId: 0  Cid: 01a4    Peb: 7ffdf000  ParentCid: 0158

        DirBase: 39f6a000  ObjectTable: e13d7928  HandleCount: 535.

        Image: winlogon.exe

        VadRoot 85c599c8 Vads 157 Clone 0 Private 1266. Modified 1081. Locked 0.

        DeviceMap e1001150

        Token                             e1743a38

        ElapsedTime                       15:50:33.974

        UserTime                          00:00:00.187

        KernelTime                        00:00:02.125

        QuotaPoolUsage[PagedPool]         53444

        QuotaPoolUsage[NonPagedPool]      63864

        Working Set Sizes (now,min,max)  (1161, 50, 345) (4644KB, 200KB, 1380KB)

        PeakWorkingSetSize                2834

        VirtualSize                       49 Mb

        PeakVirtualSize                   52 Mb

        PageFaultCount                    10861

        MemoryPriority                    BACKGROUND

        BasePriority                      13

        CommitCharge                      1979

     

     

     

            THREAD 847318d0  Cid 01a4.58e4  Teb: 7ffdd000 Win32Thread: 00000000 WAIT: (Unknown) UserMode Non-Alertable

                84731abc  Semaphore Limit 0x1

            Waiting for reply to LPC MessageId 00213bd7:

            Current LPC port e125d4e0

            Not impersonating

            DeviceMap                 e1001150

            Owning Process            85c63830       Image:         winlogon.exe

            Attached Process          N/A            Image:         N/A

            Wait Start TickCount      3650157        Ticks: 525 (0:00:00:08.203)

            Context Switch Count      2            

            UserTime                  00:00:00.000

            KernelTime                00:00:00.000

            Win32 Start Address winlogon!WlpExecuteNotify (0x010390b7)

            Start Address kernel32!BaseThreadStartThunk (0x77e617ec)

            Stack Init f5c98000 Current f5c97c20 Base f5c98000 Limit f5c95000 Call 0

            Priority 14 BasePriority 13 PriorityDecrement 0

            ChildEBP RetAddr  Args to Child             

            f5c97c38 808202b6 847318d0 84731978 00000000 nt!KiSwapContext+0x25 (FPO: [Uses EBP] [0,0,4])

            f5c97c50 8081fb6e 84731abc 84731a90 847318d0 nt!KiSwapThread+0x83 (FPO: [0,2,0])

            f5c97c94 80915355 84731abc 00000011 f5c97d01 nt!KeWaitForSingleObject+0x2e0 (FPO: [5,12,4])

            f5c97d50 8082350b 0000080c 00087ba0 00087ba0 nt!NtRequestWaitReplyPort+0x776 (FPO: [SEH])

            f5c97d50 7c8285ec 0000080c 00087ba0 00087ba0 nt!KiFastCallEntry+0xf8 (FPO: [0,0] TrapFrame @ f5c97d64)

            0075fab0 7c82787b 77c80a6e 0000080c 00087ba0 ntdll!KiFastSystemCallRet (FPO: [0,0,0])

            0075fab4 77c80a6e 0000080c 00087ba0 00087ba0 ntdll!NtRequestWaitReplyPort+0xc (FPO: [3,0,0])

            0075fb00 77c7fcf0 0075fb3c 0075fb20 77c80673 RPCRT4!LRPC_CCALL::SendReceive+0x230 (FPO: [1,11,0])

            0075fb0c 77c80673 0075fb3c 75822c38 0075ff28 RPCRT4!I_RpcSendReceive+0x24 (FPO: [1,0,0])

            0075fb20 77ce315a 0075fb68 00087c18 001125a8 RPCRT4!NdrSendReceive+0x2b (FPO: [2,0,0])

            0075ff08 75822b74 75822c38 75822b8a 0075ff28 RPCRT4!NdrClientCall2+0x22e (FPO: [SEH])

            0075ff20 75822b4f 00b81460 0075ff3c 0075ff60 WlNotify!SensApip_RPC_SensNotifyWinlogonEvent+0x1c (FPO: [2,1,0])

            0075ff30 75824e7a 0075ff3c 00100000 00000024 WlNotify!SensNotifyWinlogonEvent+0x1c (FPO: [1,0,0])

            0075ff60 0103917b 0075ff78 00000000 00000000 WlNotify!SensStopScreenSaverEvent+0x58 (FPO: [1,9,0])

            0075ffb8 77e64829 001125a8 00000000 00000000 winlogon!WlpExecuteNotify+0xc4 (FPO: [SEH])

            0075ffec 00000000 010390b7 001125a8 00000000 kernel32!BaseThreadStart+0x34 (FPO: [SEH])

     

    We found somewhere around 15 threads in the process above. Only the above thread looked suspicious since it has issued a LPC call and hasn't received a reply yet. Before we investigate further, let’s check the other winlogon process and find out about the threads in there as well....just in case if there is a commonality or not amongst both the processes...

     

    Note the long tick counts in these threads!

     

     

    kd> .process /p /r 84f02020

    Implicit process is now 84f02020

    Loading User Symbols

    ................................................................

    ................

    kd> !process 84f02020 7

    PROCESS 84f02020  SessionId: 1  Cid: 0344    Peb: 7ffdf000  ParentCid: 0158

        DirBase: 2d675000  ObjectTable: e35abc78  HandleCount: 313.

        Image: winlogon.exe

        VadRoot 85e86dd0 Vads 184 Clone 0 Private 656. Modified 722. Locked 0.

        DeviceMap e1001150

        Token                             e3a09030

        ElapsedTime                       04:51:33.657

        UserTime                          00:00:00.140

        KernelTime                        00:00:00.796

        QuotaPoolUsage[PagedPool]         49464

        QuotaPoolUsage[NonPagedPool]      10080

        Working Set Sizes (now,min,max)  (1099, 50, 345) (4396KB, 200KB, 1380KB)

        PeakWorkingSetSize                2193

        VirtualSize                       48 Mb

        PeakVirtualSize                   50 Mb

        PageFaultCount                    4312

        MemoryPriority                    BACKGROUND

        BasePriority                      13

        CommitCharge                      1329

     

            THREAD 847a93f0  Cid 0344.4ab0  Teb: 7ffd7000 Win32Thread: 00000000 WAIT: (Unknown) UserMode Non-Alertable

                847a95dc  Semaphore Limit 0x1

            Waiting for reply to LPC MessageId 0020b318:

            Current LPC port e3761290

            Impersonation token:  e3a92250 (Level Impersonation)

            DeviceMap                 e1bca430

            Owning Process            84f02020       Image:         winlogon.exe

            Attached Process          N/A            Image:         N/A

            Wait Start TickCount      3500419        Ticks: 150263 (0:00:39:07.859)

            Context Switch Count      1            

            UserTime                  00:00:00.000

            KernelTime                00:00:00.000

            Win32 Start Address winlogon!WlpExecuteNotify (0x010390b7)

            Start Address kernel32!BaseThreadStartThunk (0x77e617ec)

            Stack Init b9d00000 Current b9cffc20 Base b9d00000 Limit b9cfd000 Call 0

            Priority 13 BasePriority 13 PriorityDecrement 0

            Kernel stack not resident.

            ChildEBP RetAddr  Args to Child             

            b9cffc38 808202b6 847a93f0 847a9498 00000310 nt!KiSwapContext+0x25 (FPO: [Uses EBP] [0,0,4])

            b9cffc50 8081fb6e 847a95dc 847a95b0 847a93f0 nt!KiSwapThread+0x83 (FPO: [0,2,0])

            b9cffc94 80915355 847a95dc 00000011 00000001 nt!KeWaitForSingleObject+0x2e0 (FPO: [5,12,4])

            b9cffd50 8082350b 00000364 000b9760 000b9760 nt!NtRequestWaitReplyPort+0x776 (FPO: [SEH])

            b9cffd50 7c8285ec 00000364 000b9760 000b9760 nt!KiFastCallEntry+0xf8 (FPO: [0,0] TrapFrame @ b9cffd64)

            0087fab0 7c82787b 77c80a6e 00000364 000b9760 ntdll!KiFastSystemCallRet (FPO: [0,0,0])

            0087fab4 77c80a6e 00000364 000b9760 000b9760 ntdll!NtRequestWaitReplyPort+0xc (FPO: [3,0,0])

            0087fb00 77c7fcf0 0087fb3c 0087fb20 77c80673 RPCRT4!LRPC_CCALL::SendReceive+0x230 (FPO: [1,11,0])

            0087fb0c 77c80673 0087fb3c 75822c38 0087ff28 RPCRT4!I_RpcSendReceive+0x24 (FPO: [1,0,0])

            0087fb20 77ce315a 0087fb68 000b9820 000a8bd8 RPCRT4!NdrSendReceive+0x2b (FPO: [2,0,0])

            0087ff08 75822b74 75822c38 75822b8a 0087ff28 RPCRT4!NdrClientCall2+0x22e (FPO: [SEH])

            0087ff20 75822b4f 000c1e30 0087ff3c 0087ff60 WlNotify!SensApip_RPC_SensNotifyWinlogonEvent+0x1c (FPO: [2,1,0])

            0087ff30 75824e19 0087ff3c 00080000 00000024 WlNotify!SensNotifyWinlogonEvent+0x1c (FPO: [1,0,0])

            0087ff60 0103917b 0087ff78 00000000 00000000 WlNotify!SensStartScreenSaverEvent+0x58 (FPO: [1,9,0])

            0087ffb8 77e64829 000a8bd8 00000000 00000000 winlogon!WlpExecuteNotify+0xc4 (FPO: [SEH])

            0087ffec 00000000 010390b7 000a8bd8 00000000 kernel32!BaseThreadStart+0x34 (FPO: [SEH])

     

            THREAD 847ad8d0  Cid 0344.4c7c  Teb: 7ffac000 Win32Thread: 00000000 WAIT: (Unknown) UserMode Non-Alertable

                847adabc  Semaphore Limit 0x1

            Waiting for reply to LPC MessageId 0020bf59:

            Current LPC port e3761290

            Impersonation token:  e39b1358 (Level Impersonation)

            DeviceMap                 e1bca430

            Owning Process            84f02020       Image:         winlogon.exe

            Attached Process          N/A            Image:         N/A

            Wait Start TickCount      3513766        Ticks: 136916 (0:00:35:39.312)

            Context Switch Count      1            

            UserTime                  00:00:00.000

            KernelTime                00:00:00.000

            Win32 Start Address winlogon!WlpExecuteNotify (0x010390b7)

            Start Address kernel32!BaseThreadStartThunk (0x77e617ec)

            Stack Init b9dc0000 Current b9dbfc20 Base b9dc0000 Limit b9dbd000 Call 0

            Priority 13 BasePriority 13 PriorityDecrement 0

            Kernel stack not resident.

            ChildEBP RetAddr  Args to Child             

            b9dbfc38 808202b6 847ad8d0 847ad978 00000341 nt!KiSwapContext+0x25 (FPO: [Uses EBP] [0,0,4])

            b9dbfc50 8081fb6e 847adabc 847ada90 847ad8d0 nt!KiSwapThread+0x83 (FPO: [0,2,0])

            b9dbfc94 80915355 847adabc 00000011 b9dbfd01 nt!KeWaitForSingleObject+0x2e0 (FPO: [5,12,4])

            b9dbfd50 8082350b 00000364 000ba340 000ba340 nt!NtRequestWaitReplyPort+0x776 (FPO: [SEH])

            b9dbfd50 7c8285ec 00000364 000ba340 000ba340 nt!KiFastCallEntry+0xf8 (FPO: [0,0] TrapFrame @ b9dbfd64)

            00b7fab0 7c82787b 77c80a6e 00000364 000ba340 ntdll!KiFastSystemCallRet (FPO: [0,0,0])

            00b7fab4 77c80a6e 00000364 000ba340 000ba340 ntdll!NtRequestWaitReplyPort+0xc (FPO: [3,0,0])

            00b7fb00 77c7fcf0 00b7fb3c 00b7fb20 77c80673 RPCRT4!LRPC_CCALL::SendReceive+0x230 (FPO: [1,11,0])

            00b7fb0c 77c80673 00b7fb3c 75822c38 00b7ff28 RPCRT4!I_RpcSendReceive+0x24 (FPO: [1,0,0])

            00b7fb20 77ce315a 00b7fb68 000ba400 000a8b80 RPCRT4!NdrSendReceive+0x2b (FPO: [2,0,0])

            00b7ff08 75822b74 75822c38 75822b8a 00b7ff28 RPCRT4!NdrClientCall2+0x22e (FPO: [SEH])

            00b7ff20 75822b4f 000c1e30 00b7ff3c 00b7ff60 WlNotify!SensApip_RPC_SensNotifyWinlogonEvent+0x1c (FPO: [2,1,0])

            00b7ff30 75824e7a 00b7ff3c 00100000 00000024 WlNotify!SensNotifyWinlogonEvent+0x1c (FPO: [1,0,0])

            00b7ff60 0103917b 00b7ff78 00000000 00000000 WlNotify!SensStopScreenSaverEvent+0x58 (FPO: [1,9,0])

            00b7ffb8 77e64829 000a8b80 00000000 00000000 winlogon!WlpExecuteNotify+0xc4 (FPO: [SEH])

            00b7ffec 00000000 010390b7 000a8b80 00000000 kernel32!BaseThreadStart+0x34 (FPO: [SEH])

     

            THREAD 84989b40  Cid 0344.4c88  Teb: 7ffaa000 Win32Thread: 00000000 WAIT: (Unknown) UserMode Non-Alertable

                84989d2c  Semaphore Limit 0x1

            Waiting for reply to LPC MessageId 0020bf60:

            Current LPC port e3761290

            Impersonation token:  e3a48030 (Level Impersonation)

            DeviceMap                 e1bca430

            Owning Process            84f02020       Image:         winlogon.exe

            Attached Process          N/A            Image:         N/A

            Wait Start TickCount      3513766        Ticks: 136916 (0:00:35:39.312)

            Context Switch Count      1            

            UserTime                  00:00:00.000

            KernelTime                00:00:00.000

            Win32 Start Address winlogon!WlpExecuteNotify (0x010390b7)

            Start Address kernel32!BaseThreadStartThunk (0x77e617ec)

            Stack Init b9c44000 Current b9c43c20 Base b9c44000 Limit b9c41000 Call 0

            Priority 13 BasePriority 13 PriorityDecrement 0

            Kernel stack not resident.

            ChildEBP RetAddr  Args to Child             

            b9c43c38 808202b6 84989b40 84989be8 00000741 nt!KiSwapContext+0x25 (FPO: [Uses EBP] [0,0,4])

            b9c43c50 8081fb6e 84989d2c 84989d00 84989b40 nt!KiSwapThread+0x83 (FPO: [0,2,0])

            b9c43c94 80915355 84989d2c 00000011 b9c43d01 nt!KeWaitForSingleObject+0x2e0 (FPO: [5,12,4])

            b9c43d50 8082350b 00000364 000c18a8 000c18a8 nt!NtRequestWaitReplyPort+0x776 (FPO: [SEH])

            b9c43d50 7c8285ec 00000364 000c18a8 000c18a8 nt!KiFastCallEntry+0xf8 (FPO: [0,0] TrapFrame @ b9c43d64)

            00dffab0 7c82787b 77c80a6e 00000364 000c18a8 ntdll!KiFastSystemCallRet (FPO: [0,0,0])

            00dffab4 77c80a6e 00000364 000c18a8 000c18a8 ntdll!NtRequestWaitReplyPort+0xc (FPO: [3,0,0])

            00dffb00 77c7fcf0 00dffb3c 00dffb20 77c80673 RPCRT4!LRPC_CCALL::SendReceive+0x230 (FPO: [1,11,0])

            00dffb0c 77c80673 00dffb3c 75822c38 00dfff28 RPCRT4!I_RpcSendReceive+0x24 (FPO: [1,0,0])

            00dffb20 77ce315a 00dffb68 000c1968 000bcd80 RPCRT4!NdrSendReceive+0x2b (FPO: [2,0,0])

            00dfff08 75822b74 75822c38 75822b8a 00dfff28 RPCRT4!NdrClientCall2+0x22e (FPO: [SEH])

            00dfff20 75822b4f 000c1e30 00dfff3c 00dfff60 WlNotify!SensApip_RPC_SensNotifyWinlogonEvent+0x1c (FPO: [2,1,0])

            00dfff30 75824d57 00dfff3c 00020000 00000024 WlNotify!SensNotifyWinlogonEvent+0x1c (FPO: [1,0,0])

            00dfff60 0103917b 00dfff78 00000000 00000000 WlNotify!SensLockEvent+0x58 (FPO: [1,9,0])

            00dfffb8 77e64829 000bcd80 00000000 00000000 winlogon!WlpExecuteNotify+0xc4 (FPO: [SEH])

            00dfffec 00000000 010390b7 000bcd80 00000000 kernel32!BaseThreadStart+0x34 (FPO: [SEH])

     

            THREAD 847d7b18  Cid 0344.4c9c  Teb: 7ffa6000 Win32Thread: 00000000 WAIT: (Unknown) UserMode Non-Alertable

                847d7d04  Semaphore Limit 0x1

            Waiting for reply to LPC MessageId 0020bfb4:

            Current LPC port e3761290

            Impersonation token:  e423a358 (Level Impersonation)

            DeviceMap                 e1bca430

            Owning Process            84f02020       Image:         winlogon.exe

            Attached Process          N/A            Image:         N/A

            Wait Start TickCount      3514154        Ticks: 136528 (0:00:35:33.250)

            Context Switch Count      1            

            UserTime                  00:00:00.000

            KernelTime                00:00:00.000

            Win32 Start Address winlogon!WlpExecuteNotify (0x010390b7)

            Start Address kernel32!BaseThreadStartThunk (0x77e617ec)

            Stack Init b9c00000 Current b9bffc20 Base b9c00000 Limit b9bfd000 Call 0

            Priority 13 BasePriority 13 PriorityDecrement 0

            Kernel stack not resident.

            ChildEBP RetAddr  Args to Child             

            b9bffc38 808202b6 847d7b18 847d7bc0 00002601 nt!KiSwapContext+0x25 (FPO: [Uses EBP] [0,0,4])

            b9bffc50 8081fb6e 847d7d04 847d7cd8 847d7b18 nt!KiSwapThread+0x83 (FPO: [0,2,0])

            b9bffc94 80915355 847d7d04 00000011 b9bffd01 nt!KeWaitForSingleObject+0x2e0 (FPO: [5,12,4])

            b9bffd50 8082350b 00000364 000ab970 000ab970 nt!NtRequestWaitReplyPort+0x776 (FPO: [SEH])

            b9bffd50 7c8285ec 00000364 000ab970 000ab970 nt!KiFastCallEntry+0xf8 (FPO: [0,0] TrapFrame @ b9bffd64)

            00fcfab0 7c82787b 77c80a6e 00000364 000ab970 ntdll!KiFastSystemCallRet (FPO: [0,0,0])

            00fcfab4 77c80a6e 00000364 000ab970 000ab970 ntdll!NtRequestWaitReplyPort+0xc (FPO: [3,0,0])

            00fcfb00 77c7fcf0 00fcfb3c 00fcfb20 77c80673 RPCRT4!LRPC_CCALL::SendReceive+0x230 (FPO: [1,11,0])

            00fcfb0c 77c80673 00fcfb3c 75822c38 00fcff28 RPCRT4!I_RpcSendReceive+0x24 (FPO: [1,0,0])

            00fcfb20 77ce315a 00fcfb68 000aba30 000bd858 RPCRT4!NdrSendReceive+0x2b (FPO: [2,0,0])

            00fcff08 75822b74 75822c38 75822b8a 00fcff28 RPCRT4!NdrClientCall2+0x22e (FPO: [SEH])

            00fcff20 75822b4f 000c1e30 00fcff3c 00fcff60 WlNotify!SensApip_RPC_SensNotifyWinlogonEvent+0x1c (FPO: [2,1,0])

            00fcff30 75824db8 00fcff3c 00040000 00000024 WlNotify!SensNotifyWinlogonEvent+0x1c (FPO: [1,0,0])

            00fcff60 0103917b 00fcff78 00000000 00000000 WlNotify!SensUnlockEvent+0x58 (FPO: [1,9,0])

            00fcffb8 77e64829 000bd858 00000000 00000000 winlogon!WlpExecuteNotify+0xc4 (FPO: [SEH])

            00fcffec 00000000 010390b7 000bd858 00000000 kernel32!BaseThreadStart+0x34 (FPO: [SEH])

     

            THREAD 8478e8d0  Cid 0344.53d0  Teb: 7ffab000 Win32Thread: 00000000 WAIT: (Unknown) UserMode Non-Alertable

                8478eabc  Semaphore Limit 0x1

            Waiting for reply to LPC MessageId 0020f7bb:

            Current LPC port e3761290

            Impersonation token:  e3811358 (Level Impersonation)

            DeviceMap                 e1bca430

            Owning Process            84f02020       Image:         winlogon.exe

            Attached Process          N/A            Image:         N/A

            Wait Start TickCount      3572296        Ticks: 78386 (0:00:20:24.781)

            Context Switch Count      1            

            UserTime                  00:00:00.000

            KernelTime                00:00:00.000

            Win32 Start Address winlogon!WlpExecuteNotify (0x010390b7)

            Start Address kernel32!BaseThreadStartThunk (0x77e617ec)

            Stack Init b94bc000 Current b94bbc20 Base b94bc000 Limit b94b9000 Call 0

            Priority 13 BasePriority 13 PriorityDecrement 0

            Kernel stack not resident.

            ChildEBP RetAddr  Args to Child             

            b94bbc38 808202b6 8478e8d0 8478e978 00000310 nt!KiSwapContext+0x25 (FPO: [Uses EBP] [0,0,4])

            b94bbc50 8081fb6e 8478eabc 8478ea90 8478e8d0 nt!KiSwapThread+0x83 (FPO: [0,2,0])

            b94bbc94 80915355 8478eabc 00000011 b94bbd01 nt!KeWaitForSingleObject+0x2e0 (FPO: [5,12,4])

            b94bbd50 8082350b 00000364 000880a8 000880a8 nt!NtRequestWaitReplyPort+0x776 (FPO: [SEH])

            b94bbd50 7c8285ec 00000364 000880a8 000880a8 nt!KiFastCallEntry+0xf8 (FPO: [0,0] TrapFrame @ b94bbd64)

            00dbfab0 7c82787b 77c80a6e 00000364 000880a8 ntdll!KiFastSystemCallRet (FPO: [0,0,0])

            00dbfab4 77c80a6e 00000364 000880a8 000880a8 ntdll!NtRequestWaitReplyPort+0xc (FPO: [3,0,0])

            00dbfb00 77c7fcf0 00dbfb3c 00dbfb20 77c80673 RPCRT4!LRPC_CCALL::SendReceive+0x230 (FPO: [1,11,0])

            00dbfb0c 77c80673 00dbfb3c 75822c38 00dbff28 RPCRT4!I_RpcSendReceive+0x24 (FPO: [1,0,0])

            00dbfb20 77ce315a 00dbfb68 00088168 0008bd80 RPCRT4!NdrSendReceive+0x2b (FPO: [2,0,0])

            00dbff08 75822b74 75822c38 75822b8a 00dbff28 RPCRT4!NdrClientCall2+0x22e (FPO: [SEH])

            00dbff20 75822b4f 000c1e30 00dbff3c 00dbff60 WlNotify!SensApip_RPC_SensNotifyWinlogonEvent+0x1c (FPO: [2,1,0])

            00dbff30 75824e19 00dbff3c 00080000 00000024 WlNotify!SensNotifyWinlogonEvent+0x1c (FPO: [1,0,0])

            00dbff60 0103917b 00dbff78 00000000 00000000 WlNotify!SensStartScreenSaverEvent+0x58 (FPO: [1,9,0])

            00dbffb8 77e64829 0008bd80 00000000 00000000 winlogon!WlpExecuteNotify+0xc4 (FPO: [SEH])

            00dbffec 00000000 010390b7 0008bd80 00000000 kernel32!BaseThreadStart+0x34 (FPO: [SEH])

     

            THREAD 846e6660  Cid 0344.56a8  Teb: 7ffa7000 Win32Thread: 00000000 WAIT: (Unknown) UserMode Non-Alertable

                846e684c  Semaphore Limit 0x1

            Waiting for reply to LPC MessageId 0021117f:

            Current LPC port e3761290

            Impersonation token:  e37a5358 (Level Impersonation)

            DeviceMap                 e1bca430

            Owning Process            84f02020       Image:         winlogon.exe

            Attached Process          N/A            Image:         N/A

            Wait Start TickCount      3602966        Ticks: 47716 (0:00:12:25.562)

            Context Switch Count      1            

            UserTime                  00:00:00.000

            KernelTime                00:00:00.000

            Win32 Start Address winlogon!WlpExecuteNotify (0x010390b7)

            Start Address kernel32!BaseThreadStartThunk (0x77e617ec)

            Stack Init b9500000 Current b94ffc20 Base b9500000 Limit b94fd000 Call 0

            Priority 13 BasePriority 13 PriorityDecrement 0

            Kernel stack not resident.

            ChildEBP RetAddr  Args to Child             

            b94ffc38 808202b6 846e6660 846e6708 00000241 nt!KiSwapContext+0x25 (FPO: [Uses EBP] [0,0,4])

            b94ffc50 8081fb6e 846e684c 846e6820 846e6660 nt!KiSwapThread+0x83 (FPO: [0,2,0])

            b94ffc94 80915355 846e684c 00000011 b94ffd01 nt!KeWaitForSingleObject+0x2e0 (FPO: [5,12,4])

            b94ffd50 8082350b 00000364 0008ba78 0008ba78 nt!NtRequestWaitReplyPort+0x776 (FPO: [SEH])

            b94ffd50 7c8285ec 00000364 0008ba78 0008ba78 nt!KiFastCallEntry+0xf8 (FPO: [0,0] TrapFrame @ b94ffd64)

            00f8fab0 7c82787b 77c80a6e 00000364 0008ba78 ntdll!KiFastSystemCallRet (FPO: [0,0,0])

            00f8fab4 77c80a6e 00000364 0008ba78 0008ba78 ntdll!NtRequestWaitReplyPort+0xc (FPO: [3,0,0])

            00f8fb00 77c7fcf0 00f8fb3c 00f8fb20 77c80673 RPCRT4!LRPC_CCALL::SendReceive+0x230 (FPO: [1,11,0])

            00f8fb0c 77c80673 00f8fb3c 75822c38 00f8ff28 RPCRT4!I_RpcSendReceive+0x24 (FPO: [1,0,0])

            00f8fb20 77ce315a 00f8fb68 0008bb38 0009bbf0 RPCRT4!NdrSendReceive+0x2b (FPO: [2,0,0])

            00f8ff08 75822b74 75822c38 75822b8a 00f8ff28 RPCRT4!NdrClientCall2+0x22e (FPO: [SEH])

            00f8ff20 75822b4f 000c1e30 00f8ff3c 00f8ff60 WlNotify!SensApip_RPC_SensNotifyWinlogonEvent+0x1c (FPO: [2,1,0])

            00f8ff30 75824e7a 00f8ff3c 00100000 00000024 WlNotify!SensNotifyWinlogonEvent+0x1c (FPO: [1,0,0])

            00f8ff60 0103917b 00f8ff78 00000000 00000000 WlNotify!SensStopScreenSaverEvent+0x58 (FPO: [1,9,0])

            00f8ffb8 77e64829 0009bbf0 00000000 00000000 winlogon!WlpExecuteNotify+0xc4 (FPO: [SEH])

            00f8ffec 00000000 010390b7 0009bbf0 00000000 kernel32!BaseThreadStart+0x34 (FPO: [SEH])

     

            THREAD 8470d428  Cid 0344.56b4  Teb: 7ffa4000 Win32Thread: 00000000 WAIT: (Unknown) UserMode Non-Alertable

                8470d614  Semaphore Limit 0x1

            Waiting for reply to LPC MessageId 00211186:

            Current LPC port e3761290

            Impersonation token:  e4167358 (Level Impersonation)

            DeviceMap                 e1bca430

            Owning Process            84f02020       Image:         winlogon.exe

            Attached Process          N/A            Image:         N/A

            Wait Start TickCount      3602966        Ticks: 47716 (0:00:12:25.562)

            Context Switch Count      1            

            UserTime                  00:00:00.000

            KernelTime                00:00:00.000

            Win32 Start Address winlogon!WlpExecuteNotify (0x010390b7)

            Start Address kernel32!BaseThreadStartThunk (0x77e617ec)

            Stack Init b9378000 Current b9377c20 Base b9378000 Limit b9375000 Call 0

            Priority 13 BasePriority 13 PriorityDecrement 0

            Kernel stack not resident.

            ChildEBP RetAddr  Args to Child             

            b9377c38 808202b6 8470d428 8470d4d0 00002241 nt!KiSwapContext+0x25 (FPO: [Uses EBP] [0,0,4])

            b9377c50 8081fb6e 8470d614 8470d5e8 8470d428 nt!KiSwapThread+0x83 (FPO: [0,2,0])

            b9377c94 80915355 8470d614 00000011 b9377d01 nt!KeWaitForSingleObject+0x2e0 (FPO: [5,12,4])

            b9377d50 8082350b 00000364 0013b340 0013b340 nt!NtRequestWaitReplyPort+0x776 (FPO: [SEH])

            b9377d50 7c8285ec 00000364 0013b340 0013b340 nt!KiFastCallEntry+0xf8 (FPO: [0,0] TrapFrame @ b9377d64)

            016dfab0 7c82787b 77c80a6e 00000364 0013b340 ntdll!KiFastSystemCallRet (FPO: [0,0,0])

            016dfab4 77c80a6e 00000364 0013b340 0013b340 ntdll!NtRequestWaitReplyPort+0xc (FPO: [3,0,0])

            016dfb00 77c7fcf0 016dfb3c 016dfb20 77c80673 RPCRT4!LRPC_CCALL::SendReceive+0x230 (FPO: [1,11,0])

            016dfb0c 77c80673 016dfb3c 75822c38 016dff28 RPCRT4!I_RpcSendReceive+0x24 (FPO: [1,0,0])

            016dfb20 77ce315a 016dfb68 0013b400 0007f808 RPCRT4!NdrSendReceive+0x2b (FPO: [2,0,0])

            016dff08 75822b74 75822c38 75822b8a 016dff28 RPCRT4!NdrClientCall2+0x22e (FPO: [SEH])

            016dff20 75822b4f 000c1e30 016dff3c 016dff60 WlNotify!SensApip_RPC_SensNotifyWinlogonEvent+0x1c (FPO: [2,1,0])

            016dff30 75824d57 016dff3c 00020000 00000024 WlNotify!SensNotifyWinlogonEvent+0x1c (FPO: [1,0,0])

            016dff60 0103917b 016dff78 00000000 00000000 WlNotify!SensLockEvent+0x58 (FPO: [1,9,0])

            016dffb8 77e64829 0007f808 00000000 00000000 winlogon!WlpExecuteNotify+0xc4 (FPO: [SEH])

            016dffec 00000000 010390b7 0007f808 00000000 kernel32!BaseThreadStart+0x34 (FPO: [SEH])

     

            THREAD 846c4b40  Cid 0344.56d0  Teb: 7ffa2000 Win32Thread: 00000000 WAIT: (Unknown) UserMode Non-Alertable

                846c4d2c  Semaphore Limit 0x1

            Waiting for reply to LPC MessageId 00211223:

            Current LPC port e3761290

            Impersonation token:  e3826358 (Level Impersonation)

            DeviceMap                 e1bca430

            Owning Process            84f02020       Image:         winlogon.exe

            Attached Process          N/A            Image:         N/A

            Wait Start TickCount      3603490        Ticks: 47192 (0:00:12:17.375)

            Context Switch Count      1            

            UserTime                  00:00:00.000

            KernelTime                00:00:00.000

            Win32 Start Address winlogon!WlpExecuteNotify (0x010390b7)

            Start Address kernel32!BaseThreadStartThunk (0x77e617ec)

            Stack Init b9310000 Current b930fc20 Base b9310000 Limit b930d000 Call 0

            Priority 13 BasePriority 13 PriorityDecrement 0

            Kernel stack not resident.

            ChildEBP RetAddr  Args to Child             

            b930fc38 808202b6 846c4b40 846c4be8 00002301 nt!KiSwapContext+0x25 (FPO: [Uses EBP] [0,0,4])

            b930fc50 8081fb6e 846c4d2c 846c4d00 846c4b40 nt!KiSwapThread+0x83 (FPO: [0,2,0])

            b930fc94 80915355 846c4d2c 00000011 b930fd01 nt!KeWaitForSingleObject+0x2e0 (FPO: [5,12,4])

            b930fd50 8082350b 00000364 0013c440 0013c440 nt!NtRequestWaitReplyPort+0x776 (FPO: [SEH])

            b930fd50 7c8285ec 00000364 0013c440 0013c440 nt!KiFastCallEntry+0xf8 (FPO: [0,0] TrapFrame @ b930fd64)

            0175fab0 7c82787b 77c80a6e 00000364 0013c440 ntdll!KiFastSystemCallRet (FPO: [0,0,0])

            0175fab4 77c80a6e 00000364 0013c440 0013c440 ntdll!NtRequestWaitReplyPort+0xc (FPO: [3,0,0])

            0175fb00 77c7fcf0 0175fb3c 0175fb20 77c80673 RPCRT4!LRPC_CCALL::SendReceive+0x230 (FPO: [1,11,0])

            0175fb0c 77c80673 0175fb3c 75822c38 0175ff28 RPCRT4!I_RpcSendReceive+0x24 (FPO: [1,0,0])

            0175fb20 77ce315a 0175fb68 0013c500 00098ef8 RPCRT4!NdrSendReceive+0x2b (FPO: [2,0,0])

            0175ff08 75822b74 75822c38 75822b8a 0175ff28 RPCRT4!NdrClientCall2+0x22e (FPO: [SEH])

            0175ff20 75822b4f 000c1e30 0175ff3c 0175ff60 WlNotify!SensApip_RPC_SensNotifyWinlogonEvent+0x1c (FPO: [2,1,0])

            0175ff30 75824db8 0175ff3c 00040000 00000024 WlNotify!SensNotifyWinlogonEvent+0x1c (FPO: [1,0,0])

            0175ff60 0103917b 0175ff78 00000000 00000000 WlNotify!SensUnlockEvent+0x58 (FPO: [1,9,0])

            0175ffb8 77e64829 00098ef8 00000000 00000000 winlogon!WlpExecuteNotify+0xc4 (FPO: [SEH])

            0175ffec 00000000 010390b7 00098ef8 00000000 kernel32!BaseThreadStart+0x34 (FPO: [SEH])

     

    Wow!....look at that...around 8 threads, and all waiting for replies to the LPC calls issued. So, both the winlogon processes are stuck waiting for a reply for the LPC calls they made.

     

    The next part is to find out what's happening at the server part of these LPC calls since winlogon is stuck until the processing of the LPC calls are finished.

     

    To find the server part of the LPC call, we pick the message ID from each thread. Look for (Waiting for reply to LPC MessageId "message_id":) in the threads above. Let's start with the first winlogon process to find out the server end of the LPC call it had in there. If you remember, we did find one thread in the first winlogon process waiting for a reply to the LPC call and message id of that LPC call was 00213bd7.

     

    kd> !lpc message 00213bd7

    Searching message 213bd7 in threads ...

    Client thread 847318d0 waiting a reply from 213bd7                         

    Searching thread 847318d0 in port rundown queues ...

     

    Server communication port 0xe365aa90

        Handles: 1   References: 1

        The LpcDataInfoChainHead queue is empty

            Connected port: 0xe125d4e0      Server connection port: 0xe1c37920

     

    Client communication port 0xe125d4e0

        Handles: 1   References: 2

        The LpcDataInfoChainHead queue is empty

     

    Server connection port e1c37920  Name: senssvc

        Handles: 1   References: 17

        Server process  : 85e021e0 (svchost.exe)

        Queue semaphore : 850fa890

        Semaphore state 16 (0x10)

            Messages in queue:

            0000 e3791a88 - Busy  Id=001fbe4f  From: 0f4c.335c  Context=80060002  [e1c37930 . e1281378]

                       Length=00c000a8  Type=00000001 (LPC_REQUEST)

                       Data: 00000001 00000241 00000028 00000005 00010050 00d0fdac

            0000 e1281378 - Busy  Id=001fbe56  From: 0f4c.336c  Context=80060002  [e3791a88 . e15c5350]

                       Length=00c000a8  Type=00000001 (LPC_REQUEST)

                       Data: 00000001 00000241 00490065 0074006e 006c0065 00500000

            0000 e15c5350 - Busy  Id=00200326  From: 0f4c.3840  Context=80060002  [e1281378 . e14ba008]

                       Length=00c000a8  Type=00000001 (LPC_REQUEST)

                       Data: 00000001 00000241 0008c080 00000000 0049004d 00440048

            0000 e14ba008 - Busy  Id=0020032a  From: 0f4c.3848  Context=80060002  [e15c5350 . e1120290]

                       Length=00c000a8  Type=00000001 (LPC_REQUEST)

                       Data: 00000001 00000241 00000000 00000000 0000000c 00000000

            0000 e1120290 - Busy  Id=002049b3  From: 0f4c.3b08  Context=80060002  [e14ba008 . e40a48c8]

                       Length=00c000a8  Type=00000001 (LPC_REQUEST)

                       Data: 00000001 00000241 00600040 00600050 00900050 00600050

            0000 e40a48c8 - Busy  Id=00204e8d  From: 0f4c.09c8  Context=80060002  [e1120290 . e1ab41d0]

                       Length=00200008  Type=00000005 (LPC_PORT_CLOSED)

                       Data: fd48cde3 01c9700a

            0000 e1ab41d0 - Busy  Id=0020b318  From: 0344.4ab0  Context=80050001  [e40a48c8 . e1b35378]

                       Length=00c000a8  Type=00000001 (LPC_REQUEST)

                       Data: 00000001 00000241 00000028 00000005 00010050 00d0fdac

            0000 e1b35378 - Busy  Id=0020bf59  From: 0344.4c7c  Context=80050001  [e1ab41d0 . e1bdc918]

                       Length=00c000a8  Type=00000001 (LPC_REQUEST)

                       Data: 00000001 00000241 00000000 00000000 00000000 00000000

            0000 e1bdc918 - Busy  Id=0020bf60  From: 0344.4c88  Context=80050001  [e1b35378 . e1657ca0]

                       Length=00c000a8  Type=00000001 (LPC_REQUEST)

                       Data: 00000001 00000241 0037002d 00350032 00340033 00350035

            0000 e1657ca0 - Busy  Id=0020bfb4  From: 0344.4c9c  Context=80050001  [e1bdc918 . e3a463e8]

                       Length=00c000a8  Type=00000001 (LPC_REQUEST)

                       Data: 00000001 00000241 4ed07378 cf0ee4b3 5faeb176 0000000c

            0000 e3a463e8 - Busy  Id=0020f7bb  From: 0344.53d0  Context=80050001  [e1657ca0 . e1bcab08]

                       Length=00c000a8  Type=00000001 (LPC_REQUEST)

                       Data: 00000001 00000241 00000000 00000000 00000000 00000000

            0000 e1bcab08 - Busy  Id=0021117f  From: 0344.56a8  Context=80050001  [e3a463e8 . e4190008]

                       Length=00c000a8  Type=00000001 (LPC_REQUEST)

                       Data: 00000001 00000241 00000000 813a8198 00001601 00027011

            0000 e4190008 - Busy  Id=00211186  From: 0344.56b4  Context=80050001  [e1bcab08 . e40afca8]

                       Length=00c000a8  Type=00000001 (LPC_REQUEST)

                       Data: 00000001 00000241 002e0070 0061006d 00720072 006f0069

            0000 e40afca8 - Busy  Id=00211223  From: 0344.56d0  Context=80050001  [e4190008 . e37de340]

                       Length=00c000a8  Type=00000001 (LPC_REQUEST)

                       Data: 00000001 00000241 4bdd485c 5d18bba4 d4bb211b 0000000c

            0000 e37de340 - Busy  Id=00213bd7  From: 01a4.58e4  Context=80030000  [e40afca8 . e1c37930]

                       Length=00780060  Type=00000001 (LPC_REQUEST)

                       Data: 00000001 00000241 00000000 811fc308 00001601 000152cb

        The message queue contains 15 messages

        The LpcDataInfoChainHead queue is empty

     

    Done!                                            

     

    Interesting, along with the message id, 00213bd7 , in the queue, we also found all other message ID's seen in the second winlogon process.

    Take a look at the messages in queue and you will see all of them in there.

     

    So, now the question arises as to what is this process doing and why its unable to process these LPC call's successfully.

     

    Let's switch context to the Server process  :  85e021e0 (svchost.exe)

     

    kd> .process /p /r 85e021e0

    Implicit process is now 85e021e0 

    Loading User Symbols

    ................................................................

    ...........................................

     

    When I dumped the threads in this process, the thread count turned out to be an alarming 4900 which is extremely bizarre. Around 1325 of those threads were waiting to acquire a common critical section. This became a very solid point of interest and almost all of them (around 4870) were in a wait state pending a single common SynchronizationEvent i.e. 85d1eeb0.

     

    These threads wait for the message to come through via the LPC port to the server process. That also explains why the server service (user mode component: srvsvc.dll) is unable to process any remote requests.

     

    Few of those threads, waiting on the common critical section, looked like this: (Don’t get bothered by the “Kernel stack not resident” as we know memory gets swapped to disk)

     

    kd> !thread 84832af8

    THREAD 84832af8  Cid 03d8.4760  Teb: 7ef49000 Win32Thread: 00000000 WAIT: (Unknown) UserMode Non-Alertable

        85d1eeb0  SynchronizationEvent

    Not impersonating

    DeviceMap                 e1001150

    Owning Process            85e021e0       Image:         svchost.exe

    Attached Process          N/A            Image:         N/A

    Wait Start TickCount      3474348        Ticks: 176334 (0:00:45:55.218)

    Context Switch Count      1            

    UserTime                  00:00:00.000

    KernelTime                00:00:00.000

    Win32 Start Address srvsvc!XsProcessApisWrapper (0x74ed39a9)

    Start Address kernel32!BaseThreadStartThunk (0x77e617ec)

    Stack Init b9fd4000 Current b9fd3c78 Base b9fd4000 Limit b9fd1000 Call 0

    Priority 8 BasePriority 8 PriorityDecrement 0

    Kernel stack not resident.

    ChildEBP RetAddr  Args to Child             

    b9fd3c90 808202b6 84832af8 84832ba0 00000000 nt!KiSwapContext+0x25 (FPO: [Uses EBP] [0,0,4])

    b9fd3ca8 8081fb6e 00000000 00000000 00000000 nt!KiSwapThread+0x83 (FPO: [0,2,0])

    b9fd3cec 8090e64e 85d1eeb0 00000006 00000001 nt!KeWaitForSingleObject+0x2e0 (FPO: [5,12,4])

    b9fd3d50 8082350b 00000314 00000000 00000000 nt!NtWaitForSingleObject+0x9a (FPO: [SEH])

    b9fd3d50 7c8285ec 00000314 00000000 00000000 nt!KiFastCallEntry+0xf8 (FPO: [0,0] TrapFrame @ b9fd3d64)

    3d6bfbc8 7c827d0b 7c83d236 00000314 00000000 ntdll!KiFastSystemCallRet (FPO: [0,0,0])

    3d6bfbcc 7c83d236 00000314 00000000 00000000 ntdll!NtWaitForSingleObject+0xc (FPO: [3,0,0])

    3d6bfc08 7c83d281 00000314 00000004 00000000 ntdll!RtlpWaitOnCriticalSection+0x1a3 (FPO: [2,7,4])

    3d6bfc28 7c82ee3b 7c8877a0 00000000 7ffd8000 ntdll!RtlEnterCriticalSection+0xa8 (FPO: [1,1,0])

    3d6bfcb8 7c82ec9f 3d6bfd28 3d6bfd28 00000000 ntdll!LdrpInitializeThread+0x68 (FPO: [SEH])

    3d6bfd14 7c8284c5 3d6bfd28 7c800000 00000000 ntdll!_LdrpInitialize+0x16f (FPO: [SEH])

    00000000 00000000 00000000 00000000 00000000 ntdll!KiUserApcDispatcher+0x25

     

    kd> !thread 847d8b40

    THREAD 847d8b40  Cid 03d8.4be0  Teb: 7ed63000 Win32Thread: 00000000 WAIT: (Unknown) UserMode Non-Alertable

        85d1eeb0  SynchronizationEvent

    Not impersonating

    DeviceMap                 e1001150

    Owning Process            85e021e0       Image:         svchost.exe

    Attached Process          N/A            Image:         N/A

    Wait Start TickCount      3549550        Ticks: 101132 (0:00:26:20.187)

    Context Switch Count      1            

    UserTime                  00:00:00.000

    KernelTime                00:00:00.000

    Win32 Start Address srvsvc!XsProcessApisWrapper (0x74ed39a9)

    Start Address kernel32!BaseThreadStartThunk (0x77e617ec)

    Stack Init b98a4000 Current b98a3c78 Base b98a4000 Limit b98a1000 Call 0

    Priority 8 BasePriority 8 PriorityDecrement 0

    Kernel stack not resident.

    ChildEBP RetAddr  Args to Child             

    b98a3c90 808202b6 847d8b40 847d8be8 00000000 nt!KiSwapContext+0x25 (FPO: [Uses EBP] [0,0,4])

    b98a3ca8 8081fb6e 00000000 00000000 00000000 nt!KiSwapThread+0x83 (FPO: [0,2,0])

    b98a3cec 8090e64e 85d1eeb0 00000006 00000001 nt!KeWaitForSingleObject+0x2e0 (FPO: [5,12,4])

    b98a3d50 8082350b 00000314 00000000 00000000 nt!NtWaitForSingleObject+0x9a (FPO: [SEH])

    b98a3d50 7c8285ec 00000314 00000000 00000000 nt!KiFastCallEntry+0xf8 (FPO: [0,0] TrapFrame @ b98a3d64)

    4503fbc8 7c827d0b 7c83d236 00000314 00000000 ntdll!KiFastSystemCallRet (FPO: [0,0,0])

    4503fbcc 7c83d236 00000314 00000000 00000000 ntdll!NtWaitForSingleObject+0xc (FPO: [3,0,0])

    4503fc08 7c83d281 00000314 00000004 00000000 ntdll!RtlpWaitOnCriticalSection+0x1a3 (FPO: [2,7,4])

    4503fc28 7c82ee3b 7c8877a0 00000000 7ffd8000 ntdll!RtlEnterCriticalSection+0xa8 (FPO: [1,1,0])

    4503fcb8 7c82ec9f 4503fd28 4503fd28 00000000 ntdll!LdrpInitializeThread+0x68 (FPO: [SEH])

    4503fd14 7c8284c5 4503fd28 7c800000 00000000 ntdll!_LdrpInitialize+0x16f (FPO: [SEH])

    00000000 00000000 00000000 00000000 00000000 ntdll!KiUserApcDispatcher+0x25

     

    kd> !thread 847a8db0

    THREAD 847a8db0  Cid 03d8.4998  Teb: 7eed1000 Win32Thread: 00000000 WAIT: (Unknown) UserMode Non-Alertable

        85d1eeb0  SynchronizationEvent

    Not impersonating

    DeviceMap                 e1001150

    Owning Process            85e021e0       Image:         svchost.exe

    Attached Process          N/A            Image:         N/A

    Wait Start TickCount      3487819        Ticks: 162863 (0:00:42:24.734)

    Context Switch Count      1            

    UserTime                  00:00:00.000

    KernelTime                00:00:00.000

    Win32 Start Address srvsvc!XsProcessApisWrapper (0x74ed39a9)

    Start Address kernel32!BaseThreadStartThunk (0x77e617ec)

    Stack Init b9dc4000 Current b9dc3c78 Base b9dc4000 Limit b9dc1000 Call 0

    Priority 8 BasePriority 8 PriorityDecrement 0

    Kernel stack not resident.

    ChildEBP RetAddr  Args to Child              

    b9dc3c90 808202b6 847a8db0 847a8e58 00000000 nt!KiSwapContext+0x25 (FPO: [Uses EBP] [0,0,4])

    b9dc3ca8 8081fb6e 00000000 00000000 00000000 nt!KiSwapThread+0x83 (FPO: [0,2,0])

    b9dc3cec 8090e64e 85d1eeb0 00000006 00000001 nt!KeWaitForSingleObject+0x2e0 (FPO: [5,12,4])

    b9dc3d50 8082350b 00000314 00000000 00000000 nt!NtWaitForSingleObject+0x9a (FPO: [SEH])

    b9dc3d50 7c8285ec 00000314 00000000 00000000 nt!KiFastCallEntry+0xf8 (FPO: [0,0] TrapFrame @ b9dc3d64)

    3f4bfbc8 7c827d0b 7c83d236 00000314 00000000 ntdll!KiFastSystemCallRet (FPO: [0,0,0])

    3f4bfbcc 7c83d236 00000314 00000000 00000000 ntdll!NtWaitForSingleObject+0xc (FPO: [3,0,0])

    3f4bfc08 7c83d281 00000314 00000004 00000000 ntdll!RtlpWaitOnCriticalSection+0x1a3 (FPO: [2,7,4])

    3f4bfc28 7c82ee3b 7c8877a0 00000000 7ffd8000 ntdll!RtlEnterCriticalSection+0xa8 (FPO: [1,1,0])

    3f4bfcb8 7c82ec9f 3f4bfd28 3f4bfd28 00000000 ntdll!LdrpInitializeThread+0x68 (FPO: [SEH])

    3f4bfd14 7c8284c5 3f4bfd28 7c800000 00000000 ntdll!_LdrpInitialize+0x16f (FPO: [SEH])

    00000000 00000000 00000000 00000000 00000000 ntdll!KiUserApcDispatcher+0x25

     

    kd> !thread 848e1db0

    THREAD 848e1db0  Cid 03d8.2c10  Teb: 7f67a000 Win32Thread: 00000000 WAIT: (Unknown) UserMode Non-Alertable

        85d1eeb0  SynchronizationEvent

    Not impersonating

    DeviceMap                 e1001150

    Owning Process            85e021e0       Image:         svchost.exe

    Attached Process          N/A            Image:         N/A

    Wait Start TickCount      3183097        Ticks: 467585 (0:02:01:46.015)

    Context Switch Count      1             

    UserTime                  00:00:00.000

    KernelTime                00:00:00.000

    Win32 Start Address srvsvc!XsProcessApisWrapper (0x74ed39a9)

    Start Address kernel32!BaseThreadStartThunk (0x77e617ec)

    Stack Init f3570000 Current f356fc78 Base f3570000 Limit f356d000 Call 0

    Priority 8 BasePriority 8 PriorityDecrement 0

    Kernel stack not resident

     

    kd> !thread 84890db0

    THREAD 84890db0  Cid 03d8.31cc  Teb: 7f4e3000 Win32Thread: 00000000 WAIT: (Unknown) UserMode Non-Alertable

        85d1eeb0  SynchronizationEvent

    Not impersonating

    DeviceMap                 e1001150

    Owning Process            85e021e0       Image:         svchost.exe

    Attached Process          N/A            Image:         N/A

    Wait Start TickCount      3244800        Ticks: 405882 (0:01:45:41.906)

    Context Switch Count      1            

    UserTime                  00:00:00.000

    KernelTime                00:00:00.000

    Win32 Start Address srvsvc!XsProcessApisWrapper (0x74ed39a9)

    Start Address kernel32!BaseThreadStartThunk (0x77e617ec)

    Stack Init f2ee8000 Current f2ee7c78 Base f2ee8000 Limit f2ee5000 Call 0

    Priority 8 BasePriority 8 PriorityDecrement 0

    Kernel stack not resident.

     

    kd> !thread 848223f0

    THREAD 848223f0  Cid 03d8.3714  Teb: 7f3d8000 Win32Thread: 00000000 WAIT: (Unknown) UserMode Non-Alertable

        85d1eeb0  SynchronizationEvent

    Not impersonating

    DeviceMap                 e1001150

    Owning Process            85e021e0       Image:         svchost.exe

    Attached Process          N/A            Image:         N/A

    Wait Start TickCount      3283235        Ticks: 367447 (0:01:35:41.359)

    Context Switch Count      1            

    UserTime                  00:00:00.000

    KernelTime                00:00:00.000

    Win32 Start Address srvsvc!XsProcessApisWrapper (0x74ed39a9)

    Start Address kernel32!BaseThreadStartThunk (0x77e617ec)

    Stack Init f2ac4000 Current f2ac3c78 Base f2ac4000 Limit f2ac1000 Call 0

    Priority 8 BasePriority 8 PriorityDecrement 0

    Kernel stack not resident.

     

    kd> dt ntdll!_RTL_CRITICAL_SECTION 7c8877a0

       +0x000 DebugInfo        : 0x7c8877c0 _RTL_CRITICAL_SECTION_DEBUG

       +0x004 LockCount        : -19514

       +0x008 RecursionCount   : 2

       +0x00c OwningThread     : 0x00001634

       +0x010 LockSemaphore    : 0x00000314

       +0x014 SpinCount        : 0

     

    In Microsoft Windows Server 2003 Service Pack 1 and later versions of Windows, the LockCount field is parsed as follows: (also at link http://msdn.microsoft.com/en-us/library/cc267476.aspx)

    • The lowest bit shows the lock status. If this bit is 0, the critical section is locked; if it is 1, the critical section is not locked.
    • The next bit shows whether a thread has been woken for this lock. If this bit is 0, then a thread has been woken for this lock; if it is 1, no thread has been woken.
    • The remaining bits are the ones-complement of the number of threads waiting for the lock.

    In our case, the LockCount is -19514. The lowest bit can be determined in this way:

    kd> ? 0x1 & (-0n19514)

    Evaluate expression: 0 = 00000000

     

    kd>  (0x2 & (-0n19514)) >> 1

         ^ Syntax error in ' (0x2 & (-0n19514)) >> 1'

     

    kd> ?(0x2 & (-0n19514)) >> 1

    Evaluate expression: 1 = 00000001

     

    kd> ? ((-1) - (-0n19514)) >> 2

    Evaluate expression: 4878 = 0000130e

     

     

    So, there are 4878 threads waiting for this particular critical section. Seeing the number of above threads in the waiting list, it makes lot of sense.

    In RED above, we see the offender (0x00001634) who is not ready to leave the critical section.

     

    So now, at least we know which thread is causing the server to be unresponsive. Let’s go further and find out what’s happening in that particular thread.

     

    kd> !thread 85125020

    THREAD 85125020  Cid 03d8.1634  Teb: 7ffdb000 Win32Thread: e1754ea8 WAIT: (Unknown) UserMode Non-Alertable

        8512520c  Semaphore Limit 0x1

    Waiting for reply to LPC MessageId 001e295b:

    Current LPC port e38140c8

    IRP List:

        850eff68: (0006,0094) Flags: 00000800  Mdl: 00000000

    Not impersonating

    DeviceMap                 e1001150

    Owning Process            85e021e0       Image:         svchost.exe

    Attached Process          N/A            Image:         N/A

    Wait Start TickCount      3470768        Ticks: 179914 (0:00:46:51.156)

    Context Switch Count      93                 LargeStack

    UserTime                  00:00:00.000

    KernelTime                00:00:00.015

    Win32 Start Address RPCRT4!ThreadStartRoutine (0x77c7b0f5)

    Start Address kernel32!BaseThreadStartThunk (0x77e617ec)

    Stack Init f5424000 Current f5423c20 Base f5424000 Limit f5421000 Call 0

    Priority 10 BasePriority 8 PriorityDecrement 0

    Kernel stack not resident.

    ChildEBP RetAddr  Args to Child             

    f5423c38 808202b6 85125020 851250c8 00000500 nt!KiSwapContext+0x25 (FPO: [Uses EBP] [0,0,4])

    f5423c50 8081fb6e 8512520c 851251e0 85125020 nt!KiSwapThread+0x83 (FPO: [0,2,0])

    f5423c94 80915355 8512520c 00000011 80912701 nt!KeWaitForSingleObject+0x2e0 (FPO: [5,12,4])

    f5423d50 8082350b 00000a40 000e9dd8 000de270 nt!NtRequestWaitReplyPort+0x776 (FPO: [SEH])

    f5423d50 7c8285ec 00000a40 000e9dd8 000de270 nt!KiFastCallEntry+0xf8 (FPO: [0,0] TrapFrame @ f5423d64)

    00a0c580 7c82787b 77e77342 00000a40 000e9dd8 ntdll!KiFastSystemCallRet (FPO: [0,0,0])

    00a0c584 77e77342 00000a40 000e9dd8 000de270 ntdll!NtRequestWaitReplyPort+0xc (FPO: [3,0,0])

    00a0c5bc 77e77214 01b1f978 7c840936 00a0c604 kernel32!BasepQueryModuleData+0x121 (FPO: [5,6,4])

    00a0c5dc 7c84086e 01b1f978 7c840936 00a0c604 kernel32!BaseQueryModuleData+0x3b (FPO: [5,1,0])

    00a0c82c 7c8343b7 0009ec88 0009ec88 00000000 ntdll!LdrpCheckAppDatabase+0x9c (FPO: [1,138,4])

    00a0c840 7c8333ba 0009ec88 0016ecb0 00a0ce98 ntdll!LdrpCheckNXCompatibility+0x27 (FPO: [1,1,4])

    00a0c93c 7c834311 00000000 00000000 777964d0 ntdll!LdrpRunInitializeRoutines+0x20f (FPO: [SEH])

    00a0cbd0 7c834065 00000000 0016ecb0 00a0ce98 ntdll!LdrpLoadDll+0x3cd (FPO: [SEH])

    00a0ce4c 77e41bf3 0016ecb0 00a0ce98 00a0ce78 ntdll!LdrLoadDll+0x198 (FPO: [SEH])

    00a0ceb4 776adc89 00a0cf30 00000000 00000008 kernel32!LoadLibraryExW+0x1b2 (FPO: [SEH])

    00a0ced8 776adb95 00a0cf30 00a0cefc 00a0cf00 ole32!CClassCache::CDllPathEntry::LoadDll+0xb0 (FPO: [4,1,4])

    00a0cf08 776ad2da 00a0cf30 00a0d208 00a0cf28 ole32!CClassCache::CDllPathEntry::Create_rl+0x37 (FPO: [3,3,4])

    00a0d154 776ad1fd 00000001 00a0d208 00a0d184 ole32!CClassCache::CClassEntry::CreateDllClassEntry_rl+0xd8 (FPO: [3,139,4])

    00a0d19c 776ad112 00000001 00105624 00a0d1c8 ole32!CClassCache::GetClassObjectActivator+0x1fe (FPO: [3,10,4])

    00a0d1cc 776a1b7f 00a0d208 00a0dce8 00a0d7a0 ole32!CClassCache::GetClassObject+0x24 (FPO: [1,1,0])

    00a0d240 776a19b5 77794964 00a0d7a0 00a0dce8 ole32!CServerContextActivator::GetClassObject+0xfd (FPO: [3,23,4])

    00a0d278 776a1c4e 00a0d7a0 00a0dce8 00a0d7a0 ole32!ActivationPropertiesIn::DelegateGetClassObject+0xf3 (FPO: [2,7,0])

    00a0d298 776a1bfa 77794978 00a0d7a0 00a0dce8 ole32!CApartmentActivator::GetClassObject+0x4d (FPO: [3,2,0])

    00a0d2b0 776acb27 77794978 00000001 00000000 ole32!CProcessActivator::GCOCallback+0x2b (FPO: [6,0,4])

    00a0d2d0 776acad8 77794960 00a0d5fc 00000000 ole32!CProcessActivator::AttemptActivation+0x2c (FPO: [7,0,0])

    00a0d30c 776a1ca1 77794960 00a0d5fc 00000000 ole32!CProcessActivator::ActivateByContext+0x4f (FPO: [6,3,0])

    00a0d334 776a19b5 77794960 00a0d7a0 00a0dce8 ole32!CProcessActivator::GetClassObject+0x48 (FPO: [3,1,4])

    00a0d36c 776a1925 00a0d7a0 00a0dce8 0073006d ole32!ActivationPropertiesIn::DelegateGetClassObject+0xf3 (FPO: [2,7,0])

    00a0d5b4 776a19b5 7779487c 00000001 00a0dce8 ole32!CClientContextActivator::GetClassObject+0x88 (FPO: [3,3,0])

    00a0d5ec 776a1a07 00a0d7a0 00a0dce8 7767c9b0 ole32!ActivationPropertiesIn::DelegateGetClassObject+0xf3 (FPO: [2,7,0])

    00a0dda4 776a187e 00a0ded8 00000001 00000000 ole32!ICoGetClassObject+0x3a5 (FPO: [7,52,0])

    00a0ddd4 776a1841 00a0ded8 00000002 00000000 ole32!CComActivator::DoGetClassObject+0xe0 (FPO: [6,0,4])

    00a0ddf4 772d7218 00a0ded8 00000001 00000000 ole32!CoGetClassObject+0x20 (FPO: [5,0,0])

    00a0de60 772d65ce 01ab48d8 01aeba70 00a0de80 urlmon!CProtMgr::FindFirstCF+0x11a (FPO: [4,17,0])

    00a0de84 772c4408 01ab4870 01aeba70 00000000 urlmon!COInetSession::CreateHandler+0x2d (FPO: [6,1,0])

    00a0deec 772d6b6c 00083ff8 01aeba70 000841b0 urlmon!CTransaction::LoadHandler+0xbd (FPO: [5,15,4])

    00a0df18 772c8302 00083ff8 0000000d 00000001 urlmon!CTransaction::DispatchReport+0x203 (FPO: [8,1,4])

    00a0df44 772c874b 00083ff8 01b092e8 00000000 urlmon!CTransaction::DispatchPacket+0x31 (FPO: [2,0,4])

    00a0df64 772c6681 00083ff8 00000000 00a0e014 urlmon!CTransaction::OnINetCallback+0x92 (FPO: [2,1,0])

    00a0df9c 7739b6e3 00c60048 00000465 00000000 urlmon!TransactionWndProc+0x33 (FPO: [SEH])

    00a0dfc8 7739b874 772c664e 00c60048 00000465 USER32!InternalCallWinProc+0x28

    00a0e040 7739ba92 00000000 772c664e 00c60048 USER32!UserCallWinProcCheckWow+0x151 (FPO: [SEH])

    00a0e0a8 773a16e5 00a0e0d0 00000001 00a0e10c USER32!DispatchMessageWorker+0x327 (FPO: [SEH])

    00a0e0b8 772de54a 00a0e0d0 00084100 00000000 USER32!DispatchMessageA+0xf (FPO: [1,0,0])

    00a0e10c 772ca8f5 00000000 00000000 01ad6588 urlmon!CTransaction::CompleteOperation+0x15a (FPO: [1,15,0])

    00a0e148 772c77ee 00083ff8 01ac6930 01ad64e0 urlmon!CTransaction::Start+0x52c (FPO: [6,9,4])

    00a0e1cc 772c744f 00a0e214 01aa2b30 772c9c00 urlmon!CBinding::StartBinding+0x4d8 (FPO: [6,22,4])

    00a0f264 772c7319 01aa7230 00000000 01aa2b30 urlmon!CUrlMon::StartBinding+0x1d8 (FPO: [6,1051,4])

    00a0f29c 772eb8f5 01aa7230 01aa2b30 00000000 urlmon!CUrlMon::BindToStorage+0x67 (FPO: [5,3,4])

    00a0f2e0 7732238b 001422e0 00a0f344 001422e0 urlmon!CBaseBSCB::KickOffDownload+0x191 (FPO: [2,8,0])

    00a0f2f4 7732296f 001422e0 00a0f344 0000000c urlmon!CFileDownload::KickOffDownload+0x2d (FPO: [2,0,4])

    00a0f30c 77322ce6 00000000 00a0f344 00a0f3c4 urlmon!URLDownloadToFileW+0x51 (FPO: [5,0,0])

    00a0f444 00a0f59d 00000000 00a0f63e 00a0f632 urlmon!URLDownloadToFileA+0xd8 (FPO: [5,68,4])

    WARNING: Frame IP not in any known module. Following frames may be wrong.

    00a0f474 71bf396b 48544950 71bf3969 554a4649 0xa0f59d

     77430b8c 77e40000 00000000 00000004 0000ffff 000000b8 WS2HELP!WahOpenHandleHelper+0xf7 (FPO: [1,114,0])

     

    Now, looking at the thread, lets list down the observations and our interpretation from it:

     

    1)      So, by looking at API URLDownloadToFileW(), it seems this thread is busy downloading a file from the internet. The API and parameters are documented at  http://msdn.microsoft.com/en-us/library/ms836094.aspx Let’s find out which file are we trying to download here:

     

    kd> du 00a0f344

    00a0f344  http://10.162.13.115:31797/x

     

    kd> du 00a0f3c4

    00a0f3c4  "x.exe"

     

    So we are trying to download an executable here? It’s worth having the customer investigate this binary?

     

    2)      This thread is waiting for a reply to a LPC call issued (LPC MessageId 001e295b), so let’s find out who do we have at the server end of this LPC call:

     

    kd> !lpc message 001e295b

    Searching message 1e295b in threads ...

    Client thread 85125020 waiting a reply from 1e295b                         

    Searching thread 85125020 in port rundown queues ...

     

    Server connection port e1395030  Name: AELPort

        Handles: 1   References: 22

        Server process  : 85e021e0 (svchost.exe)

        Queue semaphore : 85c593f8

        Semaphore state 0 (0x0)

        The message queue is empty

     

    So, as you see above, the server piece of this LPC call is this process itself hence a straightforward deadlock situation. Also, please remember that critical section is process specific.

     

    3)      The stack below URLMON looks messed up so we shall have to look at the raw stack and reconstruct the correct stack. 

     

    kd> kv =00a0f8cc 0x00a0f8cc 71c49573

     

    ChildEBP RetAddr  Args to Child             

    00a0f8cc 71c49578 000c9620 000c93ac 00104100 NETAPI32!NetpwPathCanonicalize+0x76 (FPO: [6,0,4])

    00a0f8f4 74edc916 000c93ac 00104100 0000019a NETAPI32!NetpwPathCanonicalize+0x7b (FPO: [6,0,4])

    00a0f914 77c80193 000c9398 000c93ac 00104100 srvsvc!NetprPathCanonicalize+0x1c (FPO: [7,0,0])

    00a0f944 77ce33e1 74edc8fa 00a0fb28 00000007 RPCRT4!Invoke+0x30

    00a0fd44 77ce35c4 00000000 00000000 000b9fd8 RPCRT4!NdrStubCall2+0x299 (FPO: [SEH])

    00a0fd60 77c7ff7a 000b9fd8 000a80d8 000b9fd8 RPCRT4!NdrServerCall2+0x19 (FPO: [1,1,0])

    00a0fd94 77c8042d 74ed31d7 000b9fd8 00a0fe34 RPCRT4!DispatchToStubInCNoAvrf+0x38 (FPO: [SEH])

    00a0fde8 77c80353 0000001f 00000000 74ee3f0c RPCRT4!RPC_INTERFACE::DispatchToStubWorker+0x11f (FPO: [4,13,4])

    00a0fe0c 77c68e0d 000b9fd8 00000000 74ee3f0c RPCRT4!RPC_INTERFACE::DispatchToStub+0xa3 (FPO: [4,0,4])

    00a0fe40 77c68cb3 000b9f98 000002a4 000b9f60 RPCRT4!OSF_SCALL::DispatchHelper+0x149 (FPO: [0,4,4])

    00a0fe54 77c68c2b 00000000 0000001f 000c9370 RPCRT4!OSF_SCALL::DispatchRPCCall+0x10d (FPO: [2,0,4])

    00a0fe84 77c68b5e 000c9370 030002bc 0000001f RPCRT4!OSF_SCALL::ProcessReceivedPDU+0x57f (FPO: [3,5,4])

    00a0fea4 77c6e8db 00000000 000002bc 00000000 RPCRT4!OSF_SCALL::BeginRpcCall+0x194 (FPO: [2,0,4])

    00a0ff04 77c6e7b4 00000000 000c9370 000002bc RPCRT4!OSF_SCONNECTION::ProcessReceiveComplete+0x435 (FPO: [3,17,0])

    00a0ff18 77c7b799 0008cd68 0000000c 00000000 RPCRT4!ProcessConnectionServerReceivedEvent+0x21 (FPO: [7,0,0])

    00a0ff84 77c7b9b5 00a0ffac 77c8872d 0008cd68 RPCRT4!LOADABLE_TRANSPORT::ProcessIOEvents+0x1b8 (FPO: [0,15,0])

    00a0ff8c 77c8872d 0008cd68 00000000 00000000 RPCRT4!ProcessIOEventsWrapper+0xd (FPO: [1,0,0])

    00a0ffac 77c7b110 0008b120 00a0ffec 77e64829 RPCRT4!BaseCachedThreadRoutine+0x9d (FPO: [1,2,4])

    00a0ffb8 77e64829 001262f0 00000000 00000000 RPCRT4!ThreadStartRoutine+0x1b (FPO: [1,0,0])

    00a0ffec 00000000 77c7b0f5 001262f0 00000000 kernel32!BaseThreadStart+0x34 (FPO: [SEH])

     

    Our first course of action was to install the latest and greatest netapi32.dll binary which cleared up the issue.

     

     

    I hope that helps!!

     

    Thanks and regards,

    Aman

Page 3 of 24 (235 items) 12345»