• Ntdebugging Blog

    The Memory Shell Game

    • 12 Comments

    Hello, this is Somak.  Today I’d like to drop some Memory Manager info on the blog that I’ve used to communicate in brief (believe it or not) how the system deals with memory.  If you are ever faced with checking how much Available Memory you have(or don’t have), poor system performance, questions about page faults, or having a Working Set Trimming performance issue, or just want a primer into how Windows manages memory on the system, be not afraid and read on!

    How this memory stuff works

    The fastest and most expensive memory is built into the CPU.  The next fastest and less expensive is physical RAM.  Next we have the hard drive, followed by remote storage or backup.  Each step down the ladder adds significantly to access time.  For example, physical RAM can be almost one million times faster than a hard disk.  Since it is volatile and cost considerably more than hard disk space, we are limited at how much we can put in the system.  This limit is far less than the much slower and non-volatile hard disk.  Due to these design constraints we have a tiered memory structure in the computer.  To achieve the fastest response times and best overall performance, an operating system must efficiently use this tiered memory structure.  It must do its best to reduce the need to retrieve data from a slower storage media like the hard disk.  It has to do this while juggling the memory and I/O demand from all running processes.  The following paragraphs are an overview of how the Memory Manager achieves this in Windows.  A more detailed description can be found in Chapter 7 of Microsoft Windows Internals, Fourth Edition (ISBN: 0-7356-1917-4).  This book is the great source for how stuff works in Windows.

    First let’s lay out some definitions.  They will be useful later on when I talk about the interactions.  These definitions are high level to maintain brevity.

    Virtual Memory – This is a memory that an operating system can address.  Regardless of the amount of physical RAM or hard drive space, this number is limited by your processor architecture.  On a 32 bit processor you are limited to 4 GB of addressable virtual memory (2^32).  With a default installation on a 32 bit box (not using /3GB) the kernel reserves 2GB for itself.  Applications are left with 2GB of addressable virtual memory.  When applications execute they are only presented with this 2GB of addressable memory.  Each application gets its own 2GB virtual memory to play with.  If you have 50 processes running, you’ll have 50 independent 2GB Virtual Memory address spaces and one 2GB Virtual Address space for kernel.  This is possible because Virtual Memory always exists, but doesn't really exist (hence the term virtual).  We basically lie to the application and say, here is 2GB of memory address space for you to use.  This memory isn’t allocated until the application explicitly uses it.  Once the application uses the page, it becomes committed.  A virtual memory page is then translated to a physical memory page.  From this translation, the virtual page can reside in physical RAM or on the hard disk.

    Physical Memory –  This is the physical storage media.  It can be physical RAM, the hard disk, optical disks, tape backups, etc.  This is anything that can store data.  Most times when people talk about Physical Memory, they refer to physical RAM (the memory sticks on your motherboard), but with virtual page translation, physical memory can also be on the hard drive (in your paging file).  Physical RAM is limited by your processor architecture.

    Committed Memory –  When an application touches a virtual memory page (reads/write/programmatically commits) the page becomes a committed page.  It is now backed by a physical memory page.  This will usually be a physical RAM page, but could eventually be a page in the page file on the hard disk, or it could be a page in a memory mapped file on the hard disk.  The memory manager handles the translations from the virtual memory page to the physical page.   A virtual page could be in located in physical RAM, while the page next to it could be on the hard drive in the page file.

    Commit Limit –  This is the maximum amount of memory that all your applications and the OS can commit.  If you had 50 applications fully allocate their 2 GB of virtual address space, you would need 100GB of commit limit (ignore kernel memory usage to keep the numbers simple).  So 100GB of committed pages can be backed by physical RAM or the hard drive.  If you have 100 GB of RAM, you could handle this memory load.   In most cases, 100 GB of RAM isn't economically feasible so the Commit Limit is comprised of physical RAM and the page file.  If you have 2 GB of physical RAM and 98 GB of page file, then your commit limit would be 100 GB.

    Page file –   This is the storage area for virtual memory that has been committed.  It is located on the hard drive.  Since hard drive space is cheaper than physical RAM, it is an inexpensive way to increase the commit limit.

    Working Set – This is a set of virtual memory pages (that are committed) for a process and are located in physical RAM.  These pages fully belong to the process.  A working set is like a "currently/recently working on these pages" list.

    Modified pages - Once a virtual memory page leaves the process's working set, it is moved to another list.  If the page has been modified, it is placed on the modified page list.  This page is in physical RAM.  A thread will then write the page to the page file and move it to the standby list.

    Standby pages - This is a page that has left the process' working set.  This page is in physical RAM.  A standby page is like a cache for virtual memory pages.  It is still associated with the process, but not in its working set.  If the process touches the page, it is quickly faulted back into the working set.  That page also has one foot out the door.  If another process or cache needs more memory, the process association is broken and it is moved to the free page list.  Most of the pages in available memory are actually standby pages.  This makes sense when you realize that these pages can be quickly given to another process (hence available), but you should also understand that they are page caches for working sets and can be quickly given back if the process touches the page again.  The vast majority of available memory is not wasted or empty memory.

    Free pages - When a page is taken off of the standby page list, it is moved to the Free page list.  This page is in physical RAM.  These pages are not associated with any process.   When a process exits, all of its pages are then dumped onto this list. Typically, there is a very small to no amount of free pages hanging around physical RAM. 

    Zeroed pages - When a free page is zeroed out, it is placed on the Zero page list.  This page is in physical RAM.  These are the pages that are given to processes that are making memory allocations.  Due to C2 security requirements, all pages must be scrubbed before handed to a new process.  When the system is idle, a thread will scrub free pages and put them on this list.  Only a small amount of zero pages are required to handle the typical small memory allocations of processes.  Once this list is depleted, and if there is demand for more pages, we pull pages off of the Free page list and scrub them on the fly.  If the Free page list is depleted, then we pull pages off of the standby list, scrub them on the fly and hand them to the new process.

    What is Task Manager telling me?

    Prior to Windows Vista, Task Manager reports memory usage using accounting methods that you probably are not expecting.  It is because of these accounting practices, that we rarely use Task Manager to gauge system performance and memory usage.  We typically use it for a quick overview or to kill processes.  I highly recommend using Performance Monitor (perfmon.msc) for investigating performance issues.  Here's the breakdown of the numbers on the Performance tab:

    Physical Memory

    Total - The is the total physical RAM installed in the system.

    Available - This is the total of the Standby, Free and Zeroed list.  Free and Zeroed makes sense, but Standby seems odd at first.  Standby pages were added to this number because they are available for a quick scrub and given to a new process.  So they are technically available (with minimal effort).

    System Cache- This is the total of the Standby list and the size of the system working set (which includes the file cache).  Standby pages are added to this list because they are cached pages for working sets.

    PF Usage - This is the total number of committed pages on the system.  It does not tell you how many are actually written to the page file.  It only tells you how much of the page file would be used if all committed pages had to be written out to the page file at the same time.

    Commit Charge

    Total - This is the total virtual memory that has been committed.  This includes all committed memory for all processes and the kernel.

    Limit - This is the maximum amount of committed memory this system can handle.  This is a combination of physical RAM and the page file.

    Peak – This is the highest amount of memory committed thus far on this system, since boot.

    How does this work?

    The memory manager optimizes physical RAM usage across the entire system.  Since physical RAM is a finite resource, it has to balance sharing this critical resource amongst all process, the kernel and file I/O.   It tries to keep disk I/O to a minimum, which results in a more responsive system.  It does this by moving pages around to meet the demand of the system.

    Typically, large sections of physical RAM are used for file cache.  This is cache is necessary to improve disk performance.  Without it, disk I/O would make the system crawl along at an nearly unusable pace.  The file system cache is just like a working set for a process.  Pages removed from the file cache are moved to the standby or modified page list.  Many of the standby pages in RAM are probably file cache pages that were removed from its working set.  For example, on a file server, if you see 8 GB of available memory, most of these pages are probably standby pages for the file cache.  The file cache's working set could be 500 MB, but the 8 GB of standby pages should also be considered part of the file cache.

    Now let's take a look at how the memory manager handles processes.  While an application is working with its virtual memory pages, the memory manager keeps the pages in the process' working set.  Since the vast majority of application do not use all of its memory all the time, some pages will age.  Old pages are removed from the working set.  If they are modified, they are moved to the modified list.  The page is saved to the page file and moved to the standby list.  If the page hasn't been modified, it is moved directly to the standby list.  These pages will remain on the standby page list until there is a demand for it.

    If the application touches the page again, it is soft faulted back into the process' working set.  If the process doesn't use the page for a very long time, or if the demand for the page is greater elsewhere, the page is moved off of the standby list.  It is disassociated with the process and moved to a the free page list.  From the free page list, the page is scrub on demand or lazily and placed on the zero page list.  It is from the zero page list that other processes or the kernel or the file cache will get a new page.

    If after a very long time the application once again needs a page that is not in its working set, the memory manager will handle the memory fault.  If the page is on the standby list, it is quickly put back into the process' working set.  If the page is no longer in the standby list, a hard fault occurs.  The memory manager issues I/O to the hard disk to read the page(s) from the page file.  Once the I/O complete, the page is placed back into the process' work set.

    All of this is done to keep physical RAM highly utilized and disk I/O to a minimum.  We don't want to allow process to horde physical RAM for pages that are rarely used.  The physical RAM must be shared with other processes, the kernel and the file cache.  If you see lots of available memory on your system, rest assured that it is not going to waste.  The vast majority is on standby lists for processes and the file cache.

    Also note that page file usage isn't that bad.  The page file allows the Memory Manager to save modified pages before placing the page on the standby list.  The page is still in physical RAM and can be quickly faulted back into the process.  This method gives the process a chance to reclaim an old page and it allows the page to be quickly used if there is demand elsewhere.

    The best way to see the totals of these lists is to use a kernel debugger (live or postmortem).  Use the !memusage command and you'll get an output like this:

    0: kd> !memusage

    loading PFN database

    loading (100% complete)

    Compiling memory usage data (99% Complete).

    Zeroed:    414 (  1656 kb)

                   Free:      2 (     8 kb)

    Standby: 864091 (3456364 kb)

    Modified:    560 (  2240 kb)

        ModifiedNoWrite:     30 (   120 kb)

    Active/Valid: 182954 (731816 kb)

             Transition:      2 (     8 kb)

                    Bad:      0 (     0 kb)

                Unknown:      0 (     0 kb)

    TOTAL: 1048053 (4192212 kb)

    Of the 4GB of physical RAM, only 1.6 MB are on Zeroed or free pages.  731 MB is in process, system and file cache working sets.  2 MB are on the modified page list.  The vast majority, 3.4 GB, is on the standby list.  On this server, most people will see 3.4 GB of wasted physical RAM, but you will know better.

    What should I be worried about?

    Typically you shouldn't be worried about these things, until you have a performance problem.  If your system is sluggish or slow to respond or you are getting errors about out of memory, then you need to rely on this information.  You will need to collect a performance monitor log of the problem time.  If the counter list is daunting, then use the Performance Monitor Wizard to configure the performance monitor log.

    Once the log is collected, you'll need to analyze several counters.  I'm not going into detail about how to review performance monitor logs this time.  I'll save that lengthy topic for another time.  For now I'll focus on the counters relevant to the Memory Manager.

    One of the biggest reasons for slow performance and sluggish system responsiveness is disk bottleneck.  Look at Physical Disk\% Idle Time, Avg. Disk sec/Read and Avg. Disk sec/Write counters.  If your system drive is under 50% idle or your disk response times are way above your drive specifications, then you need to investigate further.  Look at the Memory\Available Mbytes.  You should have a couple hundred Mbytes of Available Memory.  This is one of the most important performance monitor counters.  If this number drops too low, your standby lists, process working sets and cache will be greatly reduced.  You'll need to find out if a process is consuming physical RAM.  Check for large process working sets or for large file cache.

    You will also need to see if paging is really affecting system performance.  Take a look at Memory\Pages Input/sec and correlate that to Physical Disk\Avg. Disk sec/Read.  Pages Input/sec is the number of pages being read in from the page file.  These are the hard faults (when the page wasn't on the standby list).  If your Avg. Disk sec/Read is close to your drive's specification and the drive's idle time is high, than paging really isn't a problem.  Small amounts of hard faults are expected as applications will every once in a while re-touch an old page.  As long as this I/O is not consistent or the disk can't keep up, you probably will not notice this impact.

    You can also look at Memory\Pages Output/sec and Physical Disk\Avg. Disk sec/Write.  These are the page commits to the page file when a modified page is removed from a process' working set.  As long as the disk can keep up with the writes, this shouldn't be a problem.  Remember that once the page is saved to the page file, it is placed on the standby list.  If there isn't a great demand for new pages, it can remain on the standby list for a very long time.  When the process touches the old virtual page again, it can be soft faulted back into the working set.  If there is great demand for memory, you'll see process working sets aggressively being trimmed.  Unless there is memory pressure, this is all done with lazy I/O, so you should not see much of an impact from this activity.

    The Memory Manager works to meet current demand and prepares for future demand when possible.  You need to look at a performance monitor log to see if there is memory pressure on the system.  You'll see this in low Available Mbytes and reductions in process working sets.  You'll be able to correlate this to increase disk I/O to the page file.  If you have established that there is memory pressure on the box, you need to figure where that demand is coming from.  Check for working set increases from processes, file cache or bottlenecked disk I/O to data drives.

  • Ntdebugging Blog

    LPC CASE2 – When things are not rosy

    • 1 Comments

    Hello, this is Roy again.  In this case we will discuss a scenario where tracing a hung client thread is not possible through LPC data structures and extensions. We would rather use hints from the LPC message sent about the operation that was taking place and other heuristics to arrive at possible reason for the hang.

     

    The issue is that whenever an Encrypted File is opened the server locks up. The initial analysis is to determine the reason for the server hang, whether it is low resource condition, deadlock or any other reason. After dumping all the threads and glancing over them I chose to pickup the following thread which looked interestingly abnormal. Below is the thread which is trying to do a CreateFile( ) and is stuck for 4hrs for a LPC reply. The file attempted to be created is an encrypted file. KSecDD.sys is the Kernel Security Device Driver which is used for handling key encryption, decryption tasks. In order to decrypt the file KSecDD makes a LPC request to LsaAuthenticationPort.

     

    2: kd> !thread fffffade6c9f1390

    THREAD fffffade6c9f1390  Cid 04b4.0790  Teb: 000007fffffd8000 Win32Thread: fffffa8008ed1d80 WAIT: (WrLpcReply) KernelMode Non-Alertable

        fffffade6c9f1710  Semaphore Limit 0x1

    Waiting for reply to LPC MessageId 01561d96:

    Current LPC port fffffa800380cc40

    IRP List:

        fffffade6d389010: (0006,03a0) Flags: 00000884  Mdl: 00000000

        fffffade6bf55c60: (0006,03a0) Flags: 00000884  Mdl: 00000000

    Impersonation token:  fffffa8002f5b620 (Level Impersonation)

    Owning Process            fffffade6f97c2b0       Image:         svchost.exe

    Wait Start TickCount      18281033       Ticks: 26423448 (4:18:41:06.375)

    Context Switch Count      11                 LargeStack

    UserTime                  00:00:00.015

    KernelTime                00:00:00.000

    Win32 Start Address 0x0000000003855c60

    Start Address 0x0000000078d6b630

    Stack Init fffffade56ea1e00 Current fffffade56e9f930

    Base fffffade56ea2000 Limit fffffade56e9b000 Call 0

    Priority 9 BasePriority 8 PriorityDecrement 0

     

    Child-SP          RetAddr           Call Site

    fffffade`56e9f970 fffff800`01054df3 nt!KiSwapContext+0x85

    fffffade`56e9faf0 fffff800`0105457f nt!KiSwapThread+0x376

    fffffade`56e9fb50 fffff800`0125555c nt!KeWaitForSingleObject+0x58e

    fffffade`56e9fbd0 fffffade`5b19e471 nt!LpcpRequestWaitReplyPort+0x672

    fffffade`56e9fc60 fffffade`5b19f88a KSecDD!EfspDecryptFek+0x131

    fffffade`56e9fec0 fffffade`5b0a4f46 KSecDD!EfsDecryptFek+0x2a

    fffffade`56e9ff10 fffffade`5b0a4b9d Ntfs!EFSPostCreate+0x323

    fffffade`56ea0060 fffffade`5b119d70 Ntfs!EFSFilePostCreate+0x1a5

    fffffade`56ea00e0 fffffade`5b1f2daa Ntfs!NtfsFsdCreate+0x425

    fffffade`56ea0270 fffffade`59976284 fltMgr!FltpCreate+0x34a

    fffffade`56ea0300 fffffade`5997ed4c SomeFSFilter+0x9284

    fffffade`56ea0360 fffffade`5997637f SomeFSFilter+0x11d4c

    fffffade`56ea0390 fffffade`5b1d759a SomeFSFilter+0x937f

    fffffade`56ea0400 fffffade`5b1f2e14 fltMgr!FltpLegacyProcessingAfterPreCallbacksCompleted+0x41a

    fffffade`56ea0470 fffff800`01295ede fltMgr!FltpCreate+0x3b4

    fffffade`56ea0500 fffff800`0128ea00 nt!IopParseDevice+0x108f

    fffffade`56ea06b0 fffff800`012909a7 nt!ObpLookupObjectName+0x923

    fffffade`56ea07c0 fffff800`01296a14 nt!ObOpenObjectByName+0x180

    fffffade`56ea09b0 fffff800`01296bb7 nt!IopCreateFile+0x630

    fffffade`56ea0b40 fffffade`599e2a67 nt!IoCreateFile+0x12f

    fffffade`56ea0c20 fffffade`599e3c91 SomeFSFilter+0x47a67

    fffffade`56ea0ce0 fffffade`599e62e9 SomeFSFilter+0x48c91

    fffffade`56ea0d10 fffffade`599aacd3 SomeFSFilter+0x4b2e9

    fffffade`56ea0e00 fffffade`599aad6e SomeFSFilter+0xfcd3

    fffffade`56ea0ea0 fffffade`599bea96 SomeFSFilter+0xfd6e

    fffffade`56ea0ed0 fffffade`599ae142 SomeFSFilter+0x23a96

    fffffade`56ea0f90 fffffade`599ae7f1 SomeFSFilter+0x13142

    fffffade`56ea1070 fffffade`5999c0e3 SomeFSFilter+0x137f1

    fffffade`56ea10a0 fffffade`599ae848 SomeFSFilter+0x10e3

    fffffade`56ea10d0 fffffade`599ae8c3 SomeFSFilter+0x13848

    fffffade`56ea1120 fffffade`599a9a96 SomeFSFilter+0x138c3

    fffffade`56ea1160 fffffade`599a782f SomeFSFilter+0xea96

    fffffade`56ea11a0 fffffade`599a7a02 SomeFSFilter+0xc82f

    fffffade`56ea11e0 fffffade`5997d484 SomeFSFilter+0xca02

    fffffade`56ea1230 fffffade`599800f4 SomeFSFilter+0x10484

    fffffade`56ea1260 fffffade`5998038d SomeFSFilter+0x130f4

    fffffade`56ea1290 fffffade`5997ecbc SomeFSFilter+0x1338d

    fffffade`56ea12c0 fffffade`5997637f SomeFSFilter+0x11cbc

    fffffade`56ea12f0 fffffade`5b1d759a SomeFSFilter+0x937f

    fffffade`56ea1360 fffffade`5b1f2e14 fltMgr!FltpLegacyProcessingAfterPreCallbacksCompleted+0x41a

    fffffade`56ea13d0 fffff800`01295ede fltMgr!FltpCreate+0x3b4

    fffffade`56ea1460 fffff800`0128ea00 nt!IopParseDevice+0x108f

    fffffade`56ea1610 fffff800`012909a7 nt!ObpLookupObjectName+0x923

    fffffade`56ea1720 fffff800`01296a14 nt!ObOpenObjectByName+0x180

    fffffade`56ea1910 fffff800`01296bb7 nt!IopCreateFile+0x630

    fffffade`56ea1aa0 fffff800`01298479 nt!IoCreateFile+0x12f

    fffffade`56ea1b80 fffff800`0104fce2 nt!NtCreateFile+0x69

    fffffade`56ea1c00 00000000`78ef17ba nt!KiSystemServiceCopyEnd+0x3

    00000000`00bbf258 00000000`00000000 0x78ef17ba

     

     

    Using the message Id mentioned in the “!thread” output we dump the LPC message. The output has two anomalies.

    1. No server thread has been listed which is processing the client request.
    2. The message queue as well as the LpcDataInfoChain is empty.

     

     

    Answers to both questions are very specific to LSASS. Most types of client requests received on LsaAuthenticationPort do not have extra data. As mentioned in CASE 1, server threads dequeue the message and chain it to LpcDataInfoChainHead only if the message contains extra information. If the message does not contain any extra data then the server does not need to store the message anywhere. The server would only need to remember the kind of request carried by the message.

     

    !lpc message searches the ETHREAD.LpcReceiveMessageId field to determine which server thread is processing the message. The way LSASS handles LPC request is that there is a single thread in LSASS that dequeues all messages from the connection port message queue and enqueues them to an LSASS specific internal queue. LSASS has a pool of worker threads that wait for requests in the internal queue. The worker threads dequeue the request, process it and reply back to the client. Hence the single LPC server thread in LSASS has its LpcReceiveMessageId overwritten by a new one every time it dequeues a message and enqueues it to the internal queue. That is the reason !lpc message fails to find a server thread for the LPC message sent to LsaAuthenticationPort.

     

     

     

     

    2: kd> !LPC MESSAGE 01561d96

    Searching message 1561d96 in threads ...

    Client thread fffffade6c9f1390 waiting a reply from 1561d96                         

    Searching thread fffffade6c9f1390 in port rundown queues ...

     

    Server communication port 0x033fc660

        Handles: 1   References: 1

        The LpcDataInfoChainHead queue is empty

            Connected port: 0xfffffa800380cc40      Server connection port: 0xfffffa800282ca40

     

    Client communication port 0xfffffa800380cc40

        Handles: 0   References: 1

        The LpcDataInfoChainHead queue is empty

     

    Server connection port fffffa800282ca40  Name: LsaAuthenticationPort

        Handles: 1   References: 83

        Server process  : fffffade6ead0040 (lsass.exe)

        Queue semaphore : fffffade6e9513c0

        Semaphore state 0 (0x0)

        The message queue is empty

        The LpcDataInfoChainHead queue is empty

    Done.  

     

     

     

    Then how do you find the server thread that is responsible for replying back to the client and why it isn’t replying? Well there are many ways to do so depending upon the scenario. The way I chose was to look at all threads in LSASS which either have function calls on the stack that relate to the KSecDD request made or other hints. The reason threads only in LSASS were checked is because we knew that the request was sent to LsaAuthenticationPort and will be processed within LSASS. The hint was in the Wait Tick Count of the thread. I found a thread which had exactly the same Wait Tick Count as the client thread that was waiting for a reply. Further investigation of the thread confirmed that it was the one responsible for processing the request from KSecDD.

     

     

     

    2: kd> !thread fffffade6fa01460

    THREAD fffffade6fa01460  Cid 03a4.0100  Teb: 000007fffff8e000 Win32Thread: 0000000000000000 WAIT: (Executive) KernelMode Non-Alertable

        fffffade6f665820  Semaphore Limit 0x1

        fffffade6f665840  NotificationEvent

    IRP List:

        fffffade6bb9a830: (0006,03a0) Flags: 00000884  Mdl: 00000000

    Impersonation token:  fffffa80044d11c0 (Level Impersonation)

    Owning Process            fffffade6ead0040       Image:         lsass.exe

    Wait Start TickCount      18281033       Ticks: 26423448 (4:18:41:06.375)

    Context Switch Count      509109            

    UserTime                  00:01:54.500

    KernelTime                00:00:33.765

    Win32 Start Address 0x0000000000c21aa0

    Start Address 0x0000000078d6b630

    Stack Init fffffade5a894e00 Current fffffade5a893e40

    Base fffffade5a895000 Limit fffffade5a88f000 Call 0

    Priority 13 BasePriority 9 PriorityDecrement 0

     

    Child-SP          RetAddr           Call Site

    fffffade`5a893e80 fffff800`01054df3 nt!KiSwapContext+0x85

    fffffade`5a894000 fffff800`0105518f nt!KiSwapThread+0x376

    fffffade`5a894060 fffffade`599a8b28 nt!KeWaitForMultipleObjects+0x66e

    fffffade`5a8940f0 fffffade`599a7808 SomeFSFilter+0xdb28

    fffffade`5a8941a0 fffffade`599a7a02 SomeFSFilter+0xc808

    fffffade`5a8941e0 fffffade`5997d484 SomeFSFilter+0xca02

    fffffade`5a894230 fffffade`599800f4 SomeFSFilter+0x10484

    fffffade`5a894260 fffffade`5998038d SomeFSFilter+0x130f4

    fffffade`5a894290 fffffade`5997ecbc SomeFSFilter+0x1338d

    fffffade`5a8942c0 fffffade`5997637f SomeFSFilter+0x11cbc

    fffffade`5a8942f0 fffffade`5b1d759a SomeFSFilter+0x937f

    fffffade`5a894360 fffffade`5b1f2e14 fltMgr!FltpLegacyProcessingAfterPreCallbacksCompleted+0x41a

    fffffade`5a8943d0 fffff800`01295ede fltMgr!FltpCreate+0x3b4

    fffffade`5a894460 fffff800`0128ea00 nt!IopParseDevice+0x108f

    fffffade`5a894610 fffff800`012909a7 nt!ObpLookupObjectName+0x923

    fffffade`5a894720 fffff800`01296a14 nt!ObOpenObjectByName+0x180

    fffffade`5a894910 fffff800`01296bb7 nt!IopCreateFile+0x630

    fffffade`5a894aa0 fffff800`01298479 nt!IoCreateFile+0x12f

    fffffade`5a894b80 fffff800`0104fce2 nt!NtCreateFile+0x69

    fffffade`5a894c00 00000000`78ef17ba nt!KiSystemServiceCopyEnd+0x3

     

     

    Having all the facts makes understanding the problem easier. SomeFSFilter.sys is an anti-virus driver that scans every file opened. During the open of an encrypted file it tries to scan the file before it is opened. For doing so it requests KSecDD to decrypt the file. KSecDD sends a LPC request to LsaAuthenticationPort for key decryption. LSASS as part of processing the request tries to open a file which for some reason is blocked by SomeFSFilter.sys. So we have a classic deadlock situation where SomeFSFilter.sys is waiting on LSASS to reply to the LPC message while LSASS is waiting on SomeFSFilter.sys for opening a file.

     

    In this scenario we could only use minimal help from the LPC debugger extension because of the way LPC requests were handled by LSASS. In such situations other heuristics are required to debug the problem.

     

    Finally, our journey through the world of LPC comes to an end. LPC has been the backbone of local inter-process communication since the NT4.0 days but as everything has a shell life, so did LPC. LPC has been replaced by ALPC (Asynchronous LPC) in Vista and Server 2008. The LPC APIs are now implemented over the new ALPC APIs to maintain compatibility. Although the LPC APIs are not exported, neither is there a requirement to call them directly, nevertheless having an understanding of the API helps in debugging as we have seen in the posts.

    Hope this information helps. Happy debugging!!!

     

  • Ntdebugging Blog

    TalkBackVideo Understanding handle leaks and How to use !htrace to find them

    • 2 Comments

    Written by Jeff Dailey

     

    Hello, my name is Jeff Dailey, I’m an Escalation Engineer for the Global Escalation Services Platforms team.   I’d like to show you how to debug and find leaking handles within your application or other process.  We can do this with the !htrace command in windbg .  Windbg is the Microsoft Windows Debugger most of us use in GES/CPR  for debugging.   

     

    Handles are a value we use in user mode, that when passed to a call that transitions to kernel, are used as an offset in your handle table to reference kernel mode objects.  Kernel mode objects are generally allocated from pool.   If you are having pool consumption problems and seeing errors like 2020 or 2019’s reported there is a good chance you may have a handle leak associated with them.  This is generally due to not doing a CloseHandle() on the handle when you have finished using it.

     

    You can vide the channel9 "how to debug handle leaks" video here 

     

    The following is the sample source for a handle leak that we will be debugging in our demo video.

     

    // leakyhandles.cpp : Defines the entry point for the console application.

    //

     

    #include "stdafx.h"

    #include <windows.h>

     

     

    void fun1(void);

    void fun2(void);

    void fun3(void);

    void fun4(void);

     

    int main(int argc, char* argv[])

    {

          while(1)

          {

                fun1();

                fun2();

                Sleep(100);

          }

          return 0;

    }

     

    void fun1(void)

    {

          fun3();

    }

     

    void fun2(void)

    {

          fun4();

     

    }

    void fun3(void)

    {

          HANDLE hEvent;

     

          hEvent = CreateEvent(NULL,TRUE,TRUE,NULL);

          CloseHandle(hEvent);

    }

    void fun4(void)

    {

        HANDLE hEvent2;

     

          hEvent2 = CreateEvent(NULL,TRUE,TRUE,NULL);

    }

     

    Thank you.

    Jeff Dailey

    Escalation Engineer (Platforms core team)

  • Ntdebugging Blog

    TalkBack Video: Cumulative Time Zone Update for Windows

    • 2 Comments

    Hi, my name is Joel Schaeffer, and I am an Escalation Engineer on the Platforms Global Escalation Services (GES) team.   I created a video to discuss the August 2007 Windows Cumulative Time Zone Update.  It is hosted on Channel9, and can be found here.  It will explain how to install the patch, validate the install, manually update non-mainstream Windows versions, and how to update Windows Mobile clients.

     

    See the following knowledge base articles for more information about how to update time zone information on your Windows clients and servers:

     

    August 2007 cumulative time zone update for Microsoft Windows operating systems

    http://support.microsoft.com/?id=933360

     

    How to configure daylight saving time for Microsoft Windows operating systems

    http://support.microsoft.com/?id=914387

     

    An August 2007 daylight saving time update for Windows Mobile-based devices is available

    http://support.microsoft.com/?id=941898

     

    See http://www.microsoft.com/dst2007 for more details about how to prepare for the upcoming DST changes for all Microsoft products.

  • Ntdebugging Blog

    Talkback video: Desktop Heap

    • 3 Comments

    Hello, Matthew here again.  Starting today, my team will be bringing you content in the form of videos, as well blog posts.  We’ll be hosting these videos on Channel 9, and we’ll link them from the ntdebugging blog.  One way that we’ll be using video is as a means of highlighting topics we’ve already covered, and then answering some of the common question that we receive from you in the form of comments on the blog.  So if you leave a comment with an interesting question, we may answer it in an upcoming video.  We are also thinking about using video as a way to show various debugging techniques rather than just blog about them. 

    For our first video, we’ve chosen desktop heap as a topic.  We’ve had a lot of traffic on the desktop heap posts, and we’ve had a lot of questions too.  So check out the video, and let us know what you think!

    http://channel9.msdn.com/Blogs/jeff_dailey/ntdebugging-blog-talk-about-on-desktop-heap

  • Ntdebugging Blog

    LPC Part 3: Debugging a real world LPC scenario

    • 4 Comments

    Hello my name is Roy; I’m an EE on the Microsoft global escalation services / CPR team.   This blog post continues my previous posts on LPC. 

    Below I will discuss scenarios that are simulated to depict actual issues encountered by Microsoft Customer Support Services. The two following scenarios deal with client threads waiting for LPC reply and hence in some way leading to user experience of a hung system or component.

     

    CASE1 – When everything in LPC looks intact

     

    In this scenario tracing the reason for client threads wait is possible through LPC data structures and extensions.

     

    The issue is a hung SMS (Systems Management Server) service. The OS (operating system) itself is working fine. It is just SMS that has hung.

     

    Taking a user mode hang dump (adplus.vbs hang mode dump) does not help since all we see is that threads in SMS are making RPC calls and are stuck. We need to investigate where these RPC calls are stuck. From the user dump we can see one important thing about the RPC calls is that they are implemented over LPC. LPC is completely implemented in the kernel and hence taking a kernel dump would be the next step in this case.

     

    02e9e698 7c821c94 ntdll!KiFastSystemCallRet

    02e9e69c 77c72700 ntdll!NtRequestWaitReplyPort+0xc              <--- RPC implemented over LPC

    02e9e6f4 77c713ba RPCRT4!LRPC_CCALL::SendReceive+0x230

    02e9e700 77788056 RPCRT4!I_RpcSendReceive+0x24          <--- RPC calls made

    02e9e71c 77788011 ole32!ThreadSendReceive+0xf5

    02e9e738 77787ed7 ole32!CRpcChannelBuffer::SwitchAptAndDispatchCall+0x112

    02e9e818 77695349 ole32!CRpcChannelBuffer::SendReceive2+0xc1

    02e9e884 776952e1 ole32!CAptRpcChnl::SendReceive+0xab

    02e9e8d8 77ce127e ole32!CCtxComChnl::SendReceive+0x113

    02e9e8f4 77ce13ca RPCRT4!NdrProxySendReceive+0x43

    02e9ecdc 77ce11bd RPCRT4!NdrClientCall2+0x206

    02e9ecfc 77c53a12 RPCRT4!ObjectStublessClient+0x8b

    02e9ed0c 77688fd2 RPCRT4!ObjectStubless+0xf

    02e9ed70 7768915a ole32!CRpcResolver::CreateInstance+0x14e

    02e9efbc 776a9b68 ole32!CClientContextActivator::CreateInstance+0xfa

    02e9effc 776a9a9a ole32!ActivationPropertiesIn::DelegateCreateInstance+0xf7

    02e9f7b0 776b1275 ole32!ICoCreateInstanceEx+0x3cc

    02e9f7d8 776b1244 ole32!CComActivator::DoCreateInstance+0x28

    02e9f7fc 74cf1a51 ole32!CoCreateInstanceEx+0x23

    02e9f830 74cf19fe wbemprox!CDCOMTrans::DoActualCCI+0x3d

    02e9f874 74cf16b1 wbemprox!CDCOMTrans::DoCCI+0x12d

    02e9f934 74cf152e wbemprox!CDCOMTrans::DoActualConnection+0x25f

    02e9f960 74cf22ab wbemprox!CDCOMTrans::DoConnection+0x25

    02e9f9a0 1806eaa9 wbemprox!CLocator::ConnectServer+0x7c

    WARNING: Stack unwind information not available. Following frames may be wrong.

    02e9fa50 186c31b6 ccmcore!CCM::Utility::Wmi::Namespace::Open+0x1a2

    02e9fb44 186c357f PolicyAgent!DllUnregisterServer+0x10d8

    02e9fbb0 18706a63 PolicyAgent!DllUnregisterServer+0x14a1

    02e9fc8c 18708899 PolicyAgentEndpoint!DllGetClassObject+0x73d2

    02e9fd6c 187097d9 PolicyAgentEndpoint!DllGetClassObject+0x9208

    02e9fde0 01068bb8 PolicyAgentEndpoint!DllGetClassObject+0xa148

    02e9fe6c 7c83b332 CcmExec+0x68bb8

    02e9fec8 7c8161d0 ntdll!RtlpWaitOrTimerCallout+0x74

    02e9feec 7c81a3c5 ntdll!RtlpAsyncWaitCallbackCompletion+0x37

    02e9ff44 7c8200fc ntdll!RtlpWorkerCallout+0x71

    02e9ff64 7c81a3fa ntdll!RtlpExecuteWorkerRequest+0x4f

    02e9ff78 7c82017f ntdll!RtlpApcCallout+0x11

    02e9ffb8 77e6608b ntdll!RtlpWorkerThread+0x61

    02e9ffec 00000000 kernel32!BaseThreadStart+0x34

     

    We examine threads belonging to ccmexec.exe from the kernel dump.

     

     

    0: kd> !process 0 0 ccmexec.exe

    PROCESS 895a6430  SessionId: 0  Cid: 01c8    Peb: 7ffde000  ParentCid: 0278

        DirBase: 6f3c7000  ObjectTable: e22d6b00  HandleCount: 832.

        Image: CcmExec.exe

     

    0: kd> .process /p /r 895a6430

    Implicit process is now 895a6430

    Loading User Symbols

    ............................................................................................

     

     

    0: kd> !thread 88e75b10

    THREAD 88e75b10  Cid 01c8.0de0  Teb: 7ffa6000 Win32Thread: e2365ea8 WAIT: (WrLpcReply) UserMode Non-Alertable

        88e75cfc  Semaphore Limit 0x1

    Waiting for reply to LPC MessageId 00039236:

    Current LPC port e2327022

    Not impersonating

    DeviceMap                 e10018a8

    Owning Process            895a6430       Image:         CcmExec.exe

    Wait Start TickCount      99831          Ticks: 46825 (0:00:12:11.640)

    Context Switch Count      14119                 LargeStack

    UserTime                  00:00:00.125

    KernelTime                00:00:00.046

    Win32 Start Address ntdll!RtlpWorkerThread (0x7c83ad38)

    Start Address kernel32!BaseThreadStartThunk (0x77e6b5f3)

    Stack Init b9425000 Current b9424c08 Base b9425000 Limit b9421000 Call 0

    Priority 13 BasePriority 8 PriorityDecrement 0

    Kernel stack not resident.

     

    ChildEBP RetAddr  Args to Child             

    b9424c20 8083e6a2 88e75b88 88e75b10 88e75bb8 nt!KiSwapContext+0x26 (FPO: [Uses EBP] [0,0,4])

    b9424c4c 8083f164 88e75cfc 88e75cd0 88e75b10 nt!KiSwapThread+0x284 (FPO: [Non-Fpo])

    b9424c94 8093983f 88e75cfc 00000011 b9424d01 nt!KeWaitForSingleObject+0x346 (FPO: [Non-Fpo])

    b9424d50 80834d3f 00000114 0017db40 0017db40 nt!NtRequestWaitReplyPort+0x776 (FPO: [Non-Fpo])

    b9424d50 7c82ed54 00000114 0017db40 0017db40 nt!KiFastCallEntry+0xfc (FPO: [0,0] TrapFrame @ b9424d64)

    02e9e698 7c821c94 77c72700 00000114 0017db40 ntdll!KiFastSystemCallRet (FPO: [0,0,0])

    02e9e69c 77c72700 00000114 0017db40 0017db40 ntdll!NtRequestWaitReplyPort+0xc (FPO: [3,0,0])

    02e9e6f4 77c713ba 00127fd4 02e9e71c 77788056 RPCRT4!LRPC_CCALL::SendReceive+0x230 (FPO: [Non-Fpo])

    02e9e700 77788056 00127fd4 02e9e800 0009f488 RPCRT4!I_RpcSendReceive+0x24 (FPO: [Non-Fpo])

    02e9e71c 77788011 00000000 00000000 00000000 ole32!ThreadSendReceive+0xf5 (FPO: [Non-Fpo])

    02e9e738 77787ed7 02e9e800 02e9e900 0009f488 ole32!CRpcChannelBuffer::SwitchAptAndDispatchCall+0x112 (FPO: [Non-Fpo])

    02e9e818 77695349 0009f488 02e9e910 02e9e900 ole32!CRpcChannelBuffer::SendReceive2+0xc1 (FPO: [Non-Fpo])

    02e9e884 776952e1 0009f488 02e9e910 02e9e900 ole32!CAptRpcChnl::SendReceive+0xab (FPO: [Non-Fpo])

    02e9e8d8 77ce127e 0009f488 02e9e910 02e9e900 ole32!CCtxComChnl::SendReceive+0x113 (FPO: [Non-Fpo])

    02e9e8f4 77ce13ca 000a1934 02e9e93c 0600016e RPCRT4!NdrProxySendReceive+0x43 (FPO: [Non-Fpo])

    02e9ecdc 77ce11bd 77676298 7767951e 02e9ed14 RPCRT4!NdrClientCall2+0x206 (FPO: [Non-Fpo])

    02e9ecfc 77c53a12 00000014 00000004 02e9ed70 RPCRT4!ObjectStublessClient+0x8b (FPO: [Non-Fpo])

    02e9ed0c 77688fd2 000a1934 00000000 02e9f1b4 RPCRT4!ObjectStubless+0xf

    02e9ed70 7768915a 02e9f1b4 02e9f704 00000010 ole32!CRpcResolver::CreateInstance+0x14e (FPO: [Non-Fpo])

    02e9efbc 776a9b68 7778fcd0 00000000 00000000 ole32!CClientContextActivator::CreateInstance+0xfa (FPO: [Non-Fpo])

    02e9effc 776a9a9a 02e9f1b4 00000000 02e9f704 ole32!ActivationPropertiesIn::DelegateCreateInstance+0xf7 (FPO: [Non-Fpo])

    02e9f7b0 776b1275 74cf1a8c 00000000 00000014 ole32!ICoCreateInstanceEx+0x3cc (FPO: [Non-Fpo])

    02e9f7d8 776b1244 74cf1a8c 00000000 00000014 ole32!CComActivator::DoCreateInstance+0x28 (FPO: [Non-Fpo])

    02e9f7fc 74cf1a51 74cf1a8c 00000000 00000014 ole32!CoCreateInstanceEx+0x23 (FPO: [Non-Fpo])

    02e9f830 74cf19fe 02e9f898 00000001 00000000 wbemprox!CDCOMTrans::DoActualCCI+0x3d (FPO: [Non-Fpo])

    02e9f874 74cf16b1 02e9f898 00000001 00000000 wbemprox!CDCOMTrans::DoCCI+0x12d (FPO: [Non-Fpo])

    02e9f934 74cf152e 01801974 00000000 00000000 wbemprox!CDCOMTrans::DoActualConnection+0x25f (FPO: [Non-Fpo])

    02e9f960 74cf22ab 01801974 00000000 00000000 wbemprox!CDCOMTrans::DoConnection+0x25 (FPO: [Non-Fpo])

    02e9f9a0 1806eaa9 00b594d0 01801974 00000000 wbemprox!CLocator::ConnectServer+0x7c (FPO: [Non-Fpo])

    WARNING: Stack unwind information not available. Following frames may be wrong.

    02e9fa50 186c31b6 0062bdaa 00000001 00000000 ccmcore!CCM::Utility::Wmi::Namespace::Open+0x1a2

    02e9fb44 186c357f 02e9fb68 00000001 00000000 PolicyAgent!DllUnregisterServer+0x10d8

    02e9fbb0 18706a63 0056a6f0 00000001 00000000 PolicyAgent!DllUnregisterServer+0x14a1

    02e9fc8c 18708899 0063cd50 02e9fd0c 02e9fcec PolicyAgentEndpoint!DllGetClassObject+0x73d2

    02e9fd6c 187097d9 ffffffff 00000864 00000000 PolicyAgentEndpoint!DllGetClassObject+0x9208

    02e9fde0 01068bb8 00566a28 006af398 00000000 PolicyAgentEndpoint!DllGetClassObject+0xa148

    02e9fe6c 7c83b332 000008b6 0011c700 0011c7f0 CcmExec+0x68bb8

    02e9fec8 7c8161d0 01068a29 000008b6 0011c700 ntdll!RtlpWaitOrTimerCallout+0x74 (FPO: [Non-Fpo])

    02e9feec 7c81a3c5 0011c7f0 7c889880 01e89b18 ntdll!RtlpAsyncWaitCallbackCompletion+0x37 (FPO: [Non-Fpo])

    02e9ff44 7c8200fc 7c816199 0011c7f0 00000000 ntdll!RtlpWorkerCallout+0x71 (FPO: [Non-Fpo])

    02e9ff64 7c81a3fa 00000000 0011c7f0 01e89b18 ntdll!RtlpExecuteWorkerRequest+0x4f (FPO: [Non-Fpo])

     

    The !thread output contains some important information.

    1. WAIT: (WrLpcReply): This tells us that the thread is waiting for a LPC reply. This information called as the “Wait Reason” is passed to KeWaitForSingleObject( ) as a parameter. This is retrieved from the _KTHREAD.WaitReason field.
    2. Waiting for reply to LPC MessageId 00039236: This line gives the message ID of the LPC message that was sent to the server thread.
    3. Current LPC port e2327022: The port number listed is the client communication port.
    4. 88e75cfc  Semaphore Limit 0x1 : The semaphore that the thread is waiting for is signaled when a reply message is posted to the client communication ports message queue. NtRequestWaitReplyPort( ) waits on this semaphore.
    5. Ticks: 46825 (0:00:12:11.640): This thread has been waiting for 12mins for a reply from the server.

     

    The next step is to find the server thread that is processing this LPC message. We will use LPC debugger extensions to trace the server thread.

     

    0: kd> !lpc message 00039236

    Searching message 39236 in threads ...

        Server thread 89531020 is working on message 39236                        

    Client thread 88e75b10 waiting a reply from 39236                         

    Searching thread 88e75b10 in port rundown queues ...

     

    Server communication port 0xe2300a38

        Handles: 1   References: 1

        The LpcDataInfoChainHead queue is empty

            Connected port: 0xe2327020      Server connection port: 0xe13734b0

     

    Client communication port 0xe2327020

        Handles: 1   References: 9

        The LpcDataInfoChainHead queue is empty

     

    Server connection port e13734b0  Name: epmapper

        Handles: 1   References: 93

        Server process  : 896e1450 (svchost1.exe)

        Queue semaphore : 8995ac10

        Semaphore state 0 (0x0)

        The message queue is empty

     

        Messages in LpcDataInfoChainHead:

            0000 e2365188 - Busy  Id=0003832f  From: 04a8.13f4  Context=80050003  [e1373530 . e297ab00]

                       Length=0044002c  Type=00380001 (LPC_REQUEST)

                       Data: 00000001 00050242 421e5fa2 9b3b48b4 a695daf5 00000001

            0000 e297ab00 - Busy  Id=00039232  From: 01c8.14e4  Context=8021000d  [e2365188 . e12cbea0]

                       Length=0044002c  Type=00380001 (LPC_REQUEST)

                       Data: 00010001 00040242 b8d06b40 b8d06b00 000003f4 000010a0

            0000 e12cbea0 - Busy  Id=00039236  From: 01c8.0de0  Context=8021000d  [e297ab00 . e1297430]

                       Length=0044002c  Type=00380001 (LPC_REQUEST)

                       Data: 00010001 00040242 4195c844 8ff87eb3 2b70d0f6 001e001c

            0000 e1297430 - Busy  Id=0003c96d  From: 0204.0224  Context=8022000e  [e12cbea0 . e1534a48]

                       Length=0044002c  Type=00380001 (LPC_REQUEST)

                       Data: 00010001 00040242 b8bb1b40 b8bb1b00 000003f4 00000fbc

            0000 e1534a48 - Busy  Id=0003db3b  From: 01c8.10e0  Context=8021000d  [e1297430 . e2515ca0]

                       Length=0044002c  Type=00380001 (LPC_REQUEST)

                       Data: 00010001 00040242 0000dc04 000001c8 7b6ee790 9e8a0590

            0000 e2515ca0 - Busy  Id=0003db3f  From: 01c8.0f5c  Context=8021000d  [e1534a48 . e29279e8]

                       Length=0044002c  Type=00380001 (LPC_REQUEST)

                       Data: 00010001 00040242 0000dc04 000001c8 7b6ee790 9e8a0590

            0000 e29279e8 - Busy  Id=0003db5a  From: 01c8.1618  Context=8021000d  [e2515ca0 . e1061008]

                       Length=0044002c  Type=00380001 (LPC_REQUEST)

                       Data: 00010001 00040242 00000000 00000000 00000000 00000000

            0000 e1061008 - Busy  Id=0004617f  From: 01c8.1220  Context=8021000d  [e29279e8 . e295fa80]

                       Length=0044002c  Type=00380001 (LPC_REQUEST)

                       Data: 00010001 00040242 00010c20 000001c8 37bd6287 35ebbd53

            0000 e295fa80 - Busy  Id=0004a38a  From: 01c8.1708  Context=8021000d  [e1061008 . e2966ea0]

                       Length=0044002c  Type=00380001 (LPC_REQUEST)

                       Data: 00010001 00040242 0001a41a 000001c8 cb786dae 60cdf739

            0000 e2966ea0 - Busy  Id=0004d0c2  From: 04a8.04d4  Context=80050003  [e295fa80 . e1261008]

                       Length=0044002c  Type=00380001 (LPC_REQUEST)

                       Data: 00020001 00030242 5415989d 3d8048c1 00070005 00000000

            0000 e1261008 - Busy  Id=000544d7  From: 01c8.0758  Context=8021000d  [e2966ea0 . e1373530]

                       Length=0044002c  Type=00380001 (LPC_REQUEST)

                       Data: 00010001 00040242 00010c20 000001c8 37bd6287 35ebbd53

        The LpcDataInfoChainHead queue contains 11 messages

        Threads in RunDown queue :     0xe2364fc8    0xe297a940    0xe12cbce0    0xe1297270    0xe1534888    0xe2515ae0    0xe2927828    0xe1060e48    0xe295f8c0    0xe2966ce0    0xe1260e48

    Done.                     

     

    When a client thread sends a message to a server the message ID is stored in the clients ETHREAD.LpcReplyMessageId field. !lpc message searches the message ID in all threads in the system. Once client thread is identified ETHREAD.LpcWaitingOnPort will give the client communication port used for the LPC transaction. This has references to the server communication port and connection port.

     

    !lpc message dumps the message queue for all the port objects. Important thing to note here is that even though the message queue is empty for connection port there are 11 messages chained to LpcDataInfoChainHead. This tells us that 11 messages were dequeued from the server message queue but the server has not replied to any of the messages since once the server responds to a message it is unchained from the LpcDataInfoChainHead list and freed.

     

    Messaging Queue

    Figure [4] LPC Message Queuing stages

     

    The above figure shows messages are first queued to the MsgQueue and then chained to LpcDataInfoChainHead. This kind of chaining only happens when the message contains extra data. This does not happen for a LPC_CONNECTION_REQUEST or if the message does not contain extra data. In our scenario server thread has dequeued and started processing the hence we see it in the LpcDataInfoChainHead.

     

    We now dump the server thread to see why it has not responded to the message for 12mins.

     

    0: kd> .process /p /r 896e1450

    Implicit process is now 896e1450

    Loading User Symbols

    ............................................................................................................................

     

    0: kd> !thread 89531020

    THREAD 89531020  Cid 03f4.10a0  Teb: 7ffac000 Win32Thread: 00000000 WAIT: (WrLpcReply) UserMode Non-Alertable

        8953120c  Semaphore Limit 0x1

    Waiting for reply to LPC MessageId 0003923d:

    Current LPC port e22d29d8

    Impersonation token:  e15315d0 (Level Impersonation)

    Owning Process            896e1450       Image:         svchost1.exe

    Wait Start TickCount      144629         Ticks: 2027 (0:00:00:31.671)

    Context Switch Count      219            

    UserTime                  00:00:00.000

    KernelTime                00:00:00.000

    Win32 Start Address 0x00039236

    LPC Server thread working on message Id 39236

    Start Address kernel32!BaseThreadStartThunk (0x77e6b5f3)

    Stack Init b8d07000 Current b8d06c08 Base b8d07000 Limit b8d04000 Call 0

    Priority 10 BasePriority 8 PriorityDecrement 0

    ChildEBP RetAddr  Args to Child             

    b8d06c20 8083e6a2 89531098 89531020 895310c8 nt!KiSwapContext+0x26 (FPO: [Uses EBP] [0,0,4])

    b8d06c4c 8083f164 8953120c 895311e0 89531020 nt!KiSwapThread+0x284 (FPO: [Non-Fpo])

    b8d06c94 8093983f 8953120c 00000011 b8d06d01 nt!KeWaitForSingleObject+0x346 (FPO: [Non-Fpo])

    b8d06d50 80834d3f 0000056c 000f3fc0 000f3fc0 nt!NtRequestWaitReplyPort+0x776 (FPO: [Non-Fpo])

    b8d06d50 7c82ed54 0000056c 000f3fc0 000f3fc0 nt!KiFastCallEntry+0xfc (FPO: [0,0] TrapFrame @ b8d06d64)

    0097f0a8 7c821c94 77c72700 0000056c 000f3fc0 ntdll!KiFastSystemCallRet (FPO: [0,0,0])

    0097f0ac 77c72700 0000056c 000f3fc0 000f3fc0 ntdll!NtRequestWaitReplyPort+0xc (FPO: [3,0,0])

    0097f118 77c713ba 0097f154 0097f138 77c72c7f RPCRT4!LRPC_CCALL::SendReceive+0x230 (FPO: [Non-Fpo])

    0097f124 77c72c7f 0097f154 76d320d8 0097f540 RPCRT4!I_RpcSendReceive+0x24 (FPO: [Non-Fpo])

    0097f138 77ce219b 0097f180 000db5f8 00000000 RPCRT4!NdrSendReceive+0x2b (FPO: [Non-Fpo])

    0097f520 76d41213 76d320d8 76d402c8 0097f540 RPCRT4!NdrClientCall2+0x22e (FPO: [Non-Fpo])

    0097f538 76d3fc12 000b5618 000ce248 000ce268 rpcss!LocalCreateInstance+0x1c (FPO: [Non-Fpo])

    0097f5cc 76d3f033 0097f7d8 0097f654 0097f824 rpcss!CServerListEntry::CallServer+0x127 (FPO: [Non-Fpo])

    0097f604 76d3bc7f 00000001 00000008 00000000 rpcss!CServerTableEntry::CallRunningServer+0x181 (FPO: [Non-Fpo])

    0097f670 76d3bac5 0097f7d8 00000000 000e415c rpcss!Activation+0x337 (FPO: [Non-Fpo])

    0097f6cc 76d3dd08 000e415c 0097f7bc 0097f71c rpcss!ActivateFromProperties+0x213 (FPO: [Non-Fpo])

    0097f6dc 76d3dd5c 76d738d0 00000000 000e415c rpcss!CScmActivator::CreateInstance+0x10 (FPO: [Non-Fpo])

    0097f71c 76d3c162 000e415c 00000000 0097f7bc rpcss!ActivationPropertiesIn::DelegateCreateInstance+0xf7 (FPO: [Non-Fpo])

    0097f764 76d3bdb6 000e415c 0097f7bc 00000000 rpcss!ActivateFromPropertiesPreamble+0x4e3 (FPO: [Non-Fpo])

    0097f7ac 76d3dcdd 00000002 000ce2ac 00000000 rpcss!PerformScmStage+0xbb (FPO: [Non-Fpo])

    0097f8c8 77c70f3b 000cdbb0 000ce248 000ce268 rpcss!SCMActivatorCreateInstance+0xa2 (FPO: [Non-Fpo])

    0097f8f8 77ce23f7 76d3dc3e 0097fae0 00000007 RPCRT4!Invoke+0x30

    0097fcf8 77ce26ed 00000000 00000000 000cdcdc RPCRT4!NdrStubCall2+0x299 (FPO: [Non-Fpo])

    0097fd14 77c709be 000cdcdc 00094ef0 000cdcdc RPCRT4!NdrServerCall2+0x19 (FPO: [Non-Fpo])

    0097fd48 77c7093f 76d3156c 000cdcdc 0097fdec RPCRT4!DispatchToStubInCNoAvrf+0x38 (FPO: [Non-Fpo])

    0097fd9c 77c70865 00000004 00000000 76d73538 RPCRT4!RPC_INTERFACE::DispatchToStubWorker+0x117 (FPO: [Non-Fpo])

    0097fdc0 77c734b1 000cdcdc 00000000 76d73538 RPCRT4!RPC_INTERFACE::DispatchToStub+0xa3 (FPO: [Non-Fpo])

    0097fdfc 77c71bb3 000b44d8 0008f0e8 000dea40 RPCRT4!LRPC_SCALL::DealWithRequestMessage+0x42c (FPO: [Non-Fpo])

    0097fe20 77c75458 0008f120 0097fe38 000b44d8 RPCRT4!LRPC_ADDRESS::DealWithLRPCRequest+0x127 (FPO: [Non-Fpo])

    0097ff84 77c5778f 0097ffac 77c5f7dd 0008f0e8 RPCRT4!LRPC_ADDRESS::ReceiveLotsaCalls+0x430 (FPO: [Non-Fpo])

    0097ff8c 77c5f7dd 0008f0e8 00000000 00000000 RPCRT4!RecvLotsaCallsWrapper+0xd (FPO: [Non-Fpo])

    0097ffac 77c5de88 00084298 0097ffec 77e6608b RPCRT4!BaseCachedThreadRoutine+0x9d (FPO: [Non-Fpo])

    0097ffb8 77e6608b 000c0070 00000000 00000000 RPCRT4!ThreadStartRoutine+0x1b (FPO: [Non-Fpo])

    0097ffec 00000000 77c5de6d 000c0070 00000000 kernel32!BaseThreadStart+0x34 (FPO: [Non-Fpo])

     

    The server thread is waiting for a reply to another LPC message it issued while servicing the client request. Tracing the LPC message as above -

     

    0: kd> !lpc message 0003923d

    Searching message 3923d in threads ...

    Client thread 89531020 waiting a reply from 3923d                         

        Server thread 898cd998 is working on message 3923d                         

    Searching thread 89531020 in port rundown queues ...

     

    Server communication port 0xe22d58c8

        Handles: 1   References: 1

        The LpcDataInfoChainHead queue is empty

            Connected port: 0xe22d29d8      Server connection port: 0xe17a95f8

     

    Client communication port 0xe22d29d8

        Handles: 1   References: 15

        The LpcDataInfoChainHead queue is empty

     

    Server connection port e17a95f8  Name: OLE71892D01F993498EA1C3757C71D5

        Handles: 1   References: 37

        Server process  : 89bedd88 (svchost.exe)

        Queue semaphore : 89718210

        Semaphore state 0 (0x0)

        The message queue is empty

     

        Messages in LpcDataInfoChainHead:

            0000 e188c3b0 - Busy  Id=00039239  From: 03f4.111c  Context=80030001  [e17a9678 . e28b00f8]

                       Length=0044002c  Type=00380001 (LPC_REQUEST)

                       Data: 00000001 00040342 0000d401 000004a8 2e6c8db6 4131cb31

            0000 e28b00f8 - Busy  Id=0003923d  From: 03f4.10a0  Context=80030001  [e188c3b0 . e2846008]

                       Length=0044002c  Type=00380001 (LPC_REQUEST)

                       Data: 00000001 00040342 0000d401 000004a8 2e6c8db6 4131cb31

            0000 e2846008 - Busy  Id=0003c96e  From: 03f4.0df0  Context=80030001  [e28b00f8 . e27dcb70]

                       Length=0044002c  Type=00380001 (LPC_REQUEST)

                       Data: 00000001 00040342 0000d401 000004a8 2e6c8db6 4131cb31

            0000 e27dcb70 - Busy  Id=0003db3c  From: 03f4.1158  Context=80030001  [e2846008 . e28c0788]

                       Length=0044002c  Type=00380001 (LPC_REQUEST)

                       Data: 00000001 00040342 0000d401 000004a8 2e6c8db6 4131cb31

            0000 e28c0788 - Busy  Id=0003db41  From: 03f4.0fbc  Context=80030001  [e27dcb70 . e2983650]

                       Length=0044002c  Type=00380001 (LPC_REQUEST)

                       Data: 00000001 00040342 0000d401 000004a8 2e6c8db6 4131cb31

            0000 e2983650 - Busy  Id=0003db5c  From: 03f4.0f40  Context=80030001  [e28c0788 . e2596ae0]

                       Length=0044002c  Type=00380001 (LPC_REQUEST)

                       Data: 00000001 00040342 0000d401 000004a8 2e6c8db6 4131cb31

            0000 e2596ae0 - Busy  Id=00046180  From: 03f4.03d8  Context=80030001  [e2983650 . e28c3ea0]

                       Length=0044002c  Type=00380001 (LPC_REQUEST)

                       Data: 00000001 00040342 0000d401 000004a8 2e6c8db6 4131cb31

            0000 e28c3ea0 - Busy  Id=0004a38d  From: 03f4.07c4  Context=80030001  [e2596ae0 . e294aea0]

                       Length=0044002c  Type=00380001 (LPC_REQUEST)

                       Data: 00000001 00040342 0000d401 000004a8 2e6c8db6 4131cb31

            0000 e294aea0 - Busy  Id=0004d0c6  From: 03f4.1714  Context=80030001  [e28c3ea0 . e2977ea0]

                       Length=0044002c  Type=00380001 (LPC_REQUEST)

                       Data: 00000001 00030342 0000d401 000004a8 2e6c8db6 4131cb31

            0000 e2977ea0 - Busy  Id=000544d8  From: 03f4.162c  Context=80030001  [e294aea0 . e241f008]

                       Length=0044002c  Type=00380001 (LPC_REQUEST)

                       Data: 00000001 00040342 0000d401 000004a8 2e6c8db6 4131cb31

            0000 e241f008 - Busy  Id=0006077e  From: 03f4.031c  Context=80030001  [e2977ea0 . e2940820]

                       Length=0044002c  Type=00380001 (LPC_REQUEST)

                       Data: 00000001 00040342 0000d401 000004a8 2e6c8db6 4131cb31

            0000 e2940820 - Busy  Id=000638da  From: 03f4.1754  Context=80030001  [e241f008 . e2576270]

                       Length=0044002c  Type=00380001 (LPC_REQUEST)

                       Data: 00000001 00040342 0000d401 000004a8 2e6c8db6 4131cb31

            0000 e2576270 - Busy  Id=000638f1  From: 03f4.1794  Context=80030001  [e2940820 . e261c268]

                       Length=0044002c  Type=00380001 (LPC_REQUEST)

                       Data: 00000001 00040342 0000d401 000004a8 2e6c8db6 4131cb31

            0000 e261c268 - Busy  Id=000638fe  From: 03f4.17a8  Context=80030001  [e2576270 . e17a9678]

                       Length=0044002c  Type=00380001 (LPC_REQUEST)

                       Data: 00000001 00040342 0000d401 000004a8 2e6c8db6 4131cb31

        The LpcDataInfoChainHead queue contains 14 messages

        Threads in RunDown queue :     0xe188c1f0    0xe28aff38    0xe2845e48    0xe27dc9b0    0xe28c05c8    0xe2983490    0xe2596920    0xe28c3ce0    0xe294ace0    0xe2977ce0    0xe241ee48    0xe2940660    0xe25760b0    0xe261c0a8

    Done.        

     

     

    0: kd> .process /p /r 89bedd88

    Implicit process is now 89bedd88

    Loading User Symbols

    ............................................................................................................................

     

     

    0: kd> !thread 898cd998

    THREAD 898cd998  Cid 04a8.0508  Teb: 7ffd5000 Win32Thread: e23a6008 WAIT: (UserRequest) UserMode Non-Alertable

        89006340  SynchronizationEvent

    IRP List:

        88eb89e0: (0006,0094) Flags: 00000800  Mdl: 00000000

    Not impersonating

    DeviceMap                 e10018a8

    Owning Process            89bedd88       Image:         svchost.exe

    Wait Start TickCount      99831          Ticks: 46825 (0:00:12:11.640)

    Context Switch Count      7330                 LargeStack

    UserTime                  00:00:00.906

    KernelTime                00:00:00.531

    Win32 Start Address 0x0003923d

    LPC Server thread working on message Id 3923d

    Start Address kernel32!BaseThreadStartThunk (0x77e6b5f3)

    Stack Init b9a55000 Current b9a54c60 Base b9a55000 Limit b9a51000 Call 0

    Priority 13 BasePriority 8 PriorityDecrement 0

    Kernel stack not resident.

    ChildEBP RetAddr  Args to Child             

    b9a54c78 8083e6a2 898cda10 898cd998 898cda40 nt!KiSwapContext+0x26 (FPO: [Uses EBP] [0,0,4])

    b9a54ca4 8083f164 00000000 00000000 00000000 nt!KiSwapThread+0x284 (FPO: [Non-Fpo])

    b9a54cec 8092db70 89006340 00000006 80834d01 nt!KeWaitForSingleObject+0x346 (FPO: [Non-Fpo])

    b9a54d50 80834d3f 00000d64 00000000 00000000 nt!NtWaitForSingleObject+0x9a (FPO: [Non-Fpo])

    b9a54d50 7c82ed54 00000d64 00000000 00000000 nt!KiFastCallEntry+0xfc (FPO: [0,0] TrapFrame @ b9a54d64)

    0109f248 7c822124 77e6bad8 00000d64 00000000 ntdll!KiFastSystemCallRet (FPO: [0,0,0])

    0109f24c 77e6bad8 00000d64 00000000 00000000 ntdll!NtWaitForSingleObject+0xc (FPO: [3,0,0])

    0109f2bc 77e6ba42 00000d64 ffffffff 00000000 kernel32!WaitForSingleObjectEx+0xac (FPO: [Non-Fpo])

    0109f2d0 77719486 00000d64 ffffffff 00000000 kernel32!WaitForSingleObject+0x12 (FPO: [Non-Fpo])

    0109f2e4 7769c09b 00086803 00000000 00086660 ole32!CComApartment::WaitForAccess+0x5c (FPO: [0,0,0])

    0109f2f8 776a417d 0109f344 02163088 02145ac0 ole32!CComApartment::GetPreRegMOID+0x41 (FPO: [Non-Fpo])

    0109f310 776a40d2 0109f344 77790858 00000001 ole32!GetPreRegMOID+0x24 (FPO: [Non-Fpo])

    0109f358 776a48e3 00000001 00000003 0109f3c4 ole32!CIDObject::GetOrCreateStdID+0x11e (FPO: [Non-Fpo])

    0109f37c 776a4419 02163088 00000000 00086860 ole32!ObtainStdIDFromUnk+0x92 (FPO: [Non-Fpo])

    0109f3a4 776a43c1 0109f468 00162690 02b8bd30 ole32!StdMarshalObject+0x49 (FPO: [Non-Fpo])

    0109f400 776a3974 0109f42c 0109f468 00162690 ole32!CDestObjectWrapper::MarshalInterface+0x150 (FPO: [Non-Fpo])

    0109f43c 77686f83 0109f468 00162690 02b8bd30 ole32!CoMarshalInterface+0x93 (FPO: [Non-Fpo])

    0109f490 77686e38 02b8bd30 00162690 00162690 ole32!MarshalHelperMulti+0x48 (FPO: [Non-Fpo])

    0109f4bc 77686d46 58b10040 00000001 00162690 ole32!GetInstanceHelperMulti+0x4d (FPO: [Non-Fpo])

    0109f584 77c70f3b 000a7938 00000000 0218b774 ole32!CObjServer::CreateInstance+0x30d (FPO: [Non-Fpo])

    0109f5a8 77ce23f7 77686b34 0109f790 00000004 RPCRT4!Invoke+0x30

    0109f9a8 77ce1a95 000b4398 000b2074 05cc596c RPCRT4!NdrStubCall2+0x299 (FPO: [Non-Fpo])

    0109fa00 77788ec1 000b4398 05cc596c 000b2074 RPCRT4!CStdStubBuffer_Invoke+0xc6 (FPO: [Non-Fpo])

    0109fa44 77788e74 05cc596c 000b4428 00000000 ole32!SyncStubInvoke+0x33 (FPO: [Non-Fpo])

    0109fa8c 7769b2b7 05cc596c 000b4288 000b4398 ole32!StubInvoke+0xa7 (FPO: [Non-Fpo])

    0109fb64 7769b0b5 000b2074 00000000 000b4398 ole32!CCtxComChnl::ContextInvoke+0xe3 (FPO: [Non-Fpo])

    0109fb80 77789926 05cc596c 00000001 000b4398 ole32!MTAInvoke+0x1a (FPO: [Non-Fpo])

    0109fbb0 77788c33 d0908070 000b2074 000b4398 ole32!AppInvoke+0xa3 (FPO: [Non-Fpo])

    0109fc84 77789936 05cc5918 000b22f8 000b4410 ole32!ComInvokeWithLockAndIPID+0x2c5 (FPO: [Non-Fpo])

    0109fcd0 77c709be 020ccc24 000b4410 020ccc24 ole32!ThreadInvoke+0x1e0 (FPO: [Non-Fpo])

    0109fd04 77c7093f 777887e0 020ccc24 0109fdec RPCRT4!DispatchToStubInCNoAvrf+0x38 (FPO: [Non-Fpo])

    0109fd58 77c70865 00000000 00000000 7767bfa8 RPCRT4!RPC_INTERFACE::DispatchToStubWorker+0x117 (FPO: [Non-Fpo])

    0109fd7c 77c657eb 020ccc24 00000000 7767bfa8 RPCRT4!RPC_INTERFACE::DispatchToStub+0xa3 (FPO: [Non-Fpo])

    0109fdbc 77c71e26 020ccc24 020ccbdc 00000000 RPCRT4!RPC_INTERFACE::DispatchToStubWithObject+0xc0 (FPO: [Non-Fpo])

    0109fdfc 77c71bb3 000c7db0 000b1498 00097c08 RPCRT4!LRPC_SCALL::DealWithRequestMessage+0x42c (FPO: [Non-Fpo])

    0109fe20 77c75458 000b14d0 0109fe38 000c7db0 RPCRT4!LRPC_ADDRESS::DealWithLRPCRequest+0x127 (FPO: [Non-Fpo])

    0109ff84 77c5778f 0109ffac 77c5f7dd 000b1498 RPCRT4!LRPC_ADDRESS::ReceiveLotsaCalls+0x430 (FPO: [Non-Fpo])

    0109ff8c 77c5f7dd 000b1498 00000000 00000000 RPCRT4!RecvLotsaCallsWrapper+0xd (FPO: [Non-Fpo])

    0109ffac 77c5de88 0008aa48 0109ffec 77e6608b RPCRT4!BaseCachedThreadRoutine+0x9d (FPO: [Non-Fpo])

    0109ffb8 77e6608b 000b1678 00000000 00000000 RPCRT4!ThreadStartRoutine+0x1b (FPO: [Non-Fpo])

     

    The above LPC server thread is waiting for allocation of COM OID (Object Identifier) since we have run out of OIDs. The way OIDs are allocated is by calling CComApartment::CallTheResolver( ). Searching threads in the system with CComApartment::CallTheResolver on the stack we get the below thread within svchost.exe.

     

     

    0: kd> !thread 88d76db0 

    THREAD 88d76db0  Cid 04a8.13f4  Teb: 7ffd6000 Win32Thread: e295c758 WAIT: (Unknown) UserMode Non-Alertable

        88d76f9c  Semaphore Limit 0x1

    Waiting for reply to LPC MessageId 0003832f:

    Current LPC port e15791b8

    Not impersonating

    DeviceMap                 e10018a8

    Owning Process            89bedd88       Image:         svchost.exe

    Wait Start TickCount      99073          Ticks: 47583 (0:00:12:23.484)

    Context Switch Count      6947                 LargeStack

    UserTime                  00:00:00.859

    KernelTime                00:00:00.562

    Win32 Start Address 0x0003832e

    LPC Server thread working on message Id 3832e

    Start Address kernel32!BaseThreadStartThunk (0x77e6b5f3)

    Stack Init b8af6000 Current b8af5c08 Base b8af6000 Limit b8af2000 Call 0

    Priority 9 BasePriority 8 PriorityDecrement 0

    Kernel stack not resident.

    ChildEBP RetAddr  Args to Child             

    b8af5c20 8083e6a2 88d76e28 88d76db0 88d76e58 nt!KiSwapContext+0x26 (FPO: [Uses EBP] [0,0,4])

    b8af5c4c 8083f164 88d76f9c 88d76f70 88d76db0 nt!KiSwapThread+0x284 (FPO: [Non-Fpo])

    b8af5c94 8093983f 88d76f9c 00000011 b8af5d01 nt!KeWaitForSingleObject+0x346 (FPO: [Non-Fpo])

    b8af5d50 80834d3f 00000094 000dde10 000dde10 nt!NtRequestWaitReplyPort+0x776 (FPO: [Non-Fpo])

    b8af5d50 7c82ed54 00000094 000dde10 000dde10 nt!KiFastCallEntry+0xfc (FPO: [0,0] TrapFrame @ b8af5d64)

    0251ed2c 7c821c94 77c72700 00000094 000dde10 ntdll!KiFastSystemCallRet (FPO: [0,0,0])

    0251ed30 77c72700 00000094 000dde10 000dde10 ntdll!NtRequestWaitReplyPort+0xc (FPO: [3,0,0])

    0251ed88 77c713ba 0251edc4 0251eda8 77c72c7f RPCRT4!LRPC_CCALL::SendReceive+0x230 (FPO: [Non-Fpo])

    0251ed94 77c72c7f 0251edc4 776720f8 0251f1b0 RPCRT4!I_RpcSendReceive+0x24 (FPO: [Non-Fpo])

    0251eda8 77ce219b 0251edf0 05c6b44c 0251f360 RPCRT4!NdrSendReceive+0x2b (FPO: [Non-Fpo])

    0251f190 7769c087 776720f8 77671df0 0251f1b0 RPCRT4!NdrClientCall2+0x22e (FPO: [Non-Fpo])

    0251f1a8 7769c04e 0008b0a8 0008bc30 0251f1f4 ole32!ServerAllocateOIDs+0x1c (FPO: [Non-Fpo])

    0251f1d8 7769c01b 0251f1f4 0251f368 0251f220 ole32!CRpcResolver::ServerAllocOIDs+0x2a (FPO: [Non-Fpo])

    0251f1fc 7769bf93 0251f368 0251f220 00000014 ole32!OXIDEntry::AllocOIDs+0x40 (FPO: [Non-Fpo])

    0251f36c 7769c0b2 00086803 00000000 00086660 ole32!CComApartment::CallTheResolver+0x6b (FPO: [Non-Fpo])

    0251f380 776a417d 0251f3cc 05c67490 05d59ee0 ole32!CComApartment::GetPreRegMOID+0x54 (FPO: [Non-Fpo])

    0251f398 776a40d2 0251f3cc 77790858 00000001 ole32!GetPreRegMOID+0x24 (FPO: [Non-Fpo])

    0251f3e0 776a48e3 00000001 00000003 0251f44c ole32!CIDObject::GetOrCreateStdID+0x11e (FPO: [Non-Fpo])

    0251f404 776a4419 05c67490 00000000 00086860 ole32!ObtainStdIDFromUnk+0x92 (FPO: [Non-Fpo])

    0251f42c 776a43c1 05e39b00 0251f504 02b30234 ole32!StdMarshalObject+0x49 (FPO: [Non-Fpo])

    0251f488 776a3974 0251f4b4 05e39b00 0251f504 ole32!CDestObjectWrapper::MarshalInterface+0x150 (FPO: [Non-Fpo])

    0251f4c4 77c544a3 05e39b00 0251f504 02b30234 ole32!CoMarshalInterface+0x93 (FPO: [Non-Fpo])

    0251f518 77c54e85 0251f5e8 02b30234 74ce6210 RPCRT4!NdrpInterfacePointerMarshall+0x9a (FPO: [Non-Fpo])

    0251f538 77c691ad 0251f5e8 000f2898 02b30234 RPCRT4!NdrpPointerMarshall+0x87 (FPO: [Non-Fpo])

    0251f550 77c690fc 0251f5e8 02b30234 74ce6210 RPCRT4!NdrPointerMarshall+0x30 (FPO: [Non-Fpo])

    0251f570 77c691ad 0251f5e8 00000000 00112e50 RPCRT4!NdrpPointerMarshall+0x116 (FPO: [Non-Fpo])

    0251f588 77c715ac 0251f5e8 00112e50 74ce620c RPCRT4!NdrPointerMarshall+0x30 (FPO: [Non-Fpo])

    0251f5ac 77ce249e 00000000 00000001 020ce8c8 RPCRT4!NdrpServerMarshal+0x82 (FPO: [Non-Fpo])

    0251f9a8 77ce1a95 020ce8c8 0011fa08 000f1454 RPCRT4!NdrStubCall2+0x374 (FPO: [Non-Fpo])

    0251fa00 77788ec1 020ce8c8 000f1454 0011fa08 RPCRT4!CStdStubBuffer_Invoke+0xc6 (FPO: [Non-Fpo])

    0251fa44 77788e74 000f1454 00137868 020debf0 ole32!SyncStubInvoke+0x33 (FPO: [Non-Fpo])

    0251fa8c 7769b2b7 000f1454 0014aaf8 020ce8c8 ole32!StubInvoke+0xa7 (FPO: [Non-Fpo])

    0251fb64 7769b0b5 0011fa08 00000000 020ce8c8 ole32!CCtxComChnl::ContextInvoke+0xe3 (FPO: [Non-Fpo])

    0251fb80 77789926 000f1454 00000001 020ce8c8 ole32!MTAInvoke+0x1a (FPO: [Non-Fpo])

    0251fbb0 77788c33 d0908070 0011fa08 020ce8c8 ole32!AppInvoke+0xa3 (FPO: [Non-Fpo])

    0251fc84 77789936 000f1400 020d3b48 00137850 ole32!ComInvokeWithLockAndIPID+0x2c5 (FPO: [Non-Fpo])

    0251fcd0 77c709be 0216202c 00137850 0216202c ole32!ThreadInvoke+0x1e0 (FPO: [Non-Fpo])

    0251fd04 77c7093f 777887e0 0216202c 0251fdec RPCRT4!DispatchToStubInCNoAvrf+0x38 (FPO: [Non-Fpo])

    0251fd58 77c70865 00000000 00000000 7767bfa8 RPCRT4!RPC_INTERFACE::DispatchToStubWorker+0x117 (FPO: [Non-Fpo])

    0251fd7c 77c657eb 0216202c 00000000 7767bfa8 RPCRT4!RPC_INTERFACE::DispatchToStub+0xa3 (FPO: [Non-Fpo])

     

     

    0: kd> !lpc message 0003832f

    Searching message 3832f in threads ...

        Server thread 8900edb0 is working on message 3832f                        

    Client thread 88d76db0 waiting a reply from 3832f                         

    Searching thread 88d76db0 in port rundown queues ...

     

    Server communication port 0xe188a6e8

        Handles: 1   References: 1

        The LpcDataInfoChainHead queue is empty

            Connected port: 0xe15791b8      Server connection port: 0xe13734b0

     

    Client communication port 0xe15791b8

        Handles: 1   References: 5

        The LpcDataInfoChainHead queue is empty

     

    Server connection port e13734b0  Name: epmapper

        Handles: 1   References: 93

        Server process  : 896e1450 (svchost1.exe)

        Queue semaphore : 8995ac10

        Semaphore state 0 (0x0)

        The message queue is empty

     

        Messages in LpcDataInfoChainHead:

            0000 e2365188 - Busy  Id=0003832f  From: 04a8.13f4  Context=80050003  [e1373530 . e297ab00]

                       Length=0044002c  Type=00380001 (LPC_REQUEST)

                       Data: 00000001 00050242 421e5fa2 9b3b48b4 a695daf5 00000001

            0000 e297ab00 - Busy  Id=00039232  From: 01c8.14e4  Context=8021000d  [e2365188 . e12cbea0]

                       Length=0044002c  Type=00380001 (LPC_REQUEST)

                       Data: 00010001 00040242 b8d06b40 b8d06b00 000003f4 000010a0

            0000 e12cbea0 - Busy  Id=00039236  From: 01c8.0de0  Context=8021000d  [e297ab00 . e1297430]

                       Length=0044002c  Type=00380001 (LPC_REQUEST)

                       Data: 00010001 00040242 4195c844 8ff87eb3 2b70d0f6 001e001c

            0000 e1297430 - Busy  Id=0003c96d  From: 0204.0224  Context=8022000e  [e12cbea0 . e1534a48]

                       Length=0044002c  Type=00380001 (LPC_REQUEST)

                       Data: 00010001 00040242 b8bb1b40 b8bb1b00 000003f4 00000fbc

            0000 e1534a48 - Busy  Id=0003db3b  From: 01c8.10e0  Context=8021000d  [e1297430 . e2515ca0]

                       Length=0044002c  Type=00380001 (LPC_REQUEST)

                       Data: 00010001 00040242 0000dc04 000001c8 7b6ee790 9e8a0590

            0000 e2515ca0 - Busy  Id=0003db3f  From: 01c8.0f5c  Context=8021000d  [e1534a48 . e29279e8]

                       Length=0044002c  Type=00380001 (LPC_REQUEST)

                       Data: 00010001 00040242 0000dc04 000001c8 7b6ee790 9e8a0590

            0000 e29279e8 - Busy  Id=0003db5a  From: 01c8.1618  Context=8021000d  [e2515ca0 . e1061008]

                       Length=0044002c  Type=00380001 (LPC_REQUEST)

                       Data: 00010001 00040242 00000000 00000000 00000000 00000000

            0000 e1061008 - Busy  Id=0004617f  From: 01c8.1220  Context=8021000d  [e29279e8 . e295fa80]

                       Length=0044002c  Type=00380001 (LPC_REQUEST)

                       Data: 00010001 00040242 00010c20 000001c8 37bd6287 35ebbd53

            0000 e295fa80 - Busy  Id=0004a38a  From: 01c8.1708  Context=8021000d  [e1061008 . e2966ea0]

                       Length=0044002c  Type=00380001 (LPC_REQUEST)

                       Data: 00010001 00040242 0001a41a 000001c8 cb786dae 60cdf739

            0000 e2966ea0 - Busy  Id=0004d0c2  From: 04a8.04d4  Context=80050003  [e295fa80 . e1261008]

                       Length=0044002c  Type=00380001 (LPC_REQUEST)

                       Data: 00020001 00030242 5415989d 3d8048c1 00070005 00000000

            0000 e1261008 - Busy  Id=000544d7  From: 01c8.0758  Context=8021000d  [e2966ea0 . e1373530]

                       Length=0044002c  Type=00380001 (LPC_REQUEST)

                       Data: 00010001 00040242 00010c20 000001c8 37bd6287 35ebbd53

        The LpcDataInfoChainHead queue contains 11 messages

        Threads in RunDown queue :     0xe2364fc8    0xe297a940    0xe12cbce0    0xe1297270    0xe1534888    0xe2515ae0    0xe2927828    0xe1060e48    0xe295f8c0    0xe2966ce0    0xe1260e48

    Done.                                             

     

     

    0: kd> !thread 8900edb0

    THREAD 8900edb0  Cid 03f4.0598  Teb: 7ffd4000 Win32Thread: 00000000 WAIT: (Unknown) UserMode Non-Alertable

        89014cc0  NotificationEvent

    Not impersonating

    DeviceMap                 e10018a8

    Owning Process            896e1450       Image:         svchost1.exe

    Wait Start TickCount      144629         Ticks: 2027 (0:00:00:31.671)

    Context Switch Count      5892            

    UserTime                  00:00:00.765

    KernelTime                00:00:01.203

    Win32 Start Address 0x0003832f

    LPC Server thread working on message Id 3832f

    Start Address kernel32!BaseThreadStartThunk (0x77e6b5f3)

    Stack Init b95d1000 Current b95d0c60 Base b95d1000 Limit b95ce000 Call 0

    Priority 13 BasePriority 8 PriorityDecrement 0

    ChildEBP RetAddr  Args to Child             

    b95d0c78 8083e6a2 8900ee28 8900edb0 8900ee58 nt!KiSwapContext+0x26 (FPO: [Uses EBP] [0,0,4])

    b95d0ca4 8083f164 00000000 00000000 00000000 nt!KiSwapThread+0x284 (FPO: [Non-Fpo])

    b95d0cec 8092db70 89014cc0 00000006 80935d01 nt!KeWaitForSingleObject+0x346 (FPO: [Non-Fpo])

    b95d0d50 80834d3f 000004e8 00000000 00000000 nt!NtWaitForSingleObject+0x9a (FPO: [Non-Fpo])

    b95d0d50 7c82ed54 000004e8 00000000 00000000 nt!KiFastCallEntry+0xfc (FPO: [0,0] TrapFrame @ b95d0d64)

    0082f7a4 7c822124 77e6bad8 000004e8 00000000 ntdll!KiFastSystemCallRet (FPO: [0,0,0])

    0082f7a8 77e6bad8 000004e8 00000000 00000000 ntdll!NtWaitForSingleObject+0xc (FPO: [3,0,0])

    0082f818 77e6ba42 000004e8 ffffffff 00000000 kernel32!WaitForSingleObjectEx+0xac (FPO: [Non-Fpo])

    0082f82c 77cab1e0 000004e8 ffffffff 000bc314 kernel32!WaitForSingleObject+0x12 (FPO: [Non-Fpo])

    0082f840 77cab333 0082f89c 000ad626 000bc314 RPCRT4!SWMRLock::LockSharedOrExclusiveOnLastWriter+0x32 (FPO: [Non-Fpo])

    0082f868 77c72e00 0082f89c 00000001 00000002 RPCRT4!SWMRLock::LockSharedOrExclusive+0xa6 (FPO: [Non-Fpo])

    0082f890 77c72f71 00000000 00000000 00000010 RPCRT4!NDRSContextUnmarshall2+0x1b9 (FPO: [Non-Fpo])

    0082f8bc 77c72eb6 4182f938 76d32c22 0082f938 RPCRT4!NdrServerContextNewUnmarshall+0xa6 (FPO: [Non-Fpo])

    0082f8d0 77c70f93 0082f938 0082fae4 76d32c22 RPCRT4!NdrUnmarshallHandle+0x52 (FPO: [Non-Fpo])

    0082f900 77ce2345 0082fae4 00094618 00000000 RPCRT4!NdrpServerUnMarshal+0x13d (FPO: [Non-Fpo])

    0082fcf8 77ce26ed 00000000 00000000 000bbd9c RPCRT4!NdrStubCall2+0x19f (FPO: [Non-Fpo])

    0082fd14 77c709be 000bbd9c 00094618 000bbd9c RPCRT4!NdrServerCall2+0x19 (FPO: [Non-Fpo])

    0082fd48 77c7093f 76d3156c 000bbd9c 0082fdec RPCRT4!DispatchToStubInCNoAvrf+0x38 (FPO: [Non-Fpo])

    0082fd9c 77c70865 00000005 00000000 76d736c0 RPCRT4!RPC_INTERFACE::DispatchToStubWorker+0x117 (FPO: [Non-Fpo])

    0082fdc0 77c734b1 000bbd9c 00000000 76d736c0 RPCRT4!RPC_INTERFACE::DispatchToStub+0xa3 (FPO: [Non-Fpo])

    0082fdfc 77c71bb3 00099188 0008f0e8 000adfd8 RPCRT4!LRPC_SCALL::DealWithRequestMessage+0x42c (FPO: [Non-Fpo])

    0082fe20 77c75458 0008f120 0082fe38 00099188 RPCRT4!LRPC_ADDRESS::DealWithLRPCRequest+0x127 (FPO: [Non-Fpo])

    0082ff84 77c5778f 0082ffac 77c5f7dd 0008f0e8 RPCRT4!LRPC_ADDRESS::ReceiveLotsaCalls+0x430 (FPO: [Non-Fpo])

    0082ff8c 77c5f7dd 0008f0e8 00000000 00000000 RPCRT4!RecvLotsaCallsWrapper+0xd (FPO: [Non-Fpo])

    0082ffac 77c5de88 00084298 0082ffec 77e6608b RPCRT4!BaseCachedThreadRoutine+0x9d (FPO: [Non-Fpo])

    0082ffb8 77e6608b 000bc298 00000000 00000000 RPCRT4!ThreadStartRoutine+0x1b (FPO: [Non-Fpo])

    0082ffec 00000000 77c5de6d 000bc298 00000000 kernel32!BaseThreadStart+0x34 (FPO: [Non-Fpo])

     

    The above thread belongs to the RPC service and is waiting for a lock to be acquired.

     

    0: kd> x /v svchost1!*service*name*

    pub global 01004078    0 svchost1!ServiceNames = <no type information>

     

    0: kd> du poi(01004078)

    000839f8  "RpcSs"

     

    To find why this thread is not able to acquire the lock, investigation of the threads within the RPC service is required. The reason the lock is not available is because there is a thread that has acquired the lock and is trying to reach a client that does not exist.

     

    The chain of events is summarized in the below figure. Note that all horizontal links were traversed using LPC data structures and debugger extensions. The vertical links were traversed by investigation beyond the scope of LPC. Thus the issue that started in a component of SMS was finally traced to RPC through the use of LPC data structures and debugger extension. LPC is one of the core inter-process communication mechanisms. Understanding LPC and learning to use its extensions helps a lot to debug issues where client threads are waiting for reply from the server.

     

    Chain of events

  • Ntdebugging Blog

    LPC part 2 Kernel Debugger Extensions

    • 1 Comments

    Hello my name is Roy, I’m an EE on the Microsoft global escalation services / CPR team.   This blog is a follow on to my first LPC blog.  We will be discussing debugger extensions that allow you to look at LPC related issues.

     

    Disclaimer: The purpose of this blog is to illustrate debugging techniques with LPC. Please do not rely on this information as documentation for the APIs when writing your own code. As always, this is subject to change in the future.  

     

    LPC Kernel Debugger Extensions

     

     

    Command

    Description

    !lpc                    

    Display the list and description of all the !lpc commands

    !lpc message [MessageId]

    Display the message with a given ID and all related information by attempting to match the given Message ID to the EHTREAD->LpcReceivedMessageId and to the ETHREAD->LpcReplyMessageId of all threads in the system.

    If the MessageId parameter is omitted then it attempts to display all the outstanding messages in the system by searching for the tag ‘LpcM’ in the pools.

    !lpc port [PortAddress] 

    Displays port information. If a server connection port address is specified then only information about that port is displayed. If either a client or server communication port is specified it prints information about the specified communication port, the peer communication port and the server connection port.

    If the PortAddress parameter is omitted then it attempts to walk the list of all objects of type “Port” and “WaitablePort” and display them. Note that for this feature to work the GFlags option “+otl” i.e. “Maintain a list of objects for each type” must be enabled.

    !lpc scan PortAddress   

    Displays port information. It attempts to walk the list of all objects of type “Port” and “WaitablePort” and display the one matching the specified port address. Note that for this feature to work the GFlags option “+otl” i.e. “Maintain a list of objects for each type” must be enabled.

    !lpc thread [ThreadAddr]

    If  ThreadAddr is specified it walks the list of threads in the ETHREAD-> LpcReplyChain to locate the list head i.e. a “Port” or “WaitablePort” object on which the thread is waiting for a reply.

    If the ThreadAddr parameter is omitted then it attempts to find all LPC server threads by looking for threads with a non-NULL EHTREAD->LpcReceivedMessageId and all client threads by looking for threads with a non-NULL ETHREAD->LpcReplyMessageId and displays them.

    !lpc PoolSearch

    Toggles a setting that controls whether the “lpc message” command will search for LPC message tag (‘LpcM’) in the kernel pools or not.

     

     

     

    LPC Kernel Debugger Extension Usage

     

     

    LPC Connection Port information from call stack

     

    On the call stack of any client or server threads blocked on LPC data transfer or LPC connection there will be a frame containing either one of the functions  NtRequestWaitReplyPort() or NtReplyWaitReceivePortEx() . The first parameter to either one of these functions is the handle to the port they are blocked on.

     

    kd> !thread 810de2a8 

    THREAD 810de2a8  Cid 01dc.01f4  Teb: 7ffde000 Win32Thread: 00000000 WAIT: (WrLpcReceive) UserMode Non-Alertable

        81131188  Semaphore Limit 0x7fffffff

        810de398  NotificationTimer

    Not impersonating

    DeviceMap                 e196c460

    Owning Process            810ddda0       Image:         rpclpcs.exe

    Wait Start TickCount      64555          Ticks: 402 (0:00:00:04.025)

    Context Switch Count      2            

    UserTime                  00:00:00.000

    KernelTime                00:00:00.000

    Win32 Start Address 0x77e76bf0

    Start Address 0x7c810856

    Stack Init f8e28000 Current f8e27c4c Base f8e28000 Limit f8e25000 Call 0

    Priority 8 BasePriority 8 PriorityDecrement 0 DecrementCount 0

    ChildEBP RetAddr  Args to Child             

    f8e27c64 804dc6a6 810de318 810de2a8 804dc6f2 nt!KiSwapContext+0x2e (FPO: [Uses EBP] [0,0,4])

    f8e27c70 804dc6f2 e1084108 8055a540 e1084108 nt!KiSwapThread+0x46 (FPO: [0,0,0])

    f8e27c98 8056a50a 00000001 00000010 00000001 nt!KeWaitForSingleObject+0x1c2 (FPO: [Non-Fpo])

    f8e27d48 804df06b 000007c4 002bff70 00000000 nt!NtReplyWaitReceivePortEx+0x3dc (FPO: [Non-Fpo])

    f8e27d48 7c90eb94 000007c4 002bff70 00000000 nt!KiFastCallEntry+0xf8 (FPO: [0,0] TrapFrame @ f8e27d64)

    002bff80 00000000 00000000 00000000 00000000 ntdll!KiFastSystemCallRet (FPO: [0,0,0])

     

    kd> !handle 7c4 3 810ddda0       

    processor number 0, process 810ddda0

    PROCESS 810ddda0  SessionId: 0  Cid: 01dc    Peb: 7ffd9000  ParentCid: 01b4

        DirBase: 058cd000  ObjectTable: e1a13278  HandleCount:  18.

        Image: rpclpcs.exe

     

    Handle table at e107d000 with 18 Entries in use

    07c4: Object: e1084108  GrantedAccess: 001f0001 Entry: e107df88

    Object: e1084108  Type: (812b5c80) Port

        ObjectHeader: e10840f0 (old version)

            HandleCount: 1  PointerCount: 4

            Directory Object: e14c72c8  Name: rpclpc

     

    kd> !lpc port e1084108

     

    Server connection port e1084108  Name: rpclpc

        Handles: 1   References: 4

        Server process  : 810ddda0 (rpclpcs.exe)

        Queue semaphore : 81131188

        Semaphore state 0 (0x0)

        The message queue is empty

        The LpcDataInfoChainHead queue is empty

     

     

    LPC Messages that are waiting to be picked up by the server thread

     

    When a message is queued to the server connection port and the server thread has not yet not been signaled to pull out the message the ‘!lpc port <Port>” displays the following output that includes the message.

     

    kd> !lpc port e10d8388

     

    Client communication port 0xe10d8388

        Handles: 1   References: 2

        The LpcDataInfoChainHead queue is empty

            Connected port: 0xe106dc00      Server connection port: 0xe15512e0

     

    Server communication port 0xe106dc00

        Handles: 1   References: 1

        The LpcDataInfoChainHead queue is empty

     

    Server connection port e15512e0  Name: rpclpc

        Handles: 1   References: 9

        Server process  : ffbbebe8 (rpclpcs.exe)

        Queue semaphore : 81250848

        Semaphore state 0 (0x0)

            Messages in queue:

            0000 e10513f8 - Busy  Id=000048d8  From: 05f4.0108  Context=80020000  [e15512f0 . e15512f0]

                       Length=0054003c  Type=00000001 (LPC_REQUEST)

                       Data: 00000001 00000241 00000000 00000000 f877bc04 804ec10e

        The message queue contains 1 messages

        The LpcDataInfoChainHead queue is empty

     

     

    LPC Messages being processed by the server

     

    When an LPC message has been queued by the client, dequeued by the sever and being worked upon by the server both the client and the server thread have the message ID associated with them

     

    kd> !lpc message 16fa

    Searching message 16fa in threads ...

        Server thread 810de2a8 is working on message 16fa                        

    Client thread 810dc930 waiting a reply from 16fa                         

    Searching thread 810dc930 in port rundown queues ...

     

    Server connection port e1084108  Name: rpclpc

        Handles: 1   References: 3

        Server process  : 810ddda0 (rpclpcs.exe)

        Queue semaphore : 81131188

        Semaphore state 0 (0x0)

        The message queue is empty

        The LpcDataInfoChainHead queue is empty

    Done.                                             

     

    kd> !thread 810de2a8

    THREAD 810de2a8  Cid 01dc.01f4  Teb: 7ffde000 Win32Thread: 00000000 RUNNING on processor 0

    Not impersonating

    DeviceMap                 e196c460

    Owning Process            810ddda0       Image:         rpclpcs.exe

    Wait Start TickCount      65732          Ticks: 10 (0:00:00:00.100)

    Context Switch Count      4            

    UserTime                  00:00:00.000

    KernelTime                00:00:00.010

    Win32 Start Address 0x000016fa

    LPC Server thread working on message Id 16fa

    Start Address kernel32!BaseThreadStartThunk (0x7c810856)

    Stack Init f8e28000 Current f8e27728 Base f8e28000 Limit f8e25000 Call 0

    Priority 9 BasePriority 8 PriorityDecrement 0 DecrementCount 0

    ChildEBP RetAddr  Args to Child             

    f8e27d64 7c90eb94 badb0d00 002bfe24 00000000 nt!KiFastCallEntry+0xf8 (FPO: [0,0] TrapFrame @ f8e27d64)

    002bfe18 7c90e399 77e76703 000007c4 002bff70 ntdll!KiFastSystemCallRet (FPO: [0,0,0])

    002bfe1c 77e76703 000007c4 002bff70 00000000 ntdll!NtReplyWaitReceivePortEx+0xc (FPO: [5,0,0])

    002bff80 77e76c22 002bffa8 77e76a3b 00084540 RPCRT4!LRPC_ADDRESS::ReceiveLotsaCalls+0xf4 (FPO: [Non-Fpo])

    002bff88 77e76a3b 00084540 7c90ee18 0006fb10 RPCRT4!RecvLotsaCallsWrapper+0xd (FPO: [Non-Fpo])

    002bffa8 77e76c0a 00083f20 002bffec 7c80b50b RPCRT4!BaseCachedThreadRoutine+0x79 (FPO: [Non-Fpo])

    002bffb4 7c80b50b 00084718 7c90ee18 0006fb10 RPCRT4!ThreadStartRoutine+0x1a (FPO: [Non-Fpo])

    002bffec 00000000 77e76bf0 00084718 00000000 kernel32!BaseThreadStart+0x37 (FPO: [Non-Fpo])

     

    kd> !thread 810dc930

    THREAD 810dc930  Cid 01e4.01ec  Teb: 7ffde000 Win32Thread: 00000000 READY

    Waiting for reply to LPC MessageId 000016fa:

    Pending LPC Reply Message:

        e16b8538: [e16b8538,e16b8538]

    Not impersonating

    DeviceMap                 e196c460

    Owning Process            810e9860       Image:         rpclpcc.exe

    Wait Start TickCount      65731          Ticks: 11 (0:00:00:00.110)

    Context Switch Count      26            

    UserTime                  00:00:00.000

    KernelTime                00:00:00.020

    Win32 Start Address rpclpcs!pre_c_init (0x01001a91)

    Start Address kernel32!BaseProcessStartThunk (0x7c810867)

    Stack Init fbe0e000 Current fbe0dc28 Base fbe0e000 Limit fbe0b000 Call 0

    Priority 8 BasePriority 8 PriorityDecrement 0 DecrementCount 16

    ChildEBP RetAddr  Args to Child             

    fbe0dc40 804ea3a4 00000000 810dc930 e107f388 nt!KiUnlockDispatcherDatabase+0x77 (FPO: [Uses EBP] [0,0,4])

    fbe0dc58 80586255 81131188 00000001 00000001 nt!KeReleaseSemaphore+0x70 (FPO: [Non-Fpo])

    fbe0dd10 80598c58 000847b0 0006f8c0 0006f8c8 nt!NtSecureConnectPort+0x635 (FPO: [Non-Fpo])

    fbe0dd3c 804df06b 000847b0 0006f8c0 0006f8c8 nt!NtConnectPort+0x24 (FPO: [Non-Fpo])

    fbe0dd3c 7c90eb94 000847b0 0006f8c0 0006f8c8 nt!KiFastCallEntry+0xf8 (FPO: [0,0] TrapFrame @ fbe0dd64)

    0006f8d0 7c93040b ffffffff 00000000 7c859e48 ntdll!KiFastSystemCallRet (FPO: [0,0,0])

    0006f9b0 7c8107fd 00000000 00084718 7c90e642 ntdll!DbgPrint+0x1b (FPO: [Non-Fpo])

    0006fe68 77ea11a8 00000022 00084540 00000000 kernel32!CreateRemoteThread+0x284 (FPO: [Non-Fpo])

    000847a8 00000000 00000001 00000000 00000000 RPCRT4!LRPC_ADDRESS::CompleteListen+0x84 (FPO: [Non-Fpo])

     

     

    Finding all threads in the system involved in LPC communication

     

    To get a consolidated list of all client threads in the system waiting on LPC replies and a list of server threads working on received LPC requests use the “!lpc threads” command without any parameters

     

    0: kd> !lpc thread

    Searching message 0 in threads ...

    Client thread 89a4cdb0 waiting a reply from 15f6                         

        Server thread 898c6a50 is working on message 63f09                        

        Server thread 89671020 is working on message 63f75                        

        Server thread 89904a48 is working on message 63f10                        

        Server thread 88d6bdb0 is working on message 3d6b1                        

        Server thread 89973db0 is working on message 63f32                         

        Server thread 896d32b8 is working on message 3454                        

        Server thread 8995d020 is working on message 63f76                        

        Server thread 8960f020 is working on message 63f77                         

        Server thread 898cd350 is working on message 61bd9                        

        Server thread 8900edb0 is working on message 3832f                        

        Server thread 8900fbf0 is working on message 33c2e                        

        Server thread 88d539a8 is working on message 3343f                        

    Client thread 89be4020 waiting a reply from 6077e                         

    Client thread 89012990 waiting a reply from 39239                         

        Server thread 89012990 is working on message 39232                        

    Client thread 89531020 waiting a reply from 3923d                         

        Server thread 89531020 is working on message 39236                        

    Client thread 88d13b40 waiting a reply from 3c96e                            

     

     

     

    Quick Tips

     

    8 No thread in the system is doing a NtListenPort( ) does that mean no LPC connection  port is open for connections?

     

    !stacks 2 NtListenPort does not return any thread because NtListenPort( ) calls NtRequestWaitReplyPort( ) to wait for an incoming LPC_CONNECTION_REQUEST message. Instead of calling NtListenPort( ) for listening for incoming connections most LPC connection threads directly call NtRequestWaitReplyPort( ).

     

    8 !lpc message does not show any message?

     

    !poolfind LpcM 1 should do the trick.

     

     

    0: kd> !lpc message

     

    Scanning large pool allocation table for Tag: LpcM (8974e000 : 8976e000)

     

    Searching Paged pool (e1000000 : f7000000) for Tag: LpcM

     

     

    0: kd> !poolfind LpcM 1

     

    Scanning large pool allocation table for Tag: LpcM (8974e000 : 8976e000)

     

     

    Searching Paged pool (e1000000 : f7000000) for Tag: LpcM

     

    e1061000 size:  168 previous size:    0  (Allocated) LpcM

    e1090758 size:  168 previous size:    8  (Allocated) LpcM

    e10a7a40 size:  168 previous size:    8  (Allocated) LpcM

    e1188468 size:    8 previous size:   20  (Free)      LpcM

    e1261000 size:  168 previous size:    0  (Allocated) LpcM

    e1297428 size:  168 previous size:    8  (Allocated) LpcM

     

    Once you have the pool blocks listed with the “LpcM” tag dump the ones that are “Allocated”. On a 32-bit machine add 8bytes to the pool address listed on the left most column. On a 64-bit machine add 16bytes i.e. 0x10. Dump the message as below to retrieve the message id.

     

    0: kd> dt nt!_LPCP_MESSAGE e1061000+8

       +0x000 Entry            : _LIST_ENTRY [ 0xe295fa80 - 0xe29279e8 ]

       +0x000 FreeEntry        : _SINGLE_LIST_ENTRY

       +0x004 Reserved0        : 0xe29279e8

       +0x008 SenderPort       : 0xe2327020

       +0x00c RepliedToThread  : (null)

       +0x010 PortContext      : 0x8021000d

       +0x018 Request          : _PORT_MESSAGE

     

    0: kd> dt nt!_LPCP_MESSAGE e1061000+8 Request.

       +0x018 Request  :

          +0x000 u1       : __unnamed

          +0x004 u2       : __unnamed

          +0x008 ClientId : _CLIENT_ID

          +0x008 DoNotUseThisField : 9.8460604703041575e-311

          +0x010 MessageId : 0x4617f

          +0x014 ClientViewSize : 0

          +0x014 CallbackId : 0

     

    0: kd> !lpc message 0x4617f

    Searching message 4617f in threads ...

        Server thread 88f91c08 is working on message 4617f                        

    Client thread 88f0f5f8 waiting a reply from 4617f                         

    Searching thread 88f0f5f8 in port rundown queues ...

     

    Server communication port 0xe2300a38

        Handles: 1   References: 1

        The LpcDataInfoChainHead queue is empty

            Connected port: 0xe2327020      Server connection port: 0xe13734b0

     

    Client communication port 0xe2327020

        Handles: 1   References: 9

        The LpcDataInfoChainHead queue is empty

     

    Server connection port e13734b0  Name: epmapper

        Handles: 1   References: 93

        Server process  : 896e1450 (svchost1.exe)

        Queue semaphore : 8995ac10

        Semaphore state 0 (0x0)

        The message queue is empty

     

     

    8   The number of server threads shown by !lpc thread processing requests is much more than the number of client threads waiting for reply.

     

    !lpc thread interprets stale data in ETHREAD structure as valid LPC information. An indication of the stale data is the ETHREAD.LpcExitThreadCalled will be set. This field is only set when the thread is exiting.

     

    0: kd> !lpc thread

    Searching message 0 in threads ...

    Client thread 89a4cdb0 waiting a reply from 15f6                         

        Server thread 898c6a50 is working on message 63f09                        

        Server thread 89671020 is working on message 63f75                         

        Server thread 89904a48 is working on message 63f10                        

        Server thread 88d6bdb0 is working on message 3d6b1                        

        Server thread 89973db0 is working on message 63f32                        

        Server thread 896d32b8 is working on message 3454                        

        Server thread 8995d020 is working on message 63f76                        

        Server thread 8960f020 is working on message 63f77                        

        Server thread 898cd350 is working on message 61bd9                        

        Server thread 8900edb0 is working on message 3832f                        

        Server thread 8900fbf0 is working on message 33c2e                        

        Server thread 88d539a8 is working on message 3343f     

     

    0: kd> !lpc message 3454

    Searching message 3454 in threads ...

        Server thread 896d32b8 is working on message 3454                        

    Done. 

     

     

    0: kd> dt nt!_ETHREAD 896d32b8 Lpc.

       +0x1c0 LpcReplyChain :  [ 0x896d3478 - 0x896d3478 ]

          +0x000 Flink : 0x896d3478 _LIST_ENTRY [ 0x896d3478 - 0x896d3478 ]

          +0x004 Blink : 0x896d3478 _LIST_ENTRY [ 0x896d3478 - 0x896d3478 ]

       +0x1ec LpcReplySemaphore :

          +0x000 Header : _DISPATCHER_HEADER

          +0x010 Limit : 1

       +0x200 LpcReplyMessage :

       +0x200 LpcWaitingOnPort :

       +0x220 LpcReceivedMessageId : 0x3454

       +0x234 LpcReplyMessageId : 0

       +0x248 LpcReceivedMsgIdValid : 0x1 ''

       +0x248 LpcExitThreadCalled : 0x1 ''

     

    8 Finding all client threads in the system waiting for LPC reply.

     

    !lpc thread

  • Ntdebugging Blog

    LPC (Local procedure calls) Part 1 architecture

    • 5 Comments

     

    Hello, my name is Roy. I’m an Escalation Engineer in the CPR platforms team.  I’ll be doing a four part series on LPC over the coming month.  You’re sure to find this interesting. That being said let’s get started.

     

    Disclaimer: The purpose of this blog is to illustrate debugging techniques with LPC. Please do not rely on this information as documentation for the LPC APIs when writing your own code. As always, this is subject to change in the future. 

    What is LPC?

     

    LPC (Local Inter-Process Communication and NOT Local Procedure Calls) is a high speed message based communication mechanism implemented in the NT kernel. LPC can be used for communication between two user mode processes, between a user mode process and a kernel mode driver or between two kernel mode drivers.  One example would be two user mode processes communicating via LPC.  Like CSRSS.exe talking to SMSS.exe over SmssWinStationApiPort while creating a logon session or any process talking to LSASS.exe over LsaAuthenticationPort for security reasons. Another example of a user mode process communicating with a kernel mode driver would be KSecDD.sys talking to LSASS.exe for EFS key encryption and decryption during read/write of an encrypted file.

     

    LPC uses two different mechanisms for passing data between the client and the server process. It uses the LPC message buffer (for data size less than 304 bytes) or it uses a shared memory Section mapped to both client and server address space (for data size more than 304 bytes).

     

    Apart from being used as the protocol of choice of Remote Procedure Calls between processes running on the same system, LPC is also used throughout the system e.g. for Win32 Application’s communication with CSRSS.exe, Security Reference Monitor’s communication with LSASS, WinLogon’s communication with LSASS etc.

     

    LPC enforces synchronous communication model between the client and the server processes. Vista deprecates the use of LPC using a new mechanism called Asynchronous Local Inter-Process Communication (ALPC). ALPC has an inherent advantage over LPC in that all calls from client to the server are asynchronous i.e. the client does not need to block/wait for the server to respond to a message. In Vista, legacy application calls to LPC APIs are automatically redirected to the newer ALPC APIs.

     

     

    LPC APIs

     

    LPC APIs are native APIs i.e. they are exported in user mode by NTDLL.dll and in kernel mode by NTOSKRNL.exe. LPC APIs are not exposed at the Win32 level hence win32 applications cannot use the LPC facility directly. Win32 applications can however use LPC indirectly when using RPC by specifying LPC as its underling transport via the protocol sequence “ncalrpc”. All LPC APIs end in the word "Port" which implies an LPC communication endpoint.

     

    API

    Description

    NtCreatePort

    Used by server to create a connection port

    NtConnectPort

    Used by client to connect to a connection port

    NtListenPort

    Used by server to listen for connection requests on the connection port.

    NtAcceptConnectPort

    Used by server to accept connection requests on the connection port

    NtCompleteConnectPort

    Used by server to complete the acceptance of a connection request

    NtRequestPort

    Used to send a datagram message that does not have a reply

    NtRequestWaitReplyPort

    Used to send a message and wait for a reply

    NtReplyPort

    Used to send a reply to a particular message

    NtReplyWaitReplyPort

    Used to send a reply to a particular message and wait for a reply to a previous message

    NtReplyWaitReceivePort

    Used by server to send a reply to the client and wait to receive a message from the client

    NtImpersonateClientOfPort

    Used by server thread to temporarily borrow the security context of a client thread

     

     

    The following diagram illustrates the steps taken by an LPC server process to listen to inbound connection request from potential client and the steps taken by clients to connect to listening servers.

     

     

    Figure [1] LPC Client Server Connection Establishment Sequence

    NOTE: Many server processes use the NtReplyWaitReceivePort( ) API instead of NtListenPort( ). NtListenPort( ) drops all LPC messages except connection requests. Hence NtListenPort( ) can only be used for the first connection. For later connection requests NtReplyWaitReceivePort( ) is used.

     

    The following diagram illustrates the steps taken by an LPC client to send a request to an LPC server that is has already established a connection to and the steps taken by the server to responds to the message.

     

     

     

    Figure [2] LPC Client Server Data Transfer Sequence

     

     

    LPC Data Structures

     

     

    LPC Port Data Structure

     

    LPC endpoints are referred to as ports. LPC implementation uses the same port structure to represent the various types of ports. The ports used by LPC are Server Connection Ports which are named ports created by the server process to accept incoming connections from clients.  Client Communication Ports which are created by the client process to connect to a server process and Server Communication Port created by the server process.

     

    Relationships 

      

     

    Figure [3] LPC Port types and their relationships

     

     

    LPCP_PORT_OBJECT is the internal data structure used by LPC to represent a LPC port. LPCP_PORT_OBJECTs are allocated out of paged pool with the tag ‘Port’.

     

    kd> dt nt!_LPCP_PORT_OBJECT

       +0x000 ConnectionPort   : Ptr32 _LPCP_PORT_OBJECT

       +0x004 ConnectedPort    : Ptr32 _LPCP_PORT_OBJECT

       +0x008 MsgQueue         : _LPCP_PORT_QUEUE

       +0x018 Creator          : _CLIENT_ID

       +0x020 ClientSectionBase : Ptr32 Void

       +0x024 ServerSectionBase : Ptr32 Void

       +0x028 PortContext      : Ptr32 Void

       +0x02c ClientThread     : Ptr32 _ETHREAD

       +0x030 SecurityQos      : _SECURITY_QUALITY_OF_SERVICE

       +0x03c StaticSecurity   : _SECURITY_CLIENT_CONTEXT

       +0x078 LpcReplyChainHead : _LIST_ENTRY

       +0x080 LpcDataInfoChainHead : _LIST_ENTRY

       +0x088 ServerProcess    : Ptr32 _EPROCESS

       +0x088 MappingProcess   : Ptr32 _EPROCESS

       +0x08c MaxMessageLength : Uint2B

       +0x08e MaxConnectionInfoLength : Uint2B

       +0x090 Flags            : Uint4B

       +0x094 WaitEvent        : _KEVENT

     

     

    Field

    Description

    ConnectedPort

    Points to the Server Communication Port

    ConnectionPort

    Points to the Server Connection Port

    MsgQueue.Semaphore

    Used to signal the server thread about the presence of  a message in MsgQueue.ReceiveHead

    MsgQueue.ReceiveHead

    Head of a doubly linked list containing all the messages that are waiting to be dequeued by the server.

    MsgQueue.NonPagedPortQueue

    Points to the LPCP_NONPAGED_PORT_QUEUE structure for the client communication port for tracking lost replies.

    LpcReplyChainHead

    Head of a doubly linked list containing all the threads that are waiting for replies to messages sent to this port.

     

     

     

    LPC Message Data Structure

     

    LPC message are data structures that carry information from the LPC client to the LPC server and can be of various types like connection, request, close etc.

     

    LPCP_MESSAGE is the internal data structure used by LPC to represent a message. LPCP_MESSAGE structures are allocated from a system wide lookaside list with the tag ‘LpcM’.

     

    kd> dt nt!_LPCP_MESSAGE

       +0x000 Entry            : _LIST_ENTRY

       +0x000 FreeEntry        : _SINGLE_LIST_ENTRY

       +0x004 Reserved0        : Uint4B

       +0x008 SenderPort       : Ptr32 Void

       +0x00c RepliedToThread  : Ptr32 _ETHREAD

       +0x010 PortContext      : Ptr32 Void

       +0x018 Request          : _PORT_MESSAGE

     

     

    Field

    Description

    Request.MessageId

    Generated from the value of a global epoch LpcpNextMessageId. Used to uniquely identify a message.

    SenderPort

    Points to the LPCP_PORT_OBJECT of the client communication port

    Entry

    Is the list entry that is used to queue the message to the Server Communication/Connection Port’s MsgQueue.ReceiveHead

    Request

    Is a copy of the message buffer that was provided as the Request parameter to the call to NtRequestWaitReplyPort() or a copy the message buffer that was provided as the Reply parameter to NtReplyWaitRecivePortEx().

     

     

    LPC related fields in Thread Data Structure

     

    kd> dt nt!_ETHREAD -y Lpc

       +0x1c8 LpcReplyChain : _LIST_ENTRY

       +0x1f4 LpcReplySemaphore : _KSEMAPHORE

       +0x208 LpcReplyMessage : Ptr32 Void

       +0x208 LpcWaitingOnPort : Ptr32 Void

       +0x228 LpcReceivedMessageId : Uint4B

       +0x23c LpcReplyMessageId : Uint4B

       +0x250 LpcReceivedMsgIdValid : Pos 0, 1 Bit

       +0x250 LpcExitThreadCalled : Pos 1, 1 Bit

     

    The following table describes the fields of the ETHREAD data structure that are used for communication between LPC Client and Server process.

     

     

    Field

    Description

    LpcReplyMessageId

    Contains the ID of the message sent to the server while the client thread is waiting on the response.

    LpcWaitingOnPort

    Used by the LPC Client Thread to stores the LPC client communication port on which it is waiting for a reply.

    LpcReplySemaphore

    Used to block the LPC Client Thread while the server is processing the LPC message.

    LpcExitThreadCalled

    Used by the Lpc APIs to determine if the thread is currently in the process of exiting and if so returns STATUS_THREAD_IS_TERMINATING from the API call.

    LpcReplyChain

    Used to queue the thread to the list of threads waiting for replies from the server communication/connection port. The list head of this list is in LPCP_PORT_OBJECT->LpcReplyChainHead.

     

     

     

    Important global LPC connection ports

     

    Field

    Description

    LsaAuthenticationPort

     Used for the following tasks:

    1. Security package management like Kerberos, NTLM etc…
    2. Credential management
    3. User Account management
    4. EFS (Encrypted File System) key encryption, decryption

     

    SmssWinStationApiPort

     

     This port is used by the CSRSS to manage Window Stations i.e. multiple sessions.

     

    ErrorLogPort

     

     This port is used by IO manager and other components to send error log entries to the EventLog service.

     

     

     

     

    Important LPC port fields in the Process

    (nt!_EPROCESS)

     

    Field

    Description

    SecurityPort

    LSA uses this port for EFS (encrypted file system) and authentication.

     

    ExceptionPort

     When a thread does not catch an exception an LPC_EXCEPTION message is sent to this port. A subsystem might register an LPC port in this field to receive second chance exception information of process running in the subsystem. Default action of CSRSS is to terminate the process.

     

    DebugPort

     The kernel dispatches user mode exceptions to the process debugger on this port. Debug outputs using OutputDebugString( ) are passed to this port as DBG_PRINTEXCEPTION_C.

     

     

     

     

    Important LPC port fields in the Thread

    (nt!_ETHREAD)

     

    Field

    Description

    TerminationPort

    This is not a single port rather a chain of ports registered with the Process Manager. The process manager notifies all ports when the thread terminates.

     

    LpcWaitingOnPort

    This field has the address of the client communication port when the client thread waits for a reply to a LPC message from a server thread.

     

     

     

     

    Stay tuned...in the next post we will roll our sleeves up and dig into the debug extensions for LPC and some common states where  you may find these calls stuck.

     

  • Ntdebugging Blog

    Desktop Heap, part 2

    • 21 Comments

     

    Matthew here again – I want to provide some follow-up information on desktop heap.   In the first post I didn’t discuss the size of desktop heap related memory ranges on 64-bit Windows, 3GB, or Vista.   So without further ado, here are the relevant sizes on various platforms...

     

     

    Windows XP (32-bit)

     

    ·         48 MB = SessionViewSize (default registry value, set for XP Professional, x86)

    ·         20 MB = SessionViewSize (if no registry value is defined) 

    ·         3072 KB = Interactive desktop heap size (defined in the registry, SharedSection 2nd value)

    ·         512 KB = Non-interactive desktop heap size (defined in the registry, SharedSection 3nd value)

    ·         128 KB = Winlogon desktop heap size

    ·         64 KB = Disconnect desktop heap size

     

     

     

    Windows Server 2003 (32-bit)

     

    ·         48 MB = SessionViewSize (default registry value)

    ·         20 MB = SessionViewSize (if no registry value is defined; this is the default for Terminal Servers)

    ·         3072 KB = Interactive desktop heap size (defined in the registry, SharedSection 2nd value)

    ·         512 KB = Non-interactive desktop heap size (defined in the registry, SharedSection 3nd value)

    ·         128 KB = Winlogon desktop heap size

    ·         64 KB = Disconnect desktop heap size

     

     

     

    Windows Server 2003 booted with 3GB (32-bit)

     

    ·         20 MB = SessionViewSize (registry value has no effect)

    ·         3072 KB = Interactive desktop heap size (defined in the registry, SharedSection 2nd value)

    ·         512 KB = Non-interactive desktop heap size (defined in the registry, SharedSection 3nd value)

    ·         128 KB = Winlogon desktop heap size

    ·         64 KB = Disconnect desktop heap size

     

    You may also see reduced heap sizes when running 3GB.  During the initialization of the window manager, an attempt is made to reserve enough session view space to accommodate the expected number of desktops heaps for a given session.  If the heap sizes specified in the SharedSection registry value have been increased, the attempt to reserve session view space may fail.  When this happens, the window manager falls back to a pair of “safe” sizes for desktop heaps (512KB for interactive, 128KB for non-interactive) and tries to reserve session space again, using these smaller numbers.  This ensures that even if the registry values are too large for the 20MB session view space, the system will still be able to boot.

     

     

     

    Windows Server 2003 (64-bit)

     

    ·         104 MB = SessionViewSize (if no registry value is defined; which is the default) 

    ·         20 MB = Interactive desktop heap size (defined in the registry, SharedSection 2nd value)

    ·         768 KB = Non-interactive desktop heap size (defined in the registry, SharedSection 3nd value)

    ·         192 KB = Winlogon desktop heap size

    ·         96 KB = Disconnect desktop heap size

     

     

     

    Windows Vista RTM (32-bit)

     

    ·         Session View space is now a dynamic kernel address range.  The SessionViewSize registry value is no longer used.

    ·         3072 KB = Interactive desktop heap size (defined in the registry, SharedSection 2nd value)

    ·         512 KB = Non-interactive desktop heap size (defined in the registry, SharedSection 3nd value)

    ·         128 KB = Winlogon desktop heap size

    ·         64 KB = Disconnect desktop heap size

     

     

     

    Windows Vista (64-bit) and Windows Server 2008 (64-bit)

     

    ·         Session View space is now a dynamic kernel address range.  The SessionViewSize registry value is no longer used.

    ·         20 MB = Interactive desktop heap size (defined in the registry, SharedSection 2nd value)

    ·         768 KB = Non-interactive desktop heap size (defined in the registry, SharedSection 3nd value)

    ·         192 KB = Winlogon desktop heap size

    ·         96 KB = Disconnect desktop heap size

     

     

    Windows Vista SP1 (32-bit) and Windows Server 2008 (32-bit)

     

    ·         Session View space is now a dynamic kernel address range.  The SessionViewSize registry value is no longer used.

    ·         12288 KB = Interactive desktop heap size (defined in the registry, SharedSection 2nd value)

    ·         512 KB = Non-interactive desktop heap size (defined in the registry, SharedSection 3nd value)

    ·         128 KB = Winlogon desktop heap size

    ·         64 KB = Disconnect desktop heap size 

     

     

    Windows Vista introduced a new public API function: CreateDesktopEx, which allows the caller to specify the size of desktop heap.

    Additionally, GetUserObjectInformation now includes a new flag for retrieving the desktop heap size (UOI_HEAPSIZE).

     

     

    [Update: 3/20/2008 - Added Vista SP1 and Server 2008 info.  More information can be found here.]

  • Ntdebugging Blog

    How Windows Starts Up (Part the second)

    • 7 Comments

    Howdy folks, David here again with part two of How Windows Starts Up.  Today we’ll be covering the Boot Loader Phase.  Let’s take a moment to recap where we are at this point.  So far, the computer has completed POST, executed the MBR code, located the active partition, executed the Boot Sector code in that partition, found NTLDR and loaded it into memory.  So what happens next?  During this phase, we’ll be discussing the following files located in the root of your system volume:  NTLDR, ntdetect.com and boot.ini.

     

    The Boot Loader Phase begins when NTLDR executes:

     

    1.    NTLDR switches the processor from real mode to 32-bit flat memory mode.

    How does it do this you ask?  NTLDR contains two parts – a 16-bit component which handles switching the processor to Protected Mode and a 32-bit component which handles everything else. 

     

    When NTLDR first begins execution, the processor is in an x86-operating mode called Real-Mode.  In Real-Mode, no virtual to physical translation of memory addresses occurs, which means that all memory addresses use the 16-bit Segment:Offset format, and only the first 1MB of the computer's physical memory is accessible.  However, we don’t have to concern ourselves with this too much because the first action NTLDR takes is to switch the CPU to Protected-Mode. 

     

    *Trivia Alert*  before we enter Protected-Mode, we enable the A20 Line – you IBM guys know what I’m talking about… 

     

    Still no virtual-to-physical translation occurs at this point in the boot process, but a full 32 bits of memory addressing becomes accessible, and now NTLDR can access up to 4GB of physical memory. 

     

    2.    NTLDR creates enough page tables to make memory below 16MB accessible with paging turned on.  This number has been increased with Windows Server 2003. 

     

    3.    NTLDR enables paging.  “Protected-Mode with Paging Enabled” is the mode in which Windows executes during normal operation.

     

    4.    NTLDR then starts the FAT and NTFS mini file system drivers contained within its own code.  Unlike the Boot Sector's file system code, NTLDR's code is sophisticated enough to read subdirectories.

     

    Note:   All disk access from this point until the Kernel Phase begins uses BIOS Interrupt 13h, Extended Interrupt 13h or NTBOOTDD.SYS

     

    5.    NTLDR locates and parses Hiberfil.sys to determine if it contains a valid hibernator file.  If a valid hibernator file is found, then that file is loaded into memory and execution continues from the point the computer was placed into hibernation.

     

    Hibernation files are interesting.  The file either contains a full memory image, or it contains a pointer (ARC path) to the boot partition of the last operating system that entered hibernation.  That boot partition will contain another Hiberfil.sys file, which contains a full memory image of the hibernating operating system.  If the Hiberfil.sys file is a pointer, the file will contain an ARC path using one of the following syntaxes:

     

                             - linkmulti()disk()rdisk()partition()

                             - linksignature()disk()rdisk()partition()

                             - linkscsi()disk()rdisk()partition()

     

    6.    If Hiberfil.sys is not found, NTLDR parses the Boot.ini file.  If the Boot.ini contains a reference to more than one operating system or boot option, then the boot loader screen is displayed. 

     

    This is a sample of a default Boot.ini file from a Windows 2000 Server-based computer:

     

    [boot loader]

    timeout=30

    default=multi(0)disk(0)rdisk(0)partition(1)\WINNT

    [operating systems]

    multi(0)disk(0)rdisk(0)partition(1)\WINNT="Windows 2000 Server" /fastdetect

     

    Let’s talk a bit about what’s in the Boot.ini file.  Boot.ini contains a list of ARC paths.  Ok, quick – five bonus points to the first person who can tell me what ‘ARC’ stands for…  come on…  Yes, you there in the red shirt… correct!  It stands for ‘Advanced RISC Computing’; now beam down with the captain.  The rest of you come with me…

     

    ARC paths may use the “Multi”, “SCSI” or “Signature” syntax.

     

    The “Multi” syntax instructs Windows to rely on the system BIOS to load system files.  This means that NTLDR will be using INT13 BIOS calls to find and load NTOSKRNL and any other files on the boot partition.

     

    The “SCSI” syntax tells Windows to load a SCSI device driver to access the boot partition.  A copy of the driver must be renamed to NTBOOTDD.SYS and placed in the root of the system partition.

     

    The “Signature” syntax is equivalent to the SCSI syntax, but is used instead to support the Plug and Play architecture in Windows 2000 or later.  Because Windows 2000 is a PnP OS, the SCSI controller number instance might vary each time you start Windows 2000, especially if you add new SCSI controller hardware after setup.

     

    The Signature syntax instructs NTLDR to locate the drive with a disk signature that matches the value in the parentheses, regardless of which SCSI controller number that the drive is connected to.  Keep in mind that the Signature syntax also requires the presence of an Ntbootdd.sys file in the root of the system partition.  If you recall from the previous post, the disk signature value is located in the MBR.

     

    Example:

    signature(8b467c12)disk(1)rdisk(0)partition(2)\winnt="Windows 2000"

     

          Signature() syntax is used only if one of the following conditions exists:

     

    ·         The System BIOS or controller hosting the boot partition on which Windows is installed, does not support INT13 Extensions, or has the INT13 Extensions option disabled, and the partition on which you installed Windows is either larger than ~7.8 gigabytes (GB) in size, or the ending cylinder number for that partition is higher than 1024.

    ·         The drive on which you installed Windows is connected to a SCSI controller with SCSI BIOS disabled, so INT13 BIOS calls cannot be used during the boot process.

     

                            227704 Windows May Use Signature() Syntax in the Boot.ini File

                            http://support.microsoft.com/default.aspx?scid=kb;EN-US;227704

     

    7.    If Boot.ini is not found, NTLDR attempts to continue the boot process by assuming that Windows was installed in the default location on the same partition (c:\windows).  If the c:\windows directory is not found, NTLDR displays this error message:  Windows could not start because the following file is missing or corrupt: \winnt root\system32\ntoskrnl.exe

     

    8.    If the boot loader screen is displayed, the user must select an operating system to boot.

    a.    If the user chooses to run an operating system other than Windows 2003/XP such as DOS or Windows 95/98, NTLDR loads and executes the Bootsect.dos file located in the root of the boot drive.

    (1)  If the Bootsect.dos file is found, NTLDR loads it into RAM and then passes control to it.

    (2)  Bootsect.dos then loads IO.sys, MSDOS.sys, Command.com, etc., as necessary and proceeds to boot the downlevel operating system.

    (3)  If the file is not found in the root, NTLDR displays:

    I/O Error accessing boot sector file multi(0)disk(0)rdisk(0)partition(1): \bootsect.dos

    b.    If the user selects a Windows NT-based operating system, NTLDR executes Ntdetect.com.

    (1)  If the ARC path points to a hard drive or partition that is not on the system, the following is displayed:

    Windows could not start because of a computer disk hardware configuration problem.  Could not read from the selected boot disk.  Check boot path and disk hardware.  Please check the Windows documentation about hardware disk configuration and your hardware reference manuals for additional information.

    (2)  The above message may also be displayed if you are using Signature() ARC path notation and the disk signature in the MBR is overwritten (for example, by a virus) or changed (corrupted) so that it no longer matches the Signature() value in the Boot.ini file.  The following error message may also be displayed when there are problems with the Signature () value:

    Windows could not start because of the following ARC firmware boot configuration problem: did not properly generate ARC name for HAL and system paths. Please check the Windows documentation about ARC configuration options and your hardware reference manuals for additional information.

     

    9.    Ntdetect.com initializes and detects all supported hardware that it can find.  (Troubleshooting tip:  There is an Ntdetect.chk on the Windows Resource Kit CD in the Debug.cab file, which can be renamed and used in place of Ntdetect.com to view what Ntdetect.com is detecting during startup.  You know you want to – go ahead, I’ll wait…)

     

    Ntdetect.com detects the following components:

     

    a)    Computer ID

    b)    Bus/adapter type

    c)    Video adapter

    d)    Keyboard

    e)    Communications ports

    f)     Floppy disks

    g)    Mouse or other pointing devices

    h)    Parallel ports

     

    10.  Ntdetect.com passes its list of hardware components to NTLDR.

     

    11.  If the /SOS switch was used in the Boot.ini for the OS that is loading, NTLDR will display the files it loads into memory from this point on.  Keep in mind – none of these files are being initialized.  They are just being loaded into RAM at this point.  I can’t tell you how many times someone has incorrectly concluded that there is a problem with MUP.SYS because it’s the last file displayed on this screen before their computer hangs.  Poor MUP – always getting a bad rap.  This list can be useful, however; it can help you identify potential malware that may be loading.

     

    12.  NTLDR loads Ntoskrnl.exe and Hal.dll into memory.  Again, *loading*, not initializing.

     

    13.  NTLDR loads the System registry hive into memory.  If Windows cannot load the registry hive, it may be necessary to restore the System hive from the ERD or by using the Windows Recovery Console to copy the files from the Windows\Repair or Windows\Repair\RegBack folder.

     

    14.  NTLDR reads the HKLM\System\Select registry key.

    a.    If the Last Known Good option was not selected, NTLDR looks for the “Default” value in the SELECT key to select which ControlSet to use.  (0x1 = ControlSet001, 0x2 = ContolSet002, 0x3 = ControlSet003) 

    b.    If the Last Known Good option was selected, NTLDR looks for the “LastKnownGood” value in the SELECT key to select the ControlSet to use.

     

    15.  NTLDR sets the “Current” value in the HKLM\System\Select key to the ControlSet being used.

     

    Note:  At this point, NTLDR has determined which ControlSet it needs to use, but we don’t actually create the “CurrentControlSet” symbolic link until the kernel initializes (more on that in our next installment).  Therefore, for the purposes of this discussion, I’ll use *ActiveControlSet* to refer to whichever registry ControlSet has been identified as the one NTLDR needs to use.

     

    16.  NTLDR scans HKLM\System\*ActiveControlSet*\Control\Nls\Codepage for the “Acp”, “Oemcp”, and “Oemhal” values.  These entries are pointers to the table of codepage information stored in the same key.  The table has the filenames that correspond to the codepage numbers.  NTLDR loads these files into memory.  The defaults are C_1252.nls, C_437.nls, and Vgaoem.fon.

     

    A code page is a character set specific to an operating system or hardware platform.  It maps character codes to individual characters.  Different code pages include different special characters, typically customized for a language or a group of languages.  The system uses code pages to translate keyboard input into character values for non-Unicode based applications, and to translate character values into characters for non-Unicode based output displays.

     

    17.  NTLDR scans HKLM\System\*ActiveControlSet*\Control\Nls\Language for the “Default” value.  This entry is a pointer to the table of language information stored in the same key.  NTLDR loads this file into memory.  The default is L_intl.nls.

     

    18.  NTLDR scans HKLM\System\*ActiveControlSet*\Services for device drivers with a “Start” value of 0x0.  It finds their “Group” and “Tag” values also.

     

    19.  NTLDR scans HKLM\System\*ActiveControlSet*\Control\ServiceGroupOrder for the order that service groups should be loaded.

     

    20.  NTLDR sorts the drivers by “Group” and then by their “Tag” value.  Each service or driver has a particular group to which it belongs.  All drivers within a specific group are ordered by tag. 

     

    21.  NTLDR loads the 0x0 device drivers into memory.  It does not initialize or start these drivers.

     

    22.  The board is set, and the pieces are moving.  NTLDR executes Ntoskrnl.exe.

     

    Troubleshooting the Boot Loader Phase

     

    Okay, now that we’ve seen what is supposed to happen, what could possibly go wrong?  Well, several things…

     

    First, all file-system activity during this phase uses BIOS Int13, Extended Int13 or Ntbootdd.sys. 

     

    1.    If your computer uses Extended Int 13, verify that Ext. Int13 is enabled in your system’s BIOS or SCSI BIOS as appropriate.

     

    2.    If you are using a SCSI driver, verify that you are using the correct SCSI driver (Ntbootdd.sys).  If you are using "SCSI()" syntax in the Boot.ini file, copy the correct device driver for the SCSI controller to the root and then rename it to Ntbootdd.sys.  This is the driver that is stored in the Winnt.sbs\System32\Drivers folder (for example, the Aic78xx.sys file).  If you use "MULTI()" syntax in the Boot.ini file, you do not need to do this.  Use the Windows Recovery Console or a Windows NT boot floppy to complete this step.

     

                301680 HOW TO: Create a Boot Disk for an NTFS or FAT Partition

                http://support.microsoft.com/default.aspx?scid=kb;EN-US;301680

     

    307654 How to install and use the Recovery Console in Windows XP

    http://support.microsoft.com/default.aspx?scid=kb;EN-US;307654

     

     

    Significant Errors during this phase:

     

    ·         Windows could not start because of a computer disk hardware configuration problem.  Could not read from the selected boot disk.  Check boot path and disk hardware.  Please check the Windows documentation about hardware disk configuration and your hardware reference manuals for additional information.

     

    This error indicates that the ARC path is pointing to a hard drive or partition that is not on the system.  This message may also be displayed if you are using Signature() ARC path notation and the disk signature in the MBR is overwritten (for example, by a virus) or changed (corrupted) so that it no longer matches the Signature() value in the Boot.ini file. 

     

    ·         Windows could not start because of the following ARC firmware boot configuration problem: did not properly generate ARC name for HAL and system paths. Please check the Windows documentation about ARC configuration options and your hardware reference manuals for additional information.

     

    This error message may be displayed when there are problems with the Signature () ARC path syntax.

     

    Ntoskrnl.exe missing or corrupt.  This error could be the result of an incorrect ARC path in the Boot.ini file.  Use the information in the following knowledge base articles to correct the Boot.ini information.

     

                102873 BOOT.INI and ARC Path Naming Conventions and Usage

                http://support.microsoft.com/default.aspx?scid=kb;EN-US;102873

     

                155222 How to Determine the ARC Path

                http://support.microsoft.com/default.aspx?scid=kb;EN-US;155222

     

                130921 Creating an FT Boot Disk With Scsi( ) and Multi( ) Identifiers

                http://support.microsoft.com/default.aspx?scid=kb;EN-US;130921

     

    It is also possible to use the Windows XP Recovery Console to create a new Boot.ini for a Windows NT, Windows 2000 or .NET installation.  Please see the following knowledge base article for a description of this process:

     

                291980 A Discussion About the Bootcfg Command and Its Uses

                http://support.microsoft.com/default.aspx?scid=kb;EN-US;Q291980

     

    Specify the /SOS switch in the Boot.ini file and monitor the output to the screen.  If the system stops responding before anything is output to the screen, then the problem may be with hardware detection.  Replace Ntdetect.com with Ntdetect.chk to view the hardware detected by the system during this phase.

     

                103659 Setup Hangs While Inspecting Hardware; How to Use Ntdetect.com

                http://support.microsoft.com/default.aspx?scid=kb;EN-US;103659

     

                320040 HOW TO: Use Ntdetect.chk to Identify Hardware Problems

    http://support.microsoft.com/default.aspx?scid=kb;EN-US;320040

     

     

    Incorrect HAL

    If after specifying the /SOS option, you see that an incorrect version of the HAL.DLL is loading, use the following KB articles to replace the HAL or troubleshoot other HAL-related issues:

     

                314477 "Hardware Configuration Problem" Err Msg Starting Windows

                http://support.microsoft.com/default.aspx?scid=kb;EN-US;314477

     

                237556 Troubleshooting Windows 2000 Hardware Abstraction Layer Issues

                http://support.microsoft.com/default.aspx?scid=kb;EN-US;237556

     

    330184  "Invalid Boot.ini" or "Windows could not start" error messages

    http://support.microsoft.com/default.aspx?scid=kb;EN-US;330184

     

     

    Unable to load System hive

    Several problems could happen here.  First, consider that the amount of memory we’re dealing with is only 16MB.  We have to squeeze NTLDR, the System Registry Hive, the kernel, the HAL, all the boot drivers and the PFN database into a relatively small area.  So, with Windows Server 2003, we’ve given ourselves a bit more room. 

     

                302594 The System hive memory limitation is improved in Windows Server 2003

    http://support.microsoft.com/default.aspx?scid=kb;EN-US;302594

     

    If Windows is unable to load the System registry hive into memory, then it may be necessary to restore the System hive from the ERD or use the Windows Recovery Console to copy the files from the Winnt\Repair or Winnt\Repair\RegBack folder whichever version is most recent.

     

                307545 How to Recover from a Corrupted Registry

                http://support.microsoft.com/default.aspx?scid=kb;EN-US;307545

     

     

    Table of Startup Type Values

    HKEY_LOCAL_MACHINE\System\CurrentControlSet\Services

    0x0 = Boot

    Loaded into RAM by the Boot loader (NTLDR) during Boot Loader Phase, then initialized by the Kernel (NTOSKRNL) during Kernel Phase

    0x1 = System

    Loaded and initialized by the Kernel during Kernel Phase

    0x2 = Auto

    Loaded or started automatically at system startup

    0x3 = Manual

    Driver is manually started by the user or another process

    0x4 = Disabled

    Driver is not to be started (There are exceptions, such as NTFS)

     

     

    Conclusion

    Well folks, we certainly have covered a lot of ground so far.  Any questions?  Yes, you there in the front…  Ah, why have we not discussed any blue screens?  Very good question.  We haven’t seen any BSOD’s up to this point in the boot sequence because the kernel isn’t running yet and STOP errors are generated by the kernel.  No kernel, no blue screens. 

     

    That’s all the time we have this week boys & girls.  Join us next time for How Windows Starts Up, The Kernel Phase.

     

Page 23 of 25 (242 items) «2122232425