• Ntdebugging Blog

    WE ARE HIRING

    • 1 Comments

    Want to debug the most difficult problems against the largest, most complex piece of software ever written? Are you fluent in C and assembly? Do you carry a spare debugger cable in your bag? Can you represent Microsoft in critical, time sensitive solution delivery? Writing code is easy, finding bugs in someone else’s code you’ve never seen before is a real challenge.

     

    Want to hear more?

     

    http://blogs.msdn.com/jobsblog/archive/2007/01/26/jobcast-life-as-an-escalation-engineer.aspx

     

    http://blogs.msdn.com/jobsblog/archive/2007/11/21/life-as-an-escalation-engineer-part-ii.aspx

     

     

    Candidate must be a strong critical thinker, and enjoy solving very difficult problems (often involving code level analysis).  Escalation Engineers within GES are frequently involved with the highest profile issues, and therefore must be able to handle both pressure and complex situations. Candidates must also have strong customer service, accurate logical problem solving and communication skills, and the ability to work in a team environment. Programming and debugging skills are required, preferably in C or C++.  Also, the ability to read and analyze network traces and solid protocol analysis is a plus. The ideal candidate may have a four year degree in C.S. or Engineering and a minimum of four years product support experience or the equivalent in work experience. Prior knowledge of the product to be supported, and other networking products and/or networking operating systems is required. If you enjoy being the problem-solver in the spotlight of critical problems, then this position will excite and challenge you.

     

     

    Applicants that possess the experience, and desire will be considered and are encouraged to apply using the below link. After applying, please submit a copy of your resume to benjammc@microsoft.com

     

    Job Details - Microsoft Careers job code is 209923 and can be found on the Microsoft Career page.

     

     

  • Ntdebugging Blog

    Too Much Cache?

    • 28 Comments

    Cache is used to reduce the performance impact when accessing data that resides on slower storage media.  Without it your PC would crawl along and become nearly unusable.  If data or code pages for a file reside on the hard disk, it can take the system 10 milliseconds to access the page.  If that same page resides in physical RAM, it can take the system 10 nanoseconds to access the page.  Access to physical RAM is about 1 million times faster than to a hard drive.  It would be great if we could load up all the contents of the hard drive into RAM, but that scenario is cost prohibitive and dangerous.  Hard disk space is far less costly and is non-volatile (the data is persistent even when disconnected from a power source). 

     

    Since we are limited with how much RAM we can stick in a box, we have to make the most of it.  We have to share this crucial physical resource with all running processes, the kernel and the file system cache.  You can read more about how this works here:

    http://blogs.msdn.com/ntdebugging/archive/2007/10/10/the-memory-shell-game.aspx

     

    The file system cache resides in kernel address space.  It is used to buffer access to the much slower hard drive.  The file system cache will map and unmap sections of files based on access patterns, application requests and I/O demand.  The file system cache operates like a process working set.  You can monitor the size of your file system cache's working set using the Memory\System Cache Resident Bytes performance monitor counter.  This value will only show you the system cache's current working set.  Once a page is removed from the cache's working set it is placed on the standby list.  You should consider the standby pages from the cache manager as a part of your file cache.  You can also consider these standby pages to be available pages.  This is what the pre-Vista Task Manager does.  Most of what you see as available pages is probably standby pages for the system cache.  Once again, you can read more about this in "The Memory Shell Game" post.

     

    Too Much Cache is a Bad Thing

    The memory manager works on a demand based algorithm.  Physical pages are given to where the current demand is.  If the demand isn't satisfied, the memory manager will start pulling pages from other areas, scrub them and send them to help meet the growing demand.  Just like any process, the system file cache can consume physical memory if there is sufficient demand. 

    Having a lot of cache is generally not a bad thing, but if it is at the expense of other processes it can be detrimental to system performance.  There are two different ways this can occur - read and write I/O.

     

    Excessive Cached Write I/O

    Applications and services can dump lots of write I/O to files through the system file cache.  The system cache's working set will grow as it buffers this write I/O.  System threads will start flushing these dirty pages to disk.  Typically the disk can't keep up with the I/O speed of an application, so the writes get buffered into the system cache.  At a certain point the cache manager will reach a dirty page threshold and start to throttle I/O into the cache manager.  It does this to prevent applications from overtaking physical RAM with write I/O.  There are however, some isolated scenarios where this throttle doesn't work as well as we would expect.  This could be due to bad applications or drivers or not having enough memory.  Fortunately, we can tune the amount of dirty pages allowed before the system starts throttling cached write I/O.  This is handled by the SystemCacheDirtyPageThreshold registry value as described in Knowledge Base article 920739: http://support.microsoft.com/default.aspx?scid=kb;EN-US;920739

     

    Excessive Cached Read I/O

    While the SystemCacheDirtyPageThreshold registry value can tune the number of write/dirty pages in physical memory, it does not affect the number of read pages in the system cache.  If an application or driver opens many files and actively reads from them continuously through the cache manager, then the memory manger will move more physical pages to the cache manager.  If this demand continues to grow, the cache manager can grow to consume physical memory and other process (with less memory demand) will get paged out to disk.  This read I/O demand may be legitimate or may be due to poor application scalability.  The memory manager doesn't know if the demand is due to bad behavior or not, so pages are moved simply because there is demand for it.  On a 32 bit system, the file system cache working set is essentially limited to 1 GB.  This is the maximum size that we blocked off in the kernel for the system cache working set.  Since most systems have more than 1 GB of physical RAM today, having the system cache working set consume physical RAM with read I/O is less likely. 

    This scenario; however, is more prevalent on 64 bit systems.  With the increase in pointer length, the kernel's address space is greatly expanded.  The system cache's working set limit can and typically does exceed how much memory is installed in the system.  It is much easier for applications and drivers to load up the system cache with read I/O.  If the demand is sustained, the system cache's working set can grow to consume physical memory.  This will push out other process and kernel resources out to the page file and can be very detrimental to system performance.

    Fortunately we can also tune the server for this scenario.  We have added two APIs to query and set the system file cache size - GetSystemFileCacheSize() and SetSystemFileCacheSize().  We chose to implement this tuning option via API calls to allow setting the cache working set size dynamically.  I’ve uploaded the source code and compiled binaries for a sample application that calls these APIs.  The source code can be compiled using the Windows DDK, or you can use the included binaries.  The 32 bit version is limited to setting the cache working set to a maximum of 4 GB.  The 64 bit version does not have this limitation.  The sample code and included binaries are completely unsupported.  It is just a quick and dirty implementation with little error handling.

  • Ntdebugging Blog

    DNS Slow to Start?

    • 7 Comments

    Hi my name is Jason. I’m an escalation engineer on the Microsoft critical problem resolutions platform team. We recently dealt with an interesting issue that I would like to share with you and show how we went about discovering the cause. 

    The customer stated that when his server (Windows Server 2003 SP1) boots up the DNS Server service does not become availabe/responsive for ~5 minutes. Customer also stated he has other DC's in this domain that are the same model server (same specs too) that do not have the problem. So I ask, "What do you mean not responsive?". During this 5 minute period, the administrator cannot use the DNS MMC to administer the server, and DNS lookups to this server fail with a timeout error. The customer can: Ping the server, connect to file shares, etc.

    Troubleshooting

    So my initial question was "What is DNS doing for 5 minutes?". In order to determine where dns.exe was hung up or spinning his wheels, I replaced dns.exe with the checked build.

    Walkthrough:

    1.     Downloaded the checked build of SP1

    2.     Extracted and expanded the checked build of dns.exe (What is a checked build?)

    3.     Worked through the process of replacing the free build of DNS.exe (Booting into safe mode in order to bypass WFP (Windows file protection))

    4.     Created the dnsdebug file (a text file with no extension under %windir%\system32\dns). The checked build of DNS looks for this file, and reads the contents to determine the level of output. 

    5.     In the file I set the debug flags for DNS_DEBUG_INIT (0x00000010 ) & DNS_DEBUG_INIT2 (0x01000000) and set the output to go to the debugger (0x00000002), net result (0x01000012). The image below shows contents of the file.
     

    6.     Instead of using the debugger I decided to use DebugView from Sysinternals to view the output. I could have attached a debugger to DNS.exe to see the output, but in this case the DebugView was sufficient.

     

    With all this in place we rebooted. We launched DebugView and did a "net start dns". Here is a screen scrape of what the output looked like:

    image

     

    I have highlighted the debug flag I have set.

    From this output we were able to determine that the Server was going through the following process:

    1.     DNS gets started by Service Control Manger (a.k.a. SCM) and then DNS communicates back to SCM (across the NtControlPipe)

    2.     DNS loads the zones from the registry (HKLM\Software\Microsoft\Windows NT\Current Version\DNS Server\Zones

    3.     DNS pulls this zones from the Active Directory (as this is an AD integrated zone)

     

    During these phases the ability to respond to administrative requests and name lookup requests is blocked (this is by design). So based on this I needed to figure out where we were spending the most of our 5 minutes.

    1.     For the SCM portion, I used Filemon monitoring only Named Pipes for dns.exe and services.exe.

    2.     For the registry portion, I used Regmon with a filter for dns.exe

    3.     For the AD portion, I used perfmon with the NTDS>LDAP Searches/Sec counter

    With these running we were able to isolate the delay to the point where DNS was pulling the zones from AD. Here is a scrape of me running this test on my VM:

    image

     

    You can see (top left) the service communicating back to SCM across the NtControlPipe. Lets take a moment here and have a short discussion how a service starting works.

    1. SCM goes to a registry value that will be used as a seed value to build a named pipe. (HKLM\System\CurrentControlSet\Control\ServiceCurrent\(default) = <a number>
    2. T
    he result is incremented by 1 and then returned back to SCM
    3. SCM then creates a named pipe using NtControlPipe as the prefix followed by the numerical value from the registry. So when starting the first service, the pipe name would be NtControlPipe1 (To get a peek at these pipes download PipeList from Sysinternals)
    4.
    The SCM then starts the service process
    5.
    The service then goes to the same registry value and uses this seed to know which named pipe he will need to talk to in order to convey progress back to SCM.
     

    We can then see (top right) the service pulling in the dns zone information from the registry. Finally we see (bottom) the small spike in LDAP searches/sec (highlighted) where the service was pulling the zone information from the AD. Until the pull of the zones from the registry and AD is complete, the DNS service does not respond to MMC administration or DNS requests.

    The majority of the 5 minute delta fell into the last bucket (the LDAP queries to pull the zone data). So begins our next discussion, What causes LDAP to be slow?

    So LDAP lives in LSASS.EXE. There are two LDAP interfaces in AD. The entire directory is available of TCP port 389, and then the subset of AD that is published (a.k.a. Global Catalog) over TCP port 3268. The DNS services is pulling the DNS zones from the "Entire Directory" interface.

    Lsass memory usage on domain controllers has two components:

    • Data structures, which are like those in other processes (i.e. threads, heaps, and stacks)
    • Database buffer cache, which consists of database pages and index pages for the directory ß This is where a "In-RAM" copy of AD database would be

       

    In Windows Server 2003, there is no limit to how large the database buffer cache can grow (outside of the limits of the 2GB of virtual address space that can be modified via the /3GB switch). This helps explain the behavior of lsass.exe normally being the largest value in the "Mem Usage" category in task manager on a DC.

    Q. What can affect the amount of Database buffer cache that LSASS is maintaining "In-RAM?
    A. Memory pressure. This could be caused by configuration (pagefile settings, etc.) or another application(s) using and paging a lot of memory on the machine.

    lsass.exe on the affected machine was using 300mb in the "Mem Usage" column in task manager. The machine that was not affected was closer to 450mb. This means there was 150mb more data in the Database buffer cache, and due to this, the process is able to respond a lot faster to LDAP requests since he can pull from the RAM cache more and this limits the amount of time spent pulling this data into the cache from the DIT file on the disk.

    The solution is to tune the pagefile settings to match the better performing server.

    By simply increasing the pagefile from ~1 gig to ~3gig the DNS Server can be available within ~30 seconds

    So what are we (Microsoft) doing about this?

    Excerpt from the "Book of Longhorn"

    What new DNS functionality is provided by this feature in Windows Server "Longhorn"?

    The DNS Server service in Windows Server "Longhorn" includes a number of new and enhanced features compared to the DNS Server service that was available in the Microsoft Windows NT® Server, Windows 2000 Server, and Windows Server® 2003 operating systems. The following sections describe these features.

     

    Background zone loading

    Very large organizations with extremely large zones that store their DNS data in AD DS sometimes discover that restarting a DNS server can take an hour or more while the DNS data is retrieved from the directory service. The result is that the DNS server is effectively unavailable to service client requests for the entire time that it takes to load AD DS-based zones.

    A DNS server running Windows Server "Longhorn" now loads zone data from AD DS in the background while it restarts so that it can respond to requests for data from other zones. When the DNS server starts, it:

    • Enumerates all zones to be loaded.
    • Loads root hints from files or AD DS storage.
    • Loads all file-backed zones, that is, zones that are stored in files rather than in AD DS.
    • Begins responding to queries and remote procedure calls (RPCs).
    • Spawns one or more threads to load the zones that are stored in AD DS.

    Because the task of loading zones is performed by separate threads, the DNS server is able to respond to queries while zone loading is in progress. If a DNS client requests data for a host in a zone that has already been loaded, the DNS server responds with the dat (or, if appropriate, a negative response) as expected. If the request is for a node that has not yet been loaded into memory, the DNS server reads the node's data from AD DS and updates the node's record list accordingly.

    Why is this functionality important?

    The DNS server can use background zone loading to begin responding to queries almost immediately when it restarts, instead of waiting until its zones are fully loaded. The DNS server can respond to queries for the nodes that it has loaded or that can be retrieved from AD DS. This functionality also provides another advantage when zone data is stored in AD DS rather than in a file: AD DS can be accessed asynchronously and immediately when a query is received, while file-based zone data can be accessed only through a sequential read of the file.

    I hope you find this information helpful.

  • Ntdebugging Blog

    The Memory Shell Game

    • 12 Comments

    Hello, this is Somak.  Today I’d like to drop some Memory Manager info on the blog that I’ve used to communicate in brief (believe it or not) how the system deals with memory.  If you are ever faced with checking how much Available Memory you have(or don’t have), poor system performance, questions about page faults, or having a Working Set Trimming performance issue, or just want a primer into how Windows manages memory on the system, be not afraid and read on!

    How this memory stuff works

    The fastest and most expensive memory is built into the CPU.  The next fastest and less expensive is physical RAM.  Next we have the hard drive, followed by remote storage or backup.  Each step down the ladder adds significantly to access time.  For example, physical RAM can be almost one million times faster than a hard disk.  Since it is volatile and cost considerably more than hard disk space, we are limited at how much we can put in the system.  This limit is far less than the much slower and non-volatile hard disk.  Due to these design constraints we have a tiered memory structure in the computer.  To achieve the fastest response times and best overall performance, an operating system must efficiently use this tiered memory structure.  It must do its best to reduce the need to retrieve data from a slower storage media like the hard disk.  It has to do this while juggling the memory and I/O demand from all running processes.  The following paragraphs are an overview of how the Memory Manager achieves this in Windows.  A more detailed description can be found in Chapter 7 of Microsoft Windows Internals, Fourth Edition (ISBN: 0-7356-1917-4).  This book is the great source for how stuff works in Windows.

    First let’s lay out some definitions.  They will be useful later on when I talk about the interactions.  These definitions are high level to maintain brevity.

    Virtual Memory – This is a memory that an operating system can address.  Regardless of the amount of physical RAM or hard drive space, this number is limited by your processor architecture.  On a 32 bit processor you are limited to 4 GB of addressable virtual memory (2^32).  With a default installation on a 32 bit box (not using /3GB) the kernel reserves 2GB for itself.  Applications are left with 2GB of addressable virtual memory.  When applications execute they are only presented with this 2GB of addressable memory.  Each application gets its own 2GB virtual memory to play with.  If you have 50 processes running, you’ll have 50 independent 2GB Virtual Memory address spaces and one 2GB Virtual Address space for kernel.  This is possible because Virtual Memory always exists, but doesn't really exist (hence the term virtual).  We basically lie to the application and say, here is 2GB of memory address space for you to use.  This memory isn’t allocated until the application explicitly uses it.  Once the application uses the page, it becomes committed.  A virtual memory page is then translated to a physical memory page.  From this translation, the virtual page can reside in physical RAM or on the hard disk.

    Physical Memory –  This is the physical storage media.  It can be physical RAM, the hard disk, optical disks, tape backups, etc.  This is anything that can store data.  Most times when people talk about Physical Memory, they refer to physical RAM (the memory sticks on your motherboard), but with virtual page translation, physical memory can also be on the hard drive (in your paging file).  Physical RAM is limited by your processor architecture.

    Committed Memory –  When an application touches a virtual memory page (reads/write/programmatically commits) the page becomes a committed page.  It is now backed by a physical memory page.  This will usually be a physical RAM page, but could eventually be a page in the page file on the hard disk, or it could be a page in a memory mapped file on the hard disk.  The memory manager handles the translations from the virtual memory page to the physical page.   A virtual page could be in located in physical RAM, while the page next to it could be on the hard drive in the page file.

    Commit Limit –  This is the maximum amount of memory that all your applications and the OS can commit.  If you had 50 applications fully allocate their 2 GB of virtual address space, you would need 100GB of commit limit (ignore kernel memory usage to keep the numbers simple).  So 100GB of committed pages can be backed by physical RAM or the hard drive.  If you have 100 GB of RAM, you could handle this memory load.   In most cases, 100 GB of RAM isn't economically feasible so the Commit Limit is comprised of physical RAM and the page file.  If you have 2 GB of physical RAM and 98 GB of page file, then your commit limit would be 100 GB.

    Page file –   This is the storage area for virtual memory that has been committed.  It is located on the hard drive.  Since hard drive space is cheaper than physical RAM, it is an inexpensive way to increase the commit limit.

    Working Set – This is a set of virtual memory pages (that are committed) for a process and are located in physical RAM.  These pages fully belong to the process.  A working set is like a "currently/recently working on these pages" list.

    Modified pages - Once a virtual memory page leaves the process's working set, it is moved to another list.  If the page has been modified, it is placed on the modified page list.  This page is in physical RAM.  A thread will then write the page to the page file and move it to the standby list.

    Standby pages - This is a page that has left the process' working set.  This page is in physical RAM.  A standby page is like a cache for virtual memory pages.  It is still associated with the process, but not in its working set.  If the process touches the page, it is quickly faulted back into the working set.  That page also has one foot out the door.  If another process or cache needs more memory, the process association is broken and it is moved to the free page list.  Most of the pages in available memory are actually standby pages.  This makes sense when you realize that these pages can be quickly given to another process (hence available), but you should also understand that they are page caches for working sets and can be quickly given back if the process touches the page again.  The vast majority of available memory is not wasted or empty memory.

    Free pages - When a page is taken off of the standby page list, it is moved to the Free page list.  This page is in physical RAM.  These pages are not associated with any process.   When a process exits, all of its pages are then dumped onto this list. Typically, there is a very small to no amount of free pages hanging around physical RAM. 

    Zeroed pages - When a free page is zeroed out, it is placed on the Zero page list.  This page is in physical RAM.  These are the pages that are given to processes that are making memory allocations.  Due to C2 security requirements, all pages must be scrubbed before handed to a new process.  When the system is idle, a thread will scrub free pages and put them on this list.  Only a small amount of zero pages are required to handle the typical small memory allocations of processes.  Once this list is depleted, and if there is demand for more pages, we pull pages off of the Free page list and scrub them on the fly.  If the Free page list is depleted, then we pull pages off of the standby list, scrub them on the fly and hand them to the new process.

    What is Task Manager telling me?

    Prior to Windows Vista, Task Manager reports memory usage using accounting methods that you probably are not expecting.  It is because of these accounting practices, that we rarely use Task Manager to gauge system performance and memory usage.  We typically use it for a quick overview or to kill processes.  I highly recommend using Performance Monitor (perfmon.msc) for investigating performance issues.  Here's the breakdown of the numbers on the Performance tab:

    Physical Memory

    Total - The is the total physical RAM installed in the system.

    Available - This is the total of the Standby, Free and Zeroed list.  Free and Zeroed makes sense, but Standby seems odd at first.  Standby pages were added to this number because they are available for a quick scrub and given to a new process.  So they are technically available (with minimal effort).

    System Cache- This is the total of the Standby list and the size of the system working set (which includes the file cache).  Standby pages are added to this list because they are cached pages for working sets.

    PF Usage - This is the total number of committed pages on the system.  It does not tell you how many are actually written to the page file.  It only tells you how much of the page file would be used if all committed pages had to be written out to the page file at the same time.

    Commit Charge

    Total - This is the total virtual memory that has been committed.  This includes all committed memory for all processes and the kernel.

    Limit - This is the maximum amount of committed memory this system can handle.  This is a combination of physical RAM and the page file.

    Peak – This is the highest amount of memory committed thus far on this system, since boot.

    How does this work?

    The memory manager optimizes physical RAM usage across the entire system.  Since physical RAM is a finite resource, it has to balance sharing this critical resource amongst all process, the kernel and file I/O.   It tries to keep disk I/O to a minimum, which results in a more responsive system.  It does this by moving pages around to meet the demand of the system.

    Typically, large sections of physical RAM are used for file cache.  This is cache is necessary to improve disk performance.  Without it, disk I/O would make the system crawl along at an nearly unusable pace.  The file system cache is just like a working set for a process.  Pages removed from the file cache are moved to the standby or modified page list.  Many of the standby pages in RAM are probably file cache pages that were removed from its working set.  For example, on a file server, if you see 8 GB of available memory, most of these pages are probably standby pages for the file cache.  The file cache's working set could be 500 MB, but the 8 GB of standby pages should also be considered part of the file cache.

    Now let's take a look at how the memory manager handles processes.  While an application is working with its virtual memory pages, the memory manager keeps the pages in the process' working set.  Since the vast majority of application do not use all of its memory all the time, some pages will age.  Old pages are removed from the working set.  If they are modified, they are moved to the modified list.  The page is saved to the page file and moved to the standby list.  If the page hasn't been modified, it is moved directly to the standby list.  These pages will remain on the standby page list until there is a demand for it.

    If the application touches the page again, it is soft faulted back into the process' working set.  If the process doesn't use the page for a very long time, or if the demand for the page is greater elsewhere, the page is moved off of the standby list.  It is disassociated with the process and moved to a the free page list.  From the free page list, the page is scrub on demand or lazily and placed on the zero page list.  It is from the zero page list that other processes or the kernel or the file cache will get a new page.

    If after a very long time the application once again needs a page that is not in its working set, the memory manager will handle the memory fault.  If the page is on the standby list, it is quickly put back into the process' working set.  If the page is no longer in the standby list, a hard fault occurs.  The memory manager issues I/O to the hard disk to read the page(s) from the page file.  Once the I/O complete, the page is placed back into the process' work set.

    All of this is done to keep physical RAM highly utilized and disk I/O to a minimum.  We don't want to allow process to horde physical RAM for pages that are rarely used.  The physical RAM must be shared with other processes, the kernel and the file cache.  If you see lots of available memory on your system, rest assured that it is not going to waste.  The vast majority is on standby lists for processes and the file cache.

    Also note that page file usage isn't that bad.  The page file allows the Memory Manager to save modified pages before placing the page on the standby list.  The page is still in physical RAM and can be quickly faulted back into the process.  This method gives the process a chance to reclaim an old page and it allows the page to be quickly used if there is demand elsewhere.

    The best way to see the totals of these lists is to use a kernel debugger (live or postmortem).  Use the !memusage command and you'll get an output like this:

    0: kd> !memusage

    loading PFN database

    loading (100% complete)

    Compiling memory usage data (99% Complete).

    Zeroed:    414 (  1656 kb)

                   Free:      2 (     8 kb)

    Standby: 864091 (3456364 kb)

    Modified:    560 (  2240 kb)

        ModifiedNoWrite:     30 (   120 kb)

    Active/Valid: 182954 (731816 kb)

             Transition:      2 (     8 kb)

                    Bad:      0 (     0 kb)

                Unknown:      0 (     0 kb)

    TOTAL: 1048053 (4192212 kb)

    Of the 4GB of physical RAM, only 1.6 MB are on Zeroed or free pages.  731 MB is in process, system and file cache working sets.  2 MB are on the modified page list.  The vast majority, 3.4 GB, is on the standby list.  On this server, most people will see 3.4 GB of wasted physical RAM, but you will know better.

    What should I be worried about?

    Typically you shouldn't be worried about these things, until you have a performance problem.  If your system is sluggish or slow to respond or you are getting errors about out of memory, then you need to rely on this information.  You will need to collect a performance monitor log of the problem time.  If the counter list is daunting, then use the Performance Monitor Wizard to configure the performance monitor log.

    Once the log is collected, you'll need to analyze several counters.  I'm not going into detail about how to review performance monitor logs this time.  I'll save that lengthy topic for another time.  For now I'll focus on the counters relevant to the Memory Manager.

    One of the biggest reasons for slow performance and sluggish system responsiveness is disk bottleneck.  Look at Physical Disk\% Idle Time, Avg. Disk sec/Read and Avg. Disk sec/Write counters.  If your system drive is under 50% idle or your disk response times are way above your drive specifications, then you need to investigate further.  Look at the Memory\Available Mbytes.  You should have a couple hundred Mbytes of Available Memory.  This is one of the most important performance monitor counters.  If this number drops too low, your standby lists, process working sets and cache will be greatly reduced.  You'll need to find out if a process is consuming physical RAM.  Check for large process working sets or for large file cache.

    You will also need to see if paging is really affecting system performance.  Take a look at Memory\Pages Input/sec and correlate that to Physical Disk\Avg. Disk sec/Read.  Pages Input/sec is the number of pages being read in from the page file.  These are the hard faults (when the page wasn't on the standby list).  If your Avg. Disk sec/Read is close to your drive's specification and the drive's idle time is high, than paging really isn't a problem.  Small amounts of hard faults are expected as applications will every once in a while re-touch an old page.  As long as this I/O is not consistent or the disk can't keep up, you probably will not notice this impact.

    You can also look at Memory\Pages Output/sec and Physical Disk\Avg. Disk sec/Write.  These are the page commits to the page file when a modified page is removed from a process' working set.  As long as the disk can keep up with the writes, this shouldn't be a problem.  Remember that once the page is saved to the page file, it is placed on the standby list.  If there isn't a great demand for new pages, it can remain on the standby list for a very long time.  When the process touches the old virtual page again, it can be soft faulted back into the working set.  If there is great demand for memory, you'll see process working sets aggressively being trimmed.  Unless there is memory pressure, this is all done with lazy I/O, so you should not see much of an impact from this activity.

    The Memory Manager works to meet current demand and prepares for future demand when possible.  You need to look at a performance monitor log to see if there is memory pressure on the system.  You'll see this in low Available Mbytes and reductions in process working sets.  You'll be able to correlate this to increase disk I/O to the page file.  If you have established that there is memory pressure on the box, you need to figure where that demand is coming from.  Check for working set increases from processes, file cache or bottlenecked disk I/O to data drives.

  • Ntdebugging Blog

    LPC CASE2 – When things are not rosy

    • 1 Comments

    Hello, this is Roy again.  In this case we will discuss a scenario where tracing a hung client thread is not possible through LPC data structures and extensions. We would rather use hints from the LPC message sent about the operation that was taking place and other heuristics to arrive at possible reason for the hang.

     

    The issue is that whenever an Encrypted File is opened the server locks up. The initial analysis is to determine the reason for the server hang, whether it is low resource condition, deadlock or any other reason. After dumping all the threads and glancing over them I chose to pickup the following thread which looked interestingly abnormal. Below is the thread which is trying to do a CreateFile( ) and is stuck for 4hrs for a LPC reply. The file attempted to be created is an encrypted file. KSecDD.sys is the Kernel Security Device Driver which is used for handling key encryption, decryption tasks. In order to decrypt the file KSecDD makes a LPC request to LsaAuthenticationPort.

     

    2: kd> !thread fffffade6c9f1390

    THREAD fffffade6c9f1390  Cid 04b4.0790  Teb: 000007fffffd8000 Win32Thread: fffffa8008ed1d80 WAIT: (WrLpcReply) KernelMode Non-Alertable

        fffffade6c9f1710  Semaphore Limit 0x1

    Waiting for reply to LPC MessageId 01561d96:

    Current LPC port fffffa800380cc40

    IRP List:

        fffffade6d389010: (0006,03a0) Flags: 00000884  Mdl: 00000000

        fffffade6bf55c60: (0006,03a0) Flags: 00000884  Mdl: 00000000

    Impersonation token:  fffffa8002f5b620 (Level Impersonation)

    Owning Process            fffffade6f97c2b0       Image:         svchost.exe

    Wait Start TickCount      18281033       Ticks: 26423448 (4:18:41:06.375)

    Context Switch Count      11                 LargeStack

    UserTime                  00:00:00.015

    KernelTime                00:00:00.000

    Win32 Start Address 0x0000000003855c60

    Start Address 0x0000000078d6b630

    Stack Init fffffade56ea1e00 Current fffffade56e9f930

    Base fffffade56ea2000 Limit fffffade56e9b000 Call 0

    Priority 9 BasePriority 8 PriorityDecrement 0

     

    Child-SP          RetAddr           Call Site

    fffffade`56e9f970 fffff800`01054df3 nt!KiSwapContext+0x85

    fffffade`56e9faf0 fffff800`0105457f nt!KiSwapThread+0x376

    fffffade`56e9fb50 fffff800`0125555c nt!KeWaitForSingleObject+0x58e

    fffffade`56e9fbd0 fffffade`5b19e471 nt!LpcpRequestWaitReplyPort+0x672

    fffffade`56e9fc60 fffffade`5b19f88a KSecDD!EfspDecryptFek+0x131

    fffffade`56e9fec0 fffffade`5b0a4f46 KSecDD!EfsDecryptFek+0x2a

    fffffade`56e9ff10 fffffade`5b0a4b9d Ntfs!EFSPostCreate+0x323

    fffffade`56ea0060 fffffade`5b119d70 Ntfs!EFSFilePostCreate+0x1a5

    fffffade`56ea00e0 fffffade`5b1f2daa Ntfs!NtfsFsdCreate+0x425

    fffffade`56ea0270 fffffade`59976284 fltMgr!FltpCreate+0x34a

    fffffade`56ea0300 fffffade`5997ed4c SomeFSFilter+0x9284

    fffffade`56ea0360 fffffade`5997637f SomeFSFilter+0x11d4c

    fffffade`56ea0390 fffffade`5b1d759a SomeFSFilter+0x937f

    fffffade`56ea0400 fffffade`5b1f2e14 fltMgr!FltpLegacyProcessingAfterPreCallbacksCompleted+0x41a

    fffffade`56ea0470 fffff800`01295ede fltMgr!FltpCreate+0x3b4

    fffffade`56ea0500 fffff800`0128ea00 nt!IopParseDevice+0x108f

    fffffade`56ea06b0 fffff800`012909a7 nt!ObpLookupObjectName+0x923

    fffffade`56ea07c0 fffff800`01296a14 nt!ObOpenObjectByName+0x180

    fffffade`56ea09b0 fffff800`01296bb7 nt!IopCreateFile+0x630

    fffffade`56ea0b40 fffffade`599e2a67 nt!IoCreateFile+0x12f

    fffffade`56ea0c20 fffffade`599e3c91 SomeFSFilter+0x47a67

    fffffade`56ea0ce0 fffffade`599e62e9 SomeFSFilter+0x48c91

    fffffade`56ea0d10 fffffade`599aacd3 SomeFSFilter+0x4b2e9

    fffffade`56ea0e00 fffffade`599aad6e SomeFSFilter+0xfcd3

    fffffade`56ea0ea0 fffffade`599bea96 SomeFSFilter+0xfd6e

    fffffade`56ea0ed0 fffffade`599ae142 SomeFSFilter+0x23a96

    fffffade`56ea0f90 fffffade`599ae7f1 SomeFSFilter+0x13142

    fffffade`56ea1070 fffffade`5999c0e3 SomeFSFilter+0x137f1

    fffffade`56ea10a0 fffffade`599ae848 SomeFSFilter+0x10e3

    fffffade`56ea10d0 fffffade`599ae8c3 SomeFSFilter+0x13848

    fffffade`56ea1120 fffffade`599a9a96 SomeFSFilter+0x138c3

    fffffade`56ea1160 fffffade`599a782f SomeFSFilter+0xea96

    fffffade`56ea11a0 fffffade`599a7a02 SomeFSFilter+0xc82f

    fffffade`56ea11e0 fffffade`5997d484 SomeFSFilter+0xca02

    fffffade`56ea1230 fffffade`599800f4 SomeFSFilter+0x10484

    fffffade`56ea1260 fffffade`5998038d SomeFSFilter+0x130f4

    fffffade`56ea1290 fffffade`5997ecbc SomeFSFilter+0x1338d

    fffffade`56ea12c0 fffffade`5997637f SomeFSFilter+0x11cbc

    fffffade`56ea12f0 fffffade`5b1d759a SomeFSFilter+0x937f

    fffffade`56ea1360 fffffade`5b1f2e14 fltMgr!FltpLegacyProcessingAfterPreCallbacksCompleted+0x41a

    fffffade`56ea13d0 fffff800`01295ede fltMgr!FltpCreate+0x3b4

    fffffade`56ea1460 fffff800`0128ea00 nt!IopParseDevice+0x108f

    fffffade`56ea1610 fffff800`012909a7 nt!ObpLookupObjectName+0x923

    fffffade`56ea1720 fffff800`01296a14 nt!ObOpenObjectByName+0x180

    fffffade`56ea1910 fffff800`01296bb7 nt!IopCreateFile+0x630

    fffffade`56ea1aa0 fffff800`01298479 nt!IoCreateFile+0x12f

    fffffade`56ea1b80 fffff800`0104fce2 nt!NtCreateFile+0x69

    fffffade`56ea1c00 00000000`78ef17ba nt!KiSystemServiceCopyEnd+0x3

    00000000`00bbf258 00000000`00000000 0x78ef17ba

     

     

    Using the message Id mentioned in the “!thread” output we dump the LPC message. The output has two anomalies.

    1. No server thread has been listed which is processing the client request.
    2. The message queue as well as the LpcDataInfoChain is empty.

     

     

    Answers to both questions are very specific to LSASS. Most types of client requests received on LsaAuthenticationPort do not have extra data. As mentioned in CASE 1, server threads dequeue the message and chain it to LpcDataInfoChainHead only if the message contains extra information. If the message does not contain any extra data then the server does not need to store the message anywhere. The server would only need to remember the kind of request carried by the message.

     

    !lpc message searches the ETHREAD.LpcReceiveMessageId field to determine which server thread is processing the message. The way LSASS handles LPC request is that there is a single thread in LSASS that dequeues all messages from the connection port message queue and enqueues them to an LSASS specific internal queue. LSASS has a pool of worker threads that wait for requests in the internal queue. The worker threads dequeue the request, process it and reply back to the client. Hence the single LPC server thread in LSASS has its LpcReceiveMessageId overwritten by a new one every time it dequeues a message and enqueues it to the internal queue. That is the reason !lpc message fails to find a server thread for the LPC message sent to LsaAuthenticationPort.

     

     

     

     

    2: kd> !LPC MESSAGE 01561d96

    Searching message 1561d96 in threads ...

    Client thread fffffade6c9f1390 waiting a reply from 1561d96                         

    Searching thread fffffade6c9f1390 in port rundown queues ...

     

    Server communication port 0x033fc660

        Handles: 1   References: 1

        The LpcDataInfoChainHead queue is empty

            Connected port: 0xfffffa800380cc40      Server connection port: 0xfffffa800282ca40

     

    Client communication port 0xfffffa800380cc40

        Handles: 0   References: 1

        The LpcDataInfoChainHead queue is empty

     

    Server connection port fffffa800282ca40  Name: LsaAuthenticationPort

        Handles: 1   References: 83

        Server process  : fffffade6ead0040 (lsass.exe)

        Queue semaphore : fffffade6e9513c0

        Semaphore state 0 (0x0)

        The message queue is empty

        The LpcDataInfoChainHead queue is empty

    Done.  

     

     

     

    Then how do you find the server thread that is responsible for replying back to the client and why it isn’t replying? Well there are many ways to do so depending upon the scenario. The way I chose was to look at all threads in LSASS which either have function calls on the stack that relate to the KSecDD request made or other hints. The reason threads only in LSASS were checked is because we knew that the request was sent to LsaAuthenticationPort and will be processed within LSASS. The hint was in the Wait Tick Count of the thread. I found a thread which had exactly the same Wait Tick Count as the client thread that was waiting for a reply. Further investigation of the thread confirmed that it was the one responsible for processing the request from KSecDD.

     

     

     

    2: kd> !thread fffffade6fa01460

    THREAD fffffade6fa01460  Cid 03a4.0100  Teb: 000007fffff8e000 Win32Thread: 0000000000000000 WAIT: (Executive) KernelMode Non-Alertable

        fffffade6f665820  Semaphore Limit 0x1

        fffffade6f665840  NotificationEvent

    IRP List:

        fffffade6bb9a830: (0006,03a0) Flags: 00000884  Mdl: 00000000

    Impersonation token:  fffffa80044d11c0 (Level Impersonation)

    Owning Process            fffffade6ead0040       Image:         lsass.exe

    Wait Start TickCount      18281033       Ticks: 26423448 (4:18:41:06.375)

    Context Switch Count      509109            

    UserTime                  00:01:54.500

    KernelTime                00:00:33.765

    Win32 Start Address 0x0000000000c21aa0

    Start Address 0x0000000078d6b630

    Stack Init fffffade5a894e00 Current fffffade5a893e40

    Base fffffade5a895000 Limit fffffade5a88f000 Call 0

    Priority 13 BasePriority 9 PriorityDecrement 0

     

    Child-SP          RetAddr           Call Site

    fffffade`5a893e80 fffff800`01054df3 nt!KiSwapContext+0x85

    fffffade`5a894000 fffff800`0105518f nt!KiSwapThread+0x376

    fffffade`5a894060 fffffade`599a8b28 nt!KeWaitForMultipleObjects+0x66e

    fffffade`5a8940f0 fffffade`599a7808 SomeFSFilter+0xdb28

    fffffade`5a8941a0 fffffade`599a7a02 SomeFSFilter+0xc808

    fffffade`5a8941e0 fffffade`5997d484 SomeFSFilter+0xca02

    fffffade`5a894230 fffffade`599800f4 SomeFSFilter+0x10484

    fffffade`5a894260 fffffade`5998038d SomeFSFilter+0x130f4

    fffffade`5a894290 fffffade`5997ecbc SomeFSFilter+0x1338d

    fffffade`5a8942c0 fffffade`5997637f SomeFSFilter+0x11cbc

    fffffade`5a8942f0 fffffade`5b1d759a SomeFSFilter+0x937f

    fffffade`5a894360 fffffade`5b1f2e14 fltMgr!FltpLegacyProcessingAfterPreCallbacksCompleted+0x41a

    fffffade`5a8943d0 fffff800`01295ede fltMgr!FltpCreate+0x3b4

    fffffade`5a894460 fffff800`0128ea00 nt!IopParseDevice+0x108f

    fffffade`5a894610 fffff800`012909a7 nt!ObpLookupObjectName+0x923

    fffffade`5a894720 fffff800`01296a14 nt!ObOpenObjectByName+0x180

    fffffade`5a894910 fffff800`01296bb7 nt!IopCreateFile+0x630

    fffffade`5a894aa0 fffff800`01298479 nt!IoCreateFile+0x12f

    fffffade`5a894b80 fffff800`0104fce2 nt!NtCreateFile+0x69

    fffffade`5a894c00 00000000`78ef17ba nt!KiSystemServiceCopyEnd+0x3

     

     

    Having all the facts makes understanding the problem easier. SomeFSFilter.sys is an anti-virus driver that scans every file opened. During the open of an encrypted file it tries to scan the file before it is opened. For doing so it requests KSecDD to decrypt the file. KSecDD sends a LPC request to LsaAuthenticationPort for key decryption. LSASS as part of processing the request tries to open a file which for some reason is blocked by SomeFSFilter.sys. So we have a classic deadlock situation where SomeFSFilter.sys is waiting on LSASS to reply to the LPC message while LSASS is waiting on SomeFSFilter.sys for opening a file.

     

    In this scenario we could only use minimal help from the LPC debugger extension because of the way LPC requests were handled by LSASS. In such situations other heuristics are required to debug the problem.

     

    Finally, our journey through the world of LPC comes to an end. LPC has been the backbone of local inter-process communication since the NT4.0 days but as everything has a shell life, so did LPC. LPC has been replaced by ALPC (Asynchronous LPC) in Vista and Server 2008. The LPC APIs are now implemented over the new ALPC APIs to maintain compatibility. Although the LPC APIs are not exported, neither is there a requirement to call them directly, nevertheless having an understanding of the API helps in debugging as we have seen in the posts.

    Hope this information helps. Happy debugging!!!

     

  • Ntdebugging Blog

    TalkBackVideo Understanding handle leaks and How to use !htrace to find them

    • 2 Comments

    Written by Jeff Dailey

     

    Hello, my name is Jeff Dailey, I’m an Escalation Engineer for the Global Escalation Services Platforms team.   I’d like to show you how to debug and find leaking handles within your application or other process.  We can do this with the !htrace command in windbg .  Windbg is the Microsoft Windows Debugger most of us use in GES/CPR  for debugging.   

     

    Handles are a value we use in user mode, that when passed to a call that transitions to kernel, are used as an offset in your handle table to reference kernel mode objects.  Kernel mode objects are generally allocated from pool.   If you are having pool consumption problems and seeing errors like 2020 or 2019’s reported there is a good chance you may have a handle leak associated with them.  This is generally due to not doing a CloseHandle() on the handle when you have finished using it.

     

    You can vide the channel9 "how to debug handle leaks" video here 

     

    The following is the sample source for a handle leak that we will be debugging in our demo video.

     

    // leakyhandles.cpp : Defines the entry point for the console application.

    //

     

    #include "stdafx.h"

    #include <windows.h>

     

     

    void fun1(void);

    void fun2(void);

    void fun3(void);

    void fun4(void);

     

    int main(int argc, char* argv[])

    {

          while(1)

          {

                fun1();

                fun2();

                Sleep(100);

          }

          return 0;

    }

     

    void fun1(void)

    {

          fun3();

    }

     

    void fun2(void)

    {

          fun4();

     

    }

    void fun3(void)

    {

          HANDLE hEvent;

     

          hEvent = CreateEvent(NULL,TRUE,TRUE,NULL);

          CloseHandle(hEvent);

    }

    void fun4(void)

    {

        HANDLE hEvent2;

     

          hEvent2 = CreateEvent(NULL,TRUE,TRUE,NULL);

    }

     

    Thank you.

    Jeff Dailey

    Escalation Engineer (Platforms core team)

  • Ntdebugging Blog

    TalkBack Video: Cumulative Time Zone Update for Windows

    • 2 Comments

    Hi, my name is Joel Schaeffer, and I am an Escalation Engineer on the Platforms Global Escalation Services (GES) team.   I created a video to discuss the August 2007 Windows Cumulative Time Zone Update.  It is hosted on Channel9, and can be found here.  It will explain how to install the patch, validate the install, manually update non-mainstream Windows versions, and how to update Windows Mobile clients.

     

    See the following knowledge base articles for more information about how to update time zone information on your Windows clients and servers:

     

    August 2007 cumulative time zone update for Microsoft Windows operating systems

    http://support.microsoft.com/?id=933360

     

    How to configure daylight saving time for Microsoft Windows operating systems

    http://support.microsoft.com/?id=914387

     

    An August 2007 daylight saving time update for Windows Mobile-based devices is available

    http://support.microsoft.com/?id=941898

     

    See http://www.microsoft.com/dst2007 for more details about how to prepare for the upcoming DST changes for all Microsoft products.

  • Ntdebugging Blog

    Talkback video: Desktop Heap

    • 3 Comments

    Hello, Matthew here again.  Starting today, my team will be bringing you content in the form of videos, as well blog posts.  We’ll be hosting these videos on Channel 9, and we’ll link them from the ntdebugging blog.  One way that we’ll be using video is as a means of highlighting topics we’ve already covered, and then answering some of the common question that we receive from you in the form of comments on the blog.  So if you leave a comment with an interesting question, we may answer it in an upcoming video.  We are also thinking about using video as a way to show various debugging techniques rather than just blog about them. 

    For our first video, we’ve chosen desktop heap as a topic.  We’ve had a lot of traffic on the desktop heap posts, and we’ve had a lot of questions too.  So check out the video, and let us know what you think!

    http://channel9.msdn.com/Blogs/jeff_dailey/ntdebugging-blog-talk-about-on-desktop-heap

  • Ntdebugging Blog

    LPC Part 3: Debugging a real world LPC scenario

    • 4 Comments

    Hello my name is Roy; I’m an EE on the Microsoft global escalation services / CPR team.   This blog post continues my previous posts on LPC. 

    Below I will discuss scenarios that are simulated to depict actual issues encountered by Microsoft Customer Support Services. The two following scenarios deal with client threads waiting for LPC reply and hence in some way leading to user experience of a hung system or component.

     

    CASE1 – When everything in LPC looks intact

     

    In this scenario tracing the reason for client threads wait is possible through LPC data structures and extensions.

     

    The issue is a hung SMS (Systems Management Server) service. The OS (operating system) itself is working fine. It is just SMS that has hung.

     

    Taking a user mode hang dump (adplus.vbs hang mode dump) does not help since all we see is that threads in SMS are making RPC calls and are stuck. We need to investigate where these RPC calls are stuck. From the user dump we can see one important thing about the RPC calls is that they are implemented over LPC. LPC is completely implemented in the kernel and hence taking a kernel dump would be the next step in this case.

     

    02e9e698 7c821c94 ntdll!KiFastSystemCallRet

    02e9e69c 77c72700 ntdll!NtRequestWaitReplyPort+0xc              <--- RPC implemented over LPC

    02e9e6f4 77c713ba RPCRT4!LRPC_CCALL::SendReceive+0x230

    02e9e700 77788056 RPCRT4!I_RpcSendReceive+0x24          <--- RPC calls made

    02e9e71c 77788011 ole32!ThreadSendReceive+0xf5

    02e9e738 77787ed7 ole32!CRpcChannelBuffer::SwitchAptAndDispatchCall+0x112

    02e9e818 77695349 ole32!CRpcChannelBuffer::SendReceive2+0xc1

    02e9e884 776952e1 ole32!CAptRpcChnl::SendReceive+0xab

    02e9e8d8 77ce127e ole32!CCtxComChnl::SendReceive+0x113

    02e9e8f4 77ce13ca RPCRT4!NdrProxySendReceive+0x43

    02e9ecdc 77ce11bd RPCRT4!NdrClientCall2+0x206

    02e9ecfc 77c53a12 RPCRT4!ObjectStublessClient+0x8b

    02e9ed0c 77688fd2 RPCRT4!ObjectStubless+0xf

    02e9ed70 7768915a ole32!CRpcResolver::CreateInstance+0x14e

    02e9efbc 776a9b68 ole32!CClientContextActivator::CreateInstance+0xfa

    02e9effc 776a9a9a ole32!ActivationPropertiesIn::DelegateCreateInstance+0xf7

    02e9f7b0 776b1275 ole32!ICoCreateInstanceEx+0x3cc

    02e9f7d8 776b1244 ole32!CComActivator::DoCreateInstance+0x28

    02e9f7fc 74cf1a51 ole32!CoCreateInstanceEx+0x23

    02e9f830 74cf19fe wbemprox!CDCOMTrans::DoActualCCI+0x3d

    02e9f874 74cf16b1 wbemprox!CDCOMTrans::DoCCI+0x12d

    02e9f934 74cf152e wbemprox!CDCOMTrans::DoActualConnection+0x25f

    02e9f960 74cf22ab wbemprox!CDCOMTrans::DoConnection+0x25

    02e9f9a0 1806eaa9 wbemprox!CLocator::ConnectServer+0x7c

    WARNING: Stack unwind information not available. Following frames may be wrong.

    02e9fa50 186c31b6 ccmcore!CCM::Utility::Wmi::Namespace::Open+0x1a2

    02e9fb44 186c357f PolicyAgent!DllUnregisterServer+0x10d8

    02e9fbb0 18706a63 PolicyAgent!DllUnregisterServer+0x14a1

    02e9fc8c 18708899 PolicyAgentEndpoint!DllGetClassObject+0x73d2

    02e9fd6c 187097d9 PolicyAgentEndpoint!DllGetClassObject+0x9208

    02e9fde0 01068bb8 PolicyAgentEndpoint!DllGetClassObject+0xa148

    02e9fe6c 7c83b332 CcmExec+0x68bb8

    02e9fec8 7c8161d0 ntdll!RtlpWaitOrTimerCallout+0x74

    02e9feec 7c81a3c5 ntdll!RtlpAsyncWaitCallbackCompletion+0x37

    02e9ff44 7c8200fc ntdll!RtlpWorkerCallout+0x71

    02e9ff64 7c81a3fa ntdll!RtlpExecuteWorkerRequest+0x4f

    02e9ff78 7c82017f ntdll!RtlpApcCallout+0x11

    02e9ffb8 77e6608b ntdll!RtlpWorkerThread+0x61

    02e9ffec 00000000 kernel32!BaseThreadStart+0x34

     

    We examine threads belonging to ccmexec.exe from the kernel dump.

     

     

    0: kd> !process 0 0 ccmexec.exe

    PROCESS 895a6430  SessionId: 0  Cid: 01c8    Peb: 7ffde000  ParentCid: 0278

        DirBase: 6f3c7000  ObjectTable: e22d6b00  HandleCount: 832.

        Image: CcmExec.exe

     

    0: kd> .process /p /r 895a6430

    Implicit process is now 895a6430

    Loading User Symbols

    ............................................................................................

     

     

    0: kd> !thread 88e75b10

    THREAD 88e75b10  Cid 01c8.0de0  Teb: 7ffa6000 Win32Thread: e2365ea8 WAIT: (WrLpcReply) UserMode Non-Alertable

        88e75cfc  Semaphore Limit 0x1

    Waiting for reply to LPC MessageId 00039236:

    Current LPC port e2327022

    Not impersonating

    DeviceMap                 e10018a8

    Owning Process            895a6430       Image:         CcmExec.exe

    Wait Start TickCount      99831          Ticks: 46825 (0:00:12:11.640)

    Context Switch Count      14119                 LargeStack

    UserTime                  00:00:00.125

    KernelTime                00:00:00.046

    Win32 Start Address ntdll!RtlpWorkerThread (0x7c83ad38)

    Start Address kernel32!BaseThreadStartThunk (0x77e6b5f3)

    Stack Init b9425000 Current b9424c08 Base b9425000 Limit b9421000 Call 0

    Priority 13 BasePriority 8 PriorityDecrement 0

    Kernel stack not resident.

     

    ChildEBP RetAddr  Args to Child             

    b9424c20 8083e6a2 88e75b88 88e75b10 88e75bb8 nt!KiSwapContext+0x26 (FPO: [Uses EBP] [0,0,4])

    b9424c4c 8083f164 88e75cfc 88e75cd0 88e75b10 nt!KiSwapThread+0x284 (FPO: [Non-Fpo])

    b9424c94 8093983f 88e75cfc 00000011 b9424d01 nt!KeWaitForSingleObject+0x346 (FPO: [Non-Fpo])

    b9424d50 80834d3f 00000114 0017db40 0017db40 nt!NtRequestWaitReplyPort+0x776 (FPO: [Non-Fpo])

    b9424d50 7c82ed54 00000114 0017db40 0017db40 nt!KiFastCallEntry+0xfc (FPO: [0,0] TrapFrame @ b9424d64)

    02e9e698 7c821c94 77c72700 00000114 0017db40 ntdll!KiFastSystemCallRet (FPO: [0,0,0])

    02e9e69c 77c72700 00000114 0017db40 0017db40 ntdll!NtRequestWaitReplyPort+0xc (FPO: [3,0,0])

    02e9e6f4 77c713ba 00127fd4 02e9e71c 77788056 RPCRT4!LRPC_CCALL::SendReceive+0x230 (FPO: [Non-Fpo])

    02e9e700 77788056 00127fd4 02e9e800 0009f488 RPCRT4!I_RpcSendReceive+0x24 (FPO: [Non-Fpo])

    02e9e71c 77788011 00000000 00000000 00000000 ole32!ThreadSendReceive+0xf5 (FPO: [Non-Fpo])

    02e9e738 77787ed7 02e9e800 02e9e900 0009f488 ole32!CRpcChannelBuffer::SwitchAptAndDispatchCall+0x112 (FPO: [Non-Fpo])

    02e9e818 77695349 0009f488 02e9e910 02e9e900 ole32!CRpcChannelBuffer::SendReceive2+0xc1 (FPO: [Non-Fpo])

    02e9e884 776952e1 0009f488 02e9e910 02e9e900 ole32!CAptRpcChnl::SendReceive+0xab (FPO: [Non-Fpo])

    02e9e8d8 77ce127e 0009f488 02e9e910 02e9e900 ole32!CCtxComChnl::SendReceive+0x113 (FPO: [Non-Fpo])

    02e9e8f4 77ce13ca 000a1934 02e9e93c 0600016e RPCRT4!NdrProxySendReceive+0x43 (FPO: [Non-Fpo])

    02e9ecdc 77ce11bd 77676298 7767951e 02e9ed14 RPCRT4!NdrClientCall2+0x206 (FPO: [Non-Fpo])

    02e9ecfc 77c53a12 00000014 00000004 02e9ed70 RPCRT4!ObjectStublessClient+0x8b (FPO: [Non-Fpo])

    02e9ed0c 77688fd2 000a1934 00000000 02e9f1b4 RPCRT4!ObjectStubless+0xf

    02e9ed70 7768915a 02e9f1b4 02e9f704 00000010 ole32!CRpcResolver::CreateInstance+0x14e (FPO: [Non-Fpo])

    02e9efbc 776a9b68 7778fcd0 00000000 00000000 ole32!CClientContextActivator::CreateInstance+0xfa (FPO: [Non-Fpo])

    02e9effc 776a9a9a 02e9f1b4 00000000 02e9f704 ole32!ActivationPropertiesIn::DelegateCreateInstance+0xf7 (FPO: [Non-Fpo])

    02e9f7b0 776b1275 74cf1a8c 00000000 00000014 ole32!ICoCreateInstanceEx+0x3cc (FPO: [Non-Fpo])

    02e9f7d8 776b1244 74cf1a8c 00000000 00000014 ole32!CComActivator::DoCreateInstance+0x28 (FPO: [Non-Fpo])

    02e9f7fc 74cf1a51 74cf1a8c 00000000 00000014 ole32!CoCreateInstanceEx+0x23 (FPO: [Non-Fpo])

    02e9f830 74cf19fe 02e9f898 00000001 00000000 wbemprox!CDCOMTrans::DoActualCCI+0x3d (FPO: [Non-Fpo])

    02e9f874 74cf16b1 02e9f898 00000001 00000000 wbemprox!CDCOMTrans::DoCCI+0x12d (FPO: [Non-Fpo])

    02e9f934 74cf152e 01801974 00000000 00000000 wbemprox!CDCOMTrans::DoActualConnection+0x25f (FPO: [Non-Fpo])

    02e9f960 74cf22ab 01801974 00000000 00000000 wbemprox!CDCOMTrans::DoConnection+0x25 (FPO: [Non-Fpo])

    02e9f9a0 1806eaa9 00b594d0 01801974 00000000 wbemprox!CLocator::ConnectServer+0x7c (FPO: [Non-Fpo])

    WARNING: Stack unwind information not available. Following frames may be wrong.

    02e9fa50 186c31b6 0062bdaa 00000001 00000000 ccmcore!CCM::Utility::Wmi::Namespace::Open+0x1a2

    02e9fb44 186c357f 02e9fb68 00000001 00000000 PolicyAgent!DllUnregisterServer+0x10d8

    02e9fbb0 18706a63 0056a6f0 00000001 00000000 PolicyAgent!DllUnregisterServer+0x14a1

    02e9fc8c 18708899 0063cd50 02e9fd0c 02e9fcec PolicyAgentEndpoint!DllGetClassObject+0x73d2

    02e9fd6c 187097d9 ffffffff 00000864 00000000 PolicyAgentEndpoint!DllGetClassObject+0x9208

    02e9fde0 01068bb8 00566a28 006af398 00000000 PolicyAgentEndpoint!DllGetClassObject+0xa148

    02e9fe6c 7c83b332 000008b6 0011c700 0011c7f0 CcmExec+0x68bb8

    02e9fec8 7c8161d0 01068a29 000008b6 0011c700 ntdll!RtlpWaitOrTimerCallout+0x74 (FPO: [Non-Fpo])

    02e9feec 7c81a3c5 0011c7f0 7c889880 01e89b18 ntdll!RtlpAsyncWaitCallbackCompletion+0x37 (FPO: [Non-Fpo])

    02e9ff44 7c8200fc 7c816199 0011c7f0 00000000 ntdll!RtlpWorkerCallout+0x71 (FPO: [Non-Fpo])

    02e9ff64 7c81a3fa 00000000 0011c7f0 01e89b18 ntdll!RtlpExecuteWorkerRequest+0x4f (FPO: [Non-Fpo])

     

    The !thread output contains some important information.

    1. WAIT: (WrLpcReply): This tells us that the thread is waiting for a LPC reply. This information called as the “Wait Reason” is passed to KeWaitForSingleObject( ) as a parameter. This is retrieved from the _KTHREAD.WaitReason field.
    2. Waiting for reply to LPC MessageId 00039236: This line gives the message ID of the LPC message that was sent to the server thread.
    3. Current LPC port e2327022: The port number listed is the client communication port.
    4. 88e75cfc  Semaphore Limit 0x1 : The semaphore that the thread is waiting for is signaled when a reply message is posted to the client communication ports message queue. NtRequestWaitReplyPort( ) waits on this semaphore.
    5. Ticks: 46825 (0:00:12:11.640): This thread has been waiting for 12mins for a reply from the server.

     

    The next step is to find the server thread that is processing this LPC message. We will use LPC debugger extensions to trace the server thread.

     

    0: kd> !lpc message 00039236

    Searching message 39236 in threads ...

        Server thread 89531020 is working on message 39236                        

    Client thread 88e75b10 waiting a reply from 39236                         

    Searching thread 88e75b10 in port rundown queues ...

     

    Server communication port 0xe2300a38

        Handles: 1   References: 1

        The LpcDataInfoChainHead queue is empty

            Connected port: 0xe2327020      Server connection port: 0xe13734b0

     

    Client communication port 0xe2327020

        Handles: 1   References: 9

        The LpcDataInfoChainHead queue is empty

     

    Server connection port e13734b0  Name: epmapper

        Handles: 1   References: 93

        Server process  : 896e1450 (svchost1.exe)

        Queue semaphore : 8995ac10

        Semaphore state 0 (0x0)

        The message queue is empty

     

        Messages in LpcDataInfoChainHead:

            0000 e2365188 - Busy  Id=0003832f  From: 04a8.13f4  Context=80050003  [e1373530 . e297ab00]

                       Length=0044002c  Type=00380001 (LPC_REQUEST)

                       Data: 00000001 00050242 421e5fa2 9b3b48b4 a695daf5 00000001

            0000 e297ab00 - Busy  Id=00039232  From: 01c8.14e4  Context=8021000d  [e2365188 . e12cbea0]

                       Length=0044002c  Type=00380001 (LPC_REQUEST)

                       Data: 00010001 00040242 b8d06b40 b8d06b00 000003f4 000010a0

            0000 e12cbea0 - Busy  Id=00039236  From: 01c8.0de0  Context=8021000d  [e297ab00 . e1297430]

                       Length=0044002c  Type=00380001 (LPC_REQUEST)

                       Data: 00010001 00040242 4195c844 8ff87eb3 2b70d0f6 001e001c

            0000 e1297430 - Busy  Id=0003c96d  From: 0204.0224  Context=8022000e  [e12cbea0 . e1534a48]

                       Length=0044002c  Type=00380001 (LPC_REQUEST)

                       Data: 00010001 00040242 b8bb1b40 b8bb1b00 000003f4 00000fbc

            0000 e1534a48 - Busy  Id=0003db3b  From: 01c8.10e0  Context=8021000d  [e1297430 . e2515ca0]

                       Length=0044002c  Type=00380001 (LPC_REQUEST)

                       Data: 00010001 00040242 0000dc04 000001c8 7b6ee790 9e8a0590

            0000 e2515ca0 - Busy  Id=0003db3f  From: 01c8.0f5c  Context=8021000d  [e1534a48 . e29279e8]

                       Length=0044002c  Type=00380001 (LPC_REQUEST)

                       Data: 00010001 00040242 0000dc04 000001c8 7b6ee790 9e8a0590

            0000 e29279e8 - Busy  Id=0003db5a  From: 01c8.1618  Context=8021000d  [e2515ca0 . e1061008]

                       Length=0044002c  Type=00380001 (LPC_REQUEST)

                       Data: 00010001 00040242 00000000 00000000 00000000 00000000

            0000 e1061008 - Busy  Id=0004617f  From: 01c8.1220  Context=8021000d  [e29279e8 . e295fa80]

                       Length=0044002c  Type=00380001 (LPC_REQUEST)

                       Data: 00010001 00040242 00010c20 000001c8 37bd6287 35ebbd53

            0000 e295fa80 - Busy  Id=0004a38a  From: 01c8.1708  Context=8021000d  [e1061008 . e2966ea0]

                       Length=0044002c  Type=00380001 (LPC_REQUEST)

                       Data: 00010001 00040242 0001a41a 000001c8 cb786dae 60cdf739

            0000 e2966ea0 - Busy  Id=0004d0c2  From: 04a8.04d4  Context=80050003  [e295fa80 . e1261008]

                       Length=0044002c  Type=00380001 (LPC_REQUEST)

                       Data: 00020001 00030242 5415989d 3d8048c1 00070005 00000000

            0000 e1261008 - Busy  Id=000544d7  From: 01c8.0758  Context=8021000d  [e2966ea0 . e1373530]

                       Length=0044002c  Type=00380001 (LPC_REQUEST)

                       Data: 00010001 00040242 00010c20 000001c8 37bd6287 35ebbd53

        The LpcDataInfoChainHead queue contains 11 messages

        Threads in RunDown queue :     0xe2364fc8    0xe297a940    0xe12cbce0    0xe1297270    0xe1534888    0xe2515ae0    0xe2927828    0xe1060e48    0xe295f8c0    0xe2966ce0    0xe1260e48

    Done.                     

     

    When a client thread sends a message to a server the message ID is stored in the clients ETHREAD.LpcReplyMessageId field. !lpc message searches the message ID in all threads in the system. Once client thread is identified ETHREAD.LpcWaitingOnPort will give the client communication port used for the LPC transaction. This has references to the server communication port and connection port.

     

    !lpc message dumps the message queue for all the port objects. Important thing to note here is that even though the message queue is empty for connection port there are 11 messages chained to LpcDataInfoChainHead. This tells us that 11 messages were dequeued from the server message queue but the server has not replied to any of the messages since once the server responds to a message it is unchained from the LpcDataInfoChainHead list and freed.

     

    Messaging Queue

    Figure [4] LPC Message Queuing stages

     

    The above figure shows messages are first queued to the MsgQueue and then chained to LpcDataInfoChainHead. This kind of chaining only happens when the message contains extra data. This does not happen for a LPC_CONNECTION_REQUEST or if the message does not contain extra data. In our scenario server thread has dequeued and started processing the hence we see it in the LpcDataInfoChainHead.

     

    We now dump the server thread to see why it has not responded to the message for 12mins.

     

    0: kd> .process /p /r 896e1450

    Implicit process is now 896e1450

    Loading User Symbols

    ............................................................................................................................

     

    0: kd> !thread 89531020

    THREAD 89531020  Cid 03f4.10a0  Teb: 7ffac000 Win32Thread: 00000000 WAIT: (WrLpcReply) UserMode Non-Alertable

        8953120c  Semaphore Limit 0x1

    Waiting for reply to LPC MessageId 0003923d:

    Current LPC port e22d29d8

    Impersonation token:  e15315d0 (Level Impersonation)

    Owning Process            896e1450       Image:         svchost1.exe

    Wait Start TickCount      144629         Ticks: 2027 (0:00:00:31.671)

    Context Switch Count      219            

    UserTime                  00:00:00.000

    KernelTime                00:00:00.000

    Win32 Start Address 0x00039236

    LPC Server thread working on message Id 39236

    Start Address kernel32!BaseThreadStartThunk (0x77e6b5f3)

    Stack Init b8d07000 Current b8d06c08 Base b8d07000 Limit b8d04000 Call 0

    Priority 10 BasePriority 8 PriorityDecrement 0

    ChildEBP RetAddr  Args to Child             

    b8d06c20 8083e6a2 89531098 89531020 895310c8 nt!KiSwapContext+0x26 (FPO: [Uses EBP] [0,0,4])

    b8d06c4c 8083f164 8953120c 895311e0 89531020 nt!KiSwapThread+0x284 (FPO: [Non-Fpo])

    b8d06c94 8093983f 8953120c 00000011 b8d06d01 nt!KeWaitForSingleObject+0x346 (FPO: [Non-Fpo])

    b8d06d50 80834d3f 0000056c 000f3fc0 000f3fc0 nt!NtRequestWaitReplyPort+0x776 (FPO: [Non-Fpo])

    b8d06d50 7c82ed54 0000056c 000f3fc0 000f3fc0 nt!KiFastCallEntry+0xfc (FPO: [0,0] TrapFrame @ b8d06d64)

    0097f0a8 7c821c94 77c72700 0000056c 000f3fc0 ntdll!KiFastSystemCallRet (FPO: [0,0,0])

    0097f0ac 77c72700 0000056c 000f3fc0 000f3fc0 ntdll!NtRequestWaitReplyPort+0xc (FPO: [3,0,0])

    0097f118 77c713ba 0097f154 0097f138 77c72c7f RPCRT4!LRPC_CCALL::SendReceive+0x230 (FPO: [Non-Fpo])

    0097f124 77c72c7f 0097f154 76d320d8 0097f540 RPCRT4!I_RpcSendReceive+0x24 (FPO: [Non-Fpo])

    0097f138 77ce219b 0097f180 000db5f8 00000000 RPCRT4!NdrSendReceive+0x2b (FPO: [Non-Fpo])

    0097f520 76d41213 76d320d8 76d402c8 0097f540 RPCRT4!NdrClientCall2+0x22e (FPO: [Non-Fpo])

    0097f538 76d3fc12 000b5618 000ce248 000ce268 rpcss!LocalCreateInstance+0x1c (FPO: [Non-Fpo])

    0097f5cc 76d3f033 0097f7d8 0097f654 0097f824 rpcss!CServerListEntry::CallServer+0x127 (FPO: [Non-Fpo])

    0097f604 76d3bc7f 00000001 00000008 00000000 rpcss!CServerTableEntry::CallRunningServer+0x181 (FPO: [Non-Fpo])

    0097f670 76d3bac5 0097f7d8 00000000 000e415c rpcss!Activation+0x337 (FPO: [Non-Fpo])

    0097f6cc 76d3dd08 000e415c 0097f7bc 0097f71c rpcss!ActivateFromProperties+0x213 (FPO: [Non-Fpo])

    0097f6dc 76d3dd5c 76d738d0 00000000 000e415c rpcss!CScmActivator::CreateInstance+0x10 (FPO: [Non-Fpo])

    0097f71c 76d3c162 000e415c 00000000 0097f7bc rpcss!ActivationPropertiesIn::DelegateCreateInstance+0xf7 (FPO: [Non-Fpo])

    0097f764 76d3bdb6 000e415c 0097f7bc 00000000 rpcss!ActivateFromPropertiesPreamble+0x4e3 (FPO: [Non-Fpo])

    0097f7ac 76d3dcdd 00000002 000ce2ac 00000000 rpcss!PerformScmStage+0xbb (FPO: [Non-Fpo])

    0097f8c8 77c70f3b 000cdbb0 000ce248 000ce268 rpcss!SCMActivatorCreateInstance+0xa2 (FPO: [Non-Fpo])

    0097f8f8 77ce23f7 76d3dc3e 0097fae0 00000007 RPCRT4!Invoke+0x30

    0097fcf8 77ce26ed 00000000 00000000 000cdcdc RPCRT4!NdrStubCall2+0x299 (FPO: [Non-Fpo])

    0097fd14 77c709be 000cdcdc 00094ef0 000cdcdc RPCRT4!NdrServerCall2+0x19 (FPO: [Non-Fpo])

    0097fd48 77c7093f 76d3156c 000cdcdc 0097fdec RPCRT4!DispatchToStubInCNoAvrf+0x38 (FPO: [Non-Fpo])

    0097fd9c 77c70865 00000004 00000000 76d73538 RPCRT4!RPC_INTERFACE::DispatchToStubWorker+0x117 (FPO: [Non-Fpo])

    0097fdc0 77c734b1 000cdcdc 00000000 76d73538 RPCRT4!RPC_INTERFACE::DispatchToStub+0xa3 (FPO: [Non-Fpo])

    0097fdfc 77c71bb3 000b44d8 0008f0e8 000dea40 RPCRT4!LRPC_SCALL::DealWithRequestMessage+0x42c (FPO: [Non-Fpo])

    0097fe20 77c75458 0008f120 0097fe38 000b44d8 RPCRT4!LRPC_ADDRESS::DealWithLRPCRequest+0x127 (FPO: [Non-Fpo])

    0097ff84 77c5778f 0097ffac 77c5f7dd 0008f0e8 RPCRT4!LRPC_ADDRESS::ReceiveLotsaCalls+0x430 (FPO: [Non-Fpo])

    0097ff8c 77c5f7dd 0008f0e8 00000000 00000000 RPCRT4!RecvLotsaCallsWrapper+0xd (FPO: [Non-Fpo])

    0097ffac 77c5de88 00084298 0097ffec 77e6608b RPCRT4!BaseCachedThreadRoutine+0x9d (FPO: [Non-Fpo])

    0097ffb8 77e6608b 000c0070 00000000 00000000 RPCRT4!ThreadStartRoutine+0x1b (FPO: [Non-Fpo])

    0097ffec 00000000 77c5de6d 000c0070 00000000 kernel32!BaseThreadStart+0x34 (FPO: [Non-Fpo])

     

    The server thread is waiting for a reply to another LPC message it issued while servicing the client request. Tracing the LPC message as above -

     

    0: kd> !lpc message 0003923d

    Searching message 3923d in threads ...

    Client thread 89531020 waiting a reply from 3923d                         

        Server thread 898cd998 is working on message 3923d                         

    Searching thread 89531020 in port rundown queues ...

     

    Server communication port 0xe22d58c8

        Handles: 1   References: 1

        The LpcDataInfoChainHead queue is empty

            Connected port: 0xe22d29d8      Server connection port: 0xe17a95f8

     

    Client communication port 0xe22d29d8

        Handles: 1   References: 15

        The LpcDataInfoChainHead queue is empty

     

    Server connection port e17a95f8  Name: OLE71892D01F993498EA1C3757C71D5

        Handles: 1   References: 37

        Server process  : 89bedd88 (svchost.exe)

        Queue semaphore : 89718210

        Semaphore state 0 (0x0)

        The message queue is empty

     

        Messages in LpcDataInfoChainHead:

            0000 e188c3b0 - Busy  Id=00039239  From: 03f4.111c  Context=80030001  [e17a9678 . e28b00f8]

                       Length=0044002c  Type=00380001 (LPC_REQUEST)

                       Data: 00000001 00040342 0000d401 000004a8 2e6c8db6 4131cb31

            0000 e28b00f8 - Busy  Id=0003923d  From: 03f4.10a0  Context=80030001  [e188c3b0 . e2846008]

                       Length=0044002c  Type=00380001 (LPC_REQUEST)

                       Data: 00000001 00040342 0000d401 000004a8 2e6c8db6 4131cb31

            0000 e2846008 - Busy  Id=0003c96e  From: 03f4.0df0  Context=80030001  [e28b00f8 . e27dcb70]

                       Length=0044002c  Type=00380001 (LPC_REQUEST)

                       Data: 00000001 00040342 0000d401 000004a8 2e6c8db6 4131cb31

            0000 e27dcb70 - Busy  Id=0003db3c  From: 03f4.1158  Context=80030001  [e2846008 . e28c0788]

                       Length=0044002c  Type=00380001 (LPC_REQUEST)

                       Data: 00000001 00040342 0000d401 000004a8 2e6c8db6 4131cb31

            0000 e28c0788 - Busy  Id=0003db41  From: 03f4.0fbc  Context=80030001  [e27dcb70 . e2983650]

                       Length=0044002c  Type=00380001 (LPC_REQUEST)

                       Data: 00000001 00040342 0000d401 000004a8 2e6c8db6 4131cb31

            0000 e2983650 - Busy  Id=0003db5c  From: 03f4.0f40  Context=80030001  [e28c0788 . e2596ae0]

                       Length=0044002c  Type=00380001 (LPC_REQUEST)

                       Data: 00000001 00040342 0000d401 000004a8 2e6c8db6 4131cb31

            0000 e2596ae0 - Busy  Id=00046180  From: 03f4.03d8  Context=80030001  [e2983650 . e28c3ea0]

                       Length=0044002c  Type=00380001 (LPC_REQUEST)

                       Data: 00000001 00040342 0000d401 000004a8 2e6c8db6 4131cb31

            0000 e28c3ea0 - Busy  Id=0004a38d  From: 03f4.07c4  Context=80030001  [e2596ae0 . e294aea0]

                       Length=0044002c  Type=00380001 (LPC_REQUEST)

                       Data: 00000001 00040342 0000d401 000004a8 2e6c8db6 4131cb31

            0000 e294aea0 - Busy  Id=0004d0c6  From: 03f4.1714  Context=80030001  [e28c3ea0 . e2977ea0]

                       Length=0044002c  Type=00380001 (LPC_REQUEST)

                       Data: 00000001 00030342 0000d401 000004a8 2e6c8db6 4131cb31

            0000 e2977ea0 - Busy  Id=000544d8  From: 03f4.162c  Context=80030001  [e294aea0 . e241f008]

                       Length=0044002c  Type=00380001 (LPC_REQUEST)

                       Data: 00000001 00040342 0000d401 000004a8 2e6c8db6 4131cb31

            0000 e241f008 - Busy  Id=0006077e  From: 03f4.031c  Context=80030001  [e2977ea0 . e2940820]

                       Length=0044002c  Type=00380001 (LPC_REQUEST)

                       Data: 00000001 00040342 0000d401 000004a8 2e6c8db6 4131cb31

            0000 e2940820 - Busy  Id=000638da  From: 03f4.1754  Context=80030001  [e241f008 . e2576270]

                       Length=0044002c  Type=00380001 (LPC_REQUEST)

                       Data: 00000001 00040342 0000d401 000004a8 2e6c8db6 4131cb31

            0000 e2576270 - Busy  Id=000638f1  From: 03f4.1794  Context=80030001  [e2940820 . e261c268]

                       Length=0044002c  Type=00380001 (LPC_REQUEST)

                       Data: 00000001 00040342 0000d401 000004a8 2e6c8db6 4131cb31

            0000 e261c268 - Busy  Id=000638fe  From: 03f4.17a8  Context=80030001  [e2576270 . e17a9678]

                       Length=0044002c  Type=00380001 (LPC_REQUEST)

                       Data: 00000001 00040342 0000d401 000004a8 2e6c8db6 4131cb31

        The LpcDataInfoChainHead queue contains 14 messages

        Threads in RunDown queue :     0xe188c1f0    0xe28aff38    0xe2845e48    0xe27dc9b0    0xe28c05c8    0xe2983490    0xe2596920    0xe28c3ce0    0xe294ace0    0xe2977ce0    0xe241ee48    0xe2940660    0xe25760b0    0xe261c0a8

    Done.        

     

     

    0: kd> .process /p /r 89bedd88

    Implicit process is now 89bedd88

    Loading User Symbols

    ............................................................................................................................

     

     

    0: kd> !thread 898cd998

    THREAD 898cd998  Cid 04a8.0508  Teb: 7ffd5000 Win32Thread: e23a6008 WAIT: (UserRequest) UserMode Non-Alertable

        89006340  SynchronizationEvent

    IRP List:

        88eb89e0: (0006,0094) Flags: 00000800  Mdl: 00000000

    Not impersonating

    DeviceMap                 e10018a8

    Owning Process            89bedd88       Image:         svchost.exe

    Wait Start TickCount      99831          Ticks: 46825 (0:00:12:11.640)

    Context Switch Count      7330                 LargeStack

    UserTime                  00:00:00.906

    KernelTime                00:00:00.531

    Win32 Start Address 0x0003923d

    LPC Server thread working on message Id 3923d

    Start Address kernel32!BaseThreadStartThunk (0x77e6b5f3)

    Stack Init b9a55000 Current b9a54c60 Base b9a55000 Limit b9a51000 Call 0

    Priority 13 BasePriority 8 PriorityDecrement 0

    Kernel stack not resident.

    ChildEBP RetAddr  Args to Child             

    b9a54c78 8083e6a2 898cda10 898cd998 898cda40 nt!KiSwapContext+0x26 (FPO: [Uses EBP] [0,0,4])

    b9a54ca4 8083f164 00000000 00000000 00000000 nt!KiSwapThread+0x284 (FPO: [Non-Fpo])

    b9a54cec 8092db70 89006340 00000006 80834d01 nt!KeWaitForSingleObject+0x346 (FPO: [Non-Fpo])

    b9a54d50 80834d3f 00000d64 00000000 00000000 nt!NtWaitForSingleObject+0x9a (FPO: [Non-Fpo])

    b9a54d50 7c82ed54 00000d64 00000000 00000000 nt!KiFastCallEntry+0xfc (FPO: [0,0] TrapFrame @ b9a54d64)

    0109f248 7c822124 77e6bad8 00000d64 00000000 ntdll!KiFastSystemCallRet (FPO: [0,0,0])

    0109f24c 77e6bad8 00000d64 00000000 00000000 ntdll!NtWaitForSingleObject+0xc (FPO: [3,0,0])

    0109f2bc 77e6ba42 00000d64 ffffffff 00000000 kernel32!WaitForSingleObjectEx+0xac (FPO: [Non-Fpo])

    0109f2d0 77719486 00000d64 ffffffff 00000000 kernel32!WaitForSingleObject+0x12 (FPO: [Non-Fpo])

    0109f2e4 7769c09b 00086803 00000000 00086660 ole32!CComApartment::WaitForAccess+0x5c (FPO: [0,0,0])

    0109f2f8 776a417d 0109f344 02163088 02145ac0 ole32!CComApartment::GetPreRegMOID+0x41 (FPO: [Non-Fpo])

    0109f310 776a40d2 0109f344 77790858 00000001 ole32!GetPreRegMOID+0x24 (FPO: [Non-Fpo])

    0109f358 776a48e3 00000001 00000003 0109f3c4 ole32!CIDObject::GetOrCreateStdID+0x11e (FPO: [Non-Fpo])

    0109f37c 776a4419 02163088 00000000 00086860 ole32!ObtainStdIDFromUnk+0x92 (FPO: [Non-Fpo])

    0109f3a4 776a43c1 0109f468 00162690 02b8bd30 ole32!StdMarshalObject+0x49 (FPO: [Non-Fpo])

    0109f400 776a3974 0109f42c 0109f468 00162690 ole32!CDestObjectWrapper::MarshalInterface+0x150 (FPO: [Non-Fpo])

    0109f43c 77686f83 0109f468 00162690 02b8bd30 ole32!CoMarshalInterface+0x93 (FPO: [Non-Fpo])

    0109f490 77686e38 02b8bd30 00162690 00162690 ole32!MarshalHelperMulti+0x48 (FPO: [Non-Fpo])

    0109f4bc 77686d46 58b10040 00000001 00162690 ole32!GetInstanceHelperMulti+0x4d (FPO: [Non-Fpo])

    0109f584 77c70f3b 000a7938 00000000 0218b774 ole32!CObjServer::CreateInstance+0x30d (FPO: [Non-Fpo])

    0109f5a8 77ce23f7 77686b34 0109f790 00000004 RPCRT4!Invoke+0x30

    0109f9a8 77ce1a95 000b4398 000b2074 05cc596c RPCRT4!NdrStubCall2+0x299 (FPO: [Non-Fpo])

    0109fa00 77788ec1 000b4398 05cc596c 000b2074 RPCRT4!CStdStubBuffer_Invoke+0xc6 (FPO: [Non-Fpo])

    0109fa44 77788e74 05cc596c 000b4428 00000000 ole32!SyncStubInvoke+0x33 (FPO: [Non-Fpo])

    0109fa8c 7769b2b7 05cc596c 000b4288 000b4398 ole32!StubInvoke+0xa7 (FPO: [Non-Fpo])

    0109fb64 7769b0b5 000b2074 00000000 000b4398 ole32!CCtxComChnl::ContextInvoke+0xe3 (FPO: [Non-Fpo])

    0109fb80 77789926 05cc596c 00000001 000b4398 ole32!MTAInvoke+0x1a (FPO: [Non-Fpo])

    0109fbb0 77788c33 d0908070 000b2074 000b4398 ole32!AppInvoke+0xa3 (FPO: [Non-Fpo])

    0109fc84 77789936 05cc5918 000b22f8 000b4410 ole32!ComInvokeWithLockAndIPID+0x2c5 (FPO: [Non-Fpo])

    0109fcd0 77c709be 020ccc24 000b4410 020ccc24 ole32!ThreadInvoke+0x1e0 (FPO: [Non-Fpo])

    0109fd04 77c7093f 777887e0 020ccc24 0109fdec RPCRT4!DispatchToStubInCNoAvrf+0x38 (FPO: [Non-Fpo])

    0109fd58 77c70865 00000000 00000000 7767bfa8 RPCRT4!RPC_INTERFACE::DispatchToStubWorker+0x117 (FPO: [Non-Fpo])

    0109fd7c 77c657eb 020ccc24 00000000 7767bfa8 RPCRT4!RPC_INTERFACE::DispatchToStub+0xa3 (FPO: [Non-Fpo])

    0109fdbc 77c71e26 020ccc24 020ccbdc 00000000 RPCRT4!RPC_INTERFACE::DispatchToStubWithObject+0xc0 (FPO: [Non-Fpo])

    0109fdfc 77c71bb3 000c7db0 000b1498 00097c08 RPCRT4!LRPC_SCALL::DealWithRequestMessage+0x42c (FPO: [Non-Fpo])

    0109fe20 77c75458 000b14d0 0109fe38 000c7db0 RPCRT4!LRPC_ADDRESS::DealWithLRPCRequest+0x127 (FPO: [Non-Fpo])

    0109ff84 77c5778f 0109ffac 77c5f7dd 000b1498 RPCRT4!LRPC_ADDRESS::ReceiveLotsaCalls+0x430 (FPO: [Non-Fpo])

    0109ff8c 77c5f7dd 000b1498 00000000 00000000 RPCRT4!RecvLotsaCallsWrapper+0xd (FPO: [Non-Fpo])

    0109ffac 77c5de88 0008aa48 0109ffec 77e6608b RPCRT4!BaseCachedThreadRoutine+0x9d (FPO: [Non-Fpo])

    0109ffb8 77e6608b 000b1678 00000000 00000000 RPCRT4!ThreadStartRoutine+0x1b (FPO: [Non-Fpo])

     

    The above LPC server thread is waiting for allocation of COM OID (Object Identifier) since we have run out of OIDs. The way OIDs are allocated is by calling CComApartment::CallTheResolver( ). Searching threads in the system with CComApartment::CallTheResolver on the stack we get the below thread within svchost.exe.

     

     

    0: kd> !thread 88d76db0 

    THREAD 88d76db0  Cid 04a8.13f4  Teb: 7ffd6000 Win32Thread: e295c758 WAIT: (Unknown) UserMode Non-Alertable

        88d76f9c  Semaphore Limit 0x1

    Waiting for reply to LPC MessageId 0003832f:

    Current LPC port e15791b8

    Not impersonating

    DeviceMap                 e10018a8

    Owning Process            89bedd88       Image:         svchost.exe

    Wait Start TickCount      99073          Ticks: 47583 (0:00:12:23.484)

    Context Switch Count      6947                 LargeStack

    UserTime                  00:00:00.859

    KernelTime                00:00:00.562

    Win32 Start Address 0x0003832e

    LPC Server thread working on message Id 3832e

    Start Address kernel32!BaseThreadStartThunk (0x77e6b5f3)

    Stack Init b8af6000 Current b8af5c08 Base b8af6000 Limit b8af2000 Call 0

    Priority 9 BasePriority 8 PriorityDecrement 0

    Kernel stack not resident.

    ChildEBP RetAddr  Args to Child             

    b8af5c20 8083e6a2 88d76e28 88d76db0 88d76e58 nt!KiSwapContext+0x26 (FPO: [Uses EBP] [0,0,4])

    b8af5c4c 8083f164 88d76f9c 88d76f70 88d76db0 nt!KiSwapThread+0x284 (FPO: [Non-Fpo])

    b8af5c94 8093983f 88d76f9c 00000011 b8af5d01 nt!KeWaitForSingleObject+0x346 (FPO: [Non-Fpo])

    b8af5d50 80834d3f 00000094 000dde10 000dde10 nt!NtRequestWaitReplyPort+0x776 (FPO: [Non-Fpo])

    b8af5d50 7c82ed54 00000094 000dde10 000dde10 nt!KiFastCallEntry+0xfc (FPO: [0,0] TrapFrame @ b8af5d64)

    0251ed2c 7c821c94 77c72700 00000094 000dde10 ntdll!KiFastSystemCallRet (FPO: [0,0,0])

    0251ed30 77c72700 00000094 000dde10 000dde10 ntdll!NtRequestWaitReplyPort+0xc (FPO: [3,0,0])

    0251ed88 77c713ba 0251edc4 0251eda8 77c72c7f RPCRT4!LRPC_CCALL::SendReceive+0x230 (FPO: [Non-Fpo])

    0251ed94 77c72c7f 0251edc4 776720f8 0251f1b0 RPCRT4!I_RpcSendReceive+0x24 (FPO: [Non-Fpo])

    0251eda8 77ce219b 0251edf0 05c6b44c 0251f360 RPCRT4!NdrSendReceive+0x2b (FPO: [Non-Fpo])

    0251f190 7769c087 776720f8 77671df0 0251f1b0 RPCRT4!NdrClientCall2+0x22e (FPO: [Non-Fpo])

    0251f1a8 7769c04e 0008b0a8 0008bc30 0251f1f4 ole32!ServerAllocateOIDs+0x1c (FPO: [Non-Fpo])

    0251f1d8 7769c01b 0251f1f4 0251f368 0251f220 ole32!CRpcResolver::ServerAllocOIDs+0x2a (FPO: [Non-Fpo])

    0251f1fc 7769bf93 0251f368 0251f220 00000014 ole32!OXIDEntry::AllocOIDs+0x40 (FPO: [Non-Fpo])

    0251f36c 7769c0b2 00086803 00000000 00086660 ole32!CComApartment::CallTheResolver+0x6b (FPO: [Non-Fpo])

    0251f380 776a417d 0251f3cc 05c67490 05d59ee0 ole32!CComApartment::GetPreRegMOID+0x54 (FPO: [Non-Fpo])

    0251f398 776a40d2 0251f3cc 77790858 00000001 ole32!GetPreRegMOID+0x24 (FPO: [Non-Fpo])

    0251f3e0 776a48e3 00000001 00000003 0251f44c ole32!CIDObject::GetOrCreateStdID+0x11e (FPO: [Non-Fpo])

    0251f404 776a4419 05c67490 00000000 00086860 ole32!ObtainStdIDFromUnk+0x92 (FPO: [Non-Fpo])

    0251f42c 776a43c1 05e39b00 0251f504 02b30234 ole32!StdMarshalObject+0x49 (FPO: [Non-Fpo])

    0251f488 776a3974 0251f4b4 05e39b00 0251f504 ole32!CDestObjectWrapper::MarshalInterface+0x150 (FPO: [Non-Fpo])

    0251f4c4 77c544a3 05e39b00 0251f504 02b30234 ole32!CoMarshalInterface+0x93 (FPO: [Non-Fpo])

    0251f518 77c54e85 0251f5e8 02b30234 74ce6210 RPCRT4!NdrpInterfacePointerMarshall+0x9a (FPO: [Non-Fpo])

    0251f538 77c691ad 0251f5e8 000f2898 02b30234 RPCRT4!NdrpPointerMarshall+0x87 (FPO: [Non-Fpo])

    0251f550 77c690fc 0251f5e8 02b30234 74ce6210 RPCRT4!NdrPointerMarshall+0x30 (FPO: [Non-Fpo])

    0251f570 77c691ad 0251f5e8 00000000 00112e50 RPCRT4!NdrpPointerMarshall+0x116 (FPO: [Non-Fpo])

    0251f588 77c715ac 0251f5e8 00112e50 74ce620c RPCRT4!NdrPointerMarshall+0x30 (FPO: [Non-Fpo])

    0251f5ac 77ce249e 00000000 00000001 020ce8c8 RPCRT4!NdrpServerMarshal+0x82 (FPO: [Non-Fpo])

    0251f9a8 77ce1a95 020ce8c8 0011fa08 000f1454 RPCRT4!NdrStubCall2+0x374 (FPO: [Non-Fpo])

    0251fa00 77788ec1 020ce8c8 000f1454 0011fa08 RPCRT4!CStdStubBuffer_Invoke+0xc6 (FPO: [Non-Fpo])

    0251fa44 77788e74 000f1454 00137868 020debf0 ole32!SyncStubInvoke+0x33 (FPO: [Non-Fpo])

    0251fa8c 7769b2b7 000f1454 0014aaf8 020ce8c8 ole32!StubInvoke+0xa7 (FPO: [Non-Fpo])

    0251fb64 7769b0b5 0011fa08 00000000 020ce8c8 ole32!CCtxComChnl::ContextInvoke+0xe3 (FPO: [Non-Fpo])

    0251fb80 77789926 000f1454 00000001 020ce8c8 ole32!MTAInvoke+0x1a (FPO: [Non-Fpo])

    0251fbb0 77788c33 d0908070 0011fa08 020ce8c8 ole32!AppInvoke+0xa3 (FPO: [Non-Fpo])

    0251fc84 77789936 000f1400 020d3b48 00137850 ole32!ComInvokeWithLockAndIPID+0x2c5 (FPO: [Non-Fpo])

    0251fcd0 77c709be 0216202c 00137850 0216202c ole32!ThreadInvoke+0x1e0 (FPO: [Non-Fpo])

    0251fd04 77c7093f 777887e0 0216202c 0251fdec RPCRT4!DispatchToStubInCNoAvrf+0x38 (FPO: [Non-Fpo])

    0251fd58 77c70865 00000000 00000000 7767bfa8 RPCRT4!RPC_INTERFACE::DispatchToStubWorker+0x117 (FPO: [Non-Fpo])

    0251fd7c 77c657eb 0216202c 00000000 7767bfa8 RPCRT4!RPC_INTERFACE::DispatchToStub+0xa3 (FPO: [Non-Fpo])

     

     

    0: kd> !lpc message 0003832f

    Searching message 3832f in threads ...

        Server thread 8900edb0 is working on message 3832f                        

    Client thread 88d76db0 waiting a reply from 3832f                         

    Searching thread 88d76db0 in port rundown queues ...

     

    Server communication port 0xe188a6e8

        Handles: 1   References: 1

        The LpcDataInfoChainHead queue is empty

            Connected port: 0xe15791b8      Server connection port: 0xe13734b0

     

    Client communication port 0xe15791b8

        Handles: 1   References: 5

        The LpcDataInfoChainHead queue is empty

     

    Server connection port e13734b0  Name: epmapper

        Handles: 1   References: 93

        Server process  : 896e1450 (svchost1.exe)

        Queue semaphore : 8995ac10

        Semaphore state 0 (0x0)

        The message queue is empty

     

        Messages in LpcDataInfoChainHead:

            0000 e2365188 - Busy  Id=0003832f  From: 04a8.13f4  Context=80050003  [e1373530 . e297ab00]

                       Length=0044002c  Type=00380001 (LPC_REQUEST)

                       Data: 00000001 00050242 421e5fa2 9b3b48b4 a695daf5 00000001

            0000 e297ab00 - Busy  Id=00039232  From: 01c8.14e4  Context=8021000d  [e2365188 . e12cbea0]

                       Length=0044002c  Type=00380001 (LPC_REQUEST)

                       Data: 00010001 00040242 b8d06b40 b8d06b00 000003f4 000010a0

            0000 e12cbea0 - Busy  Id=00039236  From: 01c8.0de0  Context=8021000d  [e297ab00 . e1297430]

                       Length=0044002c  Type=00380001 (LPC_REQUEST)

                       Data: 00010001 00040242 4195c844 8ff87eb3 2b70d0f6 001e001c

            0000 e1297430 - Busy  Id=0003c96d  From: 0204.0224  Context=8022000e  [e12cbea0 . e1534a48]

                       Length=0044002c  Type=00380001 (LPC_REQUEST)

                       Data: 00010001 00040242 b8bb1b40 b8bb1b00 000003f4 00000fbc

            0000 e1534a48 - Busy  Id=0003db3b  From: 01c8.10e0  Context=8021000d  [e1297430 . e2515ca0]

                       Length=0044002c  Type=00380001 (LPC_REQUEST)

                       Data: 00010001 00040242 0000dc04 000001c8 7b6ee790 9e8a0590

            0000 e2515ca0 - Busy  Id=0003db3f  From: 01c8.0f5c  Context=8021000d  [e1534a48 . e29279e8]

                       Length=0044002c  Type=00380001 (LPC_REQUEST)

                       Data: 00010001 00040242 0000dc04 000001c8 7b6ee790 9e8a0590

            0000 e29279e8 - Busy  Id=0003db5a  From: 01c8.1618  Context=8021000d  [e2515ca0 . e1061008]

                       Length=0044002c  Type=00380001 (LPC_REQUEST)

                       Data: 00010001 00040242 00000000 00000000 00000000 00000000

            0000 e1061008 - Busy  Id=0004617f  From: 01c8.1220  Context=8021000d  [e29279e8 . e295fa80]

                       Length=0044002c  Type=00380001 (LPC_REQUEST)

                       Data: 00010001 00040242 00010c20 000001c8 37bd6287 35ebbd53

            0000 e295fa80 - Busy  Id=0004a38a  From: 01c8.1708  Context=8021000d  [e1061008 . e2966ea0]

                       Length=0044002c  Type=00380001 (LPC_REQUEST)

                       Data: 00010001 00040242 0001a41a 000001c8 cb786dae 60cdf739

            0000 e2966ea0 - Busy  Id=0004d0c2  From: 04a8.04d4  Context=80050003  [e295fa80 . e1261008]

                       Length=0044002c  Type=00380001 (LPC_REQUEST)

                       Data: 00020001 00030242 5415989d 3d8048c1 00070005 00000000

            0000 e1261008 - Busy  Id=000544d7  From: 01c8.0758  Context=8021000d  [e2966ea0 . e1373530]

                       Length=0044002c  Type=00380001 (LPC_REQUEST)

                       Data: 00010001 00040242 00010c20 000001c8 37bd6287 35ebbd53

        The LpcDataInfoChainHead queue contains 11 messages

        Threads in RunDown queue :     0xe2364fc8    0xe297a940    0xe12cbce0    0xe1297270    0xe1534888    0xe2515ae0    0xe2927828    0xe1060e48    0xe295f8c0    0xe2966ce0    0xe1260e48

    Done.                                             

     

     

    0: kd> !thread 8900edb0

    THREAD 8900edb0  Cid 03f4.0598  Teb: 7ffd4000 Win32Thread: 00000000 WAIT: (Unknown) UserMode Non-Alertable

        89014cc0  NotificationEvent

    Not impersonating

    DeviceMap                 e10018a8

    Owning Process            896e1450       Image:         svchost1.exe

    Wait Start TickCount      144629         Ticks: 2027 (0:00:00:31.671)

    Context Switch Count      5892            

    UserTime                  00:00:00.765

    KernelTime                00:00:01.203

    Win32 Start Address 0x0003832f

    LPC Server thread working on message Id 3832f

    Start Address kernel32!BaseThreadStartThunk (0x77e6b5f3)

    Stack Init b95d1000 Current b95d0c60 Base b95d1000 Limit b95ce000 Call 0

    Priority 13 BasePriority 8 PriorityDecrement 0

    ChildEBP RetAddr  Args to Child             

    b95d0c78 8083e6a2 8900ee28 8900edb0 8900ee58 nt!KiSwapContext+0x26 (FPO: [Uses EBP] [0,0,4])

    b95d0ca4 8083f164 00000000 00000000 00000000 nt!KiSwapThread+0x284 (FPO: [Non-Fpo])

    b95d0cec 8092db70 89014cc0 00000006 80935d01 nt!KeWaitForSingleObject+0x346 (FPO: [Non-Fpo])

    b95d0d50 80834d3f 000004e8 00000000 00000000 nt!NtWaitForSingleObject+0x9a (FPO: [Non-Fpo])

    b95d0d50 7c82ed54 000004e8 00000000 00000000 nt!KiFastCallEntry+0xfc (FPO: [0,0] TrapFrame @ b95d0d64)

    0082f7a4 7c822124 77e6bad8 000004e8 00000000 ntdll!KiFastSystemCallRet (FPO: [0,0,0])

    0082f7a8 77e6bad8 000004e8 00000000 00000000 ntdll!NtWaitForSingleObject+0xc (FPO: [3,0,0])

    0082f818 77e6ba42 000004e8 ffffffff 00000000 kernel32!WaitForSingleObjectEx+0xac (FPO: [Non-Fpo])

    0082f82c 77cab1e0 000004e8 ffffffff 000bc314 kernel32!WaitForSingleObject+0x12 (FPO: [Non-Fpo])

    0082f840 77cab333 0082f89c 000ad626 000bc314 RPCRT4!SWMRLock::LockSharedOrExclusiveOnLastWriter+0x32 (FPO: [Non-Fpo])

    0082f868 77c72e00 0082f89c 00000001 00000002 RPCRT4!SWMRLock::LockSharedOrExclusive+0xa6 (FPO: [Non-Fpo])

    0082f890 77c72f71 00000000 00000000 00000010 RPCRT4!NDRSContextUnmarshall2+0x1b9 (FPO: [Non-Fpo])

    0082f8bc 77c72eb6 4182f938 76d32c22 0082f938 RPCRT4!NdrServerContextNewUnmarshall+0xa6 (FPO: [Non-Fpo])

    0082f8d0 77c70f93 0082f938 0082fae4 76d32c22 RPCRT4!NdrUnmarshallHandle+0x52 (FPO: [Non-Fpo])

    0082f900 77ce2345 0082fae4 00094618 00000000 RPCRT4!NdrpServerUnMarshal+0x13d (FPO: [Non-Fpo])

    0082fcf8 77ce26ed 00000000 00000000 000bbd9c RPCRT4!NdrStubCall2+0x19f (FPO: [Non-Fpo])

    0082fd14 77c709be 000bbd9c 00094618 000bbd9c RPCRT4!NdrServerCall2+0x19 (FPO: [Non-Fpo])

    0082fd48 77c7093f 76d3156c 000bbd9c 0082fdec RPCRT4!DispatchToStubInCNoAvrf+0x38 (FPO: [Non-Fpo])

    0082fd9c 77c70865 00000005 00000000 76d736c0 RPCRT4!RPC_INTERFACE::DispatchToStubWorker+0x117 (FPO: [Non-Fpo])

    0082fdc0 77c734b1 000bbd9c 00000000 76d736c0 RPCRT4!RPC_INTERFACE::DispatchToStub+0xa3 (FPO: [Non-Fpo])

    0082fdfc 77c71bb3 00099188 0008f0e8 000adfd8 RPCRT4!LRPC_SCALL::DealWithRequestMessage+0x42c (FPO: [Non-Fpo])

    0082fe20 77c75458 0008f120 0082fe38 00099188 RPCRT4!LRPC_ADDRESS::DealWithLRPCRequest+0x127 (FPO: [Non-Fpo])

    0082ff84 77c5778f 0082ffac 77c5f7dd 0008f0e8 RPCRT4!LRPC_ADDRESS::ReceiveLotsaCalls+0x430 (FPO: [Non-Fpo])

    0082ff8c 77c5f7dd 0008f0e8 00000000 00000000 RPCRT4!RecvLotsaCallsWrapper+0xd (FPO: [Non-Fpo])

    0082ffac 77c5de88 00084298 0082ffec 77e6608b RPCRT4!BaseCachedThreadRoutine+0x9d (FPO: [Non-Fpo])

    0082ffb8 77e6608b 000bc298 00000000 00000000 RPCRT4!ThreadStartRoutine+0x1b (FPO: [Non-Fpo])

    0082ffec 00000000 77c5de6d 000bc298 00000000 kernel32!BaseThreadStart+0x34 (FPO: [Non-Fpo])

     

    The above thread belongs to the RPC service and is waiting for a lock to be acquired.

     

    0: kd> x /v svchost1!*service*name*

    pub global 01004078    0 svchost1!ServiceNames = <no type information>

     

    0: kd> du poi(01004078)

    000839f8  "RpcSs"

     

    To find why this thread is not able to acquire the lock, investigation of the threads within the RPC service is required. The reason the lock is not available is because there is a thread that has acquired the lock and is trying to reach a client that does not exist.

     

    The chain of events is summarized in the below figure. Note that all horizontal links were traversed using LPC data structures and debugger extensions. The vertical links were traversed by investigation beyond the scope of LPC. Thus the issue that started in a component of SMS was finally traced to RPC through the use of LPC data structures and debugger extension. LPC is one of the core inter-process communication mechanisms. Understanding LPC and learning to use its extensions helps a lot to debug issues where client threads are waiting for reply from the server.

     

    Chain of events

  • Ntdebugging Blog

    LPC part 2 Kernel Debugger Extensions

    • 1 Comments

    Hello my name is Roy, I’m an EE on the Microsoft global escalation services / CPR team.   This blog is a follow on to my first LPC blog.  We will be discussing debugger extensions that allow you to look at LPC related issues.

     

    Disclaimer: The purpose of this blog is to illustrate debugging techniques with LPC. Please do not rely on this information as documentation for the APIs when writing your own code. As always, this is subject to change in the future.  

     

    LPC Kernel Debugger Extensions

     

     

    Command

    Description

    !lpc                    

    Display the list and description of all the !lpc commands

    !lpc message [MessageId]

    Display the message with a given ID and all related information by attempting to match the given Message ID to the EHTREAD->LpcReceivedMessageId and to the ETHREAD->LpcReplyMessageId of all threads in the system.

    If the MessageId parameter is omitted then it attempts to display all the outstanding messages in the system by searching for the tag ‘LpcM’ in the pools.

    !lpc port [PortAddress] 

    Displays port information. If a server connection port address is specified then only information about that port is displayed. If either a client or server communication port is specified it prints information about the specified communication port, the peer communication port and the server connection port.

    If the PortAddress parameter is omitted then it attempts to walk the list of all objects of type “Port” and “WaitablePort” and display them. Note that for this feature to work the GFlags option “+otl” i.e. “Maintain a list of objects for each type” must be enabled.

    !lpc scan PortAddress   

    Displays port information. It attempts to walk the list of all objects of type “Port” and “WaitablePort” and display the one matching the specified port address. Note that for this feature to work the GFlags option “+otl” i.e. “Maintain a list of objects for each type” must be enabled.

    !lpc thread [ThreadAddr]

    If  ThreadAddr is specified it walks the list of threads in the ETHREAD-> LpcReplyChain to locate the list head i.e. a “Port” or “WaitablePort” object on which the thread is waiting for a reply.

    If the ThreadAddr parameter is omitted then it attempts to find all LPC server threads by looking for threads with a non-NULL EHTREAD->LpcReceivedMessageId and all client threads by looking for threads with a non-NULL ETHREAD->LpcReplyMessageId and displays them.

    !lpc PoolSearch

    Toggles a setting that controls whether the “lpc message” command will search for LPC message tag (‘LpcM’) in the kernel pools or not.

     

     

     

    LPC Kernel Debugger Extension Usage

     

     

    LPC Connection Port information from call stack

     

    On the call stack of any client or server threads blocked on LPC data transfer or LPC connection there will be a frame containing either one of the functions  NtRequestWaitReplyPort() or NtReplyWaitReceivePortEx() . The first parameter to either one of these functions is the handle to the port they are blocked on.

     

    kd> !thread 810de2a8 

    THREAD 810de2a8  Cid 01dc.01f4  Teb: 7ffde000 Win32Thread: 00000000 WAIT: (WrLpcReceive) UserMode Non-Alertable

        81131188  Semaphore Limit 0x7fffffff

        810de398  NotificationTimer

    Not impersonating

    DeviceMap                 e196c460

    Owning Process            810ddda0       Image:         rpclpcs.exe

    Wait Start TickCount      64555          Ticks: 402 (0:00:00:04.025)

    Context Switch Count      2            

    UserTime                  00:00:00.000

    KernelTime                00:00:00.000

    Win32 Start Address 0x77e76bf0

    Start Address 0x7c810856

    Stack Init f8e28000 Current f8e27c4c Base f8e28000 Limit f8e25000 Call 0

    Priority 8 BasePriority 8 PriorityDecrement 0 DecrementCount 0

    ChildEBP RetAddr  Args to Child             

    f8e27c64 804dc6a6 810de318 810de2a8 804dc6f2 nt!KiSwapContext+0x2e (FPO: [Uses EBP] [0,0,4])

    f8e27c70 804dc6f2 e1084108 8055a540 e1084108 nt!KiSwapThread+0x46 (FPO: [0,0,0])

    f8e27c98 8056a50a 00000001 00000010 00000001 nt!KeWaitForSingleObject+0x1c2 (FPO: [Non-Fpo])

    f8e27d48 804df06b 000007c4 002bff70 00000000 nt!NtReplyWaitReceivePortEx+0x3dc (FPO: [Non-Fpo])

    f8e27d48 7c90eb94 000007c4 002bff70 00000000 nt!KiFastCallEntry+0xf8 (FPO: [0,0] TrapFrame @ f8e27d64)

    002bff80 00000000 00000000 00000000 00000000 ntdll!KiFastSystemCallRet (FPO: [0,0,0])

     

    kd> !handle 7c4 3 810ddda0       

    processor number 0, process 810ddda0

    PROCESS 810ddda0  SessionId: 0  Cid: 01dc    Peb: 7ffd9000  ParentCid: 01b4

        DirBase: 058cd000  ObjectTable: e1a13278  HandleCount:  18.

        Image: rpclpcs.exe

     

    Handle table at e107d000 with 18 Entries in use

    07c4: Object: e1084108  GrantedAccess: 001f0001 Entry: e107df88

    Object: e1084108  Type: (812b5c80) Port

        ObjectHeader: e10840f0 (old version)

            HandleCount: 1  PointerCount: 4

            Directory Object: e14c72c8  Name: rpclpc

     

    kd> !lpc port e1084108

     

    Server connection port e1084108  Name: rpclpc

        Handles: 1   References: 4

        Server process  : 810ddda0 (rpclpcs.exe)

        Queue semaphore : 81131188

        Semaphore state 0 (0x0)

        The message queue is empty

        The LpcDataInfoChainHead queue is empty

     

     

    LPC Messages that are waiting to be picked up by the server thread

     

    When a message is queued to the server connection port and the server thread has not yet not been signaled to pull out the message the ‘!lpc port <Port>” displays the following output that includes the message.

     

    kd> !lpc port e10d8388

     

    Client communication port 0xe10d8388

        Handles: 1   References: 2

        The LpcDataInfoChainHead queue is empty

            Connected port: 0xe106dc00      Server connection port: 0xe15512e0

     

    Server communication port 0xe106dc00

        Handles: 1   References: 1

        The LpcDataInfoChainHead queue is empty

     

    Server connection port e15512e0  Name: rpclpc

        Handles: 1   References: 9

        Server process  : ffbbebe8 (rpclpcs.exe)

        Queue semaphore : 81250848

        Semaphore state 0 (0x0)

            Messages in queue:

            0000 e10513f8 - Busy  Id=000048d8  From: 05f4.0108  Context=80020000  [e15512f0 . e15512f0]

                       Length=0054003c  Type=00000001 (LPC_REQUEST)

                       Data: 00000001 00000241 00000000 00000000 f877bc04 804ec10e

        The message queue contains 1 messages

        The LpcDataInfoChainHead queue is empty

     

     

    LPC Messages being processed by the server

     

    When an LPC message has been queued by the client, dequeued by the sever and being worked upon by the server both the client and the server thread have the message ID associated with them

     

    kd> !lpc message 16fa

    Searching message 16fa in threads ...

        Server thread 810de2a8 is working on message 16fa                        

    Client thread 810dc930 waiting a reply from 16fa                         

    Searching thread 810dc930 in port rundown queues ...

     

    Server connection port e1084108  Name: rpclpc

        Handles: 1   References: 3

        Server process  : 810ddda0 (rpclpcs.exe)

        Queue semaphore : 81131188

        Semaphore state 0 (0x0)

        The message queue is empty

        The LpcDataInfoChainHead queue is empty

    Done.                                             

     

    kd> !thread 810de2a8

    THREAD 810de2a8  Cid 01dc.01f4  Teb: 7ffde000 Win32Thread: 00000000 RUNNING on processor 0

    Not impersonating

    DeviceMap                 e196c460

    Owning Process            810ddda0       Image:         rpclpcs.exe

    Wait Start TickCount      65732          Ticks: 10 (0:00:00:00.100)

    Context Switch Count      4            

    UserTime                  00:00:00.000

    KernelTime                00:00:00.010

    Win32 Start Address 0x000016fa

    LPC Server thread working on message Id 16fa

    Start Address kernel32!BaseThreadStartThunk (0x7c810856)

    Stack Init f8e28000 Current f8e27728 Base f8e28000 Limit f8e25000 Call 0

    Priority 9 BasePriority 8 PriorityDecrement 0 DecrementCount 0

    ChildEBP RetAddr  Args to Child             

    f8e27d64 7c90eb94 badb0d00 002bfe24 00000000 nt!KiFastCallEntry+0xf8 (FPO: [0,0] TrapFrame @ f8e27d64)

    002bfe18 7c90e399 77e76703 000007c4 002bff70 ntdll!KiFastSystemCallRet (FPO: [0,0,0])

    002bfe1c 77e76703 000007c4 002bff70 00000000 ntdll!NtReplyWaitReceivePortEx+0xc (FPO: [5,0,0])

    002bff80 77e76c22 002bffa8 77e76a3b 00084540 RPCRT4!LRPC_ADDRESS::ReceiveLotsaCalls+0xf4 (FPO: [Non-Fpo])

    002bff88 77e76a3b 00084540 7c90ee18 0006fb10 RPCRT4!RecvLotsaCallsWrapper+0xd (FPO: [Non-Fpo])

    002bffa8 77e76c0a 00083f20 002bffec 7c80b50b RPCRT4!BaseCachedThreadRoutine+0x79 (FPO: [Non-Fpo])

    002bffb4 7c80b50b 00084718 7c90ee18 0006fb10 RPCRT4!ThreadStartRoutine+0x1a (FPO: [Non-Fpo])

    002bffec 00000000 77e76bf0 00084718 00000000 kernel32!BaseThreadStart+0x37 (FPO: [Non-Fpo])

     

    kd> !thread 810dc930

    THREAD 810dc930  Cid 01e4.01ec  Teb: 7ffde000 Win32Thread: 00000000 READY

    Waiting for reply to LPC MessageId 000016fa:

    Pending LPC Reply Message:

        e16b8538: [e16b8538,e16b8538]

    Not impersonating

    DeviceMap                 e196c460

    Owning Process            810e9860       Image:         rpclpcc.exe

    Wait Start TickCount      65731          Ticks: 11 (0:00:00:00.110)

    Context Switch Count      26            

    UserTime                  00:00:00.000

    KernelTime                00:00:00.020

    Win32 Start Address rpclpcs!pre_c_init (0x01001a91)

    Start Address kernel32!BaseProcessStartThunk (0x7c810867)

    Stack Init fbe0e000 Current fbe0dc28 Base fbe0e000 Limit fbe0b000 Call 0

    Priority 8 BasePriority 8 PriorityDecrement 0 DecrementCount 16

    ChildEBP RetAddr  Args to Child             

    fbe0dc40 804ea3a4 00000000 810dc930 e107f388 nt!KiUnlockDispatcherDatabase+0x77 (FPO: [Uses EBP] [0,0,4])

    fbe0dc58 80586255 81131188 00000001 00000001 nt!KeReleaseSemaphore+0x70 (FPO: [Non-Fpo])

    fbe0dd10 80598c58 000847b0 0006f8c0 0006f8c8 nt!NtSecureConnectPort+0x635 (FPO: [Non-Fpo])

    fbe0dd3c 804df06b 000847b0 0006f8c0 0006f8c8 nt!NtConnectPort+0x24 (FPO: [Non-Fpo])

    fbe0dd3c 7c90eb94 000847b0 0006f8c0 0006f8c8 nt!KiFastCallEntry+0xf8 (FPO: [0,0] TrapFrame @ fbe0dd64)

    0006f8d0 7c93040b ffffffff 00000000 7c859e48 ntdll!KiFastSystemCallRet (FPO: [0,0,0])

    0006f9b0 7c8107fd 00000000 00084718 7c90e642 ntdll!DbgPrint+0x1b (FPO: [Non-Fpo])

    0006fe68 77ea11a8 00000022 00084540 00000000 kernel32!CreateRemoteThread+0x284 (FPO: [Non-Fpo])

    000847a8 00000000 00000001 00000000 00000000 RPCRT4!LRPC_ADDRESS::CompleteListen+0x84 (FPO: [Non-Fpo])

     

     

    Finding all threads in the system involved in LPC communication

     

    To get a consolidated list of all client threads in the system waiting on LPC replies and a list of server threads working on received LPC requests use the “!lpc threads” command without any parameters

     

    0: kd> !lpc thread

    Searching message 0 in threads ...

    Client thread 89a4cdb0 waiting a reply from 15f6                         

        Server thread 898c6a50 is working on message 63f09                        

        Server thread 89671020 is working on message 63f75                        

        Server thread 89904a48 is working on message 63f10                        

        Server thread 88d6bdb0 is working on message 3d6b1                        

        Server thread 89973db0 is working on message 63f32                         

        Server thread 896d32b8 is working on message 3454                        

        Server thread 8995d020 is working on message 63f76                        

        Server thread 8960f020 is working on message 63f77                         

        Server thread 898cd350 is working on message 61bd9                        

        Server thread 8900edb0 is working on message 3832f                        

        Server thread 8900fbf0 is working on message 33c2e                        

        Server thread 88d539a8 is working on message 3343f                        

    Client thread 89be4020 waiting a reply from 6077e                         

    Client thread 89012990 waiting a reply from 39239                         

        Server thread 89012990 is working on message 39232                        

    Client thread 89531020 waiting a reply from 3923d                         

        Server thread 89531020 is working on message 39236                        

    Client thread 88d13b40 waiting a reply from 3c96e                            

     

     

     

    Quick Tips

     

    8 No thread in the system is doing a NtListenPort( ) does that mean no LPC connection  port is open for connections?

     

    !stacks 2 NtListenPort does not return any thread because NtListenPort( ) calls NtRequestWaitReplyPort( ) to wait for an incoming LPC_CONNECTION_REQUEST message. Instead of calling NtListenPort( ) for listening for incoming connections most LPC connection threads directly call NtRequestWaitReplyPort( ).

     

    8 !lpc message does not show any message?

     

    !poolfind LpcM 1 should do the trick.

     

     

    0: kd> !lpc message

     

    Scanning large pool allocation table for Tag: LpcM (8974e000 : 8976e000)

     

    Searching Paged pool (e1000000 : f7000000) for Tag: LpcM

     

     

    0: kd> !poolfind LpcM 1

     

    Scanning large pool allocation table for Tag: LpcM (8974e000 : 8976e000)

     

     

    Searching Paged pool (e1000000 : f7000000) for Tag: LpcM

     

    e1061000 size:  168 previous size:    0  (Allocated) LpcM

    e1090758 size:  168 previous size:    8  (Allocated) LpcM

    e10a7a40 size:  168 previous size:    8  (Allocated) LpcM

    e1188468 size:    8 previous size:   20  (Free)      LpcM

    e1261000 size:  168 previous size:    0  (Allocated) LpcM

    e1297428 size:  168 previous size:    8  (Allocated) LpcM

     

    Once you have the pool blocks listed with the “LpcM” tag dump the ones that are “Allocated”. On a 32-bit machine add 8bytes to the pool address listed on the left most column. On a 64-bit machine add 16bytes i.e. 0x10. Dump the message as below to retrieve the message id.

     

    0: kd> dt nt!_LPCP_MESSAGE e1061000+8

       +0x000 Entry            : _LIST_ENTRY [ 0xe295fa80 - 0xe29279e8 ]

       +0x000 FreeEntry        : _SINGLE_LIST_ENTRY

       +0x004 Reserved0        : 0xe29279e8

       +0x008 SenderPort       : 0xe2327020

       +0x00c RepliedToThread  : (null)

       +0x010 PortContext      : 0x8021000d

       +0x018 Request          : _PORT_MESSAGE

     

    0: kd> dt nt!_LPCP_MESSAGE e1061000+8 Request.

       +0x018 Request  :

          +0x000 u1       : __unnamed

          +0x004 u2       : __unnamed

          +0x008 ClientId : _CLIENT_ID

          +0x008 DoNotUseThisField : 9.8460604703041575e-311

          +0x010 MessageId : 0x4617f

          +0x014 ClientViewSize : 0

          +0x014 CallbackId : 0

     

    0: kd> !lpc message 0x4617f

    Searching message 4617f in threads ...

        Server thread 88f91c08 is working on message 4617f                        

    Client thread 88f0f5f8 waiting a reply from 4617f                         

    Searching thread 88f0f5f8 in port rundown queues ...

     

    Server communication port 0xe2300a38

        Handles: 1   References: 1

        The LpcDataInfoChainHead queue is empty

            Connected port: 0xe2327020      Server connection port: 0xe13734b0

     

    Client communication port 0xe2327020

        Handles: 1   References: 9

        The LpcDataInfoChainHead queue is empty

     

    Server connection port e13734b0  Name: epmapper

        Handles: 1   References: 93

        Server process  : 896e1450 (svchost1.exe)

        Queue semaphore : 8995ac10

        Semaphore state 0 (0x0)

        The message queue is empty

     

     

    8   The number of server threads shown by !lpc thread processing requests is much more than the number of client threads waiting for reply.

     

    !lpc thread interprets stale data in ETHREAD structure as valid LPC information. An indication of the stale data is the ETHREAD.LpcExitThreadCalled will be set. This field is only set when the thread is exiting.

     

    0: kd> !lpc thread

    Searching message 0 in threads ...

    Client thread 89a4cdb0 waiting a reply from 15f6                         

        Server thread 898c6a50 is working on message 63f09                        

        Server thread 89671020 is working on message 63f75                         

        Server thread 89904a48 is working on message 63f10                        

        Server thread 88d6bdb0 is working on message 3d6b1                        

        Server thread 89973db0 is working on message 63f32                        

        Server thread 896d32b8 is working on message 3454                        

        Server thread 8995d020 is working on message 63f76                        

        Server thread 8960f020 is working on message 63f77                        

        Server thread 898cd350 is working on message 61bd9                        

        Server thread 8900edb0 is working on message 3832f                        

        Server thread 8900fbf0 is working on message 33c2e                        

        Server thread 88d539a8 is working on message 3343f     

     

    0: kd> !lpc message 3454

    Searching message 3454 in threads ...

        Server thread 896d32b8 is working on message 3454                        

    Done. 

     

     

    0: kd> dt nt!_ETHREAD 896d32b8 Lpc.

       +0x1c0 LpcReplyChain :  [ 0x896d3478 - 0x896d3478 ]

          +0x000 Flink : 0x896d3478 _LIST_ENTRY [ 0x896d3478 - 0x896d3478 ]

          +0x004 Blink : 0x896d3478 _LIST_ENTRY [ 0x896d3478 - 0x896d3478 ]

       +0x1ec LpcReplySemaphore :

          +0x000 Header : _DISPATCHER_HEADER

          +0x010 Limit : 1

       +0x200 LpcReplyMessage :

       +0x200 LpcWaitingOnPort :

       +0x220 LpcReceivedMessageId : 0x3454

       +0x234 LpcReplyMessageId : 0

       +0x248 LpcReceivedMsgIdValid : 0x1 ''

       +0x248 LpcExitThreadCalled : 0x1 ''

     

    8 Finding all client threads in the system waiting for LPC reply.

     

    !lpc thread

Page 22 of 24 (235 items) «2021222324