• Ntdebugging Blog

    Desktop Heap Overview



    Desktop heap is probably not something that you spend a lot of time thinking about, which is a good thing.  However, from time to time you may run into an issue that is caused by desktop heap exhaustion, and then it helps to know about this resource.  Let me state up front that things have changed significantly in Vista around kernel address space, and much of what I’m talking about today does not apply to Vista.


    Laying the groundwork: Session Space

    To understand desktop heap, you first need to understand session space.  Windows 2000, Windows XP, and Windows Server 2003 have a limited, but configurable, area of memory in kernel mode known as session space.  A session represents a single user’s logon environment.  Every process belongs to a session.  On a Windows 2000 machine without Terminal Services installed, there is only a single session, and session space does not exist.  On Windows XP and Windows Server 2003, session space always exists.  The range of addresses known as session space is a virtual address range.  This address range is mapped to the pages assigned to the current session.  In this manner, all processes within a given session map session space to the same pages, but processes in another session map session space to a different set of pages. 

    Session space is divided into four areas: session image space, session structure, session view space, and session paged pool.  Session image space loads a session-private copy of Win32k.sys modified data, a single global copy of win32k.sys code and unmodified data, and maps various other session drivers like video drivers, TS remote protocol driver, etc.  The session structure holds various memory management (MM) control structures including the session working set list (WSL) information for the session.  Session paged pool allows session specific paged pool allocations.  Windows XP uses regular paged pool, since the number of remote desktop connections is limited.  On the other hand, Windows Server 2003 makes allocations from session paged pool instead of regular paged pool if Terminal Services (application server mode) is installed.  Session view space contains mapped views for the session, including desktop heap. 

    Session Space layout:

    Session Image Space: win32k.sys, session drivers

    Session Structure: MM structures and session WSL

    Session View Space: session mapped views, including desktop heap

    Session Paged Pool


    Sessions, Window Stations, and Desktops

    You’ve probably already guessed that desktop heap has something to do with desktops.  Let’s take a minute to discuss desktops and how they relate to sessions and window stations.  All Win32 processes require a desktop object under which to run.  A desktop has a logical display surface and contains windows, menus, and hooks.  Every desktop belongs to a window station.  A window station is an object that contains a clipboard, a set of global atoms and a group of desktop objects.  Only one window station per session is permitted to interact with the user. This window station is named "Winsta0."  Every window station belongs to a session.  Session 0 is the session where services run and typically represents the console (pre-Vista).  Any other sessions (Session 1, Session 2, etc) are typically remote desktops / terminal server sessions, or sessions attached to the console via Fast User Switching.  So to summarize, sessions contain one or more window stations, and window stations contain one or more desktops.

    You can picture the relationship described above as a tree.  Below is an example of this desktop tree on a typical system:

    - Session 0

    |   |

    |   ---- WinSta0           (interactive window station)

    |   |      |

    |   |      ---- Default    (desktop)

    |   |      |

    |   |      ---- Disconnect (desktop)

    |   |      |

    |   |      ---- Winlogon   (desktop)

    |   |

    |   ---- Service-0x0-3e7$  (non-interactive window station)

    |   |      |

    |   |      ---- Default    (desktop)

    |   |

    |   ---- Service-0x0-3e4$  (non-interactive window station)

    |   |      |

    |   |      ---- Default    (desktop)

    |   |

    |   ---- SAWinSta          (non-interactive window station)

    |   |      |

    |   |      ---- SADesktop  (desktop)

    |   |

    - Session 1

    |   |

    |   ---- WinSta0           (interactive window station)

    |   |      |

    |   |      ---- Default    (desktop)

    |   |      |

    |   |      ---- Disconnect (desktop)

    |   |      |

    |   |      ---- Winlogon   (desktop)

    |   |

    - Session 2


        ---- WinSta0           (interactive window station)


               ---- Default    (desktop)


               ---- Disconnect (desktop)


               ---- Winlogon   (desktop)


    In the above tree, the full path to the SADesktop (as an example) can be represented as “Session 0\SAWinSta\SADesktop”.


    Desktop Heap – what is it, what is it used for?

    Every desktop object has a single desktop heap associated with it.  The desktop heap stores certain user interface objects, such as windows, menus, and hooks.  When an application requires a user interface object, functions within user32.dll are called to allocate those objects.  If an application does not depend on user32.dll, it does not consume desktop heap.  Let’s walk through a simple example of how an application can use desktop heap. 

    1.     An application needs to create a window, so it calls CreateWindowEx in user32.dll.

    2.     User32.dll makes a system call into kernel mode and ends up in win32k.sys.

    3.     Win32k.sys allocates the window object from desktop heap

    4.     A handle to the window (an HWND) is returned to caller

    5.     The application and other processes in the same session can refer to the window object by its HWND value


    Where things go wrong

    Normally this “just works”, and neither the user nor the application developer need to worry about desktop heap usage.  However, there are two primary scenarios in which failures related to desktop heap can occur:

    1. Session view space for a given session can become fully utilized, so it is impossible for a new desktop heap to be created.
    2. An existing desktop heap allocation can become fully utilized, so it is impossible for threads that use that desktop to use more desktop heap.


    So how do you know if you are running into these problems?  Processes failing to start with a STATUS_DLL_INIT_FAILED (0xC0000142) error in user32.dll is a common symptom.  Since user32.dll needs desktop heap to function, failure to initialize user32.dll upon process startup can be an indication of desktop heap exhaustion.  Another symptom you may observe is a failure to create new windows.  Depending on the application, any such failure may be handled in different ways.  Note that if you are experiencing problem number one above, the symptoms would usually only exist in one session.  If you are seeing problem two, then the symptoms would be limited to processes that use the particular desktop heap that is exhausted.


    Diagnosing the problem

    So how can you know for sure that desktop heap exhaustion is your problem?  This can be approached in a variety of ways, but I’m going to discuss the simplest method for now.  Dheapmon is a command line tool that will dump out the desktop heap usage for all the desktops in a given session.  See our first blog post for a list of tool download locations.  Once you have dheapmon installed, be sure to run it from the session where you think you are running out of desktop heap.  For instance, if you have problems with services failing to start, then you’ll need to run dheapmon from session 0, not a terminal server session.

    Dheapmon output looks something like this:

    Desktop Heap Information Monitor Tool (Version 7.0.2727.0)

    Copyright (c) 2003-2004 Microsoft Corp.


      Session ID:    0 Total Desktop: (  5824 KB -    8 desktops)


      WinStation\Desktop            Heap Size(KB)    Used Rate(%)


      WinSta0\Default                    3072              5.7

      WinSta0\Disconnect                   64              4.0

      WinSta0\Winlogon                    128              8.7

      Service-0x0-3e7$\Default            512             15.1

      Service-0x0-3e4$\Default            512              5.1

      Service-0x0-3e5$\Default            512              1.1

      SAWinSta\SADesktop                  512              0.4

      __X78B95_89_IW\__A8D9S1_42_ID       512              0.4



    As you can see in the example above, each desktop heap size is specified, as is the percentage of usage.  If any one of the desktop heaps becomes too full, allocations within that desktop will fail.  If the cumulative heap size of all the desktops approaches the total size of session view space, then new desktops cannot be created within that session.  Both of the failure scenarios described above depend on two factors: the total size of session view space, and the size of each desktop heap allocation.  Both of these sizes are configurable. 


    Configuring the size of Session View Space

    Session view space size is configurable using the SessionViewSize registry value.  This is a REG_DWORD and the size is specified in megabytes.  Note that the values listed below are specific to 32-bit x86 systems not booted with /3GB.  A reboot is required for this change to take effect.  The value should be specified under:

    HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Control\Session Manager\Memory Management


    Size if no registry value configured

    Default registry value

    Windows 2000 *

    20 MB


    Windows XP

    20 MB

    48 MB

    Windows Server 2003

    20 MB

    48 MB

    * Settings for Windows 2000 are with Terminal Services enabled and hotfix 318942 installed.  Without the Terminal Services installed, session space does not exist, and desktop heap allocations are made from a fixed 48 MB region for system mapped views.  Without hotfix 318942 installed, the size of session view space is fixed at 20 MB.

    The sum of the sizes of session view space and session paged pool has a theoretical maximum of slightly under 500 MB for 32-bit operating systems.  The maximum varies based on RAM and various other registry values.  In practice the maximum value is around 450 MB for most configurations.  When the above values are increased, it will result in the virtual address space reduction of any combination of nonpaged pool, system PTEs, system cache, or paged pool.


    Configuring the size of individual desktop heaps

    Configuring the size of the individual desktop heaps is bit more complex.  Speaking in terms of desktop heap size, there are three possibilities:

    ·         The desktop belongs to an interactive window station and is a “Disconnect” or “Winlogon” desktop, so its heap size is fixed at 64KB or 128 KB, respectively (for 32-bit x86)

    ·         The desktop heap belongs to an interactive window station, and is not one of the above desktops.  This desktop’s heap size is configurable.

    ·         The desktop heap belongs to a non-interactive window station.  This desktop’s heap size is also configurable.


    The size of each desktop heap allocation is controlled by the following registry value:

                HKEY_LOCAL_MACHINE\System\CurrentControlSet\Control\Session Manager\SubSystems\Windows


     The default data for this registry value will look something like the following (all on one line):

                   %SystemRoot%\system32\csrss.exe ObjectDirectory=\Windows

                   SharedSection=1024,3072,512 Windows=On SubSystemType=Windows

                   ServerDll=basesrv,1 ServerDll=winsrv:UserServerDllInitialization,3

                   ServerDll=winsrv:ConServerDllInitialization,2 ProfileControl=Off




    The numeric values following "SharedSection=" control how desktop heap is allocated. These SharedSection values are specified in kilobytes.

    The first SharedSection value (1024) is the shared heap size common to all desktops. This memory is not a desktop heap allocation, and the value should not be modified to address desktop heap problems.

    The second SharedSection value (3072) is the size of the desktop heap for each desktop that is associated with an interactive window station, with the exception of the “Disconnect” and “Winlogon” desktops.

    The third SharedSection value (512) is the size of the desktop heap for each desktop that is associated with a "non-interactive" window station. If this value is not present, the size of the desktop heap for non-interactive window stations will be same as the size specified for interactive window stations (the second SharedSection value). 

    Consider the two desktop heap exhaustion scenarios described above.  If the first scenario is encountered (session view space is exhausted), and most of the desktop heaps are non-interactive, then the third SharedSection can be decreased in an effort to allow more (smaller) non-interactive desktop heaps to be created.  Of course, this may not be an option if the processes using the non-interactive heaps require a full 512 KB.  If the second scenario is encountered (a single desktop heap allocation is full), then the second or third SharedSection value can be increased to allow each desktop heap to be larger than 3072 or 512 KB.  A potential problem with this is that fewer total desktop heaps can be created.


    What are all these window stations and desktops in Session 0 anyway?

    Now that we know how to tweak the sizes of session view space and the various desktops, it is worth talking about why you have so many window stations and desktops, particularly in session 0.  First off, you’ll find that every WinSta0 (interactive window station) has at least 3 desktops, and each of these desktops uses various amounts of desktop heap.  I’ve alluded to this previously, but to recap, the three desktops for each interactive window stations are:

    ·         Default desktop - desktop heap size is configurable as described below

    ·         Disconnect desktop - desktop heap size is 64k on 32-bit systems

    ·         Winlogon desktop - desktop heap size is 128k on 32-bit systems


    Note that there can potentially be more desktops in WinSta0 as well, since any process can call CreateDesktop and create new desktops.

    Let’s move on to the desktops associated with non-interactive window stations: these are usually related to a service.  The system creates a window station in which service processes that run under the LocalSystem account are started. This window station is named service-0x0-3e7$. It is named for the LUID for the LocalSystem account, and contains a single desktop that is named Default. However, service processes that run as LocalSystem interactive start in Winsta0 so that they can interact with the user in Session 0 (but still run in the LocalSystem context).

    Any service process that starts under an explicit user or service account has a window station and desktop created for it by service control manager, unless a window station for its LUID already exists. These window stations are non-interactive window stations.  The window station name is based on the LUID, which is unique for every logon.  If an entity (other than System) logs on multiple times, a new window station is created for each logon.  An example window station name is “service-0x0-22e1$”.

    A common desktop heap issue occurs on systems that have a very large number of services.  This can be a large number of unique services, or one (poorly designed, IMHO) service that installs itself multiple times.  If the services all run under the LocalSystem account, then the desktop heap for Session 0\Service-0x0-3e7$\Default may become exhausted.  If the services all run under another user account which logs on multiples times, each time acquiring a new LUID, there will be a new desktop heap created for every instance of the service, and session view space will eventually become exhausted.

    Given what you now know about how service processes use window stations and desktops, you can use this knowledge to avoid desktop heap issues.  For instance, if you are running out of desktop heap for the Session 0\Service-0x0-3e7$\Default desktop, you may be able to move some of the services to a new window station and desktop by changing the user account that the service runs under.


    Wrapping up

    I hope you found this post interesting and useful for solving those desktop heap issues!  If you have questions are comments, please let us know.


    - Matthew Justice


    [Update: 7/5/2007 - Desktop Heap, part 2 has been posted]

    [Update: 9/13/2007 - Talkback video: Desktop Heap has been posted]

    [Update: 3/20/2008 - The default interactive desktop heap size has been increased on 32-bit Vista SP1]


  • Ntdebugging Blog

    Debug Fundamentals Exercise 2: Some reverse engineering for Thanksgiving



    Continuing our series on “Fundamentals Exercises”, we have some more reverse engineering for you!  Again, these exercises are designed more as learning experiences rather than simply puzzlers.  We hope you find them interesting and educational!  Feel free to post your responses here, but we won’t put them on the site until after we post the “official” responses, to avoid spoilers.


    Examine the following code, registers, and stack values to determine the following:


    1.       What is the return value from DemoFunction2?

    2.       What is the purpose of DemoFunction2?

    3.       Bonus: Both the last exercise and this week’s exercise involved accessing data at ebp+8.  Why ebp+8?



    1.       You probably don’t want to manually walk through every instruction that executes in the loop.  Instead, walk through a few iterations to determine the intent of the code.

    2.       The bracket notation [] in the assembly means to treat the value in brackets as a memory address, and access the value at that address.

    3.       32-bit integer return values are stored in eax



    0:000> uf 010024d0


    010024d0 55              push    ebp

    010024d1 8bec            mov     ebp,esp

    010024d3 8b5508          mov     edx,dword ptr [ebp+8]

    010024d6 33c0            xor     eax,eax

    010024d8 b920000000      mov     ecx,20h

    010024dd d1ea            shr     edx,1

    010024df 7301            jnc     asmdemo2!DemoFunction2+0x12 (010024e2)

    010024e1 40              inc     eax

    010024e2 e2f9            loop    asmdemo2!DemoFunction2+0xd (010024dd)

    010024e4 5d              pop     ebp

    010024e5 c3              ret


    0:000> r

    eax=80002418 ebx=7ffd7000 ecx=00682295 edx=00000000 esi=80002418 edi=00000002

    eip=010024d0 esp=0006fe98 ebp=0006fea8 iopl=0         nv up ei pl zr na pe nc

    cs=001b  ss=0023  ds=0023  es=0023  fs=003b  gs=0000             efl=00000246


    010024d0 55              push    ebp


    0:000> dps esp

    0006fe98  0100251c asmdemo2!main+0x20

    0006fe9c  80002418

    0006fea0  00000002

    0006fea4  00000000

    0006fea8  0006ff88

    0006feac  01002969 asmdemo2!_mainCRTStartup+0x12c

    0006feb0  00000002

    0006feb4  00682270

    0006feb8  006822b8

    0006febc  f395c17d

    0006fec0  00000000

    0006fec4  00000000

    0006fec8  7ffd7000

    0006fecc  00000000

    0006fed0  00000000

    0006fed4  00000000

    0006fed8  00000094

    0006fedc  00000006

    0006fee0  00000000

    0006fee4  00001771

    0006fee8  00000002

    0006feec  76726553

    0006fef0  20656369

    0006fef4  6b636150

    0006fef8  00003120

    0006fefc  00000000

    0006ff00  00000000

    0006ff04  00000000

    0006ff08  00000000

    0006ff0c  00000000

    0006ff10  00000000

    0006ff14  00000000


    [Update: our answer. Posted 12/04/2008]

    We had a great response to this exercise!  It was good to see so many of you going through this.  There were some readers that found this a good exercise for beginners, and others were looking for a return to Puzzlers.  As an FYI, we may do more Puzzlers in the future, but for now we are going to continue on the “Fundamentals Exercise” track to help all our readers build up a solid foundation for debugging.


    It was interesting to read how several of you not only gave the answers, but made suggestions for how the code could be optimized!  I want to point out that the code we post for these exercises isn’t intended to be the optimal solution; it is written as a learning tool.   That said, keep that in-depth feedback coming; I think everyone will benefit from a discussion of optimization.


    Answers to exercise 2:


    1. DemoFunction2 returns 5, which is the number of bits set in 80002418, the value at ebp+8.
    2. DemoFunction2 finds the hamming weight of the 32-bit value passed to the function (it returns the count of bits that are equal to 1).
    3. ebp points to the base of the stack frame (the value stored there points to the previous frame's ebp), ebp+4 points to the return address, and ebp+8 points to the first parameter passed to the function.  Note that the value of ebp changes in the function prolog, at instruction 010024d1.  At this point ebp is set to 0006fe94, so at instruction 010024d3, ebp+8 is 0006fe9c, and [ebp+8] = 80002418.
  • Ntdebugging Blog

    Understanding Pool Consumption and Event ID: 2020 or 2019



    Hi!  My name is Tate.  I’m an Escalation Engineer on the Microsoft Critical Problem Resolution Platforms Team.  I wanted to share one of the most common errors we troubleshoot here on the CPR team, its root cause being pool consumption, and the methods by which we can remedy it quickly!


    This issue is commonly misdiagnosed, however, 90% of the time it is actually quite possible to determine the resolution quickly without any serious effort at all!



    First, what do these events really mean?


    Event ID 2020
    Event Type: Error
    Event Source: Srv
    Event Category: None
    Event ID: 2020
    The server was unable to allocate from the system paged pool because the pool was empty.


    Event ID 2019
    Event Type: Error
    Event Source: Srv
    Event Category: None
    Event ID: 2019
    The server was unable to allocate from the system NonPaged pool because the pool was empty.



    This is our friend the Server Service reporting that when it was trying to satisfy a request, it was not able to find enough free memory of the respective type of pool.  2020 indicates Paged Pool and 2019, NonPaged Pool.  This doesn’t mean that the Server Service (srv.sys) is broken or the root cause of the problem, more often rather it is the first component to see the resource problem and report it to the Event Log.  Thus, there could be (and usually are) a few more symptoms of pool exhaustion on the system such as hangs, or out of resource errors reported by drivers or applications, or all of the above!



    What is Pool?


    First, Pool is not the amount of RAM on the system, it is however a segment of the virtual memory or address space that Windows reserves on boot.  These pools are finite considering address space itself is finite.  So, because 32bit(x86) machines can address 2^32==4Gigs, Windows uses (by default) 2GB for applications and 2GB for kernel.  Of the 2GB for kernel there are other things we must fit in our 2GB such as Page Table Entries (PTEs) and as such the maximum amount of Paged Pool for 32bit(x86) of ~460MB puts this in perspective in terms of our realistic limits per processor architecture.  As this implies, 64bit(x64&ia64) machines have less of a problem here due to their larger address space but there are still limits and thus no free lunch.


    *For more about determining current pool limits see the common question post “Why am I out of Paged Pool at ~200MB…” at the end of this post.


    *For more info about pools:  About Memory Management > Memory Pools

    *This has changed a bit for Vista, see Dynamic Kernel Address space



    What are these pools used for?


    These pools are used by either the kernel directly, indirectly by its support of various structures due to application requests on the system (CreateFile for example), or drivers installed on the system for their memory allocations made via the kernel pool allocation functions.


    Literally, NonPaged means that this memory when allocated will not be paged to disk and thus resident at all times, which is an important feature for drivers.  Paged conversely, can be, well… paged out to disk.  In the end though, all this memory is allocated through a common set of functions, most common is ExAllocatePoolWithTag.



    Ok, so what is using it/abusing it? (our goal right!?)


    Now that we know that the culprit is Windows or a component shipping with Windows, a driver, or an application requesting lots of things that the kernel has to create on its behalf, how can we find out which?


    There are really four basic methods that are typically used (listing in order of increasing difficulty)


    1.)    Find By Handle Count


    Handle Count?  Yes, considering that we know that an application can request something of the OS that it must then in turn create and provide a reference to…this is typically represented by a handle, and thus charged to the process’ total handle count!


    The quickest way by far if the machine is not completely hung is to check this via Task Manager.  Ctrl+Shift+Esc…Processes Tab…View…Select Columns…Handle Count.  Sort on Handles column now and check to see if there is a significantly large one there (this information is also obtainable via Perfmon.exe, Process Explorer, Handle.exe, etc.).


    What’s large?  Well, typically we should raise an eyebrow at anything over 5,000 or so.  Now that’s not to say that over this amount is inherently bad, just know that there is no free lunch and that a handle to something usually means that on the other end there is a corresponding object stored in NonPaged or Paged Pool which takes up memory.


    So for example let’s say we have a process that has 100,000 handles, mybadapp.exe.  What do we do next?


    Well, if it’s a service we could stop it (which releases the handles) or if an application running interactively, try to shut it down and look to see how much total Kernel Memory (Paged or NonPaged depending on which one we are short of) we get back.  If we were at 400MB of Paged Pool (Look at Performance Tab…Kernel Memory…Paged) and after stopping mybadapp.exe with its 100,000 handles are now at a reasonable 100MB, well there’s our bad guy and following up with the owner or further investigating (Process Explorer from sysinternals or the Windows debugger for example) what type of handles are being consumed would be the next step.



    For essential yet legacy applications, which there is no hope of replacing or obtaining support, we may consider setting up a performance monitor alert on the handle count when it hits a couple thousand or so (Performance Object: Process, Counter: Handle Count) and taking action to restart the bad service.  This is a less than elegant solution for sure but it could keep the one rotten apple from spoiling the bunch by hanging/crashing the machine!


    2.)    By Pooltag (as read by poolmon.exe)


    Okay, so no handle count gone wild? No problem.


    For Windows 2003 and later machines, a feature is enabled by default that allows tracking of the pool consumer via something called a pooltag.  For previous OS’s we will need to use a utility such as gflags.exe to Enable Pool Tagging (which requires a reboot unfortunately).  This is usually just a 3-4 character string or more technically “a character literal of up to four characters delimited by single quotation marks” that the caller of the kernel api to allocate the pool will provide as its 3rd parameter.  (see ExAllocatePoolWithTag)


    The tool that we use to get the information about what pooltag is using the most is poolmon.exe.  Launch this from a cmd prompt, hit B to sort by bytes descending and P to sort the list by the type (Paged, NonPaged, or Both) and we have a live view into what’s going on in the system.  Look specifically at the Tag Name and its respective Byte Total column for the guilty party!  Get Poolmon.exe Here  or More info about poolmon.exe usage. 


    The cool thing is that we have most of the OS utilized pooltags already documented so we have an idea if there is a match for one of the Windows components in pooltag.txt.  So if we see MmSt as the top tag for instance consuming far and away the largest amount, we can look at pooltag.txt and know that it’s the memory manager and also using that tag in a search engine query we might get the more popular KB304101 which may resolve the issue!


    We will find pooltag.txt in the ...\Debugging Tools for Windows\triage folder when the debugging tools are installed.


    Oh no, what if it’s not in the list? No problem…


    We might be able to find its owner by using one of the following techniques:


    • For 32-bit versions of Windows, use poolmon /c to create a local tag file that lists each tag value assigned by drivers on the local machine (%SystemRoot%\System32\Drivers\*.sys). The default name of this file is Localtag.txt.


    Really all versions---->• For Windows 2000 and Windows NT 4.0, use Search to find files that contain a specific pool tag, as described in KB298102, How to Find Pool Tags That Are Used By Third-Party Drivers.

    From:  http://www.microsoft.com/whdc/driver/tips/PoolMem.mspx




    3.)    Using Driver Verifier


    Using driver verifier is a more advanced approach to this problem.  Driver Verifier provides a whole suite of options targeted mainly at the driver developer to run what amounts to quality control checks before shipping their driver.


    However, should pooltag identification be a problem, there is a facility here in Pool Tracking that does the heavy lifting in that it will do the matching of Pool consumer directly to driver!


    Be careful however, the only option we will likely want to check is Pool Tracking as the other settings are potentially costly enough that if our installed driver set is not perfect on the machine we could get into an un-bootable situation with constant bluescreens notifying that xyz driver is doing abc bad thing and some follow up suggestions.


    In summary, Driver Verifier is a powerful tool at our disposal but use with care only after the easier methods do not resolve our pool problems.


    4.)    Via Debug (live and postmortem)


    As mentioned earlier the api being used here to allocate this pool memory is usually ExAllocatePoolWithTag.  If we have a kernel debugger setup we can set a break point here to brute force debug who our caller is….but that’s not usually how we do it, can you say, “extended downtime?”  There are other creative live debug methods with are a bit more advanced that we may post later…


    Usually, debugging this problem involves a post mortem memory.dmp taken from a hung server or a machine that has experienced Event ID:  2020 or Event ID 2019 or is no longer responsive to client requests, hung, or often both.  We can gather this dump via the Ctrl+Scroll Lock method see KB244139 , even while the machine is “hung” and seemingly unresponsive to the keyboard or Ctrl+Alt+Del !


    When loading the memory.dmp via windbg.exe or kd.exe we can quickly get a feel for the state of the machine with the following commands.


    Debugger output Example 1.1  (the !vm command)


    2: kd> !vm 
    *** Virtual Memory Usage ***
      Physical Memory:   262012   ( 1048048 Kb)
      Page File: \??\C:\pagefile.sys
         Current:   1054720Kb Free Space:    706752Kb
         Minimum:   1054720Kb Maximum:      1054720Kb
      Page File: \??\E:\pagefile.sys
         Current:   2490368Kb Free Space:   2137172Kb
         Minimum:   2490368Kb Maximum:      2560000Kb
      Available Pages:    63440   (  253760 Kb)
      ResAvail Pages:    194301   (  777204 Kb)
      Modified Pages:       761   (    3044 Kb)
      NonPaged Pool Usage: 52461   (  209844 Kb)<<NOTE!  Value is near NonPaged Max
      NonPaged Pool Max:   54278   (  217112 Kb)
      ********** Excessive NonPaged Pool Usage *****


    Note how the NonPaged Pool Usage value is near the NonPaged Pool Max value.  This tells us that we are basically out of NonPaged Pool.


    Here we can use the !poolused command to give the same information that poolmon.exe would have but in the dump….


    Debugger output Example 1.2  (!poolused 2)


    Note the 2 value passed to !poolused orders pool consumers by NonPaged


    2: kd> !poolused 2
       Sorting by NonPaged Pool Consumed
      Pool Used:
                NonPaged            Paged
    Tag    Allocs     Used    Allocs     Used
    Thre   120145 76892800         0        0
    File   187113 29946176         0        0
    AfdE    89683 25828704         0        0
    TCPT    41888 18765824         0        0
    AfdC    90964 17465088         0        0 


    We now see the “Thre” tag at the top of the list, the largest consumer of NonPaged Pool, let’s go look it up in pooltag.txt….


    Thre - nt!ps        - Thread objects


    Note, the nt before the ! means that this is NT or the kernel’s tag for Thread objects.

    So from our earlier discussion if we have a bunch of thread objects, I probably have an application on the system with a ton of handles and or a ton of Threads so it should be easy to find!


    Via the debugger we can find this out easily via the !process 0 0 command which will show the TableSize (Handle Count) of over 90,000!


    Debugger output Example 1.3  (the !process command continued)


    Note the two zeros after !process separated by a space gives a list of all running processes on the system.



    PROCESS 884e6520  SessionId: 0  Cid: 01a0    Peb: 7ffdf000  ParentCid: 0124
    DirBase: 110f6000  ObjectTable: 88584448  TableSize: 90472
    Image: mybadapp.exe


    We can dig further here into looking at the threads…


    Debugger output Example 1.4  (the !process command continued)


    0: kd> !PROCESS 884e6520 4
    PROCESS 884e6520  SessionId: 0  Cid: 01a0    Peb: 7ffdf000  ParentCid: 0124
    DirBase: 110f6000  ObjectTable: 88584448  TableSize: 90472.
    Image: mybadapp.exe
            THREAD 884d8560  Cid 1a0.19c  Teb: 7ffde000  Win32Thread: a208f648 WAIT
            THREAD 88447560  Cid 1a0.1b0  Teb: 7ffdd000  Win32Thread: 00000000 WAIT
            THREAD 88396560  Cid 1a0.1b4  Teb: 7ffdc000  Win32Thread: 00000000 WAIT
            THREAD 88361560  Cid 1a0.1bc  Teb: 7ffda000  Win32Thread: 00000000 WAIT
            THREAD 88335560  Cid 1a0.1c0  Teb: 7ffd9000  Win32Thread: 00000000 WAIT
            THREAD 88340560  Cid 1a0.1c4  Teb: 7ffd8000  Win32Thread: 00000000 WAIT
     And the list goes on…


    We can examine the thread via !thread 88340560 from here and so on…


    So in this rudimentary example the offender is clear in mybadapp.exe in its abundance of threads and one could dig further to determine what type of thread or functions are being executed and follow up with the owner of this executable for more detail, or take a look at the code if the application is yours!




    Common Question:


    Why am I out of Paged Pool at ~200MB when we say that the limit is around 460MB?


    This is because the memory manager at boot decided that given the current amount of RAM on the system and other memory manager settings such as /3GB, etc. that our max is X amount vs. the maximum.  There are two ways to see the maximum’s on a system.


    1.)   Process Explorer using its Task Management.  View…System Information…Kernel Memory section.


    Note that we have to specify a valid path to dbghelp.dll and Symbols path via Options…Configure Symbols.


    For example:


          Dbghelp.dll path:

    c:\<path to debugging tools for windows>\dbghelp.dll


    Symbols path:



    2.)The debugger (live or via a memory.dmp by doing a !vm)


    *NonPaged pool size is not configurable other than the /3GB boot.ini switch which lowers NonPaged Pool’s maximum.

    128MB with the /3GB switch, 256MB without


    Conversely, Paged Pool size is often able to be raised to around its maximum manually via the PagedPoolSize registry setting which we can find for example in KB304101.



    So what is this Pool Paged Bytes counter I see in Perfmon for the Process Object?


    This is when the allocation is charged to a process via ExAllocatePoolWithQuotaTag.  Typically, we will see ExAlloatePoolWithTag used and thus this counter is less effective…but hey…don’t pass up free information in Perfmon so be on the lookout for this easy win.



    Additional Resources:


     “Who's Using the Pool?” from Driver Fundamentals > Tips: What Every Driver Writer Needs to Know



    Poolmon Remarks:  http://technet2.microsoft.com/WindowsServer/en/library/85b0ba3b-936e-49f0-b1f2-8c8cb4637b0f1033.mspx





     I hope you have enjoyed this post and hopefully it will get you going in the right direction next time you see one of these events or hit a pool consumption issue!




  • Ntdebugging Blog

    Debug Fundamentals Exercise 1: Reverse engineer a function



    Hello ntdebuggers!  We’ve seen a lot of interest in our Puzzlers, and we’ve also seen requests and interest in topics covering debugging fundamentals.  So we’ve decided to combine the two topics and post a series of “Fundamentals Exercises”.  These exercises will be designed more as learning experiences rather than simply puzzlers.  We hope you find them interesting and educational!


    Feel free to post your responses here, but we won’t put them on the site until after we post the “official” responses, so as to avoid spoilers.



    Examine the following code, registers, and stack values to determine the following:


    1.       When the function “DoTheWork” returns, what is the return value from that function?

    2.       Bonus: what is the mathematical operation that “DoTheWork” performs?



    1.       The bracket notation [] in the assembly means to treat the value in brackets as a memory address, and access the value at that address.

    2.       32-bit integer return values are stored in eax



    // Code

    0:000> uf eip


    0040101c 55              push    ebp

    0040101d 8bec            mov     ebp,esp

    0040101f 8b4d08          mov     ecx,dword ptr [ebp+8]

    00401022 8bc1            mov     eax,ecx

    00401024 49              dec     ecx

    00401025 0fafc1          imul    eax,ecx

    00401028 83f902          cmp     ecx,2

    0040102b 7ff7            jg      demo2!DoTheWork+0x8 (00401024)

    0040102d 5d              pop     ebp

    0040102e c3              ret


    // Current register state

    0:000> r

    eax=00000007 ebx=7ffd9000 ecx=ffffffff edx=00000007 esi=00001771 edi=00000000

    eip=0040101c esp=0012fe9c ebp=0012feac iopl=0         nv up ei pl nz na po nc

    cs=001b  ss=0023  ds=0023  es=0023  fs=003b  gs=0000             efl=00000202


    0040101c 55              push    ebp


    // Current stack values for this thread

    0:000> dps esp

    0012fe9c  00406717 demo2!main+0x27

    0012fea0  00000007

    0012fea4  82059a87

    0012fea8  00000007

    0012feac  0012ff88

    0012feb0  004012b2 demo2!mainCRTStartup+0x170

    0012feb4  00000002

    0012feb8  00980e48

    0012febc  00980e80

    0012fec0  00000094

    0012fec4  00000006

    0012fec8  00000000

    0012fecc  00001771

    0012fed0  00000002

    0012fed4  76726553

    0012fed8  20656369

    0012fedc  6b636150

    0012fee0  00003120

    0012fee4  00000000

    0012fee8  00000000

    0012feec  00000000

    0012fef0  00000000

    0012fef4  00000000

    0012fef8  00000000

    0012fefc  00000000

    0012ff00  00000000

    0012ff04  00000000

    0012ff08  00000000

    0012ff0c  00000000

    0012ff10  00000000

    0012ff14  00000000

    0012ff18  00000000


    [Update: our answer. Posted 11/19/2008]

    Wow - what a great response from our readers on this exercise!  It is great to see the various approaches to reverse engineering this code.  As for the answer, the numerical result (stored in eax) is 5040, and the corresponding mathematical operation is a factorial.  So 7! is the result calculated, given that 7 was passed to the function.  Congratulations to all of you that got it right! 


    Many of you posted some C code to give an idea of what the original source of DoTheWork() might have looked like.  The original source was actually written in assembly!  However, it was written to be called from C, and it uses ebp in the same way that compiled C code might.  This function wasn’t written with optimal performance in mind, but rather for learning about reverse engineering.



  • Ntdebugging Blog

    NTDebugging Puzzler 0x00000003 (Matrix Edition) Some assembly required.


    Hello NTdebuggers, I'm very impressed with the depth of the answers we are seeing from our readers.  As I stated in last week's response, this week's puzzler is going to be harder.  With that said let's take it up a notch.  One of the things that is really cool about be an Escalation Engineer in GES/CPR is how far we go in the pursuit of solving complex problems.  If we're debugging some Microsoft code in a kernel dump or user mode, and our quest takes us into a binary that we don't have code or symbols for, we don't stop, we forge on!  Over the years there are members of our team that have had to port to or support Alpha, PowerPC, MIPs, IA64 and x64, myself included.  As a result most of us have books for just about every mainstream processor under the sun.  It's a good idea if you're going to be debugging on these platforms to have general working knowledge the CPUs you will encounter.  The most common CPU's we deal with are x86, followed by x64 and IA64.  Microsoft doesn't support PPC, MIPS or Alpha anymore unless you're dealing with Xbox consoles, and those are PPC.  That said, this week's challenge is to tell us what the following assembly does.  You can tell us in C, or break it down and comment on the various sections. 

    Some people like cross word puzzles, Most of us in GES/CPR love to esreveR reenignE assembler. Have FUN!


    “I don’t even see the code anymore”  Cypher...
    0:000> uf myfun
    puzzler3!myfun [c:\source\puzzler\puzzler3\puzzler3\puzzler3.cpp @ 20]:
       20 00cc1480 55              push    ebp
       20 00cc1481 8bec            mov     ebp,esp
       20 00cc1483 81ecf0000000    sub     esp,0F0h
       20 00cc1489 53              push    ebx
       20 00cc148a 56              push    esi
       20 00cc148b 57              push    edi
       20 00cc148c 8dbd10ffffff    lea     edi,[ebp-0F0h]
       20 00cc1492 b93c000000      mov     ecx,3Ch
       20 00cc1497 b8cccccccc      mov     eax,0CCCCCCCCh
       20 00cc149c f3ab            rep stos dword ptr es:[edi]
       26 00cc149e 8b4508          mov     eax,dword ptr [ebp+8]
       26 00cc14a1 50              push    eax
       26 00cc14a2 e803fcffff      call    puzzler3!ILT+165(_strlen) (00cc10aa)
       26 00cc14a7 83c404          add     esp,4
       26 00cc14aa 8945e0          mov     dword ptr [ebp-20h],eax
       28 00cc14ad 8b45e0          mov     eax,dword ptr [ebp-20h]
       28 00cc14b0 8945f8          mov     dword ptr [ebp-8],eax
       28 00cc14b3 eb09            jmp     puzzler3!myfun+0x3e (00cc14be)
       28 00cc14b5 8b45f8          mov     eax,dword ptr [ebp-8]
       28 00cc14b8 83e801          sub     eax,1
       28 00cc14bb 8945f8          mov     dword ptr [ebp-8],eax
       28 00cc14be 837df800        cmp     dword ptr [ebp-8],0
       28 00cc14c2 7e60            jle     puzzler3!myfun+0xa4 (00cc1524)
       30 00cc14c4 c745ec00000000  mov     dword ptr [ebp-14h],0
       30 00cc14cb eb09            jmp     puzzler3!myfun+0x56 (00cc14d6)
       30 00cc14cd 8b45ec          mov     eax,dword ptr [ebp-14h]
       30 00cc14d0 83c001          add     eax,1
       30 00cc14d3 8945ec          mov     dword ptr [ebp-14h],eax
       30 00cc14d6 8b45f8          mov     eax,dword ptr [ebp-8]
       30 00cc14d9 83e801          sub     eax,1
       30 00cc14dc 3945ec          cmp     dword ptr [ebp-14h],eax
       30 00cc14df 7d41            jge     puzzler3!myfun+0xa2 (00cc1522)
       32 00cc14e1 8b4508          mov     eax,dword ptr [ebp+8]
       32 00cc14e4 0345ec          add     eax,dword ptr [ebp-14h]
       32 00cc14e7 0fbe08          movsx   ecx,byte ptr [eax]
       32 00cc14ea 8b5508          mov     edx,dword ptr [ebp+8]
       32 00cc14ed 0355ec          add     edx,dword ptr [ebp-14h]
       32 00cc14f0 0fbe4201        movsx   eax,byte ptr [edx+1]
       32 00cc14f4 3bc8            cmp     ecx,eax
       32 00cc14f6 7e28            jle     puzzler3!myfun+0xa0 (00cc1520)
       34 00cc14f8 8b4508          mov     eax,dword ptr [ebp+8]
       34 00cc14fb 0345ec          add     eax,dword ptr [ebp-14h]
       34 00cc14fe 8a08            mov     cl,byte ptr [eax]
       34 00cc1500 884dd7          mov     byte ptr [ebp-29h],cl
       35 00cc1503 8b4508          mov     eax,dword ptr [ebp+8]
       35 00cc1506 0345ec          add     eax,dword ptr [ebp-14h]
       35 00cc1509 8b4d08          mov     ecx,dword ptr [ebp+8]
       35 00cc150c 034dec          add     ecx,dword ptr [ebp-14h]
       35 00cc150f 8a5101          mov     dl,byte ptr [ecx+1]
       35 00cc1512 8810            mov     byte ptr [eax],dl
       36 00cc1514 8b4508          mov     eax,dword ptr [ebp+8]
       36 00cc1517 0345ec          add     eax,dword ptr [ebp-14h]
       36 00cc151a 8a4dd7          mov     cl,byte ptr [ebp-29h]
       36 00cc151d 884801          mov     byte ptr [eax+1],cl
       38 00cc1520 ebab            jmp     puzzler3!myfun+0x4d (00cc14cd)
       40 00cc1522 eb91            jmp     puzzler3!myfun+0x35 (00cc14b5)
       41 00cc1524 5f              pop     edi
       41 00cc1525 5e              pop     esi
       41 00cc1526 5b              pop     ebx
       41 00cc1527 81c4f0000000    add     esp,0F0h
       41 00cc152d 3bec            cmp     ebp,esp
       41 00cc152f e820fcffff      call    puzzler3!ILT+335(__RTC_CheckEsp) (00cc1154)
       41 00cc1534 8be5            mov     esp,ebp
       41 00cc1536 5d              pop     ebp
       41 00cc1537 c3              ret

    Good luck, and happy debugging.

    Jeff Dailey-

    In response:  Wow, you folks did it again. I was worried that not many of our readers would respond.  Our entire team was very impressed with the number and quality of the responses we saw.  Congratulations goes out to all those assembler gurus out there that figured out this was a simple bubble sort.  We enjoyed seeing how various people went about solving this.  Some people compiled their code as they worked on reversing the function to verify the assembler.  This is a good approach.  Others just seemed to work it out end to end.  This is the approach I usually end up using because I’m typically in the middle of a debug and don’t actually need the source.


    Great work!


    Here is the answer….



    void myfun(char *val)


    00321480  push        ebp 

    00321481  mov         ebp,esp

    00321483  sub         esp,0F0h

    00321489  push        ebx 

    0032148A  push        esi 

    0032148B  push        edi 

    0032148C  lea         edi,[ebp-0F0h]

    00321492  mov         ecx,3Ch

    00321497  mov         eax,0CCCCCCCCh

    0032149C  rep stos    dword ptr es:[edi]

           int i;

           int j;

           int len;

           char t;



    0032149E  mov         eax,dword ptr [val]

    003214A1  push        eax 

    003214A2  call        @ILT+165(_strlen) (3210AAh)

    003214A7  add         esp,4

    003214AA  mov         dword ptr [len],eax


           for (i=len;i>0;i--)

    003214AD  mov         eax,dword ptr [len]

    003214B0  mov         dword ptr [i],eax

    003214B3  jmp         myfun+3Eh (3214BEh)

    003214B5  mov         eax,dword ptr [i]

    003214B8  sub         eax,1

    003214BB  mov         dword ptr [i],eax

    003214BE  cmp         dword ptr [i],0

    003214C2  jle         myfun+0A4h (321524h)



    003214C4  mov         dword ptr [j],0

    003214CB  jmp         myfun+56h (3214D6h)

    003214CD  mov         eax,dword ptr [j]

    003214D0  add         eax,1

    003214D3  mov         dword ptr [j],eax

    003214D6  mov         eax,dword ptr [i]

    003214D9  sub         eax,1

    003214DC  cmp         dword ptr [j],eax

    003214DF  jge         myfun+0A2h (321522h)


                         if (val[j]>val[j+1])

    003214E1  mov         eax,dword ptr [val]

    003214E4  add         eax,dword ptr [j]

    003214E7  movsx       ecx,byte ptr [eax]

    003214EA  mov         edx,dword ptr [val]

    003214ED  add         edx,dword ptr [j]

    003214F0  movsx       eax,byte ptr [edx+1]

    003214F4  cmp         ecx,eax

    003214F6  jle         myfun+0A0h (321520h)



    003214F8  mov         eax,dword ptr [val]

    003214FB  add         eax,dword ptr [j]

    003214FE  mov         cl,byte ptr [eax]

    00321500  mov         byte ptr [t],cl


    00321503  mov         eax,dword ptr [val]

    00321506  add         eax,dword ptr [j]

    00321509  mov         ecx,dword ptr [val]

    0032150C  add         ecx,dword ptr [j]

    0032150F  mov         dl,byte ptr [ecx+1]

    00321512  mov         byte ptr [eax],dl


    00321514  mov         eax,dword ptr [val]

    00321517  add         eax,dword ptr [j]

    0032151A  mov         cl,byte ptr [t]

    0032151D  mov         byte ptr [eax+1],cl



    00321520  jmp         myfun+4Dh (3214CDh)



    00321522  jmp         myfun+35h (3214B5h)


    00321524  pop         edi 

    00321525  pop         esi 

    00321526  pop         ebx 

    00321527  add         esp,0F0h

    0032152D  cmp         ebp,esp

    0032152F  call        @ILT+335(__RTC_CheckEsp) (321154h)

    00321534  mov         esp,ebp

    00321536  pop         ebp 

    00321537  ret         


    Thank You

    Jeff Dailey-

  • Ntdebugging Blog

    Too Much Cache?


    Cache is used to reduce the performance impact when accessing data that resides on slower storage media.  Without it your PC would crawl along and become nearly unusable.  If data or code pages for a file reside on the hard disk, it can take the system 10 milliseconds to access the page.  If that same page resides in physical RAM, it can take the system 10 nanoseconds to access the page.  Access to physical RAM is about 1 million times faster than to a hard drive.  It would be great if we could load up all the contents of the hard drive into RAM, but that scenario is cost prohibitive and dangerous.  Hard disk space is far less costly and is non-volatile (the data is persistent even when disconnected from a power source). 


    Since we are limited with how much RAM we can stick in a box, we have to make the most of it.  We have to share this crucial physical resource with all running processes, the kernel and the file system cache.  You can read more about how this works here:



    The file system cache resides in kernel address space.  It is used to buffer access to the much slower hard drive.  The file system cache will map and unmap sections of files based on access patterns, application requests and I/O demand.  The file system cache operates like a process working set.  You can monitor the size of your file system cache's working set using the Memory\System Cache Resident Bytes performance monitor counter.  This value will only show you the system cache's current working set.  Once a page is removed from the cache's working set it is placed on the standby list.  You should consider the standby pages from the cache manager as a part of your file cache.  You can also consider these standby pages to be available pages.  This is what the pre-Vista Task Manager does.  Most of what you see as available pages is probably standby pages for the system cache.  Once again, you can read more about this in "The Memory Shell Game" post.


    Too Much Cache is a Bad Thing

    The memory manager works on a demand based algorithm.  Physical pages are given to where the current demand is.  If the demand isn't satisfied, the memory manager will start pulling pages from other areas, scrub them and send them to help meet the growing demand.  Just like any process, the system file cache can consume physical memory if there is sufficient demand. 

    Having a lot of cache is generally not a bad thing, but if it is at the expense of other processes it can be detrimental to system performance.  There are two different ways this can occur - read and write I/O.


    Excessive Cached Write I/O

    Applications and services can dump lots of write I/O to files through the system file cache.  The system cache's working set will grow as it buffers this write I/O.  System threads will start flushing these dirty pages to disk.  Typically the disk can't keep up with the I/O speed of an application, so the writes get buffered into the system cache.  At a certain point the cache manager will reach a dirty page threshold and start to throttle I/O into the cache manager.  It does this to prevent applications from overtaking physical RAM with write I/O.  There are however, some isolated scenarios where this throttle doesn't work as well as we would expect.  This could be due to bad applications or drivers or not having enough memory.  Fortunately, we can tune the amount of dirty pages allowed before the system starts throttling cached write I/O.  This is handled by the SystemCacheDirtyPageThreshold registry value as described in Knowledge Base article 920739: http://support.microsoft.com/default.aspx?scid=kb;EN-US;920739


    Excessive Cached Read I/O

    While the SystemCacheDirtyPageThreshold registry value can tune the number of write/dirty pages in physical memory, it does not affect the number of read pages in the system cache.  If an application or driver opens many files and actively reads from them continuously through the cache manager, then the memory manger will move more physical pages to the cache manager.  If this demand continues to grow, the cache manager can grow to consume physical memory and other process (with less memory demand) will get paged out to disk.  This read I/O demand may be legitimate or may be due to poor application scalability.  The memory manager doesn't know if the demand is due to bad behavior or not, so pages are moved simply because there is demand for it.  On a 32 bit system, the file system cache working set is essentially limited to 1 GB.  This is the maximum size that we blocked off in the kernel for the system cache working set.  Since most systems have more than 1 GB of physical RAM today, having the system cache working set consume physical RAM with read I/O is less likely. 

    This scenario; however, is more prevalent on 64 bit systems.  With the increase in pointer length, the kernel's address space is greatly expanded.  The system cache's working set limit can and typically does exceed how much memory is installed in the system.  It is much easier for applications and drivers to load up the system cache with read I/O.  If the demand is sustained, the system cache's working set can grow to consume physical memory.  This will push out other process and kernel resources out to the page file and can be very detrimental to system performance.

    Fortunately we can also tune the server for this scenario.  We have added two APIs to query and set the system file cache size - GetSystemFileCacheSize() and SetSystemFileCacheSize().  We chose to implement this tuning option via API calls to allow setting the cache working set size dynamically.  I’ve uploaded the source code and compiled binaries for a sample application that calls these APIs.  The source code can be compiled using the Windows DDK, or you can use the included binaries.  The 32 bit version is limited to setting the cache working set to a maximum of 4 GB.  The 64 bit version does not have this limitation.  The sample code and included binaries are completely unsupported.  It is just a quick and dirty implementation with little error handling.

  • Ntdebugging Blog

    Desktop Heap, part 2



    Matthew here again – I want to provide some follow-up information on desktop heap.   In the first post I didn’t discuss the size of desktop heap related memory ranges on 64-bit Windows, 3GB, or Vista.   So without further ado, here are the relevant sizes on various platforms...



    Windows XP (32-bit)


    ·         48 MB = SessionViewSize (default registry value, set for XP Professional, x86)

    ·         20 MB = SessionViewSize (if no registry value is defined) 

    ·         3072 KB = Interactive desktop heap size (defined in the registry, SharedSection 2nd value)

    ·         512 KB = Non-interactive desktop heap size (defined in the registry, SharedSection 3nd value)

    ·         128 KB = Winlogon desktop heap size

    ·         64 KB = Disconnect desktop heap size




    Windows Server 2003 (32-bit)


    ·         48 MB = SessionViewSize (default registry value)

    ·         20 MB = SessionViewSize (if no registry value is defined; this is the default for Terminal Servers)

    ·         3072 KB = Interactive desktop heap size (defined in the registry, SharedSection 2nd value)

    ·         512 KB = Non-interactive desktop heap size (defined in the registry, SharedSection 3nd value)

    ·         128 KB = Winlogon desktop heap size

    ·         64 KB = Disconnect desktop heap size




    Windows Server 2003 booted with 3GB (32-bit)


    ·         20 MB = SessionViewSize (registry value has no effect)

    ·         3072 KB = Interactive desktop heap size (defined in the registry, SharedSection 2nd value)

    ·         512 KB = Non-interactive desktop heap size (defined in the registry, SharedSection 3nd value)

    ·         128 KB = Winlogon desktop heap size

    ·         64 KB = Disconnect desktop heap size


    You may also see reduced heap sizes when running 3GB.  During the initialization of the window manager, an attempt is made to reserve enough session view space to accommodate the expected number of desktops heaps for a given session.  If the heap sizes specified in the SharedSection registry value have been increased, the attempt to reserve session view space may fail.  When this happens, the window manager falls back to a pair of “safe” sizes for desktop heaps (512KB for interactive, 128KB for non-interactive) and tries to reserve session space again, using these smaller numbers.  This ensures that even if the registry values are too large for the 20MB session view space, the system will still be able to boot.




    Windows Server 2003 (64-bit)


    ·         104 MB = SessionViewSize (if no registry value is defined; which is the default) 

    ·         20 MB = Interactive desktop heap size (defined in the registry, SharedSection 2nd value)

    ·         768 KB = Non-interactive desktop heap size (defined in the registry, SharedSection 3nd value)

    ·         192 KB = Winlogon desktop heap size

    ·         96 KB = Disconnect desktop heap size




    Windows Vista RTM (32-bit)


    ·         Session View space is now a dynamic kernel address range.  The SessionViewSize registry value is no longer used.

    ·         3072 KB = Interactive desktop heap size (defined in the registry, SharedSection 2nd value)

    ·         512 KB = Non-interactive desktop heap size (defined in the registry, SharedSection 3nd value)

    ·         128 KB = Winlogon desktop heap size

    ·         64 KB = Disconnect desktop heap size




    Windows Vista (64-bit) and Windows Server 2008 (64-bit)


    ·         Session View space is now a dynamic kernel address range.  The SessionViewSize registry value is no longer used.

    ·         20 MB = Interactive desktop heap size (defined in the registry, SharedSection 2nd value)

    ·         768 KB = Non-interactive desktop heap size (defined in the registry, SharedSection 3nd value)

    ·         192 KB = Winlogon desktop heap size

    ·         96 KB = Disconnect desktop heap size



    Windows Vista SP1 (32-bit) and Windows Server 2008 (32-bit)


    ·         Session View space is now a dynamic kernel address range.  The SessionViewSize registry value is no longer used.

    ·         12288 KB = Interactive desktop heap size (defined in the registry, SharedSection 2nd value)

    ·         512 KB = Non-interactive desktop heap size (defined in the registry, SharedSection 3nd value)

    ·         128 KB = Winlogon desktop heap size

    ·         64 KB = Disconnect desktop heap size 



    Windows Vista introduced a new public API function: CreateDesktopEx, which allows the caller to specify the size of desktop heap.

    Additionally, GetUserObjectInformation now includes a new flag for retrieving the desktop heap size (UOI_HEAPSIZE).



    [Update: 3/20/2008 - Added Vista SP1 and Server 2008 info.  More information can be found here.]

  • Ntdebugging Blog

    Interpreting Event 153 Errors


    Hello my name is Bob Golding and I would like to share with you a new event that you may see in the system event log.  Event ID 153 is an error associated with the storage subsystem. This event was new in Windows 8 and Windows Server 2012 and was added to Windows 7 and Windows Server 2008 R2 starting with hot fix KB2819485.


    An event 153 is similar to an event 129.  An event 129 is logged when the storport driver times out a request to the disk; I described event 129 messages in a previous article.  The difference between a 153 and a 129 is that a 129 is logged when storport times out a request, a 153 is logged when the storport miniport driver times out a request.  The miniport driver may also be referred to as an adapter driver or HBA driver, this driver is typically written the hardware vendor.


    Because the miniport driver has a better knowledge of the request execution environment, some miniport drivers time the request themselves instead of letting storport handle request timing.  This is because the miniport driver can abort the individual request and return an error rather than storport resetting the drive after a timeout.  Resetting the drive is disruptive to the I/O subsystem and may not be necessary if only one request has timed out.  The error returned from the miniport driver is bubbled up to the class driver who can log an event 153 and retry the request.


    Below is an example event 153:


    Event 153 Example


    This error means that a request failed and was retried by the class driver.  In the past no message would be logged in this situation because storport did not timeout the request.  The lack of messages resulted in confusion when troubleshooting disk errors because timeouts would occur but there would be no evidence of the error.


    The details section of the event the log record will present what error caused the retry and whether the request was a read or write. Below is the details output:


    Event 153 Details


    In the example above at byte offset 29 is the SCSI status, at offset 30 is the SRB status that caused the retry, and at offset 31 is the SCSI command that is being retried.  In this case the SCSI status was 00 (SCSISTAT_GOOD), the SRB status was 09 (SRB_STATUS_TIMEOUT), and the command was 28 (SCSIOP_READ). 


    The most common SCSI commands are:

    SCSIOP_READ - 0x28



    The most common SRB statuses are below:





    A complete list of SCSI operations and statuses can be found in scsi.h in the WDK.  A list of SRB statuses can be found in srb.h.


    The timeout errors (SRB_STATUS_TIMEOUT and SRB_STATUS_COMMAND_TIMEOUT) indicate a request timed out in the adapter. In other words a request was sent to the drive and there was no response within the timeout period.  The bus reset error (SRB_STATUS_BUS_RESET) indicates that the device was reset and that the request is being retried due to the reset since all outstanding requests are aborted when a drive receives a reset.


    A system administrator who encounters event 153 errors should investigate the health of the computer’s disk subsystem.  Although an occasional timeout may be part of the normal operation of a system, the frequent need to retry requests indicates a performance issue with the storage that should be corrected.

  • Ntdebugging Blog

    NTDebugging Puzzler 0x00000007: Interlocked functions



    Today, we will have some fun with interlocked functions.


    The following section of code is reentrant.  A “well meaning” developer used interlocked functions to avoid serializing on a global table lock.


    Initial smoke testing shows that the code works fine.  Sometimes things are not as they appear when doing initial code review.  After several hours of heavy stress testing, the developer finds the machine has bugchecked.  Analysis of the dump showed that the caller of this function had steamrolled through nonpaged pool writing stacks on top of everyone’s pool memory.


    The goal of today’s puzzler is to find the bug and describe how it could be fixed with minimal timing impact.


    Here are a few details before you begin.


    1.       The variable gIndex is an unsigned long global.

    2.       The gLogArray memory was allocated from nonpaged pool and the size of this allocation is correct.




    00:   PLOG_ENTRY GetNextLogEntry ()


    01:         ULONG IterationCount = MAX_RECORDS;


    02:         PLOG_ENTRY pEntry;


    03:         do


    04:               if (InterlockedCompareExchange(&gIndex, 0, MAX_RECORDS) == MAX_RECORDS)


    05:                     pEntry = &gLogArray[0];


    06:               else


    07:                     pEntry = &gLogArray[InterlockedIncrement(&gIndex)];


    08:               --IterationCount;


    09:         } while(InterlockedCompareExchange(&pEntry->Active, 1, 0) != 0 && (IterationCount > 0));


    0a:         return (IterationCount > 0) ? pEntry : NULL;





    Happy hunting,


    Dennis Middleton “The NTFS Doctor”


      [Update: our answer. Posted 6/10/2008]


    Thanks for all the great posts!  I saw many interesting answers, and a few unique solutions that I hadn’t considered.


    Bug Description


    A slight time gap exists between the InterlockedCompareExchange and the InterlockedIncrement.  For this reason, several threads could pass the check for MAX_RECORDS prior to doing the increment.


    Assume that N is the number of threads that pass the check for MAX_RECORDS while gIndex is at a particular value.

    If N or more threads are able to pass the check while gIndex is equal to MAX_RECORDS-(N-1), then gIndex would be incremented beyond MAX_RECORDS.


    For example, let’s assume that 3 threads passed the check while gIndex was at MAX_RECORDS-2.  Then after the three increments occur, gIndex would be equal to MAX_RECORDS+1.  From that point, invalid pointers would be passed out to the caller.


    Possible Solutions


    There are several ways to solve this problem.  Some are more efficient than others.  I would avoid doing checks for MAX_RECORDS-1, MAX_RECORDS, or MAX_RECORDS+1 (interlocked or not) since there could potentially be more than two threads involved in the race condition.  Such a solution would only reduce the likelihood of an overflow.


    There were a few posts suggesting a lock or critical section for the section of code between the compare and increment.  That would be one solution, but it would also do away with the benefits of using interlocked functions.


    In keeping with the philosophy of keeping the code fast and simple, here’s a solution that gives a very good result with minimal impact.


    1.       I removed the if () {} else {} and simply allowed gIndex to increment unchecked.  With this change, gIndex can approach its max unsigned long value and possibly wrap - we need to keep the array index in check.

    2.       The modulus operator (added to line 4 below) will divide the incremented gIndex by MAX_RECORDS and use the remainder as the array index.  When dividing, the resultant remainder is always smaller than the divisor (MAX_RECORDS).  For this reason, the array index is guaranteed to be smaller than MAX_RECORDS.  As even multiples of MAX_RECORDS are reached, the array index resets back to zero mathemagically and no interlocked compare is even necessary.



    00:   PLOG_ENTRY GetNextLogEntry ()


    01:         ULONG IterationCount = MAX_RECORDS;


    02:         PLOG_ENTRY pEntry;


    03:         do



    04:               pEntry = &gLogArray[InterlockedIncrement(&gIndex) % MAX_RECORDS];


    05:               --IterationCount;


    06:         } while(InterlockedCompareExchange(&pEntry->Active, 1, 0) != 0 && (IterationCount > 0));


    07:         return (IterationCount > 0) ? pEntry : NULL;




    With the fix in place, the code is smaller, faster, easier to read, and most of all - the bug is gone.  When developing code, always try to think small.             


    Best Regards,

    NTFS Doctor




  • Ntdebugging Blog

    Debug Fundamentals Exercise 3: Calling conventions



    Today’s exercise will focus on x86 function calling conventions.  The calling convention of a function describes the following:


    ·         The order in which parameters are passed

    ·         Where parameters are placed (pushed on the stack or placed in registers)

    ·         Whether the caller or the callee is responsible for unwinding the stack on return


    While debugging, an understanding of calling conventions is helpful when you need to determine why certain values are placed in registers or on the stack before a function call.


    Standard x86 calling convention on Windows:



    Unwinds stack

    Win32 (Stdcall)

    pushed onto stack from right to left


    Native C++ (Thiscall)

    pushed onto stack from right to left, "this" pointer in ecx


    COM (Stdcall for C++)

    pushed onto stack from right to left, then "this" is pushed



    arg1 in ecx, arg2 in edx, remaining args pushed onto stack from right to left



    pushed onto stack from right to left





    Below are calls to 5 functions.  Each function takes two DWORD parameters.  Based on the code that calls each function, identify the calling convention used.


    // Call to Function1

    01002ffe 8b08            mov     ecx,dword ptr [eax]

    01003000 53              push    ebx

    01003001 687c2c0001      push    offset 01002c7c

    01003006 50              push    eax

    01003007 ff11            call    dword ptr [ecx]


    // Call to Function2

    01002490 50              push    eax

    01002491 688c110001      push    offset 0100118c

    01002496 e82a020000      call    dbgex4!Function2 (010026c5)

    0100249b 59              pop     ecx

    0100249c 59              pop     ecx


    // Call to Function3

    0100248e 8bd0            mov     edx,eax

    01002490 8bcf            mov     ecx,edi

    01002492 e8aeffffff      call    dbgex4!Function3 (01002445)


    // Call to Function4

    00413586 8b450c          mov     eax,dword ptr [ebp+0Ch]

    00413589 50              push    eax

    0041358a 8b4d08          mov     ecx,dword ptr [ebp+8]

    0041358d 51              push    ecx

    0041358e 8b4dec          mov     ecx,dword ptr [ebp-14h]

    00413591 e86fdfffff      call    dbgex4!Function4 (00411505)


    // Call to Function5

    01003540 56              push    esi

    01003541 8d85d4f9ffff    lea     eax,[ebp-62Ch]

    01003547 50              push    eax

    01003548 ff1558100001    call    dbgex4!Function5 (01001058)]



    Bonus: describe the calling convention used for x64.



    [Update: our answer. Posted 12/18/2008]


    Function1 - COM (Stdcall for C++)


    Function2 - cdecl


    Function3 - fastcall


    Function4 - Native C++ (Thiscall)


    Function5 - Win32 (Stdcall)



    Bonus: describe the calling convention used for x64:  



Page 1 of 25 (242 items) 12345»