Hi again! Today I want to bring to your attention an upcoming series of posts on troubleshooting hangs and this post as a primer for understanding hangs and how we scope these scenarios.
Scoping is a practice we use in troubleshooting that helps us to quickly narrow down the domain or scope of a problem from the entire operating system or enterprise to a specific computer and component. This allows the elimination of millions of other possible problems or interactions.
Hangs are a common and can be a sometimes lengthy support request because of the mere nature of the problem, and just describing it can be difficult. “Okay, what do you mean, it’s hung?” By nature I mean that some internal architecture knowledge is necessary to discover what component of the application or OS is not doing what is should and thus leaving us with either an unresponsive user interface or service or both. So how do we isolate what is going on here?
We will cover the main buckets or symptoms and I will list these in increasing depth or dependency into the OS below, in other words, moving from the Application Layer down into the OS. But let’s scope first…the most important step!
Scoping the Hang
We can determine which bucket or symptom we are running into by testing increasing layers of the operating system (OSI stack). Meaning, what layer of the system is working and which ones are not. The heart of this is to determine “What IS working properly and what IS NOT?”
The following table outlines the layers and tools we usually use to determine their responsiveness.
Functional Layer to Test
Tools To Test
Basic hardware + Network driver + Bottom of the network stack
Does Ping work? Num Lock light on keyboard?
SMB over Tcp/ip + Kernel as Server Service runs in the system process)
Does Net view work?
Rpc over Tcp/ip
RPC? (Event Viewer, Remote Management, Event Log, or rpcping.exe)
For example, if a machine is reported “hung” and we can ping it, and net view does not work (when it normally would) we should conclude that the server side of that request failure in most likely in the Server Service or one of its sub components. This being the case it would not make sense to troubleshoot why myapplication.exe is hung on the same server if lower level things like the server service itself do not work which may be a direct dependency!
Tip: This is a scoping method we use in isolating all problems. Look at the interaction of applications, services, the OS, drivers, etc. in light of their dependencies. “Okay, A is failing not because of A but because B failed, because C failed, and aha here is root cause in D’s failure”. Testing dependencies can yield considerable time savings vs. debugging “through” the application. Another example, if an RPC dependent application stops working, testing RPC by using another RPC app might be the first thing to do vs. debugging the first app which could be very time consuming and require specific knowledge about that app.
Here are some common scoping questions to help think about the context of the issue which could also isolate the problem quickly.
Answering these simple questions may have obvious yet extremely helpful results.
For example, if the machine is reported as hung and the observation was just made through a Remote Desktop (RDP) session, is it responsive at the console? Let’s say it is responsive at the console, we must then conclude that only the Terminal Server Service layer or one of its unique dependencies (lower in the stack) is the problem vs. the entire server. Jump to Terminal Services specific troubleshooting, etc.
Common Hang Buckets or Symptoms
Using the above scoping usually leads to these main classes of hangs which we will cover in future posts:
1.) Specific Application Menu/Button/Function Hang
The application looks “OK” in that it will repaint if we drag another window over it; however, if we click on a menu item or send a key stroke whatever functionality associated with it does not…function.
2.) Application Window Hang “I’m not dead yet…just Not Responding”
The application stops responding entirely at the UI layer, meaning it no longer refreshes and dragging another window over the top does not repaint thus displays artifacts of other windows result.
3.) The Start Menu, Desktop, or the “Shell” is hung
So here we know that the Microsoft process responsible for these windows, explorer.exe, is hung.
4.) All Windows are Hung…but Task Manager comes up eventually
Here the mouse still moves, and if we hit Ctrl+Shift+Esc we can invoke task manager, or via Ctrl+Alt+Del. This may not be a true hang, but slow or unresponsive enough to qualify or be reported as a hang!
5.) There are No Windows!
I can move the Mouse and Keyboard but they don’t “do” anything and there’s just a blank desktop, no windows, it’s hung.
In this case it may be that the server appears hung interactively while specialized services like file sharing, mail server, etc. still actually function…impending doom?
6.) No Windows + No Mouse/Keyboard + but the machine is still running, well, sort of…
Obviously the most drastic of the symptoms leaving little recourse but a debug of the machine…which might be easier than it sounds!
The server may or may not be responsive remotely via services, etc.
In each of the upcoming posts expect to see for each symptom:
Scoping Steps (what works vs. what doesn’t in each scenario)
Specific Debug Steps
Please look forward to these installments in the New Year!
Desktop heap is probably not something that you spend a lot of time thinking about, which is a good thing. However, from time to time you may run into an issue that is caused by desktop heap exhaustion, and then it helps to know about this resource. Let me state up front that things have changed significantly in Vista around kernel address space, and much of what I’m talking about today does not apply to Vista.
Laying the groundwork: Session Space
To understand desktop heap, you first need to understand session space. Windows 2000, Windows XP, and Windows Server 2003 have a limited, but configurable, area of memory in kernel mode known as session space. A session represents a single user’s logon environment. Every process belongs to a session. On a Windows 2000 machine without Terminal Services installed, there is only a single session, and session space does not exist. On Windows XP and Windows Server 2003, session space always exists. The range of addresses known as session space is a virtual address range. This address range is mapped to the pages assigned to the current session. In this manner, all processes within a given session map session space to the same pages, but processes in another session map session space to a different set of pages.
Session space is divided into four areas: session image space, session structure, session view space, and session paged pool. Session image space loads a session-private copy of Win32k.sys modified data, a single global copy of win32k.sys code and unmodified data, and maps various other session drivers like video drivers, TS remote protocol driver, etc. The session structure holds various memory management (MM) control structures including the session working set list (WSL) information for the session. Session paged pool allows session specific paged pool allocations. Windows XP uses regular paged pool, since the number of remote desktop connections is limited. On the other hand, Windows Server 2003 makes allocations from session paged pool instead of regular paged pool if Terminal Services (application server mode) is installed. Session view space contains mapped views for the session, including desktop heap.
Session Space layout:
Session Image Space: win32k.sys, session drivers
Session Structure: MM structures and session WSL
Session View Space: session mapped views, including desktop heap
Session Paged Pool
Sessions, Window Stations, and Desktops
You’ve probably already guessed that desktop heap has something to do with desktops. Let’s take a minute to discuss desktops and how they relate to sessions and window stations. All Win32 processes require a desktop object under which to run. A desktop has a logical display surface and contains windows, menus, and hooks. Every desktop belongs to a window station. A window station is an object that contains a clipboard, a set of global atoms and a group of desktop objects. Only one window station per session is permitted to interact with the user. This window station is named "Winsta0." Every window station belongs to a session. Session 0 is the session where services run and typically represents the console (pre-Vista). Any other sessions (Session 1, Session 2, etc) are typically remote desktops / terminal server sessions, or sessions attached to the console via Fast User Switching. So to summarize, sessions contain one or more window stations, and window stations contain one or more desktops.
You can picture the relationship described above as a tree. Below is an example of this desktop tree on a typical system:
- Session 0
| ---- WinSta0 (interactive window station)
| | |
| | ---- Default (desktop)
| | ---- Disconnect (desktop)
| | ---- Winlogon (desktop)
| ---- Service-0x0-3e7$ (non-interactive window station)
| | ---- Default (desktop)
| ---- Service-0x0-3e4$ (non-interactive window station)
| ---- SAWinSta (non-interactive window station)
| | ---- SADesktop (desktop)
- Session 1
- Session 2
---- WinSta0 (interactive window station)
---- Default (desktop)
---- Disconnect (desktop)
---- Winlogon (desktop)
In the above tree, the full path to the SADesktop (as an example) can be represented as “Session 0\SAWinSta\SADesktop”.
Desktop Heap – what is it, what is it used for?
Every desktop object has a single desktop heap associated with it. The desktop heap stores certain user interface objects, such as windows, menus, and hooks. When an application requires a user interface object, functions within user32.dll are called to allocate those objects. If an application does not depend on user32.dll, it does not consume desktop heap. Let’s walk through a simple example of how an application can use desktop heap.
1. An application needs to create a window, so it calls CreateWindowEx in user32.dll.
2. User32.dll makes a system call into kernel mode and ends up in win32k.sys.
3. Win32k.sys allocates the window object from desktop heap
4. A handle to the window (an HWND) is returned to caller
5. The application and other processes in the same session can refer to the window object by its HWND value
Where things go wrong
Normally this “just works”, and neither the user nor the application developer need to worry about desktop heap usage. However, there are two primary scenarios in which failures related to desktop heap can occur:
So how do you know if you are running into these problems? Processes failing to start with a STATUS_DLL_INIT_FAILED (0xC0000142) error in user32.dll is a common symptom. Since user32.dll needs desktop heap to function, failure to initialize user32.dll upon process startup can be an indication of desktop heap exhaustion. Another symptom you may observe is a failure to create new windows. Depending on the application, any such failure may be handled in different ways. Note that if you are experiencing problem number one above, the symptoms would usually only exist in one session. If you are seeing problem two, then the symptoms would be limited to processes that use the particular desktop heap that is exhausted.
Diagnosing the problem
So how can you know for sure that desktop heap exhaustion is your problem? This can be approached in a variety of ways, but I’m going to discuss the simplest method for now. Dheapmon is a command line tool that will dump out the desktop heap usage for all the desktops in a given session. See our first blog post for a list of tool download locations. Once you have dheapmon installed, be sure to run it from the session where you think you are running out of desktop heap. For instance, if you have problems with services failing to start, then you’ll need to run dheapmon from session 0, not a terminal server session.
Dheapmon output looks something like this:
Desktop Heap Information Monitor Tool (Version 7.0.2727.0)
Copyright (c) 2003-2004 Microsoft Corp.
Session ID: 0 Total Desktop: ( 5824 KB - 8 desktops)
WinStation\Desktop Heap Size(KB) Used Rate(%)
WinSta0\Default 3072 5.7
WinSta0\Disconnect 64 4.0
WinSta0\Winlogon 128 8.7
Service-0x0-3e7$\Default 512 15.1
Service-0x0-3e4$\Default 512 5.1
Service-0x0-3e5$\Default 512 1.1
SAWinSta\SADesktop 512 0.4
__X78B95_89_IW\__A8D9S1_42_ID 512 0.4
As you can see in the example above, each desktop heap size is specified, as is the percentage of usage. If any one of the desktop heaps becomes too full, allocations within that desktop will fail. If the cumulative heap size of all the desktops approaches the total size of session view space, then new desktops cannot be created within that session. Both of the failure scenarios described above depend on two factors: the total size of session view space, and the size of each desktop heap allocation. Both of these sizes are configurable.
Configuring the size of Session View Space
Session view space size is configurable using the SessionViewSize registry value. This is a REG_DWORD and the size is specified in megabytes. Note that the values listed below are specific to 32-bit x86 systems not booted with /3GB. A reboot is required for this change to take effect. The value should be specified under:
HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Control\Session Manager\Memory Management
Size if no registry value configured
Default registry value
Windows 2000 *
Windows Server 2003
* Settings for Windows 2000 are with Terminal Services enabled and hotfix 318942 installed. Without the Terminal Services installed, session space does not exist, and desktop heap allocations are made from a fixed 48 MB region for system mapped views. Without hotfix 318942 installed, the size of session view space is fixed at 20 MB.
The sum of the sizes of session view space and session paged pool has a theoretical maximum of slightly under 500 MB for 32-bit operating systems. The maximum varies based on RAM and various other registry values. In practice the maximum value is around 450 MB for most configurations. When the above values are increased, it will result in the virtual address space reduction of any combination of nonpaged pool, system PTEs, system cache, or paged pool.
Configuring the size of individual desktop heaps
Configuring the size of the individual desktop heaps is bit more complex. Speaking in terms of desktop heap size, there are three possibilities:
· The desktop belongs to an interactive window station and is a “Disconnect” or “Winlogon” desktop, so its heap size is fixed at 64KB or 128 KB, respectively (for 32-bit x86)
· The desktop heap belongs to an interactive window station, and is not one of the above desktops. This desktop’s heap size is configurable.
· The desktop heap belongs to a non-interactive window station. This desktop’s heap size is also configurable.
The size of each desktop heap allocation is controlled by the following registry value:
The default data for this registry value will look something like the following (all on one line):
SharedSection=1024,3072,512 Windows=On SubSystemType=Windows
The numeric values following "SharedSection=" control how desktop heap is allocated. These SharedSection values are specified in kilobytes.
The first SharedSection value (1024) is the shared heap size common to all desktops. This memory is not a desktop heap allocation, and the value should not be modified to address desktop heap problems.
The second SharedSection value (3072) is the size of the desktop heap for each desktop that is associated with an interactive window station, with the exception of the “Disconnect” and “Winlogon” desktops.
The third SharedSection value (512) is the size of the desktop heap for each desktop that is associated with a "non-interactive" window station. If this value is not present, the size of the desktop heap for non-interactive window stations will be same as the size specified for interactive window stations (the second SharedSection value).
Consider the two desktop heap exhaustion scenarios described above. If the first scenario is encountered (session view space is exhausted), and most of the desktop heaps are non-interactive, then the third SharedSection can be decreased in an effort to allow more (smaller) non-interactive desktop heaps to be created. Of course, this may not be an option if the processes using the non-interactive heaps require a full 512 KB. If the second scenario is encountered (a single desktop heap allocation is full), then the second or third SharedSection value can be increased to allow each desktop heap to be larger than 3072 or 512 KB. A potential problem with this is that fewer total desktop heaps can be created.
What are all these window stations and desktops in Session 0 anyway?
Now that we know how to tweak the sizes of session view space and the various desktops, it is worth talking about why you have so many window stations and desktops, particularly in session 0. First off, you’ll find that every WinSta0 (interactive window station) has at least 3 desktops, and each of these desktops uses various amounts of desktop heap. I’ve alluded to this previously, but to recap, the three desktops for each interactive window stations are:
· Default desktop - desktop heap size is configurable as described below
· Disconnect desktop - desktop heap size is 64k on 32-bit systems
· Winlogon desktop - desktop heap size is 128k on 32-bit systems
Note that there can potentially be more desktops in WinSta0 as well, since any process can call CreateDesktop and create new desktops.
Let’s move on to the desktops associated with non-interactive window stations: these are usually related to a service. The system creates a window station in which service processes that run under the LocalSystem account are started. This window station is named service-0x0-3e7$. It is named for the LUID for the LocalSystem account, and contains a single desktop that is named Default. However, service processes that run as LocalSystem interactive start in Winsta0 so that they can interact with the user in Session 0 (but still run in the LocalSystem context).
Any service process that starts under an explicit user or service account has a window station and desktop created for it by service control manager, unless a window station for its LUID already exists. These window stations are non-interactive window stations. The window station name is based on the LUID, which is unique for every logon. If an entity (other than System) logs on multiple times, a new window station is created for each logon. An example window station name is “service-0x0-22e1$”.
A common desktop heap issue occurs on systems that have a very large number of services. This can be a large number of unique services, or one (poorly designed, IMHO) service that installs itself multiple times. If the services all run under the LocalSystem account, then the desktop heap for Session 0\Service-0x0-3e7$\Default may become exhausted. If the services all run under another user account which logs on multiples times, each time acquiring a new LUID, there will be a new desktop heap created for every instance of the service, and session view space will eventually become exhausted.
Given what you now know about how service processes use window stations and desktops, you can use this knowledge to avoid desktop heap issues. For instance, if you are running out of desktop heap for the Session 0\Service-0x0-3e7$\Default desktop, you may be able to move some of the services to a new window station and desktop by changing the user account that the service runs under.
I hope you found this post interesting and useful for solving those desktop heap issues! If you have questions are comments, please let us know.
- Matthew Justice
[Update: 7/5/2007 - Desktop Heap, part 2 has been posted]
[Update: 9/13/2007 - Talkback video: Desktop Heap has been posted]
[Update: 3/20/2008 - The default interactive desktop heap size has been increased on 32-bit Vista SP1]