• Ntdebugging Blog

    Desktop Heap Overview

    • 99 Comments

     

    Desktop heap is probably not something that you spend a lot of time thinking about, which is a good thing.  However, from time to time you may run into an issue that is caused by desktop heap exhaustion, and then it helps to know about this resource.  Let me state up front that things have changed significantly in Vista around kernel address space, and much of what I’m talking about today does not apply to Vista.

     

    Laying the groundwork: Session Space

    To understand desktop heap, you first need to understand session space.  Windows 2000, Windows XP, and Windows Server 2003 have a limited, but configurable, area of memory in kernel mode known as session space.  A session represents a single user’s logon environment.  Every process belongs to a session.  On a Windows 2000 machine without Terminal Services installed, there is only a single session, and session space does not exist.  On Windows XP and Windows Server 2003, session space always exists.  The range of addresses known as session space is a virtual address range.  This address range is mapped to the pages assigned to the current session.  In this manner, all processes within a given session map session space to the same pages, but processes in another session map session space to a different set of pages. 

    Session space is divided into four areas: session image space, session structure, session view space, and session paged pool.  Session image space loads a session-private copy of Win32k.sys modified data, a single global copy of win32k.sys code and unmodified data, and maps various other session drivers like video drivers, TS remote protocol driver, etc.  The session structure holds various memory management (MM) control structures including the session working set list (WSL) information for the session.  Session paged pool allows session specific paged pool allocations.  Windows XP uses regular paged pool, since the number of remote desktop connections is limited.  On the other hand, Windows Server 2003 makes allocations from session paged pool instead of regular paged pool if Terminal Services (application server mode) is installed.  Session view space contains mapped views for the session, including desktop heap. 

    Session Space layout:

    Session Image Space: win32k.sys, session drivers

    Session Structure: MM structures and session WSL

    Session View Space: session mapped views, including desktop heap

    Session Paged Pool

     

    Sessions, Window Stations, and Desktops

    You’ve probably already guessed that desktop heap has something to do with desktops.  Let’s take a minute to discuss desktops and how they relate to sessions and window stations.  All Win32 processes require a desktop object under which to run.  A desktop has a logical display surface and contains windows, menus, and hooks.  Every desktop belongs to a window station.  A window station is an object that contains a clipboard, a set of global atoms and a group of desktop objects.  Only one window station per session is permitted to interact with the user. This window station is named "Winsta0."  Every window station belongs to a session.  Session 0 is the session where services run and typically represents the console (pre-Vista).  Any other sessions (Session 1, Session 2, etc) are typically remote desktops / terminal server sessions, or sessions attached to the console via Fast User Switching.  So to summarize, sessions contain one or more window stations, and window stations contain one or more desktops.

    You can picture the relationship described above as a tree.  Below is an example of this desktop tree on a typical system:

    - Session 0

    |   |

    |   ---- WinSta0           (interactive window station)

    |   |      |

    |   |      ---- Default    (desktop)

    |   |      |

    |   |      ---- Disconnect (desktop)

    |   |      |

    |   |      ---- Winlogon   (desktop)

    |   |

    |   ---- Service-0x0-3e7$  (non-interactive window station)

    |   |      |

    |   |      ---- Default    (desktop)

    |   |

    |   ---- Service-0x0-3e4$  (non-interactive window station)

    |   |      |

    |   |      ---- Default    (desktop)

    |   |

    |   ---- SAWinSta          (non-interactive window station)

    |   |      |

    |   |      ---- SADesktop  (desktop)

    |   |

    - Session 1

    |   |

    |   ---- WinSta0           (interactive window station)

    |   |      |

    |   |      ---- Default    (desktop)

    |   |      |

    |   |      ---- Disconnect (desktop)

    |   |      |

    |   |      ---- Winlogon   (desktop)

    |   |

    - Session 2

        |

        ---- WinSta0           (interactive window station)

               |

               ---- Default    (desktop)

               |

               ---- Disconnect (desktop)

               |

               ---- Winlogon   (desktop)

     

    In the above tree, the full path to the SADesktop (as an example) can be represented as “Session 0\SAWinSta\SADesktop”.

     

    Desktop Heap – what is it, what is it used for?

    Every desktop object has a single desktop heap associated with it.  The desktop heap stores certain user interface objects, such as windows, menus, and hooks.  When an application requires a user interface object, functions within user32.dll are called to allocate those objects.  If an application does not depend on user32.dll, it does not consume desktop heap.  Let’s walk through a simple example of how an application can use desktop heap. 

    1.     An application needs to create a window, so it calls CreateWindowEx in user32.dll.

    2.     User32.dll makes a system call into kernel mode and ends up in win32k.sys.

    3.     Win32k.sys allocates the window object from desktop heap

    4.     A handle to the window (an HWND) is returned to caller

    5.     The application and other processes in the same session can refer to the window object by its HWND value

     

    Where things go wrong

    Normally this “just works”, and neither the user nor the application developer need to worry about desktop heap usage.  However, there are two primary scenarios in which failures related to desktop heap can occur:

    1. Session view space for a given session can become fully utilized, so it is impossible for a new desktop heap to be created.
    2. An existing desktop heap allocation can become fully utilized, so it is impossible for threads that use that desktop to use more desktop heap.

     

    So how do you know if you are running into these problems?  Processes failing to start with a STATUS_DLL_INIT_FAILED (0xC0000142) error in user32.dll is a common symptom.  Since user32.dll needs desktop heap to function, failure to initialize user32.dll upon process startup can be an indication of desktop heap exhaustion.  Another symptom you may observe is a failure to create new windows.  Depending on the application, any such failure may be handled in different ways.  Note that if you are experiencing problem number one above, the symptoms would usually only exist in one session.  If you are seeing problem two, then the symptoms would be limited to processes that use the particular desktop heap that is exhausted.

     

    Diagnosing the problem

    So how can you know for sure that desktop heap exhaustion is your problem?  This can be approached in a variety of ways, but I’m going to discuss the simplest method for now.  Dheapmon is a command line tool that will dump out the desktop heap usage for all the desktops in a given session.  See our first blog post for a list of tool download locations.  Once you have dheapmon installed, be sure to run it from the session where you think you are running out of desktop heap.  For instance, if you have problems with services failing to start, then you’ll need to run dheapmon from session 0, not a terminal server session.

    Dheapmon output looks something like this:

    Desktop Heap Information Monitor Tool (Version 7.0.2727.0)

    Copyright (c) 2003-2004 Microsoft Corp.

    -------------------------------------------------------------

      Session ID:    0 Total Desktop: (  5824 KB -    8 desktops)

     

      WinStation\Desktop            Heap Size(KB)    Used Rate(%)

    -------------------------------------------------------------

      WinSta0\Default                    3072              5.7

      WinSta0\Disconnect                   64              4.0

      WinSta0\Winlogon                    128              8.7

      Service-0x0-3e7$\Default            512             15.1

      Service-0x0-3e4$\Default            512              5.1

      Service-0x0-3e5$\Default            512              1.1

      SAWinSta\SADesktop                  512              0.4

      __X78B95_89_IW\__A8D9S1_42_ID       512              0.4

    -------------------------------------------------------------

     

    As you can see in the example above, each desktop heap size is specified, as is the percentage of usage.  If any one of the desktop heaps becomes too full, allocations within that desktop will fail.  If the cumulative heap size of all the desktops approaches the total size of session view space, then new desktops cannot be created within that session.  Both of the failure scenarios described above depend on two factors: the total size of session view space, and the size of each desktop heap allocation.  Both of these sizes are configurable. 

     

    Configuring the size of Session View Space

    Session view space size is configurable using the SessionViewSize registry value.  This is a REG_DWORD and the size is specified in megabytes.  Note that the values listed below are specific to 32-bit x86 systems not booted with /3GB.  A reboot is required for this change to take effect.  The value should be specified under:

    HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Control\Session Manager\Memory Management

    OS

    Size if no registry value configured

    Default registry value

    Windows 2000 *

    20 MB

    none

    Windows XP

    20 MB

    48 MB

    Windows Server 2003

    20 MB

    48 MB

    * Settings for Windows 2000 are with Terminal Services enabled and hotfix 318942 installed.  Without the Terminal Services installed, session space does not exist, and desktop heap allocations are made from a fixed 48 MB region for system mapped views.  Without hotfix 318942 installed, the size of session view space is fixed at 20 MB.

    The sum of the sizes of session view space and session paged pool has a theoretical maximum of slightly under 500 MB for 32-bit operating systems.  The maximum varies based on RAM and various other registry values.  In practice the maximum value is around 450 MB for most configurations.  When the above values are increased, it will result in the virtual address space reduction of any combination of nonpaged pool, system PTEs, system cache, or paged pool.

     

    Configuring the size of individual desktop heaps

    Configuring the size of the individual desktop heaps is bit more complex.  Speaking in terms of desktop heap size, there are three possibilities:

    ·         The desktop belongs to an interactive window station and is a “Disconnect” or “Winlogon” desktop, so its heap size is fixed at 64KB or 128 KB, respectively (for 32-bit x86)

    ·         The desktop heap belongs to an interactive window station, and is not one of the above desktops.  This desktop’s heap size is configurable.

    ·         The desktop heap belongs to a non-interactive window station.  This desktop’s heap size is also configurable.

     

    The size of each desktop heap allocation is controlled by the following registry value:

                HKEY_LOCAL_MACHINE\System\CurrentControlSet\Control\Session Manager\SubSystems\Windows

     

     The default data for this registry value will look something like the following (all on one line):

                   %SystemRoot%\system32\csrss.exe ObjectDirectory=\Windows

                   SharedSection=1024,3072,512 Windows=On SubSystemType=Windows

                   ServerDll=basesrv,1 ServerDll=winsrv:UserServerDllInitialization,3

                   ServerDll=winsrv:ConServerDllInitialization,2 ProfileControl=Off

                   MaxRequestThreads=16

                                                               

     

    The numeric values following "SharedSection=" control how desktop heap is allocated. These SharedSection values are specified in kilobytes.

    The first SharedSection value (1024) is the shared heap size common to all desktops. This memory is not a desktop heap allocation, and the value should not be modified to address desktop heap problems.

    The second SharedSection value (3072) is the size of the desktop heap for each desktop that is associated with an interactive window station, with the exception of the “Disconnect” and “Winlogon” desktops.

    The third SharedSection value (512) is the size of the desktop heap for each desktop that is associated with a "non-interactive" window station. If this value is not present, the size of the desktop heap for non-interactive window stations will be same as the size specified for interactive window stations (the second SharedSection value). 

    Consider the two desktop heap exhaustion scenarios described above.  If the first scenario is encountered (session view space is exhausted), and most of the desktop heaps are non-interactive, then the third SharedSection can be decreased in an effort to allow more (smaller) non-interactive desktop heaps to be created.  Of course, this may not be an option if the processes using the non-interactive heaps require a full 512 KB.  If the second scenario is encountered (a single desktop heap allocation is full), then the second or third SharedSection value can be increased to allow each desktop heap to be larger than 3072 or 512 KB.  A potential problem with this is that fewer total desktop heaps can be created.

     

    What are all these window stations and desktops in Session 0 anyway?

    Now that we know how to tweak the sizes of session view space and the various desktops, it is worth talking about why you have so many window stations and desktops, particularly in session 0.  First off, you’ll find that every WinSta0 (interactive window station) has at least 3 desktops, and each of these desktops uses various amounts of desktop heap.  I’ve alluded to this previously, but to recap, the three desktops for each interactive window stations are:

    ·         Default desktop - desktop heap size is configurable as described below

    ·         Disconnect desktop - desktop heap size is 64k on 32-bit systems

    ·         Winlogon desktop - desktop heap size is 128k on 32-bit systems

     

    Note that there can potentially be more desktops in WinSta0 as well, since any process can call CreateDesktop and create new desktops.

    Let’s move on to the desktops associated with non-interactive window stations: these are usually related to a service.  The system creates a window station in which service processes that run under the LocalSystem account are started. This window station is named service-0x0-3e7$. It is named for the LUID for the LocalSystem account, and contains a single desktop that is named Default. However, service processes that run as LocalSystem interactive start in Winsta0 so that they can interact with the user in Session 0 (but still run in the LocalSystem context).

    Any service process that starts under an explicit user or service account has a window station and desktop created for it by service control manager, unless a window station for its LUID already exists. These window stations are non-interactive window stations.  The window station name is based on the LUID, which is unique for every logon.  If an entity (other than System) logs on multiple times, a new window station is created for each logon.  An example window station name is “service-0x0-22e1$”.

    A common desktop heap issue occurs on systems that have a very large number of services.  This can be a large number of unique services, or one (poorly designed, IMHO) service that installs itself multiple times.  If the services all run under the LocalSystem account, then the desktop heap for Session 0\Service-0x0-3e7$\Default may become exhausted.  If the services all run under another user account which logs on multiples times, each time acquiring a new LUID, there will be a new desktop heap created for every instance of the service, and session view space will eventually become exhausted.

    Given what you now know about how service processes use window stations and desktops, you can use this knowledge to avoid desktop heap issues.  For instance, if you are running out of desktop heap for the Session 0\Service-0x0-3e7$\Default desktop, you may be able to move some of the services to a new window station and desktop by changing the user account that the service runs under.

     

    Wrapping up

    I hope you found this post interesting and useful for solving those desktop heap issues!  If you have questions are comments, please let us know.

     

    - Matthew Justice

     

    [Update: 7/5/2007 - Desktop Heap, part 2 has been posted]

    [Update: 9/13/2007 - Talkback video: Desktop Heap has been posted]

    [Update: 3/20/2008 - The default interactive desktop heap size has been increased on 32-bit Vista SP1]

     

  • Ntdebugging Blog

    Debug Fundamentals Exercise 2: Some reverse engineering for Thanksgiving

    • 42 Comments

     

    Continuing our series on “Fundamentals Exercises”, we have some more reverse engineering for you!  Again, these exercises are designed more as learning experiences rather than simply puzzlers.  We hope you find them interesting and educational!  Feel free to post your responses here, but we won’t put them on the site until after we post the “official” responses, to avoid spoilers.

     

    Examine the following code, registers, and stack values to determine the following:

     

    1.       What is the return value from DemoFunction2?

    2.       What is the purpose of DemoFunction2?

    3.       Bonus: Both the last exercise and this week’s exercise involved accessing data at ebp+8.  Why ebp+8?

     

    Hints:

    1.       You probably don’t want to manually walk through every instruction that executes in the loop.  Instead, walk through a few iterations to determine the intent of the code.

    2.       The bracket notation [] in the assembly means to treat the value in brackets as a memory address, and access the value at that address.

    3.       32-bit integer return values are stored in eax

     

     

    0:000> uf 010024d0

    asmdemo2!DemoFunction2:

    010024d0 55              push    ebp

    010024d1 8bec            mov     ebp,esp

    010024d3 8b5508          mov     edx,dword ptr [ebp+8]

    010024d6 33c0            xor     eax,eax

    010024d8 b920000000      mov     ecx,20h

    010024dd d1ea            shr     edx,1

    010024df 7301            jnc     asmdemo2!DemoFunction2+0x12 (010024e2)

    010024e1 40              inc     eax

    010024e2 e2f9            loop    asmdemo2!DemoFunction2+0xd (010024dd)

    010024e4 5d              pop     ebp

    010024e5 c3              ret

     

    0:000> r

    eax=80002418 ebx=7ffd7000 ecx=00682295 edx=00000000 esi=80002418 edi=00000002

    eip=010024d0 esp=0006fe98 ebp=0006fea8 iopl=0         nv up ei pl zr na pe nc

    cs=001b  ss=0023  ds=0023  es=0023  fs=003b  gs=0000             efl=00000246

    asmdemo2!DemoFunction2:

    010024d0 55              push    ebp

     

    0:000> dps esp

    0006fe98  0100251c asmdemo2!main+0x20

    0006fe9c  80002418

    0006fea0  00000002

    0006fea4  00000000

    0006fea8  0006ff88

    0006feac  01002969 asmdemo2!_mainCRTStartup+0x12c

    0006feb0  00000002

    0006feb4  00682270

    0006feb8  006822b8

    0006febc  f395c17d

    0006fec0  00000000

    0006fec4  00000000

    0006fec8  7ffd7000

    0006fecc  00000000

    0006fed0  00000000

    0006fed4  00000000

    0006fed8  00000094

    0006fedc  00000006

    0006fee0  00000000

    0006fee4  00001771

    0006fee8  00000002

    0006feec  76726553

    0006fef0  20656369

    0006fef4  6b636150

    0006fef8  00003120

    0006fefc  00000000

    0006ff00  00000000

    0006ff04  00000000

    0006ff08  00000000

    0006ff0c  00000000

    0006ff10  00000000

    0006ff14  00000000

     


    [Update: our answer. Posted 12/04/2008]

    We had a great response to this exercise!  It was good to see so many of you going through this.  There were some readers that found this a good exercise for beginners, and others were looking for a return to Puzzlers.  As an FYI, we may do more Puzzlers in the future, but for now we are going to continue on the “Fundamentals Exercise” track to help all our readers build up a solid foundation for debugging.

     

    It was interesting to read how several of you not only gave the answers, but made suggestions for how the code could be optimized!  I want to point out that the code we post for these exercises isn’t intended to be the optimal solution; it is written as a learning tool.   That said, keep that in-depth feedback coming; I think everyone will benefit from a discussion of optimization.

     

    Answers to exercise 2:

     

    1. DemoFunction2 returns 5, which is the number of bits set in 80002418, the value at ebp+8.
    2. DemoFunction2 finds the hamming weight of the 32-bit value passed to the function (it returns the count of bits that are equal to 1).
    3. ebp points to the base of the stack frame (the value stored there points to the previous frame's ebp), ebp+4 points to the return address, and ebp+8 points to the first parameter passed to the function.  Note that the value of ebp changes in the function prolog, at instruction 010024d1.  At this point ebp is set to 0006fe94, so at instruction 010024d3, ebp+8 is 0006fe9c, and [ebp+8] = 80002418.
  • Ntdebugging Blog

    Understanding Pool Consumption and Event ID: 2020 or 2019

    • 41 Comments

     

    Hi!  My name is Tate.  I’m an Escalation Engineer on the Microsoft Critical Problem Resolution Platforms Team.  I wanted to share one of the most common errors we troubleshoot here on the CPR team, its root cause being pool consumption, and the methods by which we can remedy it quickly!

     

    This issue is commonly misdiagnosed, however, 90% of the time it is actually quite possible to determine the resolution quickly without any serious effort at all!

     

     

    First, what do these events really mean?

     

    Event ID 2020
    Event Type: Error
    Event Source: Srv
    Event Category: None
    Event ID: 2020
    Description:
    The server was unable to allocate from the system paged pool because the pool was empty.

     

    Event ID 2019
    Event Type: Error
    Event Source: Srv
    Event Category: None
    Event ID: 2019
    Description:
    The server was unable to allocate from the system NonPaged pool because the pool was empty.

     

     

    This is our friend the Server Service reporting that when it was trying to satisfy a request, it was not able to find enough free memory of the respective type of pool.  2020 indicates Paged Pool and 2019, NonPaged Pool.  This doesn’t mean that the Server Service (srv.sys) is broken or the root cause of the problem, more often rather it is the first component to see the resource problem and report it to the Event Log.  Thus, there could be (and usually are) a few more symptoms of pool exhaustion on the system such as hangs, or out of resource errors reported by drivers or applications, or all of the above!

     

     

    What is Pool?

     

    First, Pool is not the amount of RAM on the system, it is however a segment of the virtual memory or address space that Windows reserves on boot.  These pools are finite considering address space itself is finite.  So, because 32bit(x86) machines can address 2^32==4Gigs, Windows uses (by default) 2GB for applications and 2GB for kernel.  Of the 2GB for kernel there are other things we must fit in our 2GB such as Page Table Entries (PTEs) and as such the maximum amount of Paged Pool for 32bit(x86) of ~460MB puts this in perspective in terms of our realistic limits per processor architecture.  As this implies, 64bit(x64&ia64) machines have less of a problem here due to their larger address space but there are still limits and thus no free lunch.

     

    *For more about determining current pool limits see the common question post “Why am I out of Paged Pool at ~200MB…” at the end of this post.

     

    *For more info about pools:  About Memory Management > Memory Pools

    *This has changed a bit for Vista, see Dynamic Kernel Address space

     

     

    What are these pools used for?

     

    These pools are used by either the kernel directly, indirectly by its support of various structures due to application requests on the system (CreateFile for example), or drivers installed on the system for their memory allocations made via the kernel pool allocation functions.

     

    Literally, NonPaged means that this memory when allocated will not be paged to disk and thus resident at all times, which is an important feature for drivers.  Paged conversely, can be, well… paged out to disk.  In the end though, all this memory is allocated through a common set of functions, most common is ExAllocatePoolWithTag.

     

     

    Ok, so what is using it/abusing it? (our goal right!?)

     

    Now that we know that the culprit is Windows or a component shipping with Windows, a driver, or an application requesting lots of things that the kernel has to create on its behalf, how can we find out which?

     

    There are really four basic methods that are typically used (listing in order of increasing difficulty)

     

    1.)    Find By Handle Count

     

    Handle Count?  Yes, considering that we know that an application can request something of the OS that it must then in turn create and provide a reference to…this is typically represented by a handle, and thus charged to the process’ total handle count!

     

    The quickest way by far if the machine is not completely hung is to check this via Task Manager.  Ctrl+Shift+Esc…Processes Tab…View…Select Columns…Handle Count.  Sort on Handles column now and check to see if there is a significantly large one there (this information is also obtainable via Perfmon.exe, Process Explorer, Handle.exe, etc.).

     

    What’s large?  Well, typically we should raise an eyebrow at anything over 5,000 or so.  Now that’s not to say that over this amount is inherently bad, just know that there is no free lunch and that a handle to something usually means that on the other end there is a corresponding object stored in NonPaged or Paged Pool which takes up memory.

     

    So for example let’s say we have a process that has 100,000 handles, mybadapp.exe.  What do we do next?

     

    Well, if it’s a service we could stop it (which releases the handles) or if an application running interactively, try to shut it down and look to see how much total Kernel Memory (Paged or NonPaged depending on which one we are short of) we get back.  If we were at 400MB of Paged Pool (Look at Performance Tab…Kernel Memory…Paged) and after stopping mybadapp.exe with its 100,000 handles are now at a reasonable 100MB, well there’s our bad guy and following up with the owner or further investigating (Process Explorer from sysinternals or the Windows debugger for example) what type of handles are being consumed would be the next step.

     

    Tip: 

    For essential yet legacy applications, which there is no hope of replacing or obtaining support, we may consider setting up a performance monitor alert on the handle count when it hits a couple thousand or so (Performance Object: Process, Counter: Handle Count) and taking action to restart the bad service.  This is a less than elegant solution for sure but it could keep the one rotten apple from spoiling the bunch by hanging/crashing the machine!

     

    2.)    By Pooltag (as read by poolmon.exe)

     

    Okay, so no handle count gone wild? No problem.

     

    For Windows 2003 and later machines, a feature is enabled by default that allows tracking of the pool consumer via something called a pooltag.  For previous OS’s we will need to use a utility such as gflags.exe to Enable Pool Tagging (which requires a reboot unfortunately).  This is usually just a 3-4 character string or more technically “a character literal of up to four characters delimited by single quotation marks” that the caller of the kernel api to allocate the pool will provide as its 3rd parameter.  (see ExAllocatePoolWithTag)

     

    The tool that we use to get the information about what pooltag is using the most is poolmon.exe.  Launch this from a cmd prompt, hit B to sort by bytes descending and P to sort the list by the type (Paged, NonPaged, or Both) and we have a live view into what’s going on in the system.  Look specifically at the Tag Name and its respective Byte Total column for the guilty party!  Get Poolmon.exe Here  or More info about poolmon.exe usage. 

     

    The cool thing is that we have most of the OS utilized pooltags already documented so we have an idea if there is a match for one of the Windows components in pooltag.txt.  So if we see MmSt as the top tag for instance consuming far and away the largest amount, we can look at pooltag.txt and know that it’s the memory manager and also using that tag in a search engine query we might get the more popular KB304101 which may resolve the issue!

     

    We will find pooltag.txt in the ...\Debugging Tools for Windows\triage folder when the debugging tools are installed.

     

    Oh no, what if it’s not in the list? No problem…

     

    We might be able to find its owner by using one of the following techniques:

     

    • For 32-bit versions of Windows, use poolmon /c to create a local tag file that lists each tag value assigned by drivers on the local machine (%SystemRoot%\System32\Drivers\*.sys). The default name of this file is Localtag.txt.

     

    Really all versions---->• For Windows 2000 and Windows NT 4.0, use Search to find files that contain a specific pool tag, as described in KB298102, How to Find Pool Tags That Are Used By Third-Party Drivers.

    From:  http://www.microsoft.com/whdc/driver/tips/PoolMem.mspx

     

     

     

    3.)    Using Driver Verifier

     

    Using driver verifier is a more advanced approach to this problem.  Driver Verifier provides a whole suite of options targeted mainly at the driver developer to run what amounts to quality control checks before shipping their driver.

     

    However, should pooltag identification be a problem, there is a facility here in Pool Tracking that does the heavy lifting in that it will do the matching of Pool consumer directly to driver!

     

    Be careful however, the only option we will likely want to check is Pool Tracking as the other settings are potentially costly enough that if our installed driver set is not perfect on the machine we could get into an un-bootable situation with constant bluescreens notifying that xyz driver is doing abc bad thing and some follow up suggestions.

     

    In summary, Driver Verifier is a powerful tool at our disposal but use with care only after the easier methods do not resolve our pool problems.

     

    4.)    Via Debug (live and postmortem)

     

    As mentioned earlier the api being used here to allocate this pool memory is usually ExAllocatePoolWithTag.  If we have a kernel debugger setup we can set a break point here to brute force debug who our caller is….but that’s not usually how we do it, can you say, “extended downtime?”  There are other creative live debug methods with are a bit more advanced that we may post later…

     

    Usually, debugging this problem involves a post mortem memory.dmp taken from a hung server or a machine that has experienced Event ID:  2020 or Event ID 2019 or is no longer responsive to client requests, hung, or often both.  We can gather this dump via the Ctrl+Scroll Lock method see KB244139 , even while the machine is “hung” and seemingly unresponsive to the keyboard or Ctrl+Alt+Del !

     

    When loading the memory.dmp via windbg.exe or kd.exe we can quickly get a feel for the state of the machine with the following commands.

     

    Debugger output Example 1.1  (the !vm command)

     

    2: kd> !vm 
    *** Virtual Memory Usage ***
      Physical Memory:   262012   ( 1048048 Kb)
      Page File: \??\C:\pagefile.sys
         Current:   1054720Kb Free Space:    706752Kb
         Minimum:   1054720Kb Maximum:      1054720Kb
      Page File: \??\E:\pagefile.sys
         Current:   2490368Kb Free Space:   2137172Kb
         Minimum:   2490368Kb Maximum:      2560000Kb
      Available Pages:    63440   (  253760 Kb)
      ResAvail Pages:    194301   (  777204 Kb)
      Modified Pages:       761   (    3044 Kb)
      NonPaged Pool Usage: 52461   (  209844 Kb)<<NOTE!  Value is near NonPaged Max
      NonPaged Pool Max:   54278   (  217112 Kb)
      ********** Excessive NonPaged Pool Usage *****

     

    Note how the NonPaged Pool Usage value is near the NonPaged Pool Max value.  This tells us that we are basically out of NonPaged Pool.

     

    Here we can use the !poolused command to give the same information that poolmon.exe would have but in the dump….

     

    Debugger output Example 1.2  (!poolused 2)

     

    Note the 2 value passed to !poolused orders pool consumers by NonPaged

     

    2: kd> !poolused 2
       Sorting by NonPaged Pool Consumed
      Pool Used:
                NonPaged            Paged
    Tag    Allocs     Used    Allocs     Used
    Thre   120145 76892800         0        0
    File   187113 29946176         0        0
    AfdE    89683 25828704         0        0
    TCPT    41888 18765824         0        0
    AfdC    90964 17465088         0        0 

     

    We now see the “Thre” tag at the top of the list, the largest consumer of NonPaged Pool, let’s go look it up in pooltag.txt….

     

    Thre - nt!ps        - Thread objects

     

    Note, the nt before the ! means that this is NT or the kernel’s tag for Thread objects.

    So from our earlier discussion if we have a bunch of thread objects, I probably have an application on the system with a ton of handles and or a ton of Threads so it should be easy to find!

     

    Via the debugger we can find this out easily via the !process 0 0 command which will show the TableSize (Handle Count) of over 90,000!

     

    Debugger output Example 1.3  (the !process command continued)

     

    Note the two zeros after !process separated by a space gives a list of all running processes on the system.

     

     

    PROCESS 884e6520  SessionId: 0  Cid: 01a0    Peb: 7ffdf000  ParentCid: 0124
    DirBase: 110f6000  ObjectTable: 88584448  TableSize: 90472
    Image: mybadapp.exe

     

    We can dig further here into looking at the threads…

     

    Debugger output Example 1.4  (the !process command continued)

     

    0: kd> !PROCESS 884e6520 4
    PROCESS 884e6520  SessionId: 0  Cid: 01a0    Peb: 7ffdf000  ParentCid: 0124
    DirBase: 110f6000  ObjectTable: 88584448  TableSize: 90472.
    Image: mybadapp.exe
            THREAD 884d8560  Cid 1a0.19c  Teb: 7ffde000  Win32Thread: a208f648 WAIT
            THREAD 88447560  Cid 1a0.1b0  Teb: 7ffdd000  Win32Thread: 00000000 WAIT
            THREAD 88396560  Cid 1a0.1b4  Teb: 7ffdc000  Win32Thread: 00000000 WAIT
            THREAD 88361560  Cid 1a0.1bc  Teb: 7ffda000  Win32Thread: 00000000 WAIT
            THREAD 88335560  Cid 1a0.1c0  Teb: 7ffd9000  Win32Thread: 00000000 WAIT
            THREAD 88340560  Cid 1a0.1c4  Teb: 7ffd8000  Win32Thread: 00000000 WAIT
     And the list goes on…

     

    We can examine the thread via !thread 88340560 from here and so on…

     

    So in this rudimentary example the offender is clear in mybadapp.exe in its abundance of threads and one could dig further to determine what type of thread or functions are being executed and follow up with the owner of this executable for more detail, or take a look at the code if the application is yours!

     

     

     

    Common Question:

     

    Why am I out of Paged Pool at ~200MB when we say that the limit is around 460MB?

     

    This is because the memory manager at boot decided that given the current amount of RAM on the system and other memory manager settings such as /3GB, etc. that our max is X amount vs. the maximum.  There are two ways to see the maximum’s on a system.

     

    1.)   Process Explorer using its Task Management.  View…System Information…Kernel Memory section.

     

    Note that we have to specify a valid path to dbghelp.dll and Symbols path via Options…Configure Symbols.

     

    For example:

     

          Dbghelp.dll path:

    c:\<path to debugging tools for windows>\dbghelp.dll

     

    Symbols path:

    SRV*C:\websymbols*http://msdl.microsoft.com/download/symbols

     

    2.)The debugger (live or via a memory.dmp by doing a !vm)

     

    *NonPaged pool size is not configurable other than the /3GB boot.ini switch which lowers NonPaged Pool’s maximum.

    128MB with the /3GB switch, 256MB without

     

    Conversely, Paged Pool size is often able to be raised to around its maximum manually via the PagedPoolSize registry setting which we can find for example in KB304101.

     

     

    So what is this Pool Paged Bytes counter I see in Perfmon for the Process Object?

     

    This is when the allocation is charged to a process via ExAllocatePoolWithQuotaTag.  Typically, we will see ExAlloatePoolWithTag used and thus this counter is less effective…but hey…don’t pass up free information in Perfmon so be on the lookout for this easy win.

     

     

    Additional Resources:

     

     “Who's Using the Pool?” from Driver Fundamentals > Tips: What Every Driver Writer Needs to Know

    http://www.microsoft.com/whdc/driver/tips/PoolMem.mspx

     

    Poolmon Remarks:  http://technet2.microsoft.com/WindowsServer/en/library/85b0ba3b-936e-49f0-b1f2-8c8cb4637b0f1033.mspx

     

     

     

     

     I hope you have enjoyed this post and hopefully it will get you going in the right direction next time you see one of these events or hit a pool consumption issue!

     

    -Tate

     

  • Ntdebugging Blog

    Debug Fundamentals Exercise 1: Reverse engineer a function

    • 38 Comments

     

    Hello ntdebuggers!  We’ve seen a lot of interest in our Puzzlers, and we’ve also seen requests and interest in topics covering debugging fundamentals.  So we’ve decided to combine the two topics and post a series of “Fundamentals Exercises”.  These exercises will be designed more as learning experiences rather than simply puzzlers.  We hope you find them interesting and educational!

     

    Feel free to post your responses here, but we won’t put them on the site until after we post the “official” responses, so as to avoid spoilers.

     

     

    Examine the following code, registers, and stack values to determine the following:

     

    1.       When the function “DoTheWork” returns, what is the return value from that function?

    2.       Bonus: what is the mathematical operation that “DoTheWork” performs?

     

    Hints:

    1.       The bracket notation [] in the assembly means to treat the value in brackets as a memory address, and access the value at that address.

    2.       32-bit integer return values are stored in eax

     

     

    // Code

    0:000> uf eip

    demo2!DoTheWork:

    0040101c 55              push    ebp

    0040101d 8bec            mov     ebp,esp

    0040101f 8b4d08          mov     ecx,dword ptr [ebp+8]

    00401022 8bc1            mov     eax,ecx

    00401024 49              dec     ecx

    00401025 0fafc1          imul    eax,ecx

    00401028 83f902          cmp     ecx,2

    0040102b 7ff7            jg      demo2!DoTheWork+0x8 (00401024)

    0040102d 5d              pop     ebp

    0040102e c3              ret

     

    // Current register state

    0:000> r

    eax=00000007 ebx=7ffd9000 ecx=ffffffff edx=00000007 esi=00001771 edi=00000000

    eip=0040101c esp=0012fe9c ebp=0012feac iopl=0         nv up ei pl nz na po nc

    cs=001b  ss=0023  ds=0023  es=0023  fs=003b  gs=0000             efl=00000202

    demo2!DoTheWork:

    0040101c 55              push    ebp

     

    // Current stack values for this thread

    0:000> dps esp

    0012fe9c  00406717 demo2!main+0x27

    0012fea0  00000007

    0012fea4  82059a87

    0012fea8  00000007

    0012feac  0012ff88

    0012feb0  004012b2 demo2!mainCRTStartup+0x170

    0012feb4  00000002

    0012feb8  00980e48

    0012febc  00980e80

    0012fec0  00000094

    0012fec4  00000006

    0012fec8  00000000

    0012fecc  00001771

    0012fed0  00000002

    0012fed4  76726553

    0012fed8  20656369

    0012fedc  6b636150

    0012fee0  00003120

    0012fee4  00000000

    0012fee8  00000000

    0012feec  00000000

    0012fef0  00000000

    0012fef4  00000000

    0012fef8  00000000

    0012fefc  00000000

    0012ff00  00000000

    0012ff04  00000000

    0012ff08  00000000

    0012ff0c  00000000

    0012ff10  00000000

    0012ff14  00000000

    0012ff18  00000000

     


    [Update: our answer. Posted 11/19/2008]

    Wow - what a great response from our readers on this exercise!  It is great to see the various approaches to reverse engineering this code.  As for the answer, the numerical result (stored in eax) is 5040, and the corresponding mathematical operation is a factorial.  So 7! is the result calculated, given that 7 was passed to the function.  Congratulations to all of you that got it right! 

     

    Many of you posted some C code to give an idea of what the original source of DoTheWork() might have looked like.  The original source was actually written in assembly!  However, it was written to be called from C, and it uses ebp in the same way that compiled C code might.  This function wasn’t written with optimal performance in mind, but rather for learning about reverse engineering.

     

     

  • Ntdebugging Blog

    NTDebugging Puzzler 0x00000003 (Matrix Edition) Some assembly required.

    • 37 Comments


    Hello NTdebuggers, I'm very impressed with the depth of the answers we are seeing from our readers.  As I stated in last week's response, this week's puzzler is going to be harder.  With that said let's take it up a notch.  One of the things that is really cool about be an Escalation Engineer in GES/CPR is how far we go in the pursuit of solving complex problems.  If we're debugging some Microsoft code in a kernel dump or user mode, and our quest takes us into a binary that we don't have code or symbols for, we don't stop, we forge on!  Over the years there are members of our team that have had to port to or support Alpha, PowerPC, MIPs, IA64 and x64, myself included.  As a result most of us have books for just about every mainstream processor under the sun.  It's a good idea if you're going to be debugging on these platforms to have general working knowledge the CPUs you will encounter.  The most common CPU's we deal with are x86, followed by x64 and IA64.  Microsoft doesn't support PPC, MIPS or Alpha anymore unless you're dealing with Xbox consoles, and those are PPC.  That said, this week's challenge is to tell us what the following assembly does.  You can tell us in C, or break it down and comment on the various sections. 


    Some people like cross word puzzles, Most of us in GES/CPR love to esreveR reenignE assembler. Have FUN!

     

    “I don’t even see the code anymore”  Cypher...
    
    0:000> uf myfun
    puzzler3!myfun [c:\source\puzzler\puzzler3\puzzler3\puzzler3.cpp @ 20]:
       20 00cc1480 55              push    ebp
       20 00cc1481 8bec            mov     ebp,esp
       20 00cc1483 81ecf0000000    sub     esp,0F0h
       20 00cc1489 53              push    ebx
       20 00cc148a 56              push    esi
       20 00cc148b 57              push    edi
       20 00cc148c 8dbd10ffffff    lea     edi,[ebp-0F0h]
       20 00cc1492 b93c000000      mov     ecx,3Ch
       20 00cc1497 b8cccccccc      mov     eax,0CCCCCCCCh
       20 00cc149c f3ab            rep stos dword ptr es:[edi]
       26 00cc149e 8b4508          mov     eax,dword ptr [ebp+8]
       26 00cc14a1 50              push    eax
       26 00cc14a2 e803fcffff      call    puzzler3!ILT+165(_strlen) (00cc10aa)
       26 00cc14a7 83c404          add     esp,4
       26 00cc14aa 8945e0          mov     dword ptr [ebp-20h],eax
       28 00cc14ad 8b45e0          mov     eax,dword ptr [ebp-20h]
       28 00cc14b0 8945f8          mov     dword ptr [ebp-8],eax
       28 00cc14b3 eb09            jmp     puzzler3!myfun+0x3e (00cc14be)
       28 00cc14b5 8b45f8          mov     eax,dword ptr [ebp-8]
       28 00cc14b8 83e801          sub     eax,1
       28 00cc14bb 8945f8          mov     dword ptr [ebp-8],eax
       28 00cc14be 837df800        cmp     dword ptr [ebp-8],0
       28 00cc14c2 7e60            jle     puzzler3!myfun+0xa4 (00cc1524)
       30 00cc14c4 c745ec00000000  mov     dword ptr [ebp-14h],0
       30 00cc14cb eb09            jmp     puzzler3!myfun+0x56 (00cc14d6)
       30 00cc14cd 8b45ec          mov     eax,dword ptr [ebp-14h]
       30 00cc14d0 83c001          add     eax,1
       30 00cc14d3 8945ec          mov     dword ptr [ebp-14h],eax
       30 00cc14d6 8b45f8          mov     eax,dword ptr [ebp-8]
       30 00cc14d9 83e801          sub     eax,1
       30 00cc14dc 3945ec          cmp     dword ptr [ebp-14h],eax
       30 00cc14df 7d41            jge     puzzler3!myfun+0xa2 (00cc1522)
       32 00cc14e1 8b4508          mov     eax,dword ptr [ebp+8]
       32 00cc14e4 0345ec          add     eax,dword ptr [ebp-14h]
       32 00cc14e7 0fbe08          movsx   ecx,byte ptr [eax]
       32 00cc14ea 8b5508          mov     edx,dword ptr [ebp+8]
       32 00cc14ed 0355ec          add     edx,dword ptr [ebp-14h]
       32 00cc14f0 0fbe4201        movsx   eax,byte ptr [edx+1]
       32 00cc14f4 3bc8            cmp     ecx,eax
       32 00cc14f6 7e28            jle     puzzler3!myfun+0xa0 (00cc1520)
       34 00cc14f8 8b4508          mov     eax,dword ptr [ebp+8]
       34 00cc14fb 0345ec          add     eax,dword ptr [ebp-14h]
       34 00cc14fe 8a08            mov     cl,byte ptr [eax]
       34 00cc1500 884dd7          mov     byte ptr [ebp-29h],cl
       35 00cc1503 8b4508          mov     eax,dword ptr [ebp+8]
       35 00cc1506 0345ec          add     eax,dword ptr [ebp-14h]
       35 00cc1509 8b4d08          mov     ecx,dword ptr [ebp+8]
       35 00cc150c 034dec          add     ecx,dword ptr [ebp-14h]
       35 00cc150f 8a5101          mov     dl,byte ptr [ecx+1]
       35 00cc1512 8810            mov     byte ptr [eax],dl
       36 00cc1514 8b4508          mov     eax,dword ptr [ebp+8]
       36 00cc1517 0345ec          add     eax,dword ptr [ebp-14h]
       36 00cc151a 8a4dd7          mov     cl,byte ptr [ebp-29h]
       36 00cc151d 884801          mov     byte ptr [eax+1],cl
       38 00cc1520 ebab            jmp     puzzler3!myfun+0x4d (00cc14cd)
       40 00cc1522 eb91            jmp     puzzler3!myfun+0x35 (00cc14b5)
       41 00cc1524 5f              pop     edi
       41 00cc1525 5e              pop     esi
       41 00cc1526 5b              pop     ebx
       41 00cc1527 81c4f0000000    add     esp,0F0h
       41 00cc152d 3bec            cmp     ebp,esp
       41 00cc152f e820fcffff      call    puzzler3!ILT+335(__RTC_CheckEsp) (00cc1154)
       41 00cc1534 8be5            mov     esp,ebp
       41 00cc1536 5d              pop     ebp
       41 00cc1537 c3              ret
    
    

    Good luck, and happy debugging.

    Jeff Dailey-




    In response:  Wow, you folks did it again. I was worried that not many of our readers would respond.  Our entire team was very impressed with the number and quality of the responses we saw.  Congratulations goes out to all those assembler gurus out there that figured out this was a simple bubble sort.  We enjoyed seeing how various people went about solving this.  Some people compiled their code as they worked on reversing the function to verify the assembler.  This is a good approach.  Others just seemed to work it out end to end.  This is the approach I usually end up using because I’m typically in the middle of a debug and don’t actually need the source.

     

    Great work!

     

    Here is the answer….

     

     

    void myfun(char *val)

    {

    00321480  push        ebp 

    00321481  mov         ebp,esp

    00321483  sub         esp,0F0h

    00321489  push        ebx 

    0032148A  push        esi 

    0032148B  push        edi 

    0032148C  lea         edi,[ebp-0F0h]

    00321492  mov         ecx,3Ch

    00321497  mov         eax,0CCCCCCCCh

    0032149C  rep stos    dword ptr es:[edi]

           int i;

           int j;

           int len;

           char t;

     

           len=strlen(val);

    0032149E  mov         eax,dword ptr [val]

    003214A1  push        eax 

    003214A2  call        @ILT+165(_strlen) (3210AAh)

    003214A7  add         esp,4

    003214AA  mov         dword ptr [len],eax

     

           for (i=len;i>0;i--)

    003214AD  mov         eax,dword ptr [len]

    003214B0  mov         dword ptr [i],eax

    003214B3  jmp         myfun+3Eh (3214BEh)

    003214B5  mov         eax,dword ptr [i]

    003214B8  sub         eax,1

    003214BB  mov         dword ptr [i],eax

    003214BE  cmp         dword ptr [i],0

    003214C2  jle         myfun+0A4h (321524h)

           {

                  for(j=0;j<i-1;j++)

    003214C4  mov         dword ptr [j],0

    003214CB  jmp         myfun+56h (3214D6h)

    003214CD  mov         eax,dword ptr [j]

    003214D0  add         eax,1

    003214D3  mov         dword ptr [j],eax

    003214D6  mov         eax,dword ptr [i]

    003214D9  sub         eax,1

    003214DC  cmp         dword ptr [j],eax

    003214DF  jge         myfun+0A2h (321522h)

                  {

                         if (val[j]>val[j+1])

    003214E1  mov         eax,dword ptr [val]

    003214E4  add         eax,dword ptr [j]

    003214E7  movsx       ecx,byte ptr [eax]

    003214EA  mov         edx,dword ptr [val]

    003214ED  add         edx,dword ptr [j]

    003214F0  movsx       eax,byte ptr [edx+1]

    003214F4  cmp         ecx,eax

    003214F6  jle         myfun+0A0h (321520h)

                         {

                               t=val[j];

    003214F8  mov         eax,dword ptr [val]

    003214FB  add         eax,dword ptr [j]

    003214FE  mov         cl,byte ptr [eax]

    00321500  mov         byte ptr [t],cl

                               val[j]=val[j+1];

    00321503  mov         eax,dword ptr [val]

    00321506  add         eax,dword ptr [j]

    00321509  mov         ecx,dword ptr [val]

    0032150C  add         ecx,dword ptr [j]

    0032150F  mov         dl,byte ptr [ecx+1]

    00321512  mov         byte ptr [eax],dl

                               val[j+1]=t;

    00321514  mov         eax,dword ptr [val]

    00321517  add         eax,dword ptr [j]

    0032151A  mov         cl,byte ptr [t]

    0032151D  mov         byte ptr [eax+1],cl

                         }

                  }

    00321520  jmp         myfun+4Dh (3214CDh)

          

           }

    00321522  jmp         myfun+35h (3214B5h)

    }

    00321524  pop         edi 

    00321525  pop         esi 

    00321526  pop         ebx 

    00321527  add         esp,0F0h

    0032152D  cmp         ebp,esp

    0032152F  call        @ILT+335(__RTC_CheckEsp) (321154h)

    00321534  mov         esp,ebp

    00321536  pop         ebp 

    00321537  ret         

     

    Thank You

    Jeff Dailey-

  • Ntdebugging Blog

    Too Much Cache?

    • 27 Comments

    Cache is used to reduce the performance impact when accessing data that resides on slower storage media.  Without it your PC would crawl along and become nearly unusable.  If data or code pages for a file reside on the hard disk, it can take the system 10 milliseconds to access the page.  If that same page resides in physical RAM, it can take the system 10 nanoseconds to access the page.  Access to physical RAM is about 1 million times faster than to a hard drive.  It would be great if we could load up all the contents of the hard drive into RAM, but that scenario is cost prohibitive and dangerous.  Hard disk space is far less costly and is non-volatile (the data is persistent even when disconnected from a power source). 

     

    Since we are limited with how much RAM we can stick in a box, we have to make the most of it.  We have to share this crucial physical resource with all running processes, the kernel and the file system cache.  You can read more about how this works here:

    http://blogs.msdn.com/ntdebugging/archive/2007/10/10/the-memory-shell-game.aspx

     

    The file system cache resides in kernel address space.  It is used to buffer access to the much slower hard drive.  The file system cache will map and unmap sections of files based on access patterns, application requests and I/O demand.  The file system cache operates like a process working set.  You can monitor the size of your file system cache's working set using the Memory\System Cache Resident Bytes performance monitor counter.  This value will only show you the system cache's current working set.  Once a page is removed from the cache's working set it is placed on the standby list.  You should consider the standby pages from the cache manager as a part of your file cache.  You can also consider these standby pages to be available pages.  This is what the pre-Vista Task Manager does.  Most of what you see as available pages is probably standby pages for the system cache.  Once again, you can read more about this in "The Memory Shell Game" post.

     

    Too Much Cache is a Bad Thing

    The memory manager works on a demand based algorithm.  Physical pages are given to where the current demand is.  If the demand isn't satisfied, the memory manager will start pulling pages from other areas, scrub them and send them to help meet the growing demand.  Just like any process, the system file cache can consume physical memory if there is sufficient demand. 

    Having a lot of cache is generally not a bad thing, but if it is at the expense of other processes it can be detrimental to system performance.  There are two different ways this can occur - read and write I/O.

     

    Excessive Cached Write I/O

    Applications and services can dump lots of write I/O to files through the system file cache.  The system cache's working set will grow as it buffers this write I/O.  System threads will start flushing these dirty pages to disk.  Typically the disk can't keep up with the I/O speed of an application, so the writes get buffered into the system cache.  At a certain point the cache manager will reach a dirty page threshold and start to throttle I/O into the cache manager.  It does this to prevent applications from overtaking physical RAM with write I/O.  There are however, some isolated scenarios where this throttle doesn't work as well as we would expect.  This could be due to bad applications or drivers or not having enough memory.  Fortunately, we can tune the amount of dirty pages allowed before the system starts throttling cached write I/O.  This is handled by the SystemCacheDirtyPageThreshold registry value as described in Knowledge Base article 920739: http://support.microsoft.com/default.aspx?scid=kb;EN-US;920739

     

    Excessive Cached Read I/O

    While the SystemCacheDirtyPageThreshold registry value can tune the number of write/dirty pages in physical memory, it does not affect the number of read pages in the system cache.  If an application or driver opens many files and actively reads from them continuously through the cache manager, then the memory manger will move more physical pages to the cache manager.  If this demand continues to grow, the cache manager can grow to consume physical memory and other process (with less memory demand) will get paged out to disk.  This read I/O demand may be legitimate or may be due to poor application scalability.  The memory manager doesn't know if the demand is due to bad behavior or not, so pages are moved simply because there is demand for it.  On a 32 bit system, the file system cache working set is essentially limited to 1 GB.  This is the maximum size that we blocked off in the kernel for the system cache working set.  Since most systems have more than 1 GB of physical RAM today, having the system cache working set consume physical RAM with read I/O is less likely. 

    This scenario; however, is more prevalent on 64 bit systems.  With the increase in pointer length, the kernel's address space is greatly expanded.  The system cache's working set limit can and typically does exceed how much memory is installed in the system.  It is much easier for applications and drivers to load up the system cache with read I/O.  If the demand is sustained, the system cache's working set can grow to consume physical memory.  This will push out other process and kernel resources out to the page file and can be very detrimental to system performance.

    Fortunately we can also tune the server for this scenario.  We have added two APIs to query and set the system file cache size - GetSystemFileCacheSize() and SetSystemFileCacheSize().  We chose to implement this tuning option via API calls to allow setting the cache working set size dynamically.  I’ve uploaded the source code and compiled binaries for a sample application that calls these APIs.  The source code can be compiled using the Windows DDK, or you can use the included binaries.  The 32 bit version is limited to setting the cache working set to a maximum of 4 GB.  The 64 bit version does not have this limitation.  The sample code and included binaries are completely unsupported.  It is just a quick and dirty implementation with little error handling.

  • Ntdebugging Blog

    Desktop Heap, part 2

    • 21 Comments

     

    Matthew here again – I want to provide some follow-up information on desktop heap.   In the first post I didn’t discuss the size of desktop heap related memory ranges on 64-bit Windows, 3GB, or Vista.   So without further ado, here are the relevant sizes on various platforms...

     

     

    Windows XP (32-bit)

     

    ·         48 MB = SessionViewSize (default registry value, set for XP Professional, x86)

    ·         20 MB = SessionViewSize (if no registry value is defined) 

    ·         3072 KB = Interactive desktop heap size (defined in the registry, SharedSection 2nd value)

    ·         512 KB = Non-interactive desktop heap size (defined in the registry, SharedSection 3nd value)

    ·         128 KB = Winlogon desktop heap size

    ·         64 KB = Disconnect desktop heap size

     

     

     

    Windows Server 2003 (32-bit)

     

    ·         48 MB = SessionViewSize (default registry value)

    ·         20 MB = SessionViewSize (if no registry value is defined; this is the default for Terminal Servers)

    ·         3072 KB = Interactive desktop heap size (defined in the registry, SharedSection 2nd value)

    ·         512 KB = Non-interactive desktop heap size (defined in the registry, SharedSection 3nd value)

    ·         128 KB = Winlogon desktop heap size

    ·         64 KB = Disconnect desktop heap size

     

     

     

    Windows Server 2003 booted with 3GB (32-bit)

     

    ·         20 MB = SessionViewSize (registry value has no effect)

    ·         3072 KB = Interactive desktop heap size (defined in the registry, SharedSection 2nd value)

    ·         512 KB = Non-interactive desktop heap size (defined in the registry, SharedSection 3nd value)

    ·         128 KB = Winlogon desktop heap size

    ·         64 KB = Disconnect desktop heap size

     

    You may also see reduced heap sizes when running 3GB.  During the initialization of the window manager, an attempt is made to reserve enough session view space to accommodate the expected number of desktops heaps for a given session.  If the heap sizes specified in the SharedSection registry value have been increased, the attempt to reserve session view space may fail.  When this happens, the window manager falls back to a pair of “safe” sizes for desktop heaps (512KB for interactive, 128KB for non-interactive) and tries to reserve session space again, using these smaller numbers.  This ensures that even if the registry values are too large for the 20MB session view space, the system will still be able to boot.

     

     

     

    Windows Server 2003 (64-bit)

     

    ·         104 MB = SessionViewSize (if no registry value is defined; which is the default) 

    ·         20 MB = Interactive desktop heap size (defined in the registry, SharedSection 2nd value)

    ·         768 KB = Non-interactive desktop heap size (defined in the registry, SharedSection 3nd value)

    ·         192 KB = Winlogon desktop heap size

    ·         96 KB = Disconnect desktop heap size

     

     

     

    Windows Vista RTM (32-bit)

     

    ·         Session View space is now a dynamic kernel address range.  The SessionViewSize registry value is no longer used.

    ·         3072 KB = Interactive desktop heap size (defined in the registry, SharedSection 2nd value)

    ·         512 KB = Non-interactive desktop heap size (defined in the registry, SharedSection 3nd value)

    ·         128 KB = Winlogon desktop heap size

    ·         64 KB = Disconnect desktop heap size

     

     

     

    Windows Vista (64-bit) and Windows Server 2008 (64-bit)

     

    ·         Session View space is now a dynamic kernel address range.  The SessionViewSize registry value is no longer used.

    ·         20 MB = Interactive desktop heap size (defined in the registry, SharedSection 2nd value)

    ·         768 KB = Non-interactive desktop heap size (defined in the registry, SharedSection 3nd value)

    ·         192 KB = Winlogon desktop heap size

    ·         96 KB = Disconnect desktop heap size

     

     

    Windows Vista SP1 (32-bit) and Windows Server 2008 (32-bit)

     

    ·         Session View space is now a dynamic kernel address range.  The SessionViewSize registry value is no longer used.

    ·         12288 KB = Interactive desktop heap size (defined in the registry, SharedSection 2nd value)

    ·         512 KB = Non-interactive desktop heap size (defined in the registry, SharedSection 3nd value)

    ·         128 KB = Winlogon desktop heap size

    ·         64 KB = Disconnect desktop heap size 

     

     

    Windows Vista introduced a new public API function: CreateDesktopEx, which allows the caller to specify the size of desktop heap.

    Additionally, GetUserObjectInformation now includes a new flag for retrieving the desktop heap size (UOI_HEAPSIZE).

     

     

    [Update: 3/20/2008 - Added Vista SP1 and Server 2008 info.  More information can be found here.]

  • Ntdebugging Blog

    NTDebugging Puzzler 0x00000007: Interlocked functions

    • 19 Comments

     

    Today, we will have some fun with interlocked functions.

     

    The following section of code is reentrant.  A “well meaning” developer used interlocked functions to avoid serializing on a global table lock.

     

    Initial smoke testing shows that the code works fine.  Sometimes things are not as they appear when doing initial code review.  After several hours of heavy stress testing, the developer finds the machine has bugchecked.  Analysis of the dump showed that the caller of this function had steamrolled through nonpaged pool writing stacks on top of everyone’s pool memory.

     

    The goal of today’s puzzler is to find the bug and describe how it could be fixed with minimal timing impact.

     

    Here are a few details before you begin.

     

    1.       The variable gIndex is an unsigned long global.

    2.       The gLogArray memory was allocated from nonpaged pool and the size of this allocation is correct.

     

    Ready?

     

    00:   PLOG_ENTRY GetNextLogEntry ()

          {

    01:         ULONG IterationCount = MAX_RECORDS;

     

    02:         PLOG_ENTRY pEntry;

     

    03:         do

                {

    04:               if (InterlockedCompareExchange(&gIndex, 0, MAX_RECORDS) == MAX_RECORDS)

     

    05:                     pEntry = &gLogArray[0];

     

    06:               else

     

    07:                     pEntry = &gLogArray[InterlockedIncrement(&gIndex)];

     

    08:               --IterationCount;

     

    09:         } while(InterlockedCompareExchange(&pEntry->Active, 1, 0) != 0 && (IterationCount > 0));

     

    0a:         return (IterationCount > 0) ? pEntry : NULL;

          }

     

     

     

    Happy hunting,

     

    Dennis Middleton “The NTFS Doctor”

     


      [Update: our answer. Posted 6/10/2008]

     

    Thanks for all the great posts!  I saw many interesting answers, and a few unique solutions that I hadn’t considered.

     

    Bug Description

     

    A slight time gap exists between the InterlockedCompareExchange and the InterlockedIncrement.  For this reason, several threads could pass the check for MAX_RECORDS prior to doing the increment.

     

    Assume that N is the number of threads that pass the check for MAX_RECORDS while gIndex is at a particular value.

    If N or more threads are able to pass the check while gIndex is equal to MAX_RECORDS-(N-1), then gIndex would be incremented beyond MAX_RECORDS.

     

    For example, let’s assume that 3 threads passed the check while gIndex was at MAX_RECORDS-2.  Then after the three increments occur, gIndex would be equal to MAX_RECORDS+1.  From that point, invalid pointers would be passed out to the caller.

     

    Possible Solutions

     

    There are several ways to solve this problem.  Some are more efficient than others.  I would avoid doing checks for MAX_RECORDS-1, MAX_RECORDS, or MAX_RECORDS+1 (interlocked or not) since there could potentially be more than two threads involved in the race condition.  Such a solution would only reduce the likelihood of an overflow.

     

    There were a few posts suggesting a lock or critical section for the section of code between the compare and increment.  That would be one solution, but it would also do away with the benefits of using interlocked functions.

     

    In keeping with the philosophy of keeping the code fast and simple, here’s a solution that gives a very good result with minimal impact.

     

    1.       I removed the if () {} else {} and simply allowed gIndex to increment unchecked.  With this change, gIndex can approach its max unsigned long value and possibly wrap - we need to keep the array index in check.

    2.       The modulus operator (added to line 4 below) will divide the incremented gIndex by MAX_RECORDS and use the remainder as the array index.  When dividing, the resultant remainder is always smaller than the divisor (MAX_RECORDS).  For this reason, the array index is guaranteed to be smaller than MAX_RECORDS.  As even multiples of MAX_RECORDS are reached, the array index resets back to zero mathemagically and no interlocked compare is even necessary.

     

     

    00:   PLOG_ENTRY GetNextLogEntry ()

          {

    01:         ULONG IterationCount = MAX_RECORDS;

     

    02:         PLOG_ENTRY pEntry;

     

    03:         do

                {

     

    04:               pEntry = &gLogArray[InterlockedIncrement(&gIndex) % MAX_RECORDS];

     

    05:               --IterationCount;

     

    06:         } while(InterlockedCompareExchange(&pEntry->Active, 1, 0) != 0 && (IterationCount > 0));

     

    07:         return (IterationCount > 0) ? pEntry : NULL;

          }

     

     

    With the fix in place, the code is smaller, faster, easier to read, and most of all - the bug is gone.  When developing code, always try to think small.             

     

    Best Regards,

    NTFS Doctor

     

     

     





  • Ntdebugging Blog

    Debug Fundamentals Exercise 3: Calling conventions

    • 19 Comments

     

    Today’s exercise will focus on x86 function calling conventions.  The calling convention of a function describes the following:

     

    ·         The order in which parameters are passed

    ·         Where parameters are placed (pushed on the stack or placed in registers)

    ·         Whether the caller or the callee is responsible for unwinding the stack on return

     

    While debugging, an understanding of calling conventions is helpful when you need to determine why certain values are placed in registers or on the stack before a function call.

     

    Standard x86 calling convention on Windows:

    Name

    Arguments

    Unwinds stack

    Win32 (Stdcall)

    pushed onto stack from right to left

    callee

    Native C++ (Thiscall)

    pushed onto stack from right to left, "this" pointer in ecx

    callee

    COM (Stdcall for C++)

    pushed onto stack from right to left, then "this" is pushed

    callee

    Fastcall

    arg1 in ecx, arg2 in edx, remaining args pushed onto stack from right to left

    callee

    Cdecl

    pushed onto stack from right to left

    caller

     

     

    Question:

    Below are calls to 5 functions.  Each function takes two DWORD parameters.  Based on the code that calls each function, identify the calling convention used.

     

    // Call to Function1

    01002ffe 8b08            mov     ecx,dword ptr [eax]

    01003000 53              push    ebx

    01003001 687c2c0001      push    offset 01002c7c

    01003006 50              push    eax

    01003007 ff11            call    dword ptr [ecx]

     

    // Call to Function2

    01002490 50              push    eax

    01002491 688c110001      push    offset 0100118c

    01002496 e82a020000      call    dbgex4!Function2 (010026c5)

    0100249b 59              pop     ecx

    0100249c 59              pop     ecx

     

    // Call to Function3

    0100248e 8bd0            mov     edx,eax

    01002490 8bcf            mov     ecx,edi

    01002492 e8aeffffff      call    dbgex4!Function3 (01002445)

     

    // Call to Function4

    00413586 8b450c          mov     eax,dword ptr [ebp+0Ch]

    00413589 50              push    eax

    0041358a 8b4d08          mov     ecx,dword ptr [ebp+8]

    0041358d 51              push    ecx

    0041358e 8b4dec          mov     ecx,dword ptr [ebp-14h]

    00413591 e86fdfffff      call    dbgex4!Function4 (00411505)

     

    // Call to Function5

    01003540 56              push    esi

    01003541 8d85d4f9ffff    lea     eax,[ebp-62Ch]

    01003547 50              push    eax

    01003548 ff1558100001    call    dbgex4!Function5 (01001058)]

     

     

    Bonus: describe the calling convention used for x64.

     

     


    [Update: our answer. Posted 12/18/2008]

     

    Function1 - COM (Stdcall for C++)

     

    Function2 - cdecl

     

    Function3 - fastcall

     

    Function4 - Native C++ (Thiscall)

     

    Function5 - Win32 (Stdcall)

     

     

    Bonus: describe the calling convention used for x64:  

    http://msdn.microsoft.com/en-us/library/ms794533.aspx

     

  • Ntdebugging Blog

    Debug puzzler 0x00000002 “Attack of the crazy stack”

    • 17 Comments

    Hi NTDebuggers, I have another puzzler for you.  We started crash2.exe under windbg and it crashed.  Go figure!  Sometimes we have a very limited amount of data available to figure out what went wrong.  That being said, this week’s puzzler only gives you a few clues.    Given this week’s debugger output, what do you suspect the problem is?  What would you do to further isolate the issue or prove your theory?

     

    If there is more data you need to solve it, post a comment / request and I will provide the data for you.  We will post all comments during the week and provide our answer on Friday.  We look forward to your comments.

     

    CommandLine: crash2.exe

    Symbol search path is: srv*C:\symbols*\\symbols\symbols

    Executable search path is:

    ModLoad: 00400000 00438000   crash2.exe

    ModLoad: 779b0000 77b00000   ntdll.dll

    ModLoad: 76180000 76290000   C:\Windows\syswow64\kernel32.dll

    (15d0.1688): Break instruction exception - code 80000003 (first chance)

    eax=00000000 ebx=00000000 ecx=cd7b0000 edx=00000000 esi=fffffffe edi=77a90094

    eip=779c0004 esp=0017faf8 ebp=0017fb28 iopl=0         nv up ei pl zr na pe nc

    cs=0023  ss=002b  ds=002b  es=002b  fs=0053  gs=002b             efl=00000246

    ntdll!DbgBreakPoint:

    779c0004 cc              int     3

     

    0:000> g

     

    (15d0.1688): Access violation - code c0000005 (first chance)

    First chance exceptions are reported before any exception handling.

    This exception may be expected and handled.

    eax=0017feec ebx=7619140f ecx=0042ecc8 edx=00000000 esi=00000002 edi=00001770

    eip=65732074 esp=0017ff00 ebp=6f207473 iopl=0         nv up ei pl nz ac pe nc

    cs=0023  ss=002b  ds=002b  es=002b  fs=0053  gs=002b             efl=00010216

    65732074 ??              ???

    0:000> k 123

     

    ChildEBP RetAddr 

    WARNING: Frame IP not in any known module. Following frames may be wrong.

    0017fefc 66692065 0x65732074

    0017ff3c 0041b9a3 0x66692065

    *** WARNING: Unable to verify checksum for crash2.exe

    0017ffa0 762019f1 crash2!_onexit+0x35

    0017ffac 77a2d109 kernel32!BaseThreadInitThunk+0xe

    0017ffec 00000000 ntdll!_RtlUserThreadStart+0x23

     

    0:000> lm

    start    end        module name

    00400000 00438000   crash2   C (private pdb symbols)  D:\CPRRAMP\source\crash2\debug\crash2.pdb

    76180000 76290000   kernel32   (private pdb symbols)  C:\symbols\wkernel32.pdb\20F7BB5ED22344A2910B27CA7252AE792\wkernel32.pdb

    779b0000 77b00000   ntdll      (private pdb symbols)  C:\symbols\wntdll.pdb\7099E4B6A6984FD08CBC90A4EDD40FD12\wntdll.pdb

     

    0:000> db 66692065

    66692065  ?? ?? ?? ?? ?? ?? ?? ??-?? ?? ?? ?? ?? ?? ?? ??  ????????????????

    66692075  ?? ?? ?? ?? ?? ?? ?? ??-?? ?? ?? ?? ?? ?? ?? ??  ????????????????

    66692085  ?? ?? ?? ?? ?? ?? ?? ??-?? ?? ?? ?? ?? ?? ?? ??  ????????????????

    66692095  ?? ?? ?? ?? ?? ?? ?? ??-?? ?? ?? ?? ?? ?? ?? ??  ????????????????

    666920a5  ?? ?? ?? ?? ?? ?? ?? ??-?? ?? ?? ?? ?? ?? ?? ??  ????????????????

    666920b5  ?? ?? ?? ?? ?? ?? ?? ??-?? ?? ?? ?? ?? ?? ?? ??  ????????????????

    666920c5  ?? ?? ?? ?? ?? ?? ?? ??-?? ?? ?? ?? ?? ?? ?? ??  ????????????????

    666920d5  ?? ?? ?? ?? ?? ?? ?? ??-?? ?? ?? ?? ?? ?? ?? ??  ????????????????

    0:000> db 0x65732074

    65732074  ?? ?? ?? ?? ?? ?? ?? ??-?? ?? ?? ?? ?? ?? ?? ??  ????????????????

    65732084  ?? ?? ?? ?? ?? ?? ?? ??-?? ?? ?? ?? ?? ?? ?? ??  ????????????????

    65732094  ?? ?? ?? ?? ?? ?? ?? ??-?? ?? ?? ?? ?? ?? ?? ??  ????????????????

    657320a4  ?? ?? ?? ?? ?? ?? ?? ??-?? ?? ?? ?? ?? ?? ?? ??  ????????????????

    657320b4  ?? ?? ?? ?? ?? ?? ?? ??-?? ?? ?? ?? ?? ?? ?? ??  ????????????????

    657320c4  ?? ?? ?? ?? ?? ?? ?? ??-?? ?? ?? ?? ?? ?? ?? ??  ????????????????

    657320d4  ?? ?? ?? ?? ?? ?? ?? ??-?? ?? ?? ?? ?? ?? ?? ??  ????????????????

    657320e4  ?? ?? ?? ?? ?? ?? ?? ??-?? ?? ?? ?? ?? ?? ?? ??  ????????????????

    0:000> db 0041b9a3

    0041b9a3  c3 e8 42 57 ff ff c3 cc-cc cc cc cc cc cc cc cc  ..BW............

    0041b9b3  cc cc cc cc cc cc ff 74-24 04 e8 42 5c ff ff f7  .......t$..B\...

    0041b9c3  d8 1b c0 f7 d8 59 48 c3-cc cc cc cc 8b 44 24 04  .....YH......D$.

    0041b9d3  a3 78 3c 43 00 a3 7c 3c-43 00 a3 80 3c 43 00 a3  .x<C..|<C...<C..

    0041b9e3  84 3c 43 00 c3 cc cc cc-cc cc cc 6a 10 68 58 16  .<C........j.hX.

    0041b9f3  43 00 e8 cc 5d ff ff 33-ff 57 e8 0f 59 ff ff 59  C...]..3.W..Y..Y

    0041ba03  89 7d fc 39 7d 08 75 1c-be 78 3c 43 00 ff 35 78  .}.9}.u..x<C..5x

    0041ba13  3c 43 00 e8 19 5e ff ff-89 45 e4 c7 45 e0 02 00  <C...^...E..E...

    0:000> db 0x66692065

    66692065  ?? ?? ?? ?? ?? ?? ?? ??-?? ?? ?? ?? ?? ?? ?? ??  ????????????????

    66692075  ?? ?? ?? ?? ?? ?? ?? ??-?? ?? ?? ?? ?? ?? ?? ??  ????????????????

    66692085  ?? ?? ?? ?? ?? ?? ?? ??-?? ?? ?? ?? ?? ?? ?? ??  ????????????????

    66692095  ?? ?? ?? ?? ?? ?? ?? ??-?? ?? ?? ?? ?? ?? ?? ??  ????????????????

    666920a5  ?? ?? ?? ?? ?? ?? ?? ??-?? ?? ?? ?? ?? ?? ?? ??  ????????????????

    666920b5  ?? ?? ?? ?? ?? ?? ?? ??-?? ?? ?? ?? ?? ?? ?? ??  ????????????????

    666920c5  ?? ?? ?? ?? ?? ?? ?? ??-?? ?? ?? ?? ?? ?? ?? ??  ????????????????

    666920d5  ?? ?? ?? ?? ?? ?? ?? ??-?? ?? ?? ?? ?? ?? ?? ??  ????????????????

     

    0:000> dd esp

     

    0017ff00  66692065 72756f20 61747320 69206b63

    0017ff10  616c2073 72656772 6f6e6520 54686775

    0017ff20  20736968 6d207369 2065726f 61746164

    0017ff30  726f6620 00737520 6f76e0ae fffffffe

    0017ff40  0041b9a3 0041b9c2 00411181 004146d9

    0017ff50  0017ffa0 00412ac4 00000001 002620a0

    0017ff60  002620d8 6f220936 00000000 00000000

    0017ff70  7efde000 0017ff9c 00000000 00000006

     

    0:000> dds esp

     

    0017ff00  66692065

    0017ff04  72756f20

    0017ff08  61747320

    0017ff0c  69206b63

    0017ff10  616c2073

    0017ff14  72656772

    0017ff18  6f6e6520

    0017ff1c  54686775

    0017ff20  20736968

    0017ff24  6d207369

    0017ff28  2065726f

    0017ff2c  61746164

    0017ff30  726f6620

    0017ff34  00737520

    0017ff38  6f76e0ae

    0017ff3c  fffffffe

    0017ff40  0041b9a3 crash2!_onexit+0x35

    0017ff44  0041b9c2 crash2!atexit+0x9

    0017ff48  00411181 crash2!ILT+380(__RTC_Terminate)

    0017ff4c  004146d9 crash2!_cinit+0x49

    0017ff50  0017ffa0

    0017ff54  00412ac4 crash2!__tmainCRTStartup+0x15e

    0017ff58  00000001

    0017ff5c  002620a0

    0017ff60  002620d8

    0017ff64  6f220936

    0017ff68  00000000

    0017ff6c  00000000

    0017ff70  7efde000

    0017ff74  0017ff9c

    0017ff78  00000000

    0017ff7c  00000006

     

     

    Good luck and happy debugging!

     

    Jeff-


    [Update: more debugger output, per reader request. Posted 4/17/2008]

    Note this cashes without windbg also.

     

    CommandLine: crash2.exe

    Symbol search path is: SRV*c:\websymbols*http://msdl.microsoft.com/download/symbols

    Executable search path is:

    ModLoad: 00400000 00438000   crash2.exe

    ModLoad: 777b0000 77900000   ntdll.dll

    ModLoad: 75a10000 75b20000   C:\Windows\syswow64\kernel32.dll

    (1324.16d0): Break instruction exception - code 80000003 (first chance)

    eax=00000000 ebx=00000000 ecx=73fe0000 edx=00000000 esi=fffffffe edi=77890094

    eip=777c0004 esp=0017faf8 ebp=0017fb28 iopl=0         nv up ei pl zr na pe nc

    cs=0023  ss=002b  ds=002b  es=002b  fs=0053  gs=002b             efl=00000246

    ntdll!DbgBreakPoint:

    777c0004 cc              int     3

    0:000> g

    (1324.16d0): Access violation - code c0000005 (first chance)

    First chance exceptions are reported before any exception handling.

    This exception may be expected and handled.

    eax=0017feec ebx=75a2140f ecx=0042ecc8 edx=00000000 esi=00000002 edi=00001770

    eip=65732074 esp=0017ff00 ebp=6f207473 iopl=0         nv up ei pl nz ac pe nc

    cs=0023  ss=002b  ds=002b  es=002b  fs=0053  gs=002b             efl=00010216

    65732074 ??              ???

    0:000> db esp

    0017ff00  65 20 69 66 20 6f 75 72-20 73 74 61 63 6b 20 69  e if our stack i

    0017ff10  73 20 6c 61 72 67 65 72-20 65 6e 6f 75 67 68 54  s larger enoughT

    0017ff20  68 69 73 20 69 73 20 6d-6f 72 65 20 64 61 74 61  his is more data

    0017ff30  20 66 6f 72 20 75 73 00-9b 7d 6b d3 fe ff ff ff   for us..}k.....

    0017ff40  a3 b9 41 00 c2 b9 41 00-81 11 41 00 d9 46 41 00  ..A...A...A..FA.

    0017ff50  a0 ff 17 00 c4 2a 41 00-01 00 00 00 98 21 03 00  .....*A......!..

    0017ff60  d0 21 03 00 03 94 3f d3-00 00 00 00 00 00 00 00  .!....?.........

    0017ff70  00 e0 fd 7e 9c ff 17 00-00 00 00 00 06 00 00 00  ...~............

    0:000> ln 0042ecc8

    (0042ecac)   crash2!`string'+0x1c   |  (0042eccc)   crash2!`string'

    0:000> .formats 0017ff044

    Evaluate expression:

      Hex:     017ff044

      Decimal: 25161796

      Octal:   00137770104

      Binary:  00000001 01111111 11110000 01000100

      Chars:   ..D

      Time:    Mon Oct 19 01:23:16 1970

      Float:   low 4.70085e-038 high 0

      Double:  1.24316e-316

    0:000> dd 0017ff044

    017ff044  ???????? ???????? ???????? ????????

    017ff054  ???????? ???????? ???????? ????????

    017ff064  ???????? ???????? ???????? ????????

    017ff074  ???????? ???????? ???????? ????????

    017ff084  ???????? ???????? ???????? ????????

    017ff094  ???????? ???????? ???????? ????????

    017ff0a4  ???????? ???????? ???????? ????????

    017ff0b4  ???????? ???????? ???????? ????????

    0:000> dds crash2!__onexitbegin

    00434354  59c96d56

    00434358  00000001

    0043435c  00000000

    00434360  00000000

    00434364  00000000

    00434368  00000000

    0043436c  00000000

    00434370  00033760

    00434374  00000000

    00434378  00000000

    0043437c  00000000

    00434380  00000000

    00434384  00000000

    00434388  00000000

    0043438c  00000000

    00434390  00000000

    00434394  00000000

    00434398  00000000

    0043439c  00000000

    004343a0  00000000

    004343a4  00000000

    004343a8  00000000

    004343ac  00000000

    004343b0  00000000

    004343b4  00000000

    004343b8  00000000

    004343bc  00000000

    004343c0  00000000

    004343c4  00000000

    004343c8  00000000

    004343cc  00000000

    004343d0  00000000

    0:000> dds crash2!__onexitend

    00434350  59c94d56

    00434354  59c96d56

    00434358  00000001

    0043435c  00000000

    00434360  00000000

    00434364  00000000

    00434368  00000000

    0043436c  00000000

    00434370  00033760

    00434374  00000000

    00434378  00000000

    0043437c  00000000

    00434380  00000000

    00434384  00000000

    00434388  00000000

    0043438c  00000000

    00434390  00000000

    00434394  00000000

    00434398  00000000

    0043439c  00000000

    004343a0  00000000

    004343a4  00000000

    004343a8  00000000

    004343ac  00000000

    004343b0  00000000

    004343b4  00000000

    004343b8  00000000

    004343bc  00000000

    004343c0  00000000

    004343c4  00000000

    004343c8  00000000

    004343cc  00000000

     


    [Update: our answer. Posted 4/18/2008]

    This week we had a lot of people that realized this was a buffer overrun.   You guys are so good I’m going to make next week’s puzzler a little harder!

    Good work all!   They are not all listed but some of the responses I liked the best were:

     

    moltov

    Matthieu

    Doug

    Tal Rosen

     

    This week’s official response

     

     

    Let’s start off with wmain.  You can see here that we push a pointer to a string onto the stack and call fun1 004117a3

     

    0:000> uf crash2!wmain

    crash2!wmain [d:\cprramp\source\crash2\crash2\crash2.cpp @ 11]:

       11 00412520 55              push    ebp

       11 00412521 8bec            mov     ebp,esp

       11 00412523 83ec40          sub     esp,40h

       11 00412526 53              push    ebx

       11 00412527 56              push    esi

       11 00412528 57              push    edi

       12 00412529 686cec4200      push    offset crash2!`string' (0042ec6c) << Pushing param onto the stack

       12 0041252e e870f2ffff      call    crash2!ILT+1950(?fun1YAXPADZ) (004117a3)  << Making call to fun1

       12 00412533 83c404          add     esp,4

       13 00412536 33c0            xor     eax,eax

       14 00412538 5f              pop     edi

       14 00412539 5e              pop     esi

       14 0041253a 5b              pop     ebx

       14 0041253b 8be5            mov     esp,ebp

       14 0041253d 5d              pop     ebp

       14 0041253e c3              ret

     

    Lets dump out the value we are passing.

     

    0:000> da 0042ec6c

    0042ec6c  "This is a test ot see if our sta"

    0042ec8c  "ck is larger enough"

     

    0:000> uf 004117a3

    crash2!fun1 [d:\cprramp\source\crash2\crash2\crash2.cpp @ 17]:

       17 00412550 55              push    ebp

       17 00412551 8bec            mov     ebp,esp

       17 00412553 83ec4c          sub     esp,4Ch

       17 00412556 53              push    ebx

       17 00412557 56              push    esi

       17 00412558 57              push    edi

       19 00412559 8b4508          mov     eax,dword ptr [ebp+8]  << Here we are moving EBP+8 into eax. This is basically just loading the address of parameter one into eax

       19 0041255c 50              push    eax  << We push it onto the stack to make our call to  strcpy

       19 0041255d 8d4df4          lea     ecx,[ebp-0Ch]  << Now we are loading the address of a local on the stack. 

    Note the local is ebp-c   That means that we can only write 0xC bytes to

    this location before we end up overwriting things on the stack like our base pointer and return address.

       19 00412560 51              push    ecx  << Now we push our local variable address onto the stack for our call to strcpy

       19 00412566 83c408          add     esp,8

       20 00412569 8d45f4          lea     eax,[ebp-0Ch]

       20 0041256c 50              push    eax

     

       19 00412561 e8c3eeffff      call    crash2!ILT+1060(_strcpy) (00411429)  << This is where things go WRONG, within the call to strcpy we have overwritten our return address with string data.

     

    Let’s look at the before and after.

     

    0:000> bp 00412560

    0:000> g

    Breakpoint 2 hit

    eax=0042ec6c ebx=75a2140f ecx=0017feec edx=00000000 esi=00000002 edi=00001770

    eip=00412560 esp=0017fe9c ebp=0017fef8 iopl=0         nv up ei pl nz ac pe nc

    cs=0023  ss=002b  ds=002b  es=002b  fs=0053  gs=002b             efl=00000216

    crash2!fun1+0x10:

    00412560 51              push    ecx

    0:000> k 

    ChildEBP RetAddr 

    0017fef8 00412533 crash2!fun1+0x10 [d:\cprramp\source\crash2\crash2\crash2.cpp @ 19] << The return address is ok here!

    0017ff50 00412ac4 crash2!wmain+0x13 [d:\cprramp\source\crash2\crash2\crash2.cpp @ 12]

    0017ffa0 75a919f1 crash2!__tmainCRTStartup+0x15e [f:\sp\vctools\crt_bld\self_x86\crt\src\crt0.c @ 327]

    0017ffac 7782d109 kernel32!BaseThreadInitThunk+0xe

    0017ffec 00000000 ntdll!_RtlUserThreadStart+0x23

     

    0:000> p  << Lets step over the call to strcpy and look again.

    eax=0017feec ebx=75a2140f ecx=0042eca0 edx=00686775 esi=00000002 edi=00001770

    eip=00412569 esp=0017fea0 ebp=0017fef8 iopl=0         nv up ei pl nz ac pe nc

    cs=0023  ss=002b  ds=002b  es=002b  fs=0053  gs=002b             efl=00000216

    crash2!fun1+0x19:

    00412569 8d45f4          lea     eax,[ebp-0Ch]

     

    0:000> k

    ChildEBP RetAddr 

    0017fef8 65732074 crash2!fun1+0x19 [d:\cprramp\source\crash2\crash2\crash2.cpp @ 20]

    WARNING: Frame IP not in any known module. Following frames may be wrong.

    0017ff3c 0041b9a3 0x65732074  << This is not going to be pretty when we ret out of fun1.  We will basically return to nowhere.

    0017ffa0 75a919f1 crash2!_onexit+0x35 [f:\sp\vctools\crt_bld\self_x86\crt\src\onexit.c @ 98]

    0017ffac 7782d109 kernel32!BaseThreadInitThunk+0xe

    0017ffec 00000000 ntdll!_RtlUserThreadStart+0x23

     

    0:000> da 0017fef8   if we do a da on the location of the stack frame we can see what is at that location.

    0017fef8  "st ot see if our stack is larger"  << It’s our string!

    0017ff18  " enough"

     

    Finally we have the C code.

     

    #include <windows.h>

     

    void fun1(char * szData);

    void fun2(char * szData);

     

    int _tmain(int argc, _TCHAR* argv[])

    {

                    fun1("This is a test ot see if our stack is larger enough");

                    return 0;

    }

     

    void fun1(char * szdata)

    {

                    char szData1[10];

                    strcpy(szData1,szdata);

                    fun2(szData1);

    }

    void fun2(char *szData)

    {

                    printf("Hello from fun2");

                    strcat(szData,"This is more data for us");

    }

     

     

    Have a great weekend, Good luck, and happy debugging!

     

    Jeff-





Page 1 of 23 (230 items) 12345»