August, 2004

  • The Old New Thing

    The oft-misunderstood /3GB switch

    • 32 Comments

    It's simple to explain what it does, but people often misunderstand.

    The /3GB switch changes the way the 4GB virtual address space is split up. Instead of splitting it as 2GB of user mode virtual address space and 2GB of kernel mode virtual address space, the split is 3GB of user mode virtual address space and 1GB of kernel mode virtual address space.

    That's all.

    And yet people think it does more than that.

    I think the problem is that people think that "virtual address space" means something other than just "virtual address space".

    The term "address space" refers to how a numerical value (known as an "address") is interpreted when it is used to access some type of resource. There is a physical address space; each address in the physical address space refers to a byte in a memory chip somewhere. (Note for pedants: Yes, it's actually spread out over several memory chips, but that's not important here.) There is an I/O address space; each address in the I/O address space allows the CPU to communicate with a hardware device.

    And then there is the virtual address space. When people say "address space", they usually mean "virtual address space".

    The virtual address space is the set of possible pointer values (addresses) that can be used at a single moment by the processor. In other words, if you have an address like 0x12345678, the virtual address space determines what you get if you try to access that memory. The contents of the virtual address space changes over time, for example, as you allocate and free memory. It also varies based on context: each process has its own virtual address space.

    Saying that 2GB (or 3GB) of virtual address space is available to user mode means that at any given moment in time, out of the 4 billion virtual addresses available in a 32-bit value, 2 billion (or 3 billion) of them are potentially usable by user-mode code.

    Over the next few entries, I'll talk about the various consequences and misinterpretations of the /3GB switch.

  • The Old New Thing

    Summary of the recent spate of /3GB articles

    • 36 Comments

    A table of contents now that the whole thing is over. I hope.

    I'm not sure how successful this series has been, though, for it appears that even people who have read the articles continue to confuse virtual address space with physical address space. (Or maybe this person is merely mocking a faulty argument? I can't tell for sure.)

  • The Old New Thing

    How to set focus in a dialog box

    • 22 Comments

    Setting focus in a dialog box is more than just calling SetFocus.

    A dialog box maintains the concept of a "default button" (which is always a pushbutton). The default button is typically drawn with a distinctive look (a heavy outline or a different color) and indicates what action the dialog box will take when you hit Enter. Note that this is not the same as the control that has the focus.

    For example, open the Run dialog from the Start menu. Observe that the OK button is the default button; it has a different look from the other buttons. But focus is on the edit control. Your typing goes to the edit control, until you hit Enter; the Enter activates the default button, which is OK.

    As you tab through the dialog, observe what happens to the default button. When the dialog box moves focus to a pushbutton, that pushbutton becomes the new default button. But when the dialog box moves focus to something that isn't a pushbutton at all, the OK button resumes its position as the default button.

    The dialog manager remebers which control was the default button when the dialog was initially created, and when it moves focus to something that isn't a button, it restores that original button as the default button.

    You can ask a dialog box what the default button is by sending it the DM_GETDEFID message; similarly, you can change it with the DM_SETDEFID message.

    (Notice that the return value of the DM_GETDEFID message packs the control ID in the low word and flags in the high word. Another place where expanding dialog control IDs to 32-bit values doesn't buy you anything.)

    As the remarks to the DM_SETDEFID function note, messing directly with the default ID carelessly can lead to odd cases like a dialog box with two default buttons. Fortunately, you rarely need to change the default ID for a dialog.

    A bigger problem is using SetFocus to shove focus around a dialog. If you do this, you are going directly to the window manager, bypassing the dialog manager. This means that you can create "impossible" situations like having focus on a pushbutton without that button being the default!

    To avoid this problem, don't use SetFocus to change focus on a dialog. Instead, use the WM_NEXTDLGCTL message.

    void SetDialogFocus(HWND hdlg, HWND hwndControl)
    {
     SendMessage(hdlg, WM_NEXTDLGCTL, (WPARAM)hwndControl, TRUE);
    }
    

    As the remarks for the WM_NEXTDLGCTL message observe, the DefDlgProc function handles the WM_NEXTDLGCTL message by updating all the internal dialog manager bookkeeping, deciding which button should be default, all that good stuff.

    Now you can update dialog boxes like the professionals, avoiding oddities like having no default button, or worse, multiple default buttons!

  • The Old New Thing

    Myth: Without /3GB a single program can't allocate more than 2GB of virtual memory

    • 40 Comments

    Virtual memory is not virtual address space (part 2).

    This myth is being perpetuated even as I write this series of articles.

    The user-mode virtual address space is normally 2GB, but that doesn't limit you to 2GB of virtual memory. You can allocate memory without it being mapped into your virtual address space. (Those who grew up with Expanded Memory or other forms of bank-switched memory are well-familiar with this technique.)

    HANDLE h = CreateFileMapping(INVALID_HANDLE_VALUE, 0,
                                 PAGE_READWRITE, 1, 0, NULL);
    

    Provided you have enough physical memory and/or swap file space, that 4GB memory allocation will succeed.

    Of course, you can't map it all into memory at once on a 32-bit machine, but you can do it in pieces. Let's read a byte from this memory.

    BYTE ReadByte(HANDLE h, DWORD offset)
    {
     SYSTEM_INFO si;
     GetSystemInfo(&si);
     DWORD chunkOffset = offset % si.dwAllocationGranularity;
     DWORD chunkStart = offset - chunkOffset;
     LPBYTE pb = (LPBYTE*)MapViewOfFile(h, FILE_MAP_READ, 0,
          chunkStart, chunkOffset + sizeof(BYTE));
     BYTE b = pb[chunkOffset];
     UnmapViewOfFile(pb);
     return b;
    }
    

    Of course, in a real program, you would have error checking and probably a caching layer in order to avoid spending all your time mapping and unmapping instead of actually doing work.

    The point is that virtual address space is not virtual memory. As we have seen earlier, you can map the same memory to multiple addresses, so the one-to-one mapping between virtual memory and virtual address space has already been violated. Here we showed that just because you allocated memory doesn't mean that it has to occupy any space in your virtual address space at all.

    [Updated: 10:37am, fix minor typos reported in comments.]

  • The Old New Thing

    Myth: PAE increases the virtual address space beyond 4GB

    • 13 Comments

    This is another non sequitur. PAE increases the amount of physical memory that can be addressed by the processor, but that is unrelated to virtual address space. (Remember that PAE stands for Physical Address Extensions.)

    PAE increases the physical address space (the address space that the CPU can use to access the memory chips on your computer) from 32 bits to 36 on a Pentium 2, for a theoretical maximum physical memory capacity of 64GB. However, the size of a pointer variable hasn't changed. It's still 32 bits (for a 32-bit processor), which means that the virtual address space is still 4GB.

    With PAE enabled, the page table and page directory entries double in size (to accomodate the additional bits in the page frame), which significantly increases the amount of memory required for page tables and page directories (since each page table describes only half as much memory as it used to).

    Notice that this has as a consequence that PAE and /3GB conflict with each other to some degree. If you turn on both PAE and /3GB, then the kernel will limit itself to 16GB of physical memory. That's because there isn't enough address space in the kernel to fit all the necessary memory bookkeeping into the 1GB of memory you told the kernel to squeeze itself into.

    (On an AMD processor, the physical address space expands to 40 bits, for a theoretical maximum of 1TB. However, the memory manager uses only 37 of those bits, for an actual maximum of 128GB. Why? For the same reason that the kernel limits itself to 16GB in /3GB mode: Not enough address space. It's time to move to 64-bit processors...)

  • The Old New Thing

    Myth: The /3GB switch expands the user-mode address space of all programs

    • 46 Comments

    Only programs marked as /LARGEADDRESSAWARE are affected.

    For compatibility reasons, only programs that explicitly indicate that they are prepared to handle a virtual address space larger than 2GB will get the larger virtual address space. Unmarked programs get the normal 2GB virtual address space, and the address space between 2GB and 3GB goes unused.

    Why?

    Because far too many programs assume that the high bit of user-mode virtual addresses is always clear, often unwittingly. MSDN has a page listing some of the ways programs make this assumption. One such assumption you may be making is taking the midpoint between two pointers by using the formula (a+b)/2. As I noted in a previous exercise, this is subject to integer overflow and consequently can result in an erroneous pointer computation. Consequently, you can't just take an existing program that you didn't write, mark it /LARGEADDRESSAWARE, and declare your job done. You have to check with the authors of that program that they verified that their code does not make any 2GB assumptions. (And the fact that the authors didn't mark their program as 3GB-compatible strongly suggests that no such verification has occurred. If it had, they would have marked the program /LARGEADDRESSAWARE!)

    Marking your program /LARGEADDRESSAWARE indicates to the operating system, "Go ahead and give this program access to that extra gigabyte of user-mode address space," and as a result, addresses in the third gigabyte become possible return values from memory allocation functions. If you set the "Top down" flag in the memory manager allocation preferences mask (search for "top down"), you can instruct the memory manager to allocate high-address memory first, thereby forcing your program to deal with those addresses sooner than it normally would. This is very handy when testing your program in a /3GB configuration since it forces the troublesome memory addresses to be used sooner than normal.

    Exercise: Find the bug in the following function. Hint: What's today's topic?

    #define BUFFER_SIZE 32768
    BOOL  IsPointerInsideBuffer(const BYTE *p, const BYTE *buffer)
    {
      return p >= buffer && p - buffer < BUFFER_SIZE;
    }
    
  • The Old New Thing

    Kernel address space consequences of the /3GB switch

    • 22 Comments

    One of the adverse consequences of the /3GB switch is that it forces the kernel to operate inside a much smaller space.

    One of the biggest casualties of the limited address space is the video driver. To manage the memory on the video card, the driver needs to be able to address it, and the apertures required are typically quite large. When the video driver requests a 256MB aperture, the call is likely to fail since there simply isn't that much address space available to spare.

    All of kernel's bookkeeping needs to fit inside that one gigabyte. Page tables, page directories, bitmaps, video driver apertures. It's a very tight squeeze, but if you're willing to cut back (for example by not requiring such a large video aperture), you can barely squeak it through. (A later entry will discuss another casualty of the reduced address space.)

    It's like trying to change your clothes inside a small closet. You can do it, but it's a real struggle, you're going to have to make sacrifices, and the results aren't always very pretty.

  • The Old New Thing

    Why is the virtual address space 4GB anyway?

    • 58 Comments

    The size of the address space is capped by the number of unique pointer values. For a 32-bit processor, a 32-bit value can represent 232 distinct values. If you allow each such value to address a different byte of memory, you get 232 bytes, which equals four gigabytes.

    If you were willing to forego the flat memory model and deal with selectors, then you can combine a 16-bit selectors value with a 32-bit offset for a combined 48-bit pointer value. This creates a theoretical maximum of 248 distinct pointer values, which if you allowed each such to address a different byte of memory, yields 256TB of memory.

    This theoretical maximum cannot be achieved on the Pentium class of processors, however. On reason is that the lower bits of the segment value encode information about the type of selector. As a result, of the 65536 possible selector values, only 8191 of them are usable to access user-mode data. This drops you to 32TB.

    The real limitation on the address space using the selector:offset model is that each selector merely describes a subset of a flat 32-bit address space. So even if you could get to use all 8191 selectors, they would all just be views on the same underlying 32-bit address space.

    (Besides, I seriously doubt people would be willing to return the the exciting days of segmented programming.)

    In 64-bit Windows, the 2GB limit is gone; the user-mode virtual address space is now a stunning 8 terabytes. Even if you allocated a megabyte of address space per second, it would take you three months to run out. (Notice however that you can set /LARGEADDRESSAWARE:NO on your 64-bit program to tell the operating system to force the program to live below the 2GB boundary. It's unclear why you would ever want to do this, though, since you're missing out on the 64-bit address space while still paying for it in pointer size. It's like paying extra for cable television and then not watching.)

    Armed with what you have learned so far, maybe you can respond to this request that came in from a customer:

    Oen of our boot.ini files has a /7GB switch. Our consultant told us that we should set it to 1GB less than the system memory. Since we have 8GB, 8GB - 1GB = 7GB. The consultant said that setting this value allows an application to allocate more than 2GB of memory. We would like Microsoft to comment on this analysis.
  • The Old New Thing

    Why can't you treat a FILETIME as an __int64?

    • 27 Comments

    The FILETIME structure represents a 64-bit value in two parts:

    typedef struct _FILETIME {
      DWORD dwLowDateTime;
      DWORD dwHighDateTime;
    } FILETIME, *PFILETIME;
    

    You may be tempted to take the entire FILETIME structure and access it directly as if it were an __int64. After all, its memory layout exactly matches that of a 64-bit (little-endian) integer. Some people have written sample code that does exactly this:

    pi = (__int64*)&ft; // WRONG
    (*pi) += (__int64)num*datepart; // WRONG
    

    Why is this wrong?

    Alignment.

    Since a FILETIME is a structure containing two DWORDs, it requires only 4-byte alignment, since that is sufficient to put each DWORD on a valid DWORD boundary. There is no need for the first DWORD to reside on an 8-byte boundary. And in fact, you've probably already used a structure where it doesn't: The WIN32_FIND_DATA structure.

    typedef struct _WIN32_FIND_DATA {
        DWORD dwFileAttributes;
        FILETIME ftCreationTime;
        FILETIME ftLastAccessTime;
        FILETIME ftLastWriteTime;
        DWORD nFileSizeHigh;
        DWORD nFileSizeLow;
        DWORD dwReserved0;
        DWORD dwReserved1;
        TCHAR  cFileName[ MAX_PATH ];
        TCHAR  cAlternateFileName[ 14 ];
    } WIN32_FIND_DATA, *PWIN32_FIND_DATA, *LPWIN32_FIND_DATA;
    

    Observe that the three FILETIME structures appear at offsets 4, 12, and 20 from the beginning of the structure. They have been thrown off 8-byte alignment by the dwFileAttributes member.

    Casting a FILETIME to an __int64 therefore can (and in the WIN32_FIND_DATA case, will) create a misaligned pointer. Accessing a misaligned pointer will raise a STATUS_DATATYPE_MISALIGNMENT exception on architectures which require alignment.

    Even if you are on a forgiving platform that performs automatic alignment fixups, you can still run into trouble. More on this and other consequences of alignment in the next few entries.

    Exercise: Why are the LARGE_INTEGER and ULARGE_INTEGER structures not affected?

  • The Old New Thing

    Myth: You need /3GB if you have more than 2GB of physical memory

    • 38 Comments

    Physical memory is not virtual address space.

    In my opinion, this is another non sequitur. I'm not sure what logical process led to this myth. It can't be a misapprehension of a 1-1 mapping between physical memory and virtual memory, because that mapping is blatantly not one-to-one. You typically have far more virtual memory than physical memory. Free physical memory doesn't have any manifestation in any virtual address space. And shared memory has manifestations in multiple virtual address spaces yet correspond to the same physical page.

    Though this brings up a historical note.

    In Windows/386, the kernel just so happened to map all physical memory into the kernel-mode virtual address space. There was a function _MapPhysToLinear. You gave it a physical memory range and it returned the base of a range of linear addresses that could be used to access that physical memory. Some driver developers discovered that the kernel mapped all of physical memory and just handed out pointers into that single mapping. As a result, they called _MapPhysToLinear(0, 0x1000) and whenever they wanted to access physical memory in the future, they just added the address to the return value from that single call. In other words, they assumed that

     _MapPhysToLinear(p, x) = _MapPhysToLinear(0, x) + p 

    In Windows 95, the memory manager was completely rewritten and the above coincidence was no longer true. To conserve kernel-mode virtual address space, physical memory was now mapped linearly only as necessary.

    Of course, the drivers that relied on the old behavior were now broken because the undocumented behavior they relied upon was no longer present.

    As a result, when it starts up, Windows 95 looks around to see if any drivers known to rely on this undocumented behavior are loaded. (Windows 3.1 didn't support dynamically-loaded kernel drivers so looking at boot time was sufficient.) If so, then it went ahead and mapped all of physical memory into the kernel-mode virtual address space to keep those driver happy. This wasted virtual address space but kept your machine running.

    I can already hear people saying, "Microsoft shouldn't have made those buggy drivers work. They should have just let the computer crash in order to put pressure on the authors of those drivers to fix their bugs." This assumes, of course, that the cause of the crash could be traced back to the buggy driver in the first place. A very common manifestation of a stray pointer in kernel mode is memory corruption, which means that the component that crashes is rarely the one that caused the problem in the first place.

    For example, nearly all Windows 95 bluescreen crashes in VMM(01) are caused by memory corruption. VMM(01) is the non-swappable part of the Windows 95 kernel which is where the memory manager lives. If a driver corrupts the kernel-mode heap, a bluescreen in the memory manager is typically how the corruption manifests itself.

Page 1 of 3 (29 items) 123