• The Old New Thing

    The kooky STRRET structure

    • 10 Comments

    If you've messed with the shell namespace, you've no doubt run across the kooky STRRET structure, which is used by IShellFolder::GetDisplayNameOf to return names of shell items. As you can see from its documentation, a STRRET is sometimes an ANSI string buffer, sometimes a pointer to a UNICODE string, sometimes (and this is the kookiest bit) an offset into a pidl. What is going on here?

    The STRRET structure burst onto the scene during the Windows 95 era. Computers during this time were still comparatively slow and memory-constrained. (Windows 95's minimum hardware requirements were for 4MB of memory and a 386DX processor - which ran at a whopping 25MHz.) It was much faster to allocate memory off the stack (a simple "sub" instruction) than to allocate it from the heap (which might take thousands of instructions!), so the STRRET structure was designed so the common (for Windows 95) scenarios could be satisfied without needing a heap allocation.

    The STRRET_OFFSET flag took this to an even greater extreme. Often, you kept the name inside the pidl, and copying it into the STRRET structure would take, gosh, 200 clocks (!). To avoid this wasteful memory copying, STRRET_OFFSET allowed you to return just an offset into the pidl, which the caller could then copy out of directly.

    Woo-hoo, you saved a string copy.

    Of course, as time passed and computers got faster and memory became more readily available, these micro-optimizations have turned into annoyances. Saving 200 clock cycles on a string copy operation is hardly worth it any more. On a 1GHz processor, a single soft page fault costs you over a million cycles; a hard page fault costs you tens of millions.

    You can copy a lot of strings in twenty million cycles.

    What's more, the scenarios that were common in Windows 95 aren't quite so common any more, so the original scenario that the optimization was tailored for hardly occurs any more. It's an optimization that has outlived its usefulness.

    Fortunately, you don't have to think about the STRRET structure any more. There are several helper functions that take the STRRET structure and turn it into something much easier to manipulate.

    The kookiness of the STRRET structure has now been encapsulated away. Thank goodness.

  • The Old New Thing

    Finished competing in your event? Let the games begin!

    • 10 Comments

    Ten thousand human beings in peak physical condition. All in one dormitory complex. With 130,000 free condoms. Let the games begin!

    And you have to tip your hat (tam?) to The Scotsman for finding an athlete named "Randy Jones" for this article.

  • The Old New Thing

    Summary of the recent spate of /3GB articles

    • 36 Comments

    A table of contents now that the whole thing is over. I hope.

    I'm not sure how successful this series has been, though, for it appears that even people who have read the articles continue to confuse virtual address space with physical address space. (Or maybe this person is merely mocking a faulty argument? I can't tell for sure.)

  • The Old New Thing

    The curious interaction between PAE and NX

    • 5 Comments

    Carmen Crincoli covered the interaction between PAE and NX on his own blog, so I'll merely incorporate his remarks by reference.

    (And notice again the concession to backwards compatibility. Without the backwards compatibility work, XP SP2 would have shipped with NX support and an asterisk, "* and those of you who have device drivers that are not PAE-ready will not be able to take advantage of these new security enhancements. We could've done something to make your systems secure, but we decided not to do it in order to teach you a lesson.")

  • The Old New Thing

    Writing your own menu-like window

    • 32 Comments

    Hereby incorporating by reference the "FakeMenu" sample in the Platform SDK. It's in the winui\shell\fakemenu directory.

    For those who don't have the Platform SDK, what are you doing writing Win32 programs without the Platform SDK? Download it if it didn't come with your development tools.

    If for some reason you don't want the Platform SDK yet you want to write Win32 programs (I bet you're the sort of person who throws away the manual as soon as you buy something), you can look at the version that Chris Becke has stashed away on this page.

  • The Old New Thing

    Myth: In order to use AWE, you must enable PAE

    • 15 Comments

    Address Windowing Extensions (AWE) does not require PAE. I don't know why some people claim that it does, since it is so easy to demonstrate otherwise.

    Take a program that uses AWE. If you don't have one handy, you can use the one that comes in MSDN as a sample program that demonstrates how to use AWE. Grant yourself "Lock Pages in Memory" privileges and run the program. Observe that it works.

    Now remove the /PAE switch from your boot.ini, reboot, and run the program again. Observe that it still works.

    Myth disproved by direct experimentation and observation.

  • The Old New Thing

    Myth: PAE increases the virtual address space beyond 4GB

    • 13 Comments

    This is another non sequitur. PAE increases the amount of physical memory that can be addressed by the processor, but that is unrelated to virtual address space. (Remember that PAE stands for Physical Address Extensions.)

    PAE increases the physical address space (the address space that the CPU can use to access the memory chips on your computer) from 32 bits to 36 on a Pentium 2, for a theoretical maximum physical memory capacity of 64GB. However, the size of a pointer variable hasn't changed. It's still 32 bits (for a 32-bit processor), which means that the virtual address space is still 4GB.

    With PAE enabled, the page table and page directory entries double in size (to accomodate the additional bits in the page frame), which significantly increases the amount of memory required for page tables and page directories (since each page table describes only half as much memory as it used to).

    Notice that this has as a consequence that PAE and /3GB conflict with each other to some degree. If you turn on both PAE and /3GB, then the kernel will limit itself to 16GB of physical memory. That's because there isn't enough address space in the kernel to fit all the necessary memory bookkeeping into the 1GB of memory you told the kernel to squeeze itself into.

    (On an AMD processor, the physical address space expands to 40 bits, for a theoretical maximum of 1TB. However, the memory manager uses only 37 of those bits, for an actual maximum of 128GB. Why? For the same reason that the kernel limits itself to 16GB in /3GB mode: Not enough address space. It's time to move to 64-bit processors...)

  • The Old New Thing

    Why all these articles about PAE and /3GB?

    • 22 Comments

    Apparently there is some unrest in comment-land with people who are sick of this whole /3GB series.

    Why have I been spending over two weeks exploring the consequences of the /3GB switch and exploding various common myths about it?

    Because too many people don't understand what /3GB means but talk as if they do. As you saw yesterday, there are still lots of people out there that don't understand the differences between physical memory, virtual memory, and virtual address space, and end up misconfiguring their computers or treating the switch as magic fairy dust. I've gotten tired of explaining it to misguided person after misguided person over the years, so I figured if I wrote up the explanation and debunkings once and for all, I won't have to visit the topic ever again.

    You think you're sick of /3GB? You've only had to deal with it for two weeks. Imagine having to explain the /3GB switch for six years!

    At any rate, the 3GB series will draw to a close at the end of the week, assuming everything goes according to schedule. Then there will be other topics for you to be sick of.

  • The Old New Thing

    Why is the virtual address space 4GB anyway?

    • 58 Comments

    The size of the address space is capped by the number of unique pointer values. For a 32-bit processor, a 32-bit value can represent 232 distinct values. If you allow each such value to address a different byte of memory, you get 232 bytes, which equals four gigabytes.

    If you were willing to forego the flat memory model and deal with selectors, then you can combine a 16-bit selectors value with a 32-bit offset for a combined 48-bit pointer value. This creates a theoretical maximum of 248 distinct pointer values, which if you allowed each such to address a different byte of memory, yields 256TB of memory.

    This theoretical maximum cannot be achieved on the Pentium class of processors, however. On reason is that the lower bits of the segment value encode information about the type of selector. As a result, of the 65536 possible selector values, only 8191 of them are usable to access user-mode data. This drops you to 32TB.

    The real limitation on the address space using the selector:offset model is that each selector merely describes a subset of a flat 32-bit address space. So even if you could get to use all 8191 selectors, they would all just be views on the same underlying 32-bit address space.

    (Besides, I seriously doubt people would be willing to return the the exciting days of segmented programming.)

    In 64-bit Windows, the 2GB limit is gone; the user-mode virtual address space is now a stunning 8 terabytes. Even if you allocated a megabyte of address space per second, it would take you three months to run out. (Notice however that you can set /LARGEADDRESSAWARE:NO on your 64-bit program to tell the operating system to force the program to live below the 2GB boundary. It's unclear why you would ever want to do this, though, since you're missing out on the 64-bit address space while still paying for it in pointer size. It's like paying extra for cable television and then not watching.)

    Armed with what you have learned so far, maybe you can respond to this request that came in from a customer:

    Oen of our boot.ini files has a /7GB switch. Our consultant told us that we should set it to 1GB less than the system memory. Since we have 8GB, 8GB - 1GB = 7GB. The consultant said that setting this value allows an application to allocate more than 2GB of memory. We would like Microsoft to comment on this analysis.
  • The Old New Thing

    Myth: The /3GB switch lets me map one giant 3GB block of memory

    • 20 Comments

    Just because the virtual address space is 3GB doesn't mean that you can map one giant 3GB block of memory. The standard holes in the virtual address space are still there: 64K at the bottom, and 64K near the 2GB boundary.

    Moreover, the system DLLs continue to load at their preferred virtual addresses which lie just below the 2GB boundary. The process heap and other typical process bookkeeping also take their bites out of your virtual address space.

    As a result, even though the user-mode virtual address space is nearly 3GB, it is not the case that all of the available space is contiguous. The holes near the 2GB boundary prevent you from getting even 2GB of contiguous address space.

    Some people may try to relocate the system DLLs to alternate addresses in order to create more room, but that won't work for multiple reasons. First, of course, is that it doesn't get rid of the 64K gap near the 2GB boundary. Second, the system allocates other items such as thread information blocks and the process environment variables before your program gets a chance to start running, so by the time your program gets around to allocating memory, the space it wanted may already have been claimed.

    Third, the system really needs certain key DLLs to be loaded at the same address in all processes. For example, the syscall trap must reside at a fixed location so that the kernel-mode trap handler will recognize it as a valid syscall trap and not as an illegal instruction. The debugger requires this as well, so that it can use CreateRemoteThread to inject a breakpoint into the process when you tell it to break into the process you are debugging.

Page 380 of 429 (4,287 items) «378379380381382»