History

  • The Old New Thing

    My pants are fancy!

    • 13 Comments

    During the development of Windows, the User Research team tried out an early build of some proposed changes on volunteers from the general community. During one of the tests, they invited the volunteer to just play around with a particular component, to explore it the way they would at home.

    The usability subject scrolled around a bit, admired the visuals, selected a few things, and then had an idea to try to customize the component. He fiddled around a bit and quickly discovered the customization feaure.

    To celebrate his success, he proudly announced in a sing-song sort of way, "My pants are fancy!"

    That clip of a happy usability study participant gleefully announcing "My pants are fancy!" tickled the team's funny bone, and the phrase "My pants are fancy" became a catch phrase.

  • The Old New Thing

    The Softsel Hot List for the week of December 22, 1986

    • 21 Comments

    Back in the days before Internet-based software distribution, heck back even before the Internet existed in a form resembling what it is today, one of the most important ways of keeping track of the consumer computing industry was to subscribe to the Softsel Hot List, a weekly poster of the top sellers in various categories. Here is the Softsel Hot List for the week of December 22, 1986, or at least an HTML reproduction of it. The title at the top was inspired by a space agey font popular at the time, but I am too lazy to figure out how to do that in HTML so you'll have to use your imagination. (If your imagination fails you, you can use the photo from this page.)

    SOFTSEL® HOT LIST®
    HARDWARE
    This
    Week
    Last
    Week
    Weeks
    on
    Chart
      PRINTERS
     1   1   43    120D Dot Matrix • Citizen America
     2   3   30  P7 Pinwriter • NEC Information Systems
     3   2   100    MSP-10 Dot Matrix • Citizen America
     4   4   19    P6 Pinwriter • NEC Information Systems
     5   5   34    Premiere 35 Daisywheel • Citizen America
     6   6   37    MSP-20 Dot Matrix • Citizen America
     7   7   86    MSP-15 Dot Matrix • Citizen America
     8   8   46    3550 Spinwwriter • NEC Information Systems
     9   9   33    P5 Pinwriter • NEC Information Systems
     10   –   1  KX-P1080i Dot Matrix • Panasonic
     
      MONITORS
     1   1   40    JC 1401 Multisync • NEC Home Electronics
     2   2   98    Video 310A Hi-Res Amber TTL • Amdek
     3   3   79    Color 600 Hi-Res RGB • Amdek
     4   4   33    JB 1285 Amber TTL • NEC Home Electronics
     5   6   37    Color 722 CGA/EGA • Amdek
     6   5   82    121 Hi-Res Green TTL • Taxan
     7   –   1  318 Hi-Res Color • AT&T
     8   7   50    122 Amber TTL • Taxan
     9   –   1  Super Vision 720 Hi-Res • Taxan
     10   –   1  313 Mono • AT&T
     
      DISK DRIVES & STORAGE DEVICES
     1   1   27    Laser FD100 Apple Drives • Video Technology • AP
     2   4   17  Bernoulli Box Dual 20MB • Iomega • IBM, MAC
     3   2   26    FileCard 20MB Hard Disk/Card • Western Digital • IBM
     4   3   36    QIC-60H External Tape Backup • Tecmar • IBM
     5   6   23    Bernoulli Box Dual 10MB • Iomega • IBM, MAC
     6   5   26    QIC-60AT Internal Tape Backup • Tecmar • IBM
     7   8   3    Teac AT 360k Drive • Maynard • IBM
     8   –   23    Maynstream 20MB Portable Backup • Maynard • IBM
     
      BOARDS, MODEMS & INTERFACES
     1   1   165    Hercules Graphics Card Plus • Hercules • IBM
     2   2   155    SixPakPlus • AST Research • IBM
     3   3   113    Hercules Color Card • Hercules • IBM
     4   4   170    Smartmodem 1200B • Hayes • IBM
     5   7   14    Above Board/AT • Intel • IBM
     6   5   188    Smartmodem 1200 • Hayes • AP
     7   8   38    Advantage AT! • AST Research • IBM
     8   6   68    Smartmodem 2400 • Hayes • IBM
     9   9   36    Gamecard III • CH Products • IBm
     10   11   35    Smartmodem 2400B • Hayes • IBM
     11   13   113    Grappler • Orange Micro • AP
     12   16   37  Practical Modem 1200 • Practical Peripherals • IBM
     13   12   37    Above Board/PC • Intel • IBM
     14   10   31    QuadEGA+ • Quadram • IBM
     15   15   7    SixPakPremium • AST Research • IBM
     16   20   20    Rampage! AT • AST Research • IBM
     17   –   19    Hotlink • Orange Micro • AP
     18   18   17    Autoswitch EGA • Paradise Systems • IBM
     19   19   39    Expanding Quadboard • Quadram • IBM
     20   17   2    Advantage Premium • AST Research • IBM
     
      ACCESSORIES
     1   1   116    Mach III • CH Products • AP, IBM
     2   2   98    Microsoft Mouse • Microsoft • IBM
     3   3   178    Joystick • Kraft Systems • AP, IBM
     4   4   35    Mach II • CH Products • AP, IBM
     5   5   2  Tac 10 • Suncom • AP, IBM
     6   7   34    Safe Strip • Curtis Manufacturing
     7   8   222    System Saver • Kensington • AP, MAC
     8   9   10    Intel 80287 Coprocessor • Intel • IBM
     9   10   99    MasterPiece • Kensington • IBM
     10   –   21    Intel 8087 Coprocessor • Intel • IBM
    SOFTWARE
    This
    Week
    Last
    Week
    Weeks
    on
    Chart
      BUSINESS
     1   1   138    WordPerfect • WordPerfect Corp • AP, IBM
     2   2   202    1-2-3 • Lotus • IBM
     3   5   22  Javelin • Javelin • IBM
     4   3   161    Microsoft Word • Microsoft • IBM, MAC
     5   4   7    Quicken • Intuit • AP, IBM
     6   6   15    PFS:First Choice • Software Publishing • IBM
     7   8   31    SQZ! • Turner Hall • IBM
     8   7   49    dBase III Plus • Ashton-tate • IBM
     9   11   3  Lotus HAL • Lotus • IBM
     10   9   57    Q & A • Symantec • IBM
     11   12   113    Sidekick • Borland Int'l. • IBM
     12   10   56    Paradox • Ansa Software • IBM
     13   27   2  NewsMaster • Unison (Brown-Wagh) • IBM
     14   20   27  ProDesign II • American Small Bus. Comp. • IBM
     15   15   27    DAC Easy Accounting • DAC • IBM
     16   16   9    Microsoft Works • Microsoft • MAC
     17   13   57    VP Planner • Paperback Software • IBM
     18   18   11    PFS:Professional Write • Software Publishing • IBM
     19   14   18    MacDraft • IDD • MAC
     20   21   177    Multimate • Ashton-Tate • IBM
     21   22   55    Reflex • Borland Int'l. • IBM, MAC
     22   19   37    Multimate Advantage • Ashton-Tate • IBM
     23   24   25    Note-It • Turner Hall • IBM
     24   25   8    Clipper • Nantucket • IBM
     25   23   97    Wordstar 2000 • MicroPro Int'l. • IBM
     26   28   52    Microsoft Windows • Microsoft • IBM
     27   17   62    Microsoft Excel • Microsoft • MAC
     28   –   9    MORE • Living Videotext • MAC
     29   –   1  PFS:Professional File • Software Publishing • IBM
     30   –   7    R:Base System V • Microrim • IBM
     
      SYSTEMS & UTILITIES
     1   1   167    Crosstalk XVI • DCA/Crosstalk Communications • AP, IBM
     2   3   116    Norton Utilities • Norton Computing • IBM
     3   4   140    Sideways • Funk Software • IBM
     4   2   39    Fastback • Fifth Generation • IBM
     5   9   109  Turbo Pascal • Borland Int'l • AP, IBM, MAC
     6   8   17    Carbon Copy • Meridian Technology • IBM
     7   6   25    Dan Bricklin's Demo Program • Software Garden • IBM
     8   5   99    Smartcom II • Hayes • IBM, MAC
     9   7   27    XTREE • Executive Systems • IBM
     10   10   5    Disk Optimizer • SoftLogic Solutions • IBM
     
      HOME & EDUCATION
     1   1   128    Print Shop • Broderbund • AP, IBM, MAC, COM
     2   2   163    Math Blaster! • Davidson & Assoc. • AP, IBM, MAC, COM, AT
     3   6   5  Microsoft Learning DOS • Microsoft • IBM
     4   3   122    Typing Tutor III • Simon & Shuster • AP, IBM, MAC, COM
     5   4   21    Certificate Maker • Springboard • AP, IBM, COM
     6   –   90  Managing Your Money • MECA • AP, IBM
     7   5   95    The Newsroom • Springboard • AP, IBM, COM
     8   8   194    Bank Street Writer • Broderbund • AP, IBM, COM
     9   7   211    Mastertype • Mindscape • AP, IBM
     10   –   118    E.G. for Young Children • Springboard • AP, IBM, MAC
     
      RECREATION
     1   1   203    Microsoft Flight Simulator • Microsoft • IBM, MAC
     2   2   156    Sargon III • Hayden Software • AP, IBM, MAC, AT
     3   4   3    King's Quest III • Sierra On-Line • IBM, ST
     4   3   71    Jet • SubLogic • AP, IBM, COM
     5   5   52    Winter Games • Epyx • AP, MAC, COM, ST
     6   7   91    F-15 Strike Eagle • Microprose • AP, IBM
     7   6   204    Flight Simulator II • SubLogic • AP, COM, AT, AG
     8   8   43    Silent Service • Microprose • AP, IBM
     9   10   48    Where is Carmen San Diego • Broderbund • AP, IBM, COM
     10   –   1  Bop'N Wrestle • Mindscape • AP, IBM, COM
    BEST PERFORMERS THIS WEEK COMING UP FAST WATCH CLOSELY
    Week of December 22, 1986
    The HOT LIST is compiled from Softsel sales to over 15,000 dealers in 50 states and 45 countries. Sales may vary regionally. The names of the products and companies appearing above may be trademarks or registered trademarks.
    For an annual HOT LIST subscription, send your check for to: Softsel Computer Products, Inc., Attn: Hot List Subscriptions, 546 North Oak Street, P.O. Box 6080, Inglewood California, 90312-6080. For more details, please call Softsel's Marketing Department at (213) 412-8290.
    ©1986 Softsel® Computer Products, Inc.
  • The Old New Thing

    How did protected-mode 16-bit Windows fix up jumps to functions that got discarded?

    • 16 Comments

    Commenter Neil presumes that Windows 286 and later simply fixed up the movable entry table with jmp selector:offset instructions once and for all.

    It could have, but it went one step further.

    Recall that the point of the movable entry table is to provide a fixed location that always refers to a specific function, no matter where that function happens to be. This was necessary because real mode has no memory manager.

    But protected mode does have a memory manager. Why not let the memory manager do the work? That is, after all, its job.

    In protected-mode 16-bit Windows, the movable entry table was ignored. When one piece of code needed to reference another piece of code, it simply jumped to or called it by its selector:offset.

        push    ax
        call    0987:6543
    

    (Exercise: Why didn't I use call 1234:5678 as the sample address?)

    The selector was patched directly into the code as part of fixups. (We saw this several years ago in another context.)

    When a segment is relocated in memory, there is no stack walking to patch up return addresses to point to thunks, and no editing of the movable entry points to point to the new location. All that happens is that the base address in the descriptor table entry for the selector is updated to point to the new linear address of the segment. And when a segment is discarded, the descriptor table entry is marked not present, so that any future reference to it will raise a selector not present exception, which the kernel handles by reloading the selector.

    Things are a lot easier when you have a memory manager around. A lot of the head-exploding engineering in real-mode windows was in all the work of simulating a memory manager on a CPU that didn't have one!

  • The Old New Thing

    If 16-bit Windows had a single input queue, how did you debug applications on it?

    • 32 Comments

    After learning about the bad things that happened if you synchronized your application's input queue with its debugger, commenter kme wonders how debugging worked in 16-bit Windows, since 16-bit Windows didn't have asynchronous input? In 16-bit Windows, all applications shared the same input queue, which means you were permanently in the situation described in the original article, where the application and its debugger (and everything else) shared an input queue and therefore would constantly deadlock.

    The solution to UI deadlocks is to make sure the debugger doesn't have any UI.

    At the most basic level, the debugger communicated with the developer through the serial port. You connected a dumb terminal to the other end of the serial port. Mine was a Wyse 50 serial console terminal. All your debugging happened on the terminal. You could disassemble code, inspect and modify registers and memory, and even patch new code on the fly. If you wanted to consult source code, you needed to have a copy of it available somewhere else (like on your other computer). It was similar to using the cdb debugger, where the only commands available were r, db, eb, u, and a. Oh, and bp to set breakpoints.

    Now, if you were clever, you could use a terminal emulator program so you didn't need a dedicated physical terminal to do your debugging. You could connect the target computer to your development machine and view the disassembly and the source code on the same screen. But you weren't completely out of the woods, because what did you use to debug your development machine if it crashed? The dumb terminal, of course.¹

    Target machine
    Debugger
    Development machine
    Debugger
    Wyse 50
    dumb terminal

    I did pretty much all my Windows 95 debugging this way.

    If you didn't have two computers, another solution was to use a debugger like CodeView. CodeView avoided the UI deadlock problem by not using the GUI to present its UI. When you hit a breakpoint or otherwise halted execution of your application, CodeView talked directly to the video driver to save the first 4KB of video memory, then switched into text mode to tell you what happened. When you resumed execution, it restored the video memory, then switched the video card back into graphics mode, restored all the pixels it captured, then resumed execution as if nothing had happened. (If you were debugging a graphics problem, you could hit F3 to switch temporarily to graphics mode, so you could see what was on the screen.)

    If you were really fancy, you could spring for a monochrome adapter, either the original IBM one or the Hercules version, and tell CodeView to use that adapter for its debugging UI. That way, when you broke into the debugger, you could still see what was on the screen! We had multiple monitors before it was cool.

    ¹ Some people were crazy and cross-connected their target and development machines.

    Target machine
    Debugger
    Development machine
    Debugger

    This allowed them to use their target machine to debug their development machine and vice versa. But if your development machine crashed while it was debugging the target machine, then you were screwed.

  • The Old New Thing

    Why is the FAT driver called FASTFAT? Why would anybody ever write SLOWFAT?

    • 33 Comments

    Anon is interested in why the FAT driver is called FASTFAT.SYS. "Was there an earlier slower FAT driver? What could you possibly get so wrong with a FAT implementation that it needed to be chucked out?"

    The old FAT driver probably had a boring name like, um, FAT.SYS. At some point, somebody decided to write a newer, faster one, so they called it FASTFAT. And the name stuck.

    As for what you could possibly get so wrong with a FAT implementation that it needed to be improved: Remember that circumstances change over time. A design that works well under one set of conditions may start to buckle when placed under alternate conditions. It's not that the old implementation was wrong; it's just that conditions have changed, and the new implementation is better for the new conditions.

    For example, back in the old days, there were three versions of FAT: FAT8, FAT12, and FAT16. For such small disks, simple algorithms work just fine. In fact, they're preferable because a simple algorithm is easy to get right and is easier to debug. It also typically takes up a lot less space, and memory was at a premium in the old days. An O(n) algorithm is not a big deal if n never gets very large and the constants are small. Since FAT16 capped out at 65535 clusters per drive, there was a built-in limit on how big n could get. If a typical directory has only a few dozen files in it, then a linear scan is just fine.

    It's natural to choose algorithms that map directly to the on-disk data structures. (Remember, data structures determine algorithms.) FAT directories are just unsorted arrays of file names, so a simple directory searching function would just read through the directory one entry at a time until it found the matching file. Finding a free cluster is just a memory scan looking for a 0 in the allocation table. Memory management was simple: Don't try. Let the disk cache do it.

    These simple algorithms worked fine until FAT32 showed up and bumped n sky high. But fortunately, by the time that happened, computers were also faster and had more memory available, so you had more room to be ambitious.

    The big gains in FASTFAT came from algorithmic changes. For example, the on-disk data structures are transformed into more efficient in-memory data structures and cached. The first time you look in a directory, you need to do a linear search to collect all the file names, but if you cache them in a faster data structure (say, a hash table), subsequent accesses to the directory become much faster. And since computers now have more memory available, you can afford to keep a cache of directory entries around, as opposed to the old days where memory was tighter and large caches were a big no-no.

    (I wonder if any non-Microsoft FAT drivers do this sort of optimization, or whether they just do the obvious thing and use the disk data structures as memory data structures.)

    The original FAT driver was very good at solving the problem it was given, while staying within the limitations it was forced to operate under. It's just that over time, the problem changed, and the old solutions didn't hold up well any more. I guess it's a matter of interpretation whether this means that the old driver was "so wrong." If your child outgrows his toddler bed, does that mean the toddler bed was a horrible mistake?

  • The Old New Thing

    Why is there a 64KB no-man's land near the end of the user-mode address space?

    • 29 Comments

    We learned some time ago that there is a 64KB no-man's land near the 2GB boundary to accommodate a quirk of the Alpha AXP processor architecture. But that's not the only reason why it's there.

    The no-man's land near the 2GB boundary is useful even on x86 processors because it simplifies parameter validation at the boundary between user mode and kernel mode by taking out a special case. If the 64KB zone did not exist, then somebody could pass a buffer that straddles the 2GB boundary, and the kernel mode validation layer would have to detect that unusual condition and reject the buffer.

    By having a guaranteed invalid region, the kernel mode buffer validation code can simply validate that the starting address is below the 2GB boundary, then walk through the buffer checking each page. If somebody tries to straddle the boundary, the validation code will hit the permanently-invalid region and fail.

    Yes, this sounds like a micro-optimization, but I suspect this was not so much for optimization purposes as it was to remove weird boundary conditions, because weird boundary conditions are where the bugs tend to be.

    (Obviously, the no-man's land moves if you set the /3GB switch.)

  • The Old New Thing

    Why is 0x00400000 the default base address for an executable?

    • 31 Comments

    The default base address for a DLL is 0x10000000, but the default base address for an EXE is 0x00400000. Why that particular value for EXEs? What's so special about 4 megabytes

    It has to do with the amount of address space mapped by a single page directory entry on an x86 and a design decision made in 1987.

    The only technical requirement for the base address of an EXE is that it be a multiple of 64KB. But some choices for base address are better than others.

    The goal in choosing a base address is to minimize the likelihood that modules will have to be relocated. This means not colliding with things already in the address space (which will force you to relocate) as well as not colliding with things that may arrive in the address space later (forcing them to relocate). For an executable, the not colliding with things that may arrive later part means avoiding the region of the address space that tends to fill with DLLs. Since the operating system itself puts DLLs at high addresses and the default base address for non-operating system DLLs is 0x10000000, this means that the base address for the executable should be somewhere below 0x10000000, and the lower you go, the more room you have before you start colliding with DLLs. But how low can you go?

    The first part means that you also want to avoid the things that are already there. Windows NT didn't have a lot of stuff at low addresses. The only thing that was already there was a PAGE_NOACCESS page mapped at zero in order to catch null pointer accesses. Therefore, on Windows NT, you could base your executable at 0x00010000, and many applications did just that.

    But on Windows 95, there was a lot of stuff already there. The Windows 95 virtual machine manager permanently maps the first 64KB of physical memory to the first 64KB of virtual memory in order to avoid a CPU erratum. (Windows 95 had to work around a lot of CPU bugs and firmware bugs.) Furthermore, the entire first megabyte of virtual address space is mapped to the logical address space of the active virtual machine. (Nitpickers: actually a little more than a megabyte.) This mapping behavior is required by the x86 processor's virtual-8086 mode.

    Windows 95, like its predecessor Windows 3.1, runs Windows in a special virtual machine (known as the System VM), and for compatibility it still routes all sorts of things through 16-bit code just to make sure the decoy quacks the right way. Therefore, even when the CPU is running a Windows application (as opposed to an MS-DOS-based application), it still keeps the virtual machine mapping active so it doesn't have to do page remapping (and the expensive TLB flush that comes with it) every time it needs to go to the MS-DOS compatibility layer.

    Okay, so the first megabyte of address space is already off the table. What about the other three megabytes?

    Now we come back to that little hint at the top of the article.

    In order to make context switching fast, the Windows 3.1 virtual machine manager "rounds up" the per-VM context to 4MB. It does this so that a memory context switch can be performed by simply updating a single 32-bit value in the page directory. (Nitpickers: You also have to mark instance data pages, but that's just flipping a dozen or so bits.) This rounding causes us to lose three megabytes of address space, but given that there was four gigabytes of address space, a loss of less than one tenth of one percent was deemed a fair trade-off for the significant performance improvement. (Especially since no applications at the time came anywhere near beginning to scratch the surface of this limit. Your entire computer had only 2MB of RAM in the first place!)

    This memory map was carried forward into Windows 95, with some tweaks to handle separate address spaces for 32-bit Windows applications. Therefore, the lowest address an executable could be loaded on Windows 95 was at 4MB, which is 0x00400000.

    Geek trivia: To prevent Win32 applications from accessing the MS-DOS compatibility area, the flat data selector was actually an expand-down selector which stopped at the 4MB boundary. (Similarly, a null pointer in a 16-bit Windows application would result in an access violation because the null selector is invalid. It would not have accessed the interrupt vector table.)

    The linker chooses a default base address for executables of 0x0400000 so that the resulting binary can load without relocation on both Windows NT and Windows 95. Nobody really cares much about targeting Windows 95 any more, so in principle, the linker folks could choose a different default base address now. But there's no real incentive for doing it aside from making diagrams look prettier, especially since ASLR makes the whole issue moot anyway. And besides, if they changed it, then people would be asking, "How come some executables have a base address of 0x04000000 and some executables have a base address of 0x00010000?"

    TL;DR: To make context switching fast.

  • The Old New Thing

    What is the story of the mysterious DS_RECURSE dialog style?

    • 12 Comments

    There are a few references to the DS_RECURSE dialog style scattered throughout MSDN, and they are all of the form "Don't use it." But if you look in your copy of winuser.h, there is no sign of DS_RECURSE anywhere. This obviously makes it trivial to avoid using it because you couldn't use it even if you wanted it, seeing as it doesn't exist.

    "Do not push the red button on the control panel!"

    There is no red button on the control panel.

    "Well, that makes it easy not to push it."

    As with many of these types of stories, the answer is rather mundane.

    When nested dialogs were added to Windows 95, the flag to indicate that a dialog is a control host was DS_RECURSE. The name was intended to indicate that anybody who is walking a dialog looking for controls should recurse into this window, since it has more controls inside.

    The window manager folks later decided to change the name, and they changed it to DS_CONTROL. All documentation that was written before the renaming had to be revised to change all occurrences of DS_RECURSE to DS_CONTROL.

    It looks like they didn't quite catch them all: There are two straggling references in the Windows Embedded documentation. My guess is that the Windows Embedded team took a snapshot of the main Windows documentation, and they took their snapshot before the renaming was complete.

    Unfortunately, I don't have any contacts in the Windows Embedded documentation team, so I don't know whom to contact to get them to remove the references to flags that don't exist.

  • The Old New Thing

    What did Windows 3.1 do when you hit Ctrl+Alt+Del?

    • 29 Comments

    This is the end of Ctrl+Alt+Del week, a week that sort of happened around me and I had to catch up with.

    The Windows 3.1 virtual machine manager had a clever solution for avoiding deadlocks: There was only one synchronization object in the entire kernel. It was called "the critical section", with the definite article because there was only one. The nice thing about a system where the only available synchronization object is a single critical section is that deadlocks are impossible: The thread with the critical section will always be able to make progress because the only thing that could cause it to stop would be blocking on a synchronization object. But there is only one synchronization object (the critical section), and it already owns that.

    When you hit Ctrl+Alt+Del in Windows 3.1, a bunch of crazy stuff happened. All this work was in a separate driver, known as the virtual reboot device. By convention, all drivers in Windows 3.1 were called the virtual something device because their main job was to virtualize some hardware or other functionality. That's where the funny name VxD came from. It was short for virtual x device.

    First, the virtual reboot device driver checked which virtual machine had focus. If you were using an MS-DOS program, then it told all the device drivers to clean up whatever they were doing for that virtual machine, and then it terminated the virtual machine. This was the easy case.

    Otherwise, the focus was on a Windows application. Now things got messy.

    When the 16-bit Windows kernel started up, it gave the virtual reboot device the addresses of a few magic things. One of those magic things was a special byte that was set to 1 every time the 16-bit Windows scheduler regained control. When you hit Ctrl+Alt+Del, the virtual reboot device set the byte to 0, and it also registered a callback with the virtual machine manager to say "Call me back once the critical section becomes available." The callback didn't do anything aside from remember the fact that it was called at all. And then the code waited for ¾ seconds. (Why ¾ seconds? I have no idea.)

    After ¾ seconds, the virtual reboot device looked to see what the state of the machine was.

    If the "call me back once the critical section becomes available" callback was never called, then the problem is that a device driver is stuck in the critical section. Maybe the device driver put an Abort, Retry, Ignore message on the screen that the user needs to respond to. The user saw this message:


     Procomm 


    This background non-Windows application is not responding.

    *  Press any key to activate the non-Windows application.
    *  Press CTRL+ALT+DEL again to restart your computer. You will
       lose any unsaved information.


      Press any key to continue _

    After the user presses a key, focus was placed on the virtual machine that holds the critical section so the user can address the problem. A user who is still stuck can hit Ctrl+Alt+Del again to restart the whole process, and this time, execution will go into the "If you were using an MS-DOS program" paragraph, and the code will shut down the stuck virtual machine.

    If the critical section was not the problem, then the virtual reboot device checked if the 16-bit kernel scheduler had set the byte to 1 in the meantime. If so, then it means that no applications were hung, and you got the message


     Windows 


    Although you can use CTRL+ALT+DEL to quit an application that has stopped responding to the system, there is no application in this state.

    To quit an application, use the application's quit or exit command, or choose the Close command from the Control menu.

    *  Press any key to return to Windows.
    *  Press CTRL+ALT+DEL again to restart your computer. You will
       lose any unsaved information in all applications.


      Press any key to continue _

    (Anachronism alert: The System menu was called the Control menu back then.)

    Otherwise, the special byte was still 0, which means that the 16-bit scheduler never got control, which meant that a 16-bit Windows application was not releasing control back to the kernel. The virtual reboot device then waited for the virtual machine to finish processing any pending virtual interrupts. (This allowed any pending MS-DOS emulation or 16-bit MS-DOS device drivers to finish up their work.) If things did not return to this sane state within 3¼ seconds, then you got this screen:


     Windows 


    The system is either busy or has become unstable. You can wait and see if the system becomes available again and continue working or you can restart your computer.

    *  Press any key to return to Windows and wait.
    *  Press CTRL+ALT+DEL again to restart your computer. You will
       lose any unsaved information in all applications.


      Press any key to continue _

    Otherwise, we are in the case where the system returned to a state where there are no active virtual interrupts. The kernel single-stepped the processor if necessary until the instruction pointer was no longer in the kernel, or until it had single-stepped for 5000 instructions and the instruction pointer was not in the heap manager. (The heap manager was allowed to run for more than 5000 instructions.)

    At this point, you got the screen that Steve Ballmer wrote.



    Contoso Deluxe Music Composer


      This Windows application has stopped responding to the system.

      *  Press ESC to cancel and return to Windows.
      *  Press ENTER to close this application that is not responding.
         You will lose any unsaved information in this application.
      *  Press CTRL+ALT+DEL again to restart your computer. You will
         lose any unsaved information in all applications.


    If you hit Enter, then the 16-bit kernel terminated the application by doing mov ax, 4c00h followed by int 21h, which was the system call that applications used to exit normally. This time, the kernel is making the exit call on behalf of the stuck application. Everything looks like the application simply decided to exit normally.

    The stuck application exits, the kernel regains control, and hopefully, things return to normal.

    I should point out that I didn't write any of this code. "It was like that when I got here."

    Bonus chatter: There were various configuration settings to tweak all of the above behavior. For example, you could say that Ctrl+Alt+Del always restarted the computer rather than terminating the current application. Or you could skip the check whether the 16-bit kernel scheduler had set the byte to 1 so that you could use Ctrl+Alt+Del to terminate an application even if it wasn't hung.¹ There was also a setting to restart the computer upon receipt of an NMI, the intention being that the signal would be triggered either by a dedicated add-on switch or by poking a ball-point pen in just the right spot. (This is safer than just pushing the reset button because the restart would flush disk caches and shut down devices in an orderly manner.)

    ¹ This setting was intended for developers to assist in debugging their programs because if you went for this option, the program that got terminated is whichever one happened to have control of the CPU at the time you hit Ctrl+Alt+Del. This was, in theory, random, but in practice it often guessed right. That's because the problem was usually that a program got wedged into an infinite message loop, so most of the CPU was being run in the stuck application anyway.

  • The Old New Thing

    The history of Win32 critical sections so far

    • 19 Comments

    The CRITICAL_SECTION structure has gone through a lot of changes since its introduction back oh so many decades ago. The amazing thing is that as long as you stick to the documented API, your code is completely unaffected.

    Initially, the critical section object had an owner field to keep track of which thread entered the critical section, if any. It also had a lock count to keep track of how many times the owner thread entered the critical section, so that the critical section would be released when the matching number of Leave­Critical­Section calls was made. And there was an auto-reset event used to manage contention. We'll look more at that event later. (It's actually more complicated than this, but the details aren't important.)

    If you've ever looked at the innards of a critical section (for entertainment purposes only), you may have noticed that the lock count was off by one: The lock count was the number of active calls to Enter­Critical­Section minus one. The bias was needed because the original version of the interlocked increment and decrement operations returned only the sign of the result, not the revised value. Biasing the result by 1 means that all three states could be detected: Unlocked (negative), locked exactly once (zero), reentrant lock (positive). (It's actually more complicated than this, but the details aren't important.)

    If a thread tries to enter a critical section but can't because the critical section is owned by another thread, then it sits and waits on the contention event. When the owning thread releases all its claims on the critical section, it signals the event to say, "Okay, the door is unlocked. The next guy can come in."

    The contention event is allocated only when contention occurs. (This is what older versions of MSDN meant when they said that the event is "allocated on demand.") Which leads to a nasty problem: What if contention occurs, but the attempt to create the contention event fails? Originally, the answer was "The kernel raises an out-of-memory exception."

    Now you'd think that a clever program could catch this exception and try to recover from it, say, by unwinding everything that led up to the exception. Unfortunately, the weakest link in the chain is the critical section object itself: In the original version of the code, the out-of-memory exception was raised while the critical section was in an unstable state. Even if you managed to catch the exception and unwind everything you could, the critical section was itself irretrievably corrupted.

    Another problem with the original design became apparent on multiprocessor systems: If a critical section was typically held for a very brief time, then by the time you called into kernel to wait on the contention event, the critical section was already freed!

    void SetGuid(REFGUID guid)
    {
     EnterCriticalSection(&g_csGuidUpdate);
     g_theGuid = guid;
     LeaveCriticalSection(&g_csGuidUpdate);
    }
    
    void GetGuid(GUID *pguid)
    {
     EnterCriticalSection(&g_csGuidUpdate);
     *pguid = g_theGuid;
     LeaveCriticalSection(&g_csGuidUpdate);
    }
    

    This imaginary code uses a critical section to protect accesses to a GUID. The actual protected region is just nine instructions long. Setting up to wait on a kernel object is way, way more than nine instructions. If the second thread immediately waited on the critical section contention event, it would find that by the time the kernel got around to entering the wait state, the event would say, "Dude, what took you so long? I was signaleded, like, a bazillion cycles ago!"

    Windows 2000 added the Initialize­Critical­Section­And­Spin­Count function, which lets you avoid the problem where waiting for a critical section costs more than the code the critical section was protecting. If you initialize with a spin count, then when a thread tries to enter the critical section and can't, it goes into a loop trying to enter it over and over again, in the hopes that it will be released.

    "Are we there yet? How about now? How about now? How about now? How about now? How about now? How about now? How about now? How about now? How about now? How about now? How about now?"

    If the critical section is not released after the requested number of iterations, then the old slow wait code is executed.

    Note that spinning on a critical section is helpful only on multiprocessor systems, and only in the case where you know that all the protected code segments are very short in duration. If the critical section is held for a long time, then spinning is wasteful since the odds that the critical section will become free during the spin cycle are very low, and you wasted a bunch of CPU.

    Another feature added in Windows 2000 is the ability to preallocate the contention event. This avoids the dreaded "out of memory" exception in Enter­Critical­Section, but at a cost of a kernel event for every critical section, whether actual contention occurs or not.

    Windows XP solved the problem of the dreaded "out of memory" exception by using a fallback algorithm that could be used if the contention event could not be allocated. The fallback algorithm is not as efficient, but at least it avoids the "out of memory" situation. (Which is a good thing, because nobody really expects Enter­Critical­Section to fail.) This also means that requests for the contention event to be preallocated are now ignored, since the reason for preallocating (avoiding the "out of memory" exception) no longer exists.

    (And while they were there, the kernel folks also fixed Initialize­Critical­Section so that a failed initialization left the critical section object in a stable state so you could safely try again later.)

    In Windows Vista, the internals of the critical section object were rejiggered once again, this time to add convoy resistance. The internal bookkeeping completely changed; the lock count is no longer a 1-biased count of the number of Enter­Critical­Section calls which are pending. As a special concession to backward compatibility with people who violated the API contract and looked directly at the internal data structures, the new algorithm goes to some extra effort to ensure that if a program breaks the rules and looks at a specific offset inside the critical section object, the value stored there is −1 if and only if the critical section is unlocked.

    Often, people will remark that "your compatibility problems would go away if you just open-sourced the operating system." I think there is some confusion over what "go away" means. If you release the source code to the operating system, it makes it even easier for people to take undocumented dependencies on it, because they no longer have the barrier of "Well, I can't find any documentation, so maybe it's not documented." They can just read the source code and say, "Oh, I see that if the critical section is unlocked, the Lock­Count variable has the value −1." Boom, instant undocumented dependency. Compatibility is screwed. (Unless what people are saying "your compatibility problems would go away if you just open-sourced all applications, so that these problems can be identified and fixed as soon as they are discovered.")

    Exercise: Why isn't it important that the fallback algorithm be highly efficient?

Page 1 of 50 (498 items) 12345»