• The Old New Thing

    Oh, that's probably why I'm in the Quake credits


    Back in 2012, I couldn't remember why I was in the Quake credits. But then the comment from Kristaps pinpointed the most likely reason: The Sys_Page­In function.

    void Sys_PageIn (void *ptr, int size)
        byte    *x;
        int     j, m, n;
    // touch all the memory to make sure it's there. The 16-page skip is to
    // keep Win 95 from thinking we're trying to page ourselves in (we are
    // doing that, of course, but there's no reason we shouldn't)
        x = (byte *)ptr;
        for (n=0 ; n<4 ; n++)
            for (m=0 ; m<(size - 16 * 0x1000) ; m += 4)
                sys_checksum += *(int *)&x[m];
                sys_checksum += *(int *)&x[m + 16 * 0x1000];

    What this code does is access the memory block specified by the ptr and size parameters in an unusual pattern: It reads byte zero, then the byte at an offset of 16 pages, then byte one, then a byte at an offset of 16 pages plus one, and so on, alternating between a byte and its counterpart 16 pages ahead.

    This specific access pattern in Windows 95 defeated the "sequential memory scan" detection algorithm.

    Recall that computers in the Windows 95 era had 4MB of RAM. Suppose you were working in a document for a long time. Finally, you're done, and you close the window or minimize it. Boom, now your desktop is visible and the wallpaper bitmap needs to be paged in. If your screen is 1024 × 768 at 16 bits per pixel, that comes out to 1.5MB of memory. Paging in 1.5MB of memory means for the bitmap means kicking out 1.5MB of memory being used for other stuff, and that's a lot of memory for a machine that has only 4MB to work with (especially since a lot of that 4MB belongs to stuff that isn't eligible for being paged out). The phenomenon we saw was that repainting your desktop would flush out most of your memory.

    And then the next thing you do is probably launch a new application, which will cover the wallpaper, so the wallpaper memory isn't going to be needed any more. So we basically purged all the memory in your system in order to handle a huge block of memory that got accessed only once.

    The trick that Windows 95 used was to watch your pattern of page faults, and if it saw that you were doing sequential memory access, it started marking the memory 16 pages behind the current access as not recently accessed. In the case of a straight sequential scan, this means that the entire buffer cycles through a 64KB window of memory, regardless of the buffer size. With this trick, a 4MB buffer ends up consuming only 64KB of memory, as opposed to using all the memory in your system.

    The Sys_Page­In function specifically defeates the sequential-scan detector by intentionally going back 16 pages and accessing the page again. This causes it to be marked recently used, counteracting the not recently used that the sequential-scan detector had done. Result: The memory pages are all marked recently used and are no longer prime candidates for being paged out.

  • The Old New Thing

    How do you get network connectivity from the worst PC in the world?


    Some time ago, I wrote about the two worst PCs ever. The worst PC of all time, according to PC World magazine was the Packard Bell PC. As installed at the factory, the computer came with every single expansion slot filled. Need to add a peripheral to your computer? Ha-ha, you can't!

    Now, this was in the days before motherboards had integrated network adapters, and a network adapter was not one of the expansion cards that came preinstalled. But I needed to get network access in order to install the latest builds, and I teased at the end of the article that I had to resort to other devious means of obtaining network connectivity.

    Nobody ever asked me to follow up on that teaser, but I'm going to answer it anyway. "So, Raymond, how did you get network connectivity on a computer that had no network adapter and nowhere to plug in a network adapter card?"

    Of course, the cheat would be to unplug one of the existing expansion cards (the dial-up modem was a good candidate), but that would remove a piece of hardware that the senior executive's identical home PC was using.

    My solution was to use a little-known feature of Windows 95 known as Direct Cable Connection, or DCC. DCC allowed you to use your parallel port as a network adapter. (I am not making this up.) I obtained a special cable from the manager of the DCC project and hooked up one end to the Packed Bell PC and the other end to my main development machine, which acted as a bridge between the Packard Bell PC and the rest of the corporate network.

    I installed new builds of Windows 95 this way, which was a great source of amusement (and by amusement, I mean frustration) to the Windows 95 setup team, who found themselves dealing with failures that occurred on a network configuration most of them had never heard of. (But which, realistically, was one of the flakiest network configurations in the world.)

    I also ran nightly stress tests this way, which offered a similar degree of amusement/frustration to whatever developer had to investigate the failures turned up by the accursed machine. For example, one of the things that made debugging difficult was that if you broke into the kernel debugger for too long, the DCC network connection dropped, and you lost network connectivity.

    I think it's kind of fitting that the worst PC in the world also offered the worst debugging experience in the world.

  • The Old New Thing

    The Windows 95 I/O system assumed that if it wrote a byte, then it could read it back


    In Windows 95, compressed data was read off the disk in three steps.

    1. The raw compressed data was read into a temporary buffer.
    2. The compressed data was uncompressed into a second temporary buffer.
    3. The uncompressed data was copied to the application-provided I/O buffer.

    But you could save a step if the I/O buffer was a full cluster:

    1. The raw compressed data was read into a temporary buffer.
    2. The compressed data was uncompressed directly into the application-provided I/O buffer.

    A common characteristic of dictionary-based compression is that a compressed stream can contain a code that says "Generate a copy of bytes X through Y from the existing uncompressed data."

    As a simplified example, suppose the cluster consisted of two copies of the same 512-byte block. The compressed data might say "Take these 512 bytes and copy them to the output. Then take bytes 0 through 511 of the uncompressed output and copy them to the output."

    So far, so good.

    Well, except that if the application wrote to the I/O buffer while the read was in progress, then the read would get corrupted because it would copy the wrong bytes to the second half of the cluster.

    Fortunately, writing to the I/O buffer is forbidden during the read, so any application that pulled this sort of trick was breaking the rules, and if it got corrupted data, well, that's its own fault. (You can construct a similar scenario where writing to the buffer during a write can result in corrupted data being written to disk.)

    Things got even weirder if you passed a memory-mapped device as your I/O buffer. There was a bug that said, "The splash screen for this MS-DOS game is all corrupted if you run it from a compressed volume."

    The reason was that the game issued an I/O directly into the video frame buffer. The EGA and VGA video frame buffers used planar memory and latching. When you read or write a byte in video memory, the resulting behavior is a complicated combination of the byte you wrote, the values in the latches, other configuration settings, and the values already in memory. The details aren't important; the important thing is that video memory does not act like system RAM. Write a byte to video memory, then read it back, and not only will you not get the same value back, but you probably modified video memory in a strange way.

    The game in question loaded its splash screen by issuing I/O directly into video memory, knowing that MS-DOS copies the result into the output buffer byte by byte. It set up the control registers and the latches in such a way that then bytes written into memory go exactly where they should. (It issued four reads into the same buffer, with different control registers each time, so that each read ended up being issued to a different plane.)

    This worked great, unless the disk was compressed.

    The optimization above relied on the property that writing a byte followed by reading the byte produces the byte originally written. But this doesn't work for video memory because of the weird way video memory works. The result was that when the decompression engine tried to read what it thought was the uncompressed data, it was actually asking the video controller to do some strange operations. The result was corrupted decompressed data, and corrupted video data.

    The fix was to force double-buffering in non-device RAM if the I/O buffer was into device-mapped memory.

  • The Old New Thing

    Windows started picking up the really big pieces of TerminateThread garbage on the sidewalk, but it's still garbage on the sidewalk


    Ah, Terminate­Thread. There are still people who think that there are valid scenarios for calling Terminate­Thread.

    Can you explain how Exit­Thread works?

    We are interested because we have a class called Thread­Class. We call the Start() method , and then the Stop() method, and then the Wait­Until­Stopped() method, and then the process hangs with this call stack:


    Can you help us figure out what's going on?

    From the stack trace, it is clear that the thread is shutting down, and the loader (Ldr) is waiting on a critical section. The critical section the loader is most famous for needing is the so-called loader lock which is used for various things, most notably to make sure that all DLL thread notification are serialized.

    I guessed that the call to Wait­Until­Stopped() was happening inside Dll­Main, which created a deadlock because the thread cannot exit until it delivers its Dll­Main notifications, but it can't do that until the calling thread exits Dll­Main.

    The customer did some more debugging:

    The debugger reports the critical section as

    CritSec ntdll!LdrpLoaderLock+0 at 77724300
    WaiterWoken        No
    LockCount          3
    RecursionCount     1
    OwningThread       a80
    EntryCount         0
    ContentionCount    3
    *** Locked

    The critical section claims that it is owned by thread 0xa80, but there is no such active thread in the process. In the kernel debugger, a search for that thread says

    Looking for thread Cid = a80 ...
    THREAD 8579e1c0  Cid 0b58.0a80  Teb: 00000000 Win32Thread: 00000000 TERMINATED
    Not impersonating
    DeviceMap                 862f8a98
    Owning Process            0       Image:         <Unknown>
    Attached Process          84386d90       Image:         Contoso.exe
    Wait Start TickCount      12938474       Ticks: 114780 (0:00:29:50.579)
    Context Switch Count      8             
    UserTime                  00:00:00.000
    KernelTime                00:00:00.000
    Win32 Start Address 0x011167c0
    Stack Init 0 Current bae35be0 Base bae36000 Limit bae33000 Call 0
    Priority 10 BasePriority 8 PriorityDecrement 2 IoPriority 2 PagePriority 5

    Contoso.exe is our process.

    Okay, we're getting somewhere now. The thread 0xa80 terminated while it held the loader lock. When you run the program under a debugger, do you see any exceptions that might suggest that the thread terminated abnormally?

    We found the cause of the problem. We use Terminate­Thread in the other place. That causes the thread to continue to hold the loader lock after it has terminated.

    It's not clear what the customer meant by "the other place", but no matter. The cause of the problem was found: They were using Terminate­Thread.

    At this point, Larry Osterman was inspired to write a poem.

    How many times does
    it have to be said: Never
    call TerminateThread.

    In the ensuing discussion, somebody suggested,

    One case where it is okay to use Terminate­Thread is if the thread was created suspended and has never been resumed. I believe it is perfectly legal to terminate it, at least in Windows Vista and later.

    No, it is not "perfectly legal," for certain values of "perfectly legal."

    What happened is that Windows Vista added some code to try to limit the impact of a bad idea. Specifically, it added code to free the thread's stack when the thread was terminated, so that each terminated thread didn't leak a megabyte of memory. In the parlance of earlier discussion, I referred to this as stop throwing garbage on the sidewalk.

    In this case, it's like saying, "It's okay to run this red light because the city added a delayed green to the cross traffic." The city added a delayed green to the cross traffic because people were running the light and the city didn't want people to die. That doesn't mean that it's okay to run the light now.

    Unfortunately, the guidance that says "Sometimes it's okay to call Terminate­Thread" has seeped into our own Best Practices documents. The Dynamic-Link Library Best Practices under Best Practices for Synchronization describes a synchronization model which actually involves calling Terminate­Thread.

    Do not do this.

    It's particularly sad because the downloadable version of the document references both Larry and me telling people to stop doing crazy things in Dll­Main, and terminating threads is definitely a crazy thing.

    (The solution to the problem described in the whitepaper is not to use Terminate­Thread. It's to use the Free­Library­And­Exit­Thread pattern.)

    Now the history.

    Originally, there was no Terminate­Thread function. The original designers felt strongly that no such function should exist because there was no safe way to terminate a thread, and there's no point having a function that cannot be called safely. But people screamed that they needed the Terminate­Thread function, even though it wasn't safe, so the operating system designers caved and added the function because people demanded it. Of course, those people who insisted that they needed Terminate­Thread now regret having been given it.

    It's one of those "Be careful what you wish for" things.

  • The Old New Thing

    The changing fortunes of being the last program on the Start menu


    One of the thing I didn't mention during my discussion lo many years ago on how Windows XP chooses what goes on the front page of the Start menu is that the the last-place position is actually kind of special.

    In Windows Vista, the last-place position was used to hold the program you ran most recently, regardless of how often you ran it. In a sense, the last slot is a one-entry MRU. (This was the general idea, although it could be overridden by other principles. For example, if the program you last ran is pinned, then it shows up in its natural pinned location rather than being shoved to the bottom of the Start menu.)

    In Windows 7, the magical last-place position goes to the application you just installed, if one exists, to save you the trouble of hunting for it. (As we saw some time ago, you can set the System.App­User­Model.Exclude­From­Show­In­New­Install property to remove your shortcut from consideration as a newly-installed program.)

  • The Old New Thing

    What did the Ignore button do in Windows 3.1 when an application encountered a general protection fault?


    In Windows 3.0, when an application encountered a general protection fault, you got an error message that looked like this:

    Application error
    CONTOSO caused a General Protection Fault in
    module CONTOSO.EXE at 0002:2403

    In Windows 3.1, under the right conditions, you would get a second option:

    An error has occurred in your application.
    If you choose Ignore, you should save your work in a new file.
    If you choose Close, your application will terminate.

    Okay, we know what Close does. But what does Ignore do? And under what conditions will it appear?

    Roughly speaking, the Ignore option becomes available if

    • The fault is a general protection fault,
    • The faulting instruction is not in the kernel or the window manager,
    • The faulting instruction is one of the following, possibly with one or more prefix bytes:
      • Memory operations: op r, m; op m, r; or op m.
      • String memory operations: movs, stos, etc.
      • Selector load: lds, les, pop ds, pop es.

    If the conditions are met, then the Ignore option became available. If you chose to Ignore, then the kernel did the following:

    • If the faulting instruction is a selector load instruction, the destination selector register is set to zero.
    • If the faulting instruction is a pop instruction, the stack pointer is incremented by two.
    • The instruction pointer is advanced over the faulting instruction.
    • Execution is resumed.

    In other words, the kernel did the assembly language equivalent of ON ERROR RESUME NEXT.

    Now, your reaction to this might be, "How could this possibly work? You are just randomly ignoring instructions!" But the strange thing is, this idea was so crazy it actually worked, or at least worked a lot of the time. You might have to hit Ignore a dozen times, but there's a good chance that eventually the bad values in the registers will get overwritten by good values (and it probably won't take long because the 8086 has so few registers), and the program will continue seemingly-normally.

    Totally crazy.

    Exercise: Why didn't the code have to know how to ignore jump instructions and conditional jump instructions?

    Bonus trivia: The developer who implemented this crazy feature was Don Corbitt, the same developer who wrote Dr. Watson.

  • The Old New Thing

    Hazy memories of the Windows 95 ship party


    One of the moments from the Windows 95 ship party (20 years ago today) was when one of the team members drove his motorcycle through the halls, leaving burns in the carpet.

    The funny part of that story (beyond the fact that it happened) is that nobody can agree on who it was! I seem to recall that it was Todd, but another of my colleagues remembers that it was Dave, and yet another remembers that it was Ed. We all remember the carpet burns, but we all blame it on different people.

    As one of my colleagues noted, "I'm glad all of this happened before YouTube."

    Brad Silverberg, the vice president of the Personal Systems Division (as it was then known), recalled that "I had a lot of apologizing to do to Facilities [about all the shenanigans that took place that day], but it was worth it."

  • The Old New Thing

    Why does the BackupWrite function take a pointer to a modifiable buffer when it shouldn't be modifying the buffer?


    The Backup­Write function takes a non-const pointer to the buffer to be written to the file being restored. Will it actually modify the buffer? Assuming it doesn't, why wasn't it declared const? It would be much more convenient if it took a const pointer to the buffer, so that people with const buffers didn't have to const_cast every time they called the function. Would changing the parameter from non-const to const create any compatibility problems?

    Okay, let's take the questions in order.

    Will it actually modify the buffer? No.

    Why wasn't it declared const? My colleague Aaron Margosis explained that the function dates back to Windows NT 3.1, when const-correctness was rarely considered. A lot of functions from that area (particularly in the kernel) suffer from the same problem. For example, the computer name passed to the Reg­Connect­Registry function is a non-const pointer even though the function never writes to it.

    Last question: Can the parameter be changed from non-const to const without breaking compatibility?

    It would not cause problems from a binary compatibility standpoint, because a const pointer and a non-const pointer take the same physical form in Win32. However, it breaks source code compatiblity. Consider the following code fragment:

    BOOL WINAPI TestModeBackupWrite(
      HANDLE hFile,
      LPBYTE lpBuffer,
      DWORD nNumberOfBytesToWrite,
      LPDWORD lpNumberOfBytesWritten,
      BOOL bAbort,
      BOOL bProcessSecurity,
      LPVOID *lpContext)
     ... simulate a BackupWrite ...
     return TRUE;
                     LPDWORD, BOOL, BOOL, LPVOID *);
    BACKUPWRITEPROC TestableBackupWrite;
    void SetTestMode(bool testing)
     if (testing) {
      TestableBackupWrite = TestModeBackupWrite;
     } else {
      TestableBackupWrite = BackupWrite;

    The idea here is that the program can be run in test mode, say to do a simulated restore. (You see this sort of thing a lot with DVD-burning software.) The program uses Testable­Backup­Write whenever it wants to write to a file being restored from backup. In test mode, Testable­Backup­Write points to the Test­Mode­Backup­Write function; in normal mode, it points to the Backup­Write function.

    If the second parameter were changed from LPBYTE to const BYTE *, then the above code would hit a compiler error.

    Mind you, maybe it's worth breaking some source code in order to get better const-correctness, but for now, the cost/benefit tradeoff biases toward leaving things alone.

  • The Old New Thing

    Where is the full version of the music that plays when you start Windows 98 for the first time?


    A customer was presumably exercising the unlimited support part of their support contract when they asked, "Where is the full song for the music that plays when you start Windows 98 for the first time? The Welcome application plays only the first 30 seconds. Can you send us the rest of the song?"

    I guess the IT department really likes that music and wants the extended dance remix for their late-night raves. (If you stick it out, there's a special appearance in the linked video of the screen I wrote.)

    The music was commissioned to be 30 seconds long. There is no extended version of the song, or at least not one that Microsoft acquired the rights to. The original composer could have written a five-minute version and sent Microsoft only the first 30 seconds, or maybe the piece is only 30 seconds long to begin with. All we know is that we have a 30-second music clip.

  • The Old New Thing

    Why does the class name for Explorer change depending on whether you open it with /e?


    I noted some time ago that Explorer's original name was Cabinet, and that the name lingers in the programmatic class name: Cabinet­WClass. A commenter with a rude name points out that Explorer uses the class name Explorer­WClass if you open it with the /e command line switch, adding, "This is rather strange since you can toggle the folder pane on/off in the UI either way."

    In Windows 95, the window class names for Explorer were Cabinet­WClass for plain Explorer windows and Explorer­WClass for windows opened in Explore mode with the folder tree view thingie on the left hand side. This was not strange at the time because there were two different types of Explorer windows, and there was no way to change between them. The UI to toggle the folder pane on/off did not exist.

    Internally, the two types of Explorer windows were handled by different frame window classes, and naturally the two different classes got different names. The plain Explorer window frame hosted a view window, an address bar, and a status bar, whereas the fancy Explorer window frame hosted those components plus a folder tree. It wasn't until some time later that the ability to toggle the folder pane on and off was added. To do this, the two window classes were merged into a single implementation that dynamically added in or removed the folder tree.

    Great, we can get rid of Explorer­WClass and just use Cabinet­WClass for everything.

    And then the application compatibility bug reports came in.

    Because even though it wasn't documented, application relied on the implementation detail that plain Explorer windows could be found by doing a Find­Window for Cabinet­WClass, and that fancy Explorer windows could be found by doing a Find­Window for Explorer­WClass. They would do things like launch explorer.exe /e C:\some\folder, wait a few seconds, and then do a Find­Window("Explorer­WClass", ...) and expect to find a window. (Just do a Web search for Cabinet­WClass and Explorer­WClass if you don't believe me.)

    For compatibility, therefore, Explorer windows still use the old class names from Windows 95. If you open the window with the folder pane hidden, the class name is Cabinet­WClass, and if you open it with the folder pane visible, the class name is Explorer­WClass. The two classes are functionally identical, but people who rely on undocumented behavior expect to see the same names from 1995.

Page 1 of 52 (517 items) 12345»