History

  • The Old New Thing

    Why do BackupRead and BackupWrite require synchronous file handles?

    • 24 Comments

    The Backup­Read and Backup­Write functions require that the handle you provide by synchronous. (In other words, that they not be opened with FILE_FLAG_OVERLAPPED.)

    A customer submitted the following question:

    We have been using asynchronous file handles with the Backup­Read. Every so often, the call to Backup­Read will fail, but we discovered that as a workaround, we can just retry the operation, and it will succeed the second time. This solution has been working for years.

    Lately, we've been seeing crash when trying to back up files, and the stack traces in the crash dumps appear to be corrupted. The issue appears to happen only on certain networks, and the problem goes away if we switch to a synchronous handle.

    Do you have any insight into this issue? Why were the Backup­Read and Backup­Write functions designed to require synchronous handles?

    The Backup­Read and Backup­Write functions have historically issued I/O against the handles provided on the assumption that they are synchronous. As we saw a while ago, doing so against an asynchronous handle means that you're playing a risky game: If the I/O completes synchronously, then nobody gets hurt, but if the I/O goes asynchronous, then the temporary OVERLAPPED structure on the stack will be updated by the kernel when the I/O completes, which could very well be after the function that created it has already returned. The result: A stack smash. (Related: Looking at the world through kernel-colored glasses.)

    This oversight in the code (blindly assuming that the handle is a synchronous handle) was not detected until 10 years after the API was originally designed and implemented. During that time, backup applications managed to develop very tight dependencies on the undocumented behavior of the backup functions. The backup folks tried fixing the bug but found that it ended up introducing massive compatibility issues. On top of that, there was no real business case for extending the Backup­Read and Backup­Write functions to accept asynchronous handles.

    As a result, there was no practical reason for changing the function's behavior. Instead, the requirement that the handle be synchronous was added to the documentation, along with additional text explaining that if you pass an asynchronous handle, you will get "subtle errors that are very difficult to debug."

    In other words, the requirement that the handles be synchronous exists for backward compatibility.

  • The Old New Thing

    Why was Pinball removed from Windows Vista?

    • 115 Comments

    Windows XP was the last client version of Windows to include the Pinball game that had been part of Windows since Windows 95. There is apparently speculation that this was done for legal reasons.

    No, that's not why.

    One of the things I did in Windows XP was port several millions of lines of code from 32-bit to 64-bit Windows so that we could ship Windows XP 64-bit Edition. But one of the programs that ran into trouble was Pinball. The 64-bit version of Pinball had a pretty nasty bug where the ball would simply pass through other objects like a ghost. In particular, when you started the game, the ball would be delivered to the launcher, and then it would slowly fall towards the bottom of the screen, through the plunger, and out the bottom of the table.

    Games tended to be really short.

    Two of us tried to debug the program to figure out what was going on, but given that this was code written several years earlier by an outside company, and that nobody at Microsoft ever understood how the code worked (much less still understood it), and that most of the code was completely uncommented, we simply couldn't figure out why the collision detector was not working. Heck, we couldn't even find the collision detector!

    We had several million lines of code still to port, so we couldn't afford to spend days studying the code trying to figure out what obscure floating point rounding error was causing collision detection to fail. We just made the executive decision right there to drop Pinball from the product.

    If it makes you feel better, I am saddened by this as much as you are. I really enjoyed playing that game. It was the location of the one Windows XP feature I am most proud of.

    Update: Hey everybody asking that the source code be released: The source code was licensed from another company. If you want the source code, you have to go ask them.

  • The Old New Thing

    The QuickCD PowerToy, a brief look back

    • 27 Comments

    One of the original Windows 95 PowerToys was a tool called QuickCD. Though that wasn't its original name.

    The original name of the QuickCD PowerToy was FlexiCD. You'd think that it was short for "Flexible CD Player", but you'd be wrong. FlexiCD was actually named after its author, whose name is Felix, but who uses the "Flexi" anagram as a whimsical nickname. We still called him Felix, but he would occasionally use the Flexi nickname to sign off an email message, or use it whenever he had to create a userid for a Web site (if Web sites which required user registration existed in 1994).

    You can still see remnants of FlexiCD in the documentation. The last sample INF file on this page was taken from the QuickCD installer.

  • The Old New Thing

    Have you found any TheDailyWTF-worthy code during the development of Windows 95?

    • 25 Comments

    Mott555 is interested in some sloppy/ugly code or strange workarounds or code comments during the development of Windows 95, like "anything TheDailyWTF-worthy."

    I discovered that opening a particular program churned the hard drive a lot when you opened it. I decided to hook up the debugger to see what the problem was. What I discovered was code that went roughly like this, in pseudo-code:

    int TryToCallFunctionX(a, b, c)
    {
      for each file in (SystemDirectory,
                        WindowsDirectory,
                        ProgramFilesDirectory(RecursiveSearch),
                        KitchenSink,
                        Uncle.GetKitchenSink)
      {
        hInstance = LoadLibrary(file);
        fn = GetProcAddress(hInstance, "FunctionX");
        if (fn != nullptr) {
            int result = fn(a,b,c);
            FreeLibrary(hInstance);
            return result;
        }
        fn = GetProcAddress(hInstance, "__imp__FunctionX");
        if (fn != nullptr) {
            int result = fn(a,b,c);
            FreeLibrary(hInstance);
            return result;
        }
        fn = GetProcAddress(hInstance, "FunctionX@12");
        if (fn != nullptr) {
            int result = fn(a,b,c);
            FreeLibrary(hInstance);
            return result;
        }
        fn = GetProcAddress(hInstance, "__imp__FunctionX@12");
        if (fn != nullptr) {
            int result = fn(a,b,c);
            FreeLibrary(hInstance);
            return result;
        }
        FreeLibrary(hInstance);
      }
      return 0;
    }
    

    The code enumerated every file in the system directory, Windows directory, Program Files directory, and possibly also the kitchen sink and their uncle's kitchen sink. It tries to load each one as a library, and sees if it has an export called FunctionX. For good measure, it also tries __imp__­FunctionX, FunctionX@12, and __imp__­FunctionX@12. If it finds any match, it calls the function.

    As it happens, every single call to Get­Proc­Address failed. The function they were trying to call was an internal function in the window manager that wasn't exported. I guess they figured, "Hm, I can't find it in user32. Maybe it moved to some other DLL," and went through every DLL they could think of.

    I called out this rather dubious programming technique, and word got back to the development team for that program. They came back and admitted, "Yeah, we were hoping to call that function, but couldn't find it, and the code you found is stuff we added during debugging. We have no intention of actually shipping that code."

    Well, yeah, but still, what possesses you to try such a crazy technique, even if only for debugging?

  • The Old New Thing

    Why are there both FIND and FINDSTR programs, with unrelated feature sets?

    • 35 Comments
    Jonathan wonders why we have both find and findstr, and furthermore, why the two programs have unrelated features. The find program supports UTF-16, which findstr doesn't; on the other hand, the findstr program supports regular expressions, which find does not.

    The reason why their feature sets are unrelated is that the two programs are unrelated.

    The find program came first. As I noted in the article, the find program dates back to 1982. When it was ported to Windows NT, Unicode support was added. But nobody bothered to add any features to it. It was intended to be a straight port of the old MS-DOS program.

    Meanwhile, one of my colleagues over on the MS-DOS team missed having a grep program, so he wrote his own. Developers often write these little tools to make their lives easier. This was purely a side project, not an official part of any version of MS-DOS or Windows. When he moved to the Windows 95 team, he brought his little box of tools with him, and he ported some of them to Win32 in his spare time because, well, that's what programmers do. (This was back in the days when programmers loved to program anything in their spare time.)

    And that's where things stood for a long time. The official find program just searched for fixed strings, but could do so in Unicode. Meanwhile, my colleague's little side project supported regular expressions but not Unicode.

    And then one day, the Windows 2000 Resource Kit team said, "Hey, that's a pretty cool program you've got there. Mind if we include it in the Resource Kit?"

    "Sure, why not," my colleague replied. "It's useful to me, maybe it'll be useful to somebody else."

    So in it went, under the name qgrep.

    Next, the Windows Resource Kit folks said, "You know, it's kind of annoying that you have to go install the Resource Kit just to get these useful tools. Wouldn't it be great if we put the most useful ones in the core Windows product?" I don't know what sort of cajoling was necessary, but they convinced the Windows team to add a handful of Resource Kit programs to Windows. Along the way, qgrep somehow changed its name to findstr. (Other Resource Kit programs kept their names, like where and diskraid.)

    So there you have it. You can think of the find and findstr programs as examples of parallel evolution.

  • The Old New Thing

    What does the COINIT_SPEED_OVER_MEMORY flag to CoInitializeEx do?

    • 20 Comments

    One of the flags you can pass to Co­Initialize­Ex is COINIT_SPEED_OVER_MEMORY, which is documented as

    COINIT_SPEED_OVER_MEMORY: Trade memory for speed.

    This documentation is already vague since it doesn't say which direction the trade is being made. Are you reducing memory to increase speed, or increasing memory by reducing speed? Actually it's neither: If you pass this flag, then you are instructing COM to consume more memory in an attempt to reduce CPU usage, under the assumption that you run faster by executing fewer cycles.¹

    The request is a per-process one-way transition. Once anybody anywhere in the process puts COM into speed-over-memory mode, the flag stays set and remains set until the process exits.

    When should you enable this mode? It doesn't matter, because as far as I can tell, there is no code anywhere in COM that changes its behavior based on whether the process has been placed into this mode! It looks like the flag was added when DCOM was introduced, but it never got hooked up to anything. (Or whatever code that had been hooked up to it never shipped.)

    ¹ As you know, consuming more memory is not a guarantee that you will actually run faster, because higher memory usage increases the chances that what you need will take an L1 cache miss or a page fault, which will cost you dearly in wait time (though not in CPU usage).

  • The Old New Thing

    Why does ShellExecute return SE_ERR_ACCESSDENIED for nearly everything?

    • 7 Comments

    We saw a while ago that the Shell­Execute function returns SE_ERR_ACCESS­DENIED at the slightest provocation. Why can't it return something more meaningful?

    The short-term answer is that the return value from Shell­Execute is both a success code and an error code, and you check whether the value is greater than 32 to see which half you're in. In particular, the error code case is if the value you got is less than or equal to 32. This already demonstrates that the error codes are limited to values less than or equal to 32. And all those error codes are already accounted for, so there's nowhere to stick "an error not on the original list of 32 possible error codes." Therefore, any error that wasn't on the original list of error codes gets turned into SE_ERR_ACCESS­DENIED, in the same way that MS-DOS turned any error that didn't map to one of its original errors into 5 (access denied).

    Okay, but why was 32 chosen as the cut-off?

    The Shell­Execute function didn't come up with that number. That number came from the kernel folks, who decided that Win­Exec function returned the handle to the application that was executed on success, or an error code less than 32 on failure. And back in the old days, Shell­Execute was just a function that called Find­Executable and then passed the result to Win­Exec, so following the Win­Exec pattern made sense.

    (You may have noticed a tiny discrepancy there. The shell folks decided to add a new error code SE_ERR_DLL­NOT­FOUND with a numeric value of 32, thereby making the return value from Shell­Execute behave subtly differently from that of Win­Exec. The people who made this decision probably regretted it once it became clear that lots of applications were checking the return value incorrectly, but it's too late to fix it now.)

    Okay, so let's peel back another layer: Why did the Win­Exec function overload the return value? Well, overloaded return values were all the rage back then. A lot of functions to create something return the created object on success, or null on failure. The kernel folks said, "Well, we can do even better than that. Not only can we tell you that we failed to create the application, we can even tell you why! You see, MS-DOS has a maximum of 31 error codes, so we can just return the error code directly if we can ensure that no values less than 32 are valid segments. And we can make that guarantee because the 8086 processor reserves the first 1024 bytes of memory (the first 64 segments) for its interrupt vector table, so no application could possibly be loaded there. Hooray! We're such over-achievers!"

    This weird way of reporting errors from Shell­Execute has been preserved for compatibility. New applications would probably better served to switch to the Shell­Execute­Ex function instead, since it reports errors by calling Set­Last­Error with the real error code before returning. (In other words, you can call Get­Last­Error to get the real error code.)

    Bonus chatter: Wait a second, if Get­Last­Error gets you the real error code, how come the original report was that the Shell­Execute­Ex function also returns SE_ERR_ACCESS­DENIED?

    Because it depends on what you mean by "returns". Technically speaking, the Shell­Execute­Ex function returns FALSE for all errors, since it is prototyped as returning a BOOL. When somebody says that it returns an error code, you first have to ask where they got that error code from.

    If they got it from Get­Last­Error, then they'll get a meaningful error code, or at least something more meaningful than SE_ERR_ACCESS­DENIED.

    But if instead they look at the hInstApp member of the SHELL­EXECUTE­INFO structure, then they'll get that useless SE_ERR_ACCESS­DENIED value again. Because the hInstApp is where the legacy return value is recorded. If you look there, you're going to see the old lame error code. So don't look there.

  • The Old New Thing

    Irony patrol: Recycling bins

    • 40 Comments

    Microsoft has a large corporate recycling effort. Every office, every mail room, every kitchenette, every conference room has a recycling bin. The dining facilities earned Green Restaurant Certification, and there is a goal of making the cafeterias a zero-landfill facility by 2012. (Hey, that's this year!)

    A few years ago, I found one room in my building that didn't have a recycling bin, and you'd think it'd be one of the rooms near the top of the list for needing one.

    The room without a recycling bin was the copy machine room.

    As a result, people were throwing their unwanted cover sheets and other paper waste into the regular garbage.

    I decided to be somebody. I took a recycling bin from an unused office and moved it into the copy room.

    Bonus recycling bin irony: For many years, each office had three recycling bins, each labeled for its intended contents: white paper, mixed paper, and aluminum cans. Improvements in automated sorting technology removed the need to separate these recyclables manually, and in 2008, all three recycling bins were replaced with a single recycle bin, which was labeled with the simple three-arrow recycling logo.

    The irony is that Microsoft was going to toss all the old recycling bins into a landfill because they couldn't find anybody who wanted them.

    Alert Microsoft employee Tom Roth found the right people in Building Facilities and got them to stop the trucks as they were about 100 feet from dumping 40,000 perfectly good plastic bins into a landfill. Tom's son Justin works in the recycling industry, and he used his contacts to get the word out, and soon requests for recycling bins were coming in from all over the state of Washington. It took three months, but they eventually found homes for all of the recycle bins.

    Ironic disaster averted.

  • The Old New Thing

    The cries of "Oh no!" emerge from each office as the realization slowly dawns

    • 35 Comments

    Today is the (approximate) 15th anniversary of the Bedlam Incident. To commemorate that event, here's a story of another email incident gone horribly awry.

    Some time ago, an email message was sent to a large mailing list. It came from somebody in the IT department and said roughly, "This is a mail sent on behalf of Person X to check if your XYZ server has migrated to the new datacenter. Please visit http://blahblah and confirm that your server name is of the form XXYYNNZZ. If not, please Reply to All."

    Uh-oh.

    The seasoned Microsoft employees (and the new employees who paid attention during new employee orientation) recognized the monster that was about to be unleashed, and the cries of "Oh no!" could be heard emerging from each office as the realization dawned.

    And then it started. All the replies from people saying "I'm still on the old datacenter." And then the replies from people saying, "Stop replying!"

    What's frustrating is that you can't do anything about the catastrophe that is unfolding. Any attempt to reply to the message telling people to stop replying only makes the problem worse. All you can do is stand back and wait for the fire to burn itself out.

    Ten minutes later, Person X sent a message to the mailing list. "Please DO NOT Reply all to this email thread. I am working with the IT department to see if there is another way to get this information."

    It took another ten minutes for the messages to finally stop, but that seems to have shut things down. Now it's time for blame and speculation!

    We were never told whose brilliant idea it was to try to gather information by sending mail to 7000 people telling them to reply all. One theory was that Person X went to the IT department saying "Hi, I'd like to collect information XYZ from this large list of people. Can you help?" And some new hire in the IT department said, "Sure, I can get that information for you. I'll just send everybody some email!"

    After the dust settled, somebody made a tally of the damage.

    Number of people on the mailing list: around 7000.
    Number of replies: 70.
    Of those, number of replies saying "stop replying": 17.

    To commemorate this event, a colleague of mine who maintains a popular internal tool pushed out an upgraded version. The new version had a checkbox on the main page:

    I have not been migrated to the new datacenter.

    Bonus chatter: It so happens that the message was sent at the beginning of the summer, and most of the "I'm still on the old datacenter" replies came from summer interns. Maybe it was a test.

  • The Old New Thing

    What's the difference between F5 and F8 at the boot screen?

    • 34 Comments

    Ian B wondered what the difference is between pressing F5 and F8 while Windows is booting.

    I have no idea either. My strategy was to just mash on the function keys, space bar, DEL key, anything else I can think of. Keep pressing them all through the boot process, and maybe a boot menu will show up.

    The F5 hotkey was introduced in Windows 95, where the boot sequence hotkeys were as follows:

    • ESC - Boot in text mode.
    • F5 - Boot in Safe Mode.
    • Shift+F5 - Boot to Safe Mode MS-DOS.
    • Ctrl+F5 - Boot to Safe Mode MS-DOS with drive compression disabled.
    • Alt+F5 - Boot with LOADTOP=0 for Japanese systems.
    • F6 - Boot in Safe Mode with networking.
    • F4 - Boot to previous version of MS-DOS.
    • Ctrl+F4 - Boot to previous version of MS-DOS with drive compression disabled.
    • F8 - Boot to menu.
    • Shift+F8 - Boot with step-by-step confirmation.
    • Ctrl+F8 - Boot with step-by-step confirmation with drive compression disabled.

    Man, that's an insane number of boot options all buried behind obscure function keys. Boy am I glad we got rid of them. This frees up room in my brain for things like Beanie Baby trivia.

    Bonus chatter: The next generation of computers boots so fast that there's no time to hit any of these hotkeys!

Page 5 of 50 (491 items) «34567»