November, 2006

  • The Old New Thing

    It took two of us to keep up with one Bob


    One of my friends (let's call him "Bob") retired from Microsoft many years ago. Bob is an amazing programmer whose skills I remain in awe of.

    I remember visiting his office one evening with a mutual friend ("Fred") to catch up on things. When we turned up, he showed us a problem that he was working on. He was doing some sort of fancy graphics effect, but since I'm not a graphics person, I sat down in his guest chair and flipped through some magazines while he talked about the problem with Fred. (Fred being a graphics guru.) They discussed the problem and settled on an algorithm. Or at least I assume that's what they did, because I wasn't paying much attention.

    And then Bob started digging into the algorithm's implementation. Since this particular effect was in the program's inner loop, it had to be fast, and at the time this story took place, that meant assembly language. Not just your everyday assembly language, but insane assembly language, pulling crazy tricks like using the stack pointer as a general purpose register (since the x86 has so few registers) and performing multiple operations in parallel with one register. I put down my magazine, and Bob and I sweated out the details. Meanwhile, Fred flipped through a book on Bob's shelf.

    That's how great Bob was. It took two people (Fred and me) to keep up with one Bob.

    Anyway, that's a pretty long introduction to Bob just to get to the real story, which will come tomorrow.

  • The Old New Thing

    A fork is an easy-to-find nonstandard USB device


    Remember the Ten Immutable Laws of Security. Today, we're going to talk about number three: If a bad guy has unrestricted physical access to your computer, it's not your computer any more.

    There was a bug which floated past my field of vision many months ago that went something like this: "I found a critical security bug in the USB stack. If somebody plugs in a USB device which emits a specific type of malformed packet during a specific step in the protocol, then the USB driver crashes. This is a denial of service that should be accorded critical security status."

    Now, it's indeed the case that the driver should not crash when handed a malformed USB packet, and the bug should certainly be fixed. (That said, I'm sure some people will manage to interpret this article as advocating not fixing the bug.) But let's look at the prerequisites for this bug to manifest itself: The attacker needs to build a USB device that is intentionally out of specification in one particular way and plug that device into a vulnerable machine. While that's certainly possible, it's a lot of work for your typical hacker to burn a custom EEPROM with USB firmware that manages to hit the precise conditions necessary to trigger the driver bug.

    It's much easier just to grab a fork.

    You see, since this attack requires physical access to a USB port, you may as well attack the machine in a much more direct manner that doesn't require you to spend hours with a soldering gun and a circuit board: Just grab a fork and jam it into the USB port. I haven't tried it, but I suspect that will crash the machine pretty effectively, too. If you can't get the fork to work, pouring a glass of water into the USB port will probably seal the deal.

    Doron tells me that some companies address this problem by removing physical access: They fill the USB ports on all their machines with epoxy.

    Update: Randy Aull tells me that the USB 2.0 specification anticipated the fork attack and requires that all transceivers be able to withstand short circuits "of D+ and/or D- to VBUS, GND, other data lines, or the cable shield at the connector, for a minimum of 24 hours." (Though I'm not sure if that also covers shorting VBUS to GND.) I wonder if they also have a paragraph specifying that USB devices must also withstand water immersion... Of course, you could still use that fork to push the power button or jam it into an outlet on the same circuit as the computer you want to take down in order to blow a fuse.

  • The Old New Thing

    What went wrong in Windows 95 if you use a system color brush as your background brush?


    If you want to register a window class and use a system color as its background color, you set the hbrBackground member to the desired color, plus one, cast to an HBRUSH:

    wc.hbrBackground = (HBRUSH)(COLOR_WINDOW + 1);

    Windows 95 introduced "system color brushes", which are a magic type of brush which always paint in the corresponding system color, even if the system color changes.

    HBRUSH hbrWindow = GetSysColorBrush(COLOR_WINDOW);

    The hbrWindow brush will always paint in the color corresponding to GetSysColor(COLOR_WINDOW), even if the user changes the color scheme later.

    Now, you might be tempted to use a system color brush as your class background brush. After all, if you want the background to be the system window color, why not use a brush that is always the system window color?

    Well, because the documentation for GetSysColorBrush explicitly tells you not to do that. If you tried this on Windows 95, it would have seemed to work for a while, and then bad things would start happening.

    The system color brushes were added as a convenience to save people the trouble of having to create solid color brushes just to draw a system color. Profiling in Windows 95 revealed that a lot of time was spent by applications creating solid brushes, using them briefly, and then destroying them, so anything that could be done to reduce the rate at which applications needed to do this was a good thing.

    The system color brushes were implemented in Windows 95 by creating them and then telling GDI, "Hey, if somebody tries to DeleteObject this brush, don't let them." This prevented the system color brushes from being accidentally destroyed.

    Except when it didn't.

    When you registered a window class with a background brush (and by that, I mean an honest-to-goodness brush and not a pseudo-brush you get from that (HBRUSH)(COLOR_xxx + 1) stuff) the window manager did the same thing to the brush as it did to the system color brushes: It told GDI, "Hey, if somebody tries to DeleteObject this brush, don't let them." This prevented people from destroying brushes while the window manager was still using them to draw the background of a window.

    When you unregistered the window class, the window manager told GDI, "Okay, delete this brush, and yes, I told you not to let anyone DeleteObject it, but I'm overriding it because I was the one who protected it in the first place. I'm just cancelling my previous instructions to protect this brush." The window manager takes responsibility for destroying the brush when you register the class; therefore, when you unregister the class, the window manager is obligated to clean up and destroy the brush. (Actually, it's a little more complicated than that, because it is legal to use one brush as the background brush for multiple window classes, but let's ignore that case for now.)

    Do you see the problem yet?

    What if you registered a window class with a system color brush as the background brush and then unregistered it? (Don't forget that classes are automatically unregistered when the process exits.) When you registered the class, the brush got protected, and when you unregistered the class, the Windows 95 window manager told GDI to override the protection and destroy the brush anyway.

    Oops, we just destroyed a system color brush. Even though those brushes were protected, the protection didn't work here because you went through the code path where the window manager said, "Override the safety interlocks!"

    But of course, you didn't need to use a system color brush in the first place. You should have used that (HBRUSH)(COLOR_xxx + 1) pseudo-brush.

    (Note to nitpickers: This story talked about Windows 95. It does not apply to any other version of Windows. The problem may or may not exist in those other versions; I make no claims.)

  • The Old New Thing

    It's not surprising at all that people search for Yahoo


    Earlier this year, one columnist was baffled as to why "Yahoo" was the most searched-for term on Google. I wasn't baffled at all. Back in 2001, Alexa published the top ten most searched-for terms on their service, and four of the top ten were URLs:,,, and

    A lot of people simply don't care to learn the difference between the search box and the address bar. "If I type what I want into this box here, I sometimes get a strange error message. But if I type it into that box there, then I get what I want. Therefore, I'll use that box there for everything." And you know what? It doesn't bother me that they don't care. In fact, I think it's good that they don't care. Computers should adapt to people, not the other way around.

    You can try to explain to these people, "You see, this is a URL, so you type it into the address box. But that is a search phrase, so you type it into the search box."

    "You-are-what? Look, I don't care about your fancy propeller-beanie acronyms. You computer types are always talking about how computers are so easy to use, and then you make up these arbitrary rules about where I'm supposed to type things. If I want something, I type into this box and click 'Search'. And it finds it. Watch. I want Yahoo, so I type 'yahoo' into the box, and boom, there it is. I have a system that works. Why are you trying to make my life more confusing?"

    I remember attending a presentation by the MSN Explorer team on what they learned about how people use a web browser. They found many situations where people failed to accomplish their desired task because they typed the right thing into the wrong box. But instead of trying to teach people which box to type it in, they just expanded the definition of "right". You typed your query into the wrong box? No problem, we'll just pretend you typed it into the correct box. In fact, let's just get rid of all these special-purposes boxes. Whatever you want, just type it into this box, and we'll get it for you.

    I wish the phone company would learn this. Sometimes I'll dial a telephone number and I'll get an automated recording that says, "I'm sorry. You must dial a '1' before the number. Please hang up and try again." Or "I'm sorry. You must not dial a '1' before the number. Please hang up and try again." That's because in the state of Washington, there are complicated rules about when you have to dial a "1" in front of the number and when you don't. (Fortunately, the rule on when you have to dial the area code is easier to remember: If the area code you are calling is the same as the area code you are dialing from, then you can omit the area code.) For example, suppose your home number is 425-882-xxxx. Here's how you have to dial the following numbers:

    To call this numberyou dial

    If you get it wrong, the voice comes on the line to tell you. Hey, since you know what I did wrong and you know what I meant to do, why not just fix it? If I dial a number and forget the "1", just insert the 1 and connect the call. If I dial a number and include the "1" when I didn't need to, just delete the 1 and connect the call. Don't make me have to look up in the book whether I need a 1 or not. (In the front of the phone book are tables showing which numbers need a "1" and which don't. I hate those tables.)

    (Yes, I know there are weird technical/legal reasons for why I have to dial the phone in four different ways depending on whom I want to call. But it's still wrong that these technical/legal reasons mean that the rules for dialing a telephone are impossibly complicated.)

  • The Old New Thing

    Sometimes you need to recalibrate your progress reports


    One of my former managers told me this story from a project he worked on many years ago. This project was broken up into multiple groups, and there was a weekly meeting where representatives from each group got together to discuss how the project was going.

    One of the groups was responsible for generating the reports and analysis. This was an important part of the project, but not a part that other groups depended on, since the reports were "pure output". At each meeting, the reporting and analysis group indicated steadily improving progress. Ten percent complete. Twenty-five percent. Fifty percent. The number increased week by week. Things looked good.

    And then one week they reported that they were eighty percent done, adding, "It almost links now."

    I don't know how everybody in the room reacted to this revelation, but I suspect it was met with stunned silence. It's a good thing nobody had a dependency on the reports.

  • The Old New Thing

    Placebo setting: QoS bandwidth reservation


    A placebo setting that has been getting a lot of play in recent years is that of QoS bandwidth reservation. The setting in question sets a maximum amount of bandwidth that can be reserved for QoS. I guess one thing people forgot to notice is the word "maximum". It doesn't set the amount of reserved bandwidth, just the maximum.

    Changing the value will in most cases have no effect on your download speed, since the limit kicks in only if you have an application that uses QoS in the first place. QoS, which stands for "quality of service", is a priority scheme for network bandwidth. A program can request a certain amount of bandwidth, say for media streaming, and when the program accesses the network, up to that much bandwidth is guaranteed to be available to the program. The setting in question controls how much bandwidth can be claimed for high priority network access. If no program is using QoS, then all your bandwidth is available to non-QoS programs. What's more, even if there is a QoS reservation active, if the program that reserved the bandwidth isn't actually using it, then the bandwidth is available to non-QoS programs.

    Consider this analogy: A restaurant seats 100 people, and it has a policy of accepting reservations for at most twenty percent of those seats. This doesn't mean that twenty seats are sitting empty all the time. If ten people have made reservations for dinner at 8pm, then ninety seats are available for drop-in customers at that time. The twenty percent policy just means that once twenty people have made reservations for dinner at 8pm, the restaurant won't accept any more reservations.

    Here's an example with made-up numbers: Suppose you are downloading a large file over your 720kbps connection. Since there is nothing else using the network, your download proceeds at 720kbps. Now suppose you fire up a program that uses QoS, say, for streaming media. (I don't know whether Windows Media Player uses QoS.) You connect to a streaming media source, and the media player does some math and determines that in order to give you smooth playback, it needs a minimum of 100kbps. (If it gets more, then great, but it needs at least that much to avoid dropouts.) The program places a reservation of that amount through QoS. With a default maximum reservation of 20% = 144kbps, this reservation request is granted. Playback of the streaming media begins, and your bandwidth is now split, with 100kbps going to your media player and the remaining 620kbps going to your download.

    Now you hit pause on the media player to answer the phone. Even though the media player has a 100kbps reservation, it's not using it, so all 720kbps of bandwidth is devoted to your download. You get off the phone and unpause the media player. Bandwidth is once again divided 100kbps for the media player and 620kbps for the download.

    Now, sure, you can set your QoS maximum reservation to zero. This means that when the media player asks for a guarantee of 100kbps, QoS will tell it, "Sorry, no can do." The media player will still play the streaming media, but since it no longer has a guarantee of bandwidth, there may be stretches where the download consumes most of the network bandwidth and the streaming media gets only 50kbps. Result: dropped frames, stuttering, or pauses for buffering.

    So tweak this value all you want, but understand what you're tweaking.

  • The Old New Thing

    The quiet dream of placebo settings


    Back in the Windows 95 days, people swore that increasing the value of MaxBPs in the system.ini file fixed application errors. People usually made up some pseudo-scientific explanation for why this fixed crashes. These explanations were complete rot.

    These breakpoints had nothing to do with Windows applications. They were used by 32-bit device drivers to communicate with code in MS-DOS boxes, typically the 16-bit driver they are trying to take over from or are otherwise coordinating their activities with. A bunch of these are allocated at system startup when drivers settle themselves in, and on occasion, a driver might patch a breakpoint temporarily into DOS memory, removing it when the breakpoint is hit (or when the breakpoint is no longer needed). Increasing this value had no effect on Windows application.

    I fantasized about adding a "Performance" page to Tweak UI with an option to increase the number of "PlaceBOs". I would make up some nonsense text about this setting controlling how high in memory the system should place its "breakpoint opcodes". Placing them higher will free up memory for other purposes and reduce the frequency of "Out of memory" errors. Or something like that.

    I was reminded of this story by my pals in products support who were trying to come up with a polite way of explaining to their customer that there is no /7GB boot.ini switch. In other situations, they sometimes dream of shipping placebo.dll to a customer to solve their problem.

    (And by the way, the technical reason why the user-mode address space is limited to eight terabytes was given by commenter darwou: The absence of a 16-byte atomic compare-and-exchange instruction means that bits need to be sacrificed to encode the sequence number which avoids the ABA problem.)

  • The Old New Thing

    What is the process by which the cursor gets set?


    Commenter LittleHelper asked, "Why is the cursor associated with the class and not the window?" This question makes the implicit assumption that the cursor is associated with the class. While there is a cursor associated with each window class, it is the window that decides what cursor to use.

    The cursor-setting process is described in the documentation of the WM_SETCURSOR message:

    The DefWindowProc function passes the WM_SETCURSOR message to a parent window before processing. If the parent window returns TRUE, further processing is halted. Passing the message to a window's parent window gives the parent window control over the cursor's setting in a child window. The DefWindowProc function also uses this message to set the cursor to an arrow if it is not in the client area, or to the registered class cursor if it is in the client area.

    That paragraph pretty much captures the entire cursor-setting process. all I'm writing from here on out is just restating those few sentences.

    The WM_SETCURSOR goes to the child window beneath the cursor. (Obviously it goes to the child window and not the parent, because the documentation says that DefWindowProc forward the message to its parent. if the message went to the parent originally, then there would be nobody to forward the message to!) At this point, your window procedure can trap the WM_SETCURSOR message, set the cursor, and return TRUE. Thus, the window gets the first priority on deciding what the cursor is.

    If you don't handle the WM_SETCURSOR message, then DefWindowProc forwards the message to the parent, who in turn gets to decide whether to handle the message or forward to its parent in turn. One possibility is that one of the ancestor windows will handle the message, set the cursor, and return TRUE. In that case, the TRUE return value tells DefWindowProc that the cursor has been set and no more work needs to be done.

    The other, more likely, possibility is that none of the ancestor windows cared to set the cursor. At each return to DefWindowProc, the cursor will be set to the class cursor for the window that contains the cursor.

    Here it is in pictures. Suppose we have three windows, A, B, and C, where A is the top-level window, B a child, and C a grandchild, and none of them do anything special in WM_SETCURSOR. Suppose further that the mouse is over window C:

    SendMessage(hwndC, WM_SETCURSOR, ...)
     C's window procedure does nothing special
     DefWindowProc(hwndC, WM_SETCURSOR, ...)
      DefWindowProc forwards to parent:
       SendMessage(hwndB, WM_SETCURSOR, ...)
       B's window procedure does nothing special
       DefWindowProc(hwndB, WM_SETCURSOR, ...)
        DefWindowProc forwards to parent:
         SendMessage(hwndA, WM_SETCURSOR, ...)
         A's window procedure does nothing special
          DefWindowProc(hwndA) cannot forward to parent (no parent)
          DefWindowProc(hwndA) sets the cursor to C's class cursor
          DefWindowProc(hwndA) returns FALSE
         A's window procedure returns FALSE
        SendMessage(hwndA, WM_SETCURSOR, ...) returns FALSE
        DefWindowProc(hwndB) sets the cursor to C's class cursor
        DefWindowProc(hwndB) returns FALSE
       B's window procedure returns FALSE
      SendMessage(hwndB, WM_SETCURSOR, ...) returns FALSE
      DefWindowProc(hwndC) sets the cursor to C's class cursor
      DefWindowProc(hwndC) returns FALSE
     C's window procedure returns FALSE
    SendMessage(hwndC, WM_SETCURSOR, ...) returns FALSE

    Observe that the WM_SETCURSOR started at the bottom (window C), bubbled up to the top (window A), and then worked its way back down to window C. On the way up, it asks each window if it wants to set the cursor, and if it makes it all the way to the top with nobody expressing an opinion, then on the way down, each window sets the cursor to C's class cursor.

    Now, of course, any of the windows along the way could have decided, "I'm setting the cursor!" and returned TRUE, in which case the message processing would have halted immediately.

    So you see, the window really does decide what the cursor is. Yes, there is a cursor associated with the class, but it is used only if the window decides to use it. If you want to associate a cursor with the window, you can do it by handling the WM_SETCURSOR message explicitly instead of letting DefWindowProc default to the class cursor.

    LittleHelper's second question: "Many programs call SetCursor on every WM_MOUSEMOVE. Is this not recommended?"

    Although there is no rule forbidding you from using WM_MOUSEMOVE to set your cursor, it's going to lead to some problems. First, and much less serious, you won't be able to participate in the WM_SETCURSOR negotiations since you aren't doing your cursor setting there. But the real problem is that you're going to get cursor flicker. WM_SETCURSOR will get sent to your window to determine the cursor. Since you didn't do anything, it will probably turn into your class cursor. And then you get your WM_MOUSEMOVE and set the cursor again. Result: Each time the user moves the mouse, the cursor changes to the class cursor and then to the final cursor.

    Let's watch this happen. Start with the scratch program and make these changes:

    OnMouseMove(HWND hwnd, int x, int y, UINT keyFlags)
     Sleep(10); // just to make the flicker more noticeable
     SetCursor(LoadCursor(NULL, IDC_CROSS));
     // Add to WndProc
     HANDLE_MSG(hwnd, WM_MOUSEMOVE, OnMouseMove);

    Run the program and move the mouse over the client area. Notice that it flickers between an arrow (the class cursor, set during WM_SETCURSOR) and the crosshairs (set during WM_MOUSEMOVE).

  • The Old New Thing

    Paradoxically, you should remove the smart card when logging on with a smart card


    To connect to the Microsoft corporate network from home, employees need to use smartcard authentication. But, somewhat paradoxically, you do better if you remove the smart card.

    A colleague of mine tipped me off to this. To initiate the connection, you have to insert the smart card and provide the smart card password. Then the system connects to Microsoft and validates both the smart card and password. During this time, you can see the smart card access light blink on and off, and an "elapsed time" meter will start running.

    Once the elapsed time reaches five seconds, remove the smart card. The actual authentication happens in five seconds; the rest of the time is doing other validation, quarantining your system, confirming that you have all the necessary patches, that sort of thing. Some of those operations in turn require authentication, and if you leave your smart card in the reader, the system will try to authenticate with the smart card (slow) even though that isn't the authentication it needs.

    If you remove the card, then the system won't try to use the smart card, and the rest of the logon process will go much faster.

    This tip may not work for other people who use smart cards for authentication, but it works for me to connect to Microsoft. What used to take thirty seconds now takes just seven.

  • The Old New Thing

    It takes only one program to foul an upgrade


    "Worst software ever." That was Aaron Zupancic's cousin's reaction to the fact that Windows XP was incompatible with one program originally designed for Windows 98.

    Then again, commenter Aargh! says "The bad code should be fixed, period. If it can't be fixed, it breaks, too bad." Perhaps Aargh! can send a message to Aaron's cousin saying, "Too bad." I'm sure that'll placate Aaron's cousin.

Page 1 of 3 (28 items) 123