June, 2008

  • The Old New Thing

    2008 mid-year link clearance

    • 16 Comments

    Time for the semi-annual link clearance.

    And, as always, the obligatory plug for my column in TechNet Magazine:

    Footnotes

    ¹Yes, it isn't literally powered by RFID. I was spoofing marketing-speak. I apologize to those for whom this did not require explaining.

  • The Old New Thing

    You don't need that 90 byte whereis program any more

    • 10 Comments

    Yes, you can write a whereis program in 90 bytes but Windows Server 2003 and Windows Vista both come with a version of WHERE.EXE, so you don't even need the batch file any more.

  • The Old New Thing

    GUIDs are globally unique, but substrings of GUIDs aren't

    • 40 Comments

    A customer needed to generate an 8-byte unique value, and their initial idea was to generate a GUID and throw away the second half, keeping the first eight bytes. They wanted to know if this was a good idea.

    No, it's not a good idea.

    The GUID generation algorithm relies on the fact that it has all 16 bytes to use to establish uniqueness, and if you throw away half of it, you lose the uniqueness. There are multiple GUID generation algorithms, but I'll pick one of them for concreteness, specifically the version described in this Internet draft.

    The first 60 bits of the GUID encode a timestamp, the precise format of which is not important.

    The next four bits are always 0001, which identify that this GUID was generated by "algorithm 1". The version field is necessary to ensure that two GUID generation algorithms do not accidentally generate the same GUID. The algorithms are designed so that a particular algorithm doesn't generate the same GUID twice, but without a version field, there would be no way to ensure that some other algorithm wouldn't generate the same GUID by some systematic collision.

    The next 14 bits are "emergency uniquifier bits"; we'll look at them later, because they are the ones that fine tune the overall algorithm.

    The next two bits are reserved and fixed at 01.

    The last 48 bits are the unique address of the computer's network card. If the computer does not have a network card, set the top bit and use a random number generator for the other 47. No valid network card will have the top bit set in its address, so there is no possibility that a GUID generated from a computer without a network card will accidentally collide with a GUID generated from a computer with a network card.

    Once you take it apart, the bits of the GUID break down like this:

    • 60 bits of timestamp,
    • 48 bits of computer identifier,
    • 14 bits of uniquifier, and
    • six bits are fixed,

    for a total of 128 bits.

    The goal of this algorithm is to use the combination of time and location ("space-time coordinates" for the relativity geeks out there) as the uniqueness key. However, timekeeping is not perfect, so there's a possibility that, for example, two GUIDs are generated in rapid succession from the same machine, so close to each other in time that the timestamp would be the same. That's where the uniquifier comes in. When time appears to have stood still (if two requests for a GUID are made in rapid succession) or gone backward (if the system clock is set to a new time earlier than what it was), the uniquifier is incremented so that GUIDs generated from the "second time it was five o'clock" don't collide with those generated "the first time it was five o'clock".

    Once you see how it all works, it's clear that you can't just throw away part of the GUID since all the parts (well, except for the fixed parts) work together to establish the uniqueness. If you take any of the three parts away, the algorithm falls apart. In particular, keeping just the first eight bytes (64 bits) gives you the timestamp and four constant bits; in other words, all you have is a timestamp, not a GUID.

    Since it's just a timestamp, you can have collisions. If two computers generate one of these "truncated GUIDs" at the same time, they will generate the same result. Or if the system clock goes backward in time due to a clock reset, you'll start regenerating GUIDs that you had generated the first time it was that time.

    Upon further investigation, the customer really didn't need global uniqueness. The value merely had to be unique among a cluster of a half dozen computers. Once you understand why the GUID generation algorithm works, you can reimplement it on a smaller scale:

    • Four bits to encode the computer number,
    • 56 bits for the timestamp, and
    • four bits as a uniquifier.

    We can reduce the number of bits to make the computer unique since the number of computers in the cluster is bounded, and we can reduce the number of bits in the timestamp by assuming that the program won't be in service 200 years from now, or that if it is, the items that were using these unique values are no longer relevant. At 100 nanoseconds per tick, 2^56 ticks will take 228 years to elapse. (Extending the range beyond 228 years is left as an exercise, but it's wasted effort, because you're going to hit the 16-computer limit first!)

    You can get away with a four-bit uniquifier by assuming that the clock won't drift more than an hour out of skew (say) and that the clock won't reset more than sixteen times per hour. Since you're running under a controlled environment, you can make this one of the rules for running your computing cluster.

  • The Old New Thing

    The mystery of the garbage lady

    • 11 Comments

    Last year, my good friend and colleague Sarah transfered from the Redmond offices to Microsoft UK in Reading. One of her most popular lunchtime stories is the mystery of the garbage lady, which she finally got around to posting on her blog.

    Some of my other favorite stories from her blog:

    A colleague of mine experienced the phenomenon of clouded geography in reverse. He was temporarily assigned to Microsoft UK and while living there had occasion to drive out to Wales. He pulled out his handy road map and studied it: "Okay, I need to take this highway west, over the mountain range, and then take that exit, and then I'll be there." He hopped in his car and started driving.

    After a while he started getting nervous. It was getting late, and he still hadn't reached the mountain range yet. He started worrying that the people he was meeting at the destination would be concerned when he failed to show up on time. (I guess he picked up the British habit of worrying about other people being worried.)

    And then he saw the exit, and boom, he was at his destination.

    Afterwards, he went back to the map to see what happened.

    The first issue was one of scale. His map was of all of Great Britain, and he assumed that the scale of such a map was comparable to maps of large areas of the United States. A route that goes halfway across a large map, say a map of the state of Washington, will take a few hours to cover. The UK is comparatively much more compact. From Reading, you can get to the Welsh border in 90 minutes.

    The second issue was one of geography. What was notated on the map as a mountain range was, to someone more familiar with the mountains of western North America, just a hill.

  • The Old New Thing

    The disappointment of people who need to have their hand held from beginning to end

    • 47 Comments

    Some customers already have the answer but need to have their hand held.

    My customer wants to enforce a company-wide policy of disabling the "Keep the taskbar on top of other windows" feature. We have confirmed that there is no group policy setting for controlling this setting. Further research reveals the SHAppBarMessage function. The customer wants to know if there is any way he can write code that will use this function to modify the setting.

    The customer found a map to a stream, saw that there were directions printed on it, and then asked, "Is there any way I can follow these directions and get some water?"

    The product team dutifully wrote up the four-line function to do the work the customer requested—call SHAppBarMessage with ABM_GETSTATE to get the current state, turn off the ABM_ALWAYSONTOP flag, and then call it again with ABM_SETSTATE to apply the changes—but it still frustrates me that we had to deal with this question in the first place.

    It's one thing to say, "I tried doing X and it didn't work. Here's the code I was using." It's another thing to say, "I discovered function X. Can you write code for me?"

    No lesson today. Just venting.

  • The Old New Thing

    The difference between a junior and senior position at a video card company

    • 28 Comments

    Here are the descriptions (verbatim) of two job positions open at a major video card manufacturer back in the late 1990's:

    Software Engineer - Drivers

    The way in to a "hot" company in a "hot" field. This entry level position requires some programming experience with graphics preferred.

    Senior Software Engineer - Applications (Demos)

    This senior position requires considerable experience in 3D graphics or multimedia. Design, specification, and implementation of demos for new products will all be part of the mix.

  • The Old New Thing

    Don't require your users to have a degree in philosophy, episode 3

    • 29 Comments

    While signing up for online bill payment for one of the services I use, I encountered the following check box:

    Uncheck this box if you do not wish to receive electronic communications from XYZ.

    This is not simply a negative-sense checkbox; it's a double-negative-sense checkbox! What's wrong with this:

    Send me electronic communications from XYZ.

    Oh, right, I know what's wrong with it: It's too easy for people to opt out! Marketing is all about making users ask for something they don't want.

  • The Old New Thing

    Raymond misreads acronyms: MSPP-PVP

    • 12 Comments

    I was looking up some information about the Microsoft Partner Program, commonly abbreviated MSPP. I stumbled across an internal wiki called mspp-pvp. My first reaction upon seeing the name was, "What, it's player versus player?"

    I got this images of the CEOs of the various partners wearing VR goggles and running around playing laser tag.

    No, it's the Partner Velocity Program.

  • The Old New Thing

    Just because you're using a smart pointer class doesn't mean you can abdicate understanding what it does

    • 38 Comments

    It's great when you have a tool to make programming easier, but you still must understand what it does or you're just replacing one set of problems with another set of more subtle problems. For example, we discussed earlier the importance of knowing when your destructor runs. Here's another example, courtesy of my colleague Chris Ashton. This was posted as a Suggestion Box entry, but it's pretty much a complete article on its own.

    I came across an interesting bug this weekend that I've never seen described anywhere else, I thought it might be good fodder for your blog.

    What do you suppose the following code does?

    CComBSTR bstr;
    bstr = ::SysAllocStringLen(NULL, 100);
    
    1. Allocates a BSTR 100 characters long.
    2. Leaks memory and, if you're really lucky, opens the door for an insidious memory corruption.

    Obviously I'm writing here, so the answer cannot be A. It is, in fact, B.

    The key is that CComBSTR is involved here, so operator= is being invoked. And operator=, as you might recall, does a deep copy of the entire string, not just a shallow copy of the BSTR pointer. But how long does operator= think the string is? Well, since BSTR and LPCOLESTR are equivalent (at least as far as the C++ compiler is concerned), the argument to operator= is an LPCOLESTR – so operator= naturally tries to use the wcslen length of the string, not the SysStringLen length. And in this case, since the string is uninitialized, wcslen often returns a much smaller value than SysStringLen would. As a result, the original 100-character string is leaked, and you get back a buffer that can only hold, say, 25 characters.

    The code you really want here is:

    CComBSTR bstr;
    bstr.Attach(::SysAllocStringLen(NULL, 100));
    

    Or:

    CComBSTR bstr(100);
    

    I'm still a big fan of smart pointers (surely the hours spent finding this bug would have been spent finding memory leaks caused by other incautious programmers), but this example gives pause – CComBSTR and some OLE calls just don't mix.

    All I can add to this story is an exercise: Chris writes, "Since the string is uninitialized, wcslen often returns a much smaller value than SysStringLen would." Can it possibly return a larger value? Is there a potential read overflow here?

  • The Old New Thing

    Donations to the Microsoft Archives: Pens, CDs, and paperweights

    • 39 Comments

    Among other responsibilities, the Archives department preserves Microsoft history, be it old hardware, old software, old documentation, or ephemera. Last year, one of my colleagues was cleaning out his office because he was moving to Granada, and of the junk he was going to throw out, the Archives asked me to save the following:

    • Windows NT Workstation 3.51 for PowerPC (box and CD)
    • Windows NT Workstation 3.51 for Alpha AXP (CD only)
    • Documentation for two internal UI libraries from over a decade ago
    • Word 97/Excel 97 Service Pack 3 for Alpha AXP
    • Microsoft Excel promotional pens
    • Windows 95 launch event commemorative paperweight
    • MSN 8 commemorative gold master CD and ship gift box
    • MSN Internet Access promotional booklet and CD (unopened)
    • An official CD of an internal build of a 1999 user interface project that was later abandoned (although its spirit survives in Windows XP).
    • The official CD of pictures from the Windows XP RTM party (unopened)
    • A lava lamp prize from an internal promotional campaign

    and my favorite

    • An HP RPN calculator which was issued to employees as part of the standard office equipment (back in the day).
Page 1 of 4 (32 items) 1234