March, 2004

  • The Old New Thing

    Where did my Task Manager tabs and buttons go?

    • 33 Comments

    Ah, welcome to "Tiny Footprint Mode".

    This mode exists for the ultrageeks who want to put a tiny little CPU meter in the corner of the screen. To go back to normal mode, just double-click a blank space in the border.

    This is one of those geek features that has created more problems than it solved. Sure, the geeks get their cute little CPU meter in the corner, but for each geek that does this, there are thousands of normal users who accidentally go into Tiny mode and can't figure out how to get back.

    [Raymond is currently on vacation; this message was pre-recorded.]

  • The Old New Thing

    Some files come up strange in Notepad

    • 29 Comments

    David Cumps discovered that certain text files come up strange in Notepad.

    The reason is that Notepad has to edit files in a variety of encodings, and when its back against the wall, sometimes it's forced to guess.

    Here's the file "Hello" in various encodings:

    48 65 6C 6C 6F

    This is the traditional ANSI encoding.

    48 00 65 00 6C 00 6C 00 6F 00

    This is the Unicode (little-endian) encoding with no BOM.

    FF FE 48 00 65 00 6C 00 6C 00 6F 00

    This is the Unicode (little-endian) encoding with BOM. The BOM (FF FE) serves two purposes: First, it tags the file as a Unicode document, and second, the order in which the two bytes appear indicate that the file is little-endian.

    00 48 00 65 00 6C 00 6C 00 6F

    This is the Unicode (big-endian) encoding with no BOM. Notepad does not support this encoding.

    FE FF 00 48 00 65 00 6C 00 6C 00 6F

    This is the Unicode (big-endian) encoding with BOM. Notice that this BOM is in the opposite order from the little-endian BOM.

    EF BB BF 48 65 6C 6C 6F

    This is UTF-8 encoding. The first three bytes are the UTF-8 encoding of the BOM.

    2B 2F 76 38 2D 48 65 6C 6C 6F

    This is UTF-7 encoding. The first five bytes are the UTF-7 encoding of the BOM. Notepad doesn't support this encoding.

    Notice that the UTF7 BOM encoding is just the ASCII string "+/v8-", which is difficult to distinguish from just a regular file that happens to begin with those five characters (as odd as they may be).

    The encodings that do not have special prefixes and which are still supported by Notepad are the traditional ANSI encoding (i.e., "plain ASCII") and the Unicode (little-endian) encoding with no BOM. When faced with a file that lacks a special prefix, Notepad is forced to guess which of those two encodings the file actually uses. The function that does this work is IsTextUnicode, which studies a chunk of bytes and does some statistical analysis to come up with a guess.

    And as the documentation notes, "Absolute certainty is not guaranteed." Short strings are most likely to be misdetected.

    [Raymond is currently on vacation; this message was pre-recorded.]

  • The Old New Thing

    Why is the line terminator CR+LF?

    • 40 Comments
    This protocol dates back to the days of teletypewriters. CR stands for "carriage return" - the CR control character returned the print head ("carriage") to column 0 without advancing the paper. LF stands for "linefeed" - the LF control character advanced the paper one line without moving the print head. So if you wanted to return the print head to column zero (ready to print the next line) and advance the paper (so it prints on fresh paper), you need both CR and LF.

    If you go to the various internet protocol documents, such as RFC 0821 (SMTP), RFC 1939 (POP), RFC 2060 (IMAP), or RFC 2616 (HTTP), you'll see that they all specify CR+LF as the line termination sequence. So the the real question is not "Why do CP/M, MS-DOS, and Win32 use CR+LF as the line terminator?" but rather "Why did other people choose to differ from these standards documents and use some other line terminator?"

    Unix adopted plain LF as the line termination sequence. If you look at the stty options, you'll see that the onlcr option specifies whether a LF should be changed into CR+LF. If you get this setting wrong, you get stairstep text, where

    each
        line
            begins
    
    where the previous line left off. So even unix, when left in raw mode, requires CR+LF to terminate lines. The implicit CR before LF is a unix invention, probably as an economy, since it saves one byte per line.

    The unix ancestry of the C language carried this convention into the C language standard, which requires only "\n" (which encodes LF) to terminate lines, putting the burden on the runtime libraries to convert raw file data into logical lines.

    The C language also introduced the term "newline" to express the concept of "generic line terminator". I'm told that the ASCII committee changed the name of character 0x0A to "newline" around 1996, so the confusion level has been raised even higher.

    Here's another discussion of the subject, from a unix perspective.

  • The Old New Thing

    C++ scoped static initialization is not thread-safe, on purpose!

    • 49 Comments

    The rule for static variables at block scope (as opposed to static variables with global scope) is that they are initialized the first time execution reaches their declaration.

    Find the race condition:

    int ComputeSomething()
    {
      static int cachedResult = ComputeSomethingSlowly();
      return cachedResult;
    }
    

    The intent of this code is to compute something expensive the first time the function is called, and then cache the result to be returned by future calls to the function.

    A variation on this basic technique is is advocated by this web site to avoid the "static initialization order fiasco". (Said fiasco is well-described on that page so I encourage you to read it and understand it.)

    The problem is that this code is not thread-safe. Statics with local scope are internally converted by the compiler into something like this:

    int ComputeSomething()
    {
      static bool cachedResult_computed = false;
      static int cachedResult;
      if (!cachedResult_computed) {
        cachedResult_computed = true;
        cachedResult = ComputeSomethingSlowly();
      }
      return cachedResult;
    }
    

    Now the race condition is easier to see.

    Suppose two threads both call this function for the first time. The first thread gets as far as setting cachedResult_computed = true, and then gets pre-empted. The second thread now sees that cachedResult_computed is true and skips over the body of the "if" branch and returns an uninitialized variable.

    What you see here is not a compiler bug. This behavior is required by the C++ standard.

    You can write variations on this theme to create even worse problems:

    class Something { ... };
    int ComputeSomething()
    {
      static Something s;
      return s.ComputeIt();
    }
    

    This gets rewritten internally as (this time, using pseudo-C++):

    class Something { ... };
    int ComputeSomething()
    {
      static bool s_constructed = false;
      static uninitialized Something s;
      if (!s_constructed) {
        s_constructed = true;
        new(&s) Something; // construct it
        atexit(DestructS);
      }
      return s.ComputeIt();
    }
    // Destruct s at process termination
    void DestructS()
    {
     ComputeSomething::s.~Something();
    }
    

    Notice that there are multiple race conditions here. As before, it's possible for one thread to run ahead of the other thread and use "s" before it has been constructed.

    Even worse, it's possible for the first thread to get pre-empted immediately after testing s_constructed but before setting it to "true". In this case, the object s gets double-constructed and double-destructed.

    That can't be good.

    But wait, that's not all. Not look at what happens if you have two runtime-initialized local statics:

    class Something { ... };
    int ComputeSomething()
    {
      static Something s(0);
      static Something t(1);
      return s.ComputeIt() + t.ComputeIt();
    }
    

    This is converted by the compiler into the following pseudo-C++:

    class Something { ... };
    int ComputeSomething()
    {
      static char constructed = 0;
      static uninitialized Something s;
      if (!(constructed & 1)) {
        constructed |= 1;
        new(&s) Something; // construct it
        atexit(DestructS);
      }
      static uninitialized Something t;
      if (!(constructed & 2)) {
        constructed |= 2;
        new(&t) Something; // construct it
        atexit(DestructT);
      }
      return s.ComputeIt() + t.ComputeIt();
    }
    

    To save space, the compiler placed the two "x_constructed" variables into a bitfield. Now there are multiple non-interlocked read-modify-store operations on the variable "constructed".

    Now consider what happens if one thread attempts to execute "constructed |= 1" at the same time another thread attempts to execute "constructed |= 2".

    On an x86, the statements likely assemble into

      or constructed, 1
    ...
      or constructed, 2
    
    without any "lock" prefixes. On multiprocessor machines, it is possible for the two stores both to read the old value and clobber each other with conflicting values.

    On ia64 and alpha, this clobbering is much more obvious since they do not have a single read-modify-store instruction; the three steps must be explicitly coded:

      ldl t1,0(a0)     ; load
      addl t1,1,t1     ; modify
      stl t1,1,0(a0)   ; store
    

    If the thread gets pre-empted between the load and the store, the value stored may no longer agree with the value being overwritten.

    So now consider the following insane sequence of execution:

    • Thread A tests "constructed" and finds it zero and prepares to set the value to 1, but it gets pre-empted.
    • Thread B enters the same function, sees "constructed" is zero and proceeds to construct both "s" and "t", leaving "constructed" equal to 3.
    • Thread A resumes execution and completes its load-modify-store sequence, setting "constructed" to 1, then constructs "s" (a second time).
    • Thread A then proceeds to construct "t" as well (a second time) setting "constructed" (finally) to 3.

    Now, you might think you can wrap the runtime initialization in a critical section:

    int ComputeSomething()
    {
     EnterCriticalSection(...);
     static int cachedResult = ComputeSomethingSlowly();
     LeaveCriticalSection(...);
     return cachedResult;
    }
    

    Because now you've placed the one-time initialization inside a critical section and made it thread-safe.

    But what if the second call comes from within the same thread? ("We've traced the call; it's coming from inside the thread!") This can happen if ComputeSomethingSlowly() itself calls ComputeSomething(), perhaps indirectly. Since that thread already owns the critical section, the code enter it just fine and you once again end up returning an uninitialized variable.

    Conclusion: When you see runtime initialization of a local static variable, be very concerned.

  • The Old New Thing

    How do I convert a SID between binary and string forms?

    • 9 Comments

    Of course, if you want to do this programmatically, you would use ConvertSidToStringSid and ConvertStringSidtoSid, but often you're studying a memory dump or otherwise need to do the conversion manually.

    If you have a SID like S-a-b-c-d-e-f-g-...

    Then the bytes are

    a(revision)
    N(number of dashes minus two)
    bbbbbb(six bytes of "b" treated as a 48-bit number in big-endian format)
    cccc(four bytes of "c" treated as a 32-bit number in little-endian format)
    dddd(four bytes of "d" treated as a 32-bit number in little-endian format)
    eeee(four bytes of "e" treated as a 32-bit number in little-endian format)
    ffff(four bytes of "f" treated as a 32-bit number in little-endian format)
    etc.

    So for example, if your SID is S-1-5-21-2127521184-1604012920-1887927527-72713, then your raw hex SID is

    010500000000000515000000A065CF7E784B9B5FE77C8770091C0100

    This breaks down as follows:

    01S-1
    05(seven dashes, seven minus two = 5)
    000000000005(5 = 0x000000000005, big-endian)
    15000000(21 = 0x00000015, little-endian)
    A065CF7E(2127521184 = 0x7ECF65A0, little-endian)
    784B9B5F(1604012920 = 0x5F9B4B78, little-endian)
    E77C8770(1887927527 = 0X70877CE7, little-endian)
    091C0100(72713 = 0x00011c09, little-endian)

    Yeah, that's great, Raymond, but what do all those numbers mean?

    S-1-version number (SID_REVISION)
    -5-SECURITY_NT_AUTHORITY
    -21-SECURITY_NT_NON_UNIQUE
    -...-...-...-these identify the machine that issued the SID
    72713unique user id on the machine

    Each machine generates a unique ID that it uses to stamp all the SIDs it creates (-...-...-...-). The last number is a "relative id (RID)" that represents a user created by that machine. There are a bunch of predefined RIDs; you can see them in the header file ntseapi.h, which is also where I got these names from. The system reserves RIDs up to 999, so the first non-builtin account gets assigned ID number 1000. The number 72713 means that this particular SID is the 71714th SID created by the issuer. (The machine that issued this SID is clearly a domain controller, responsible for creating the accounts of tens of thousands of users.)

    (Actually, I lied above when I said that this is the 71714th SID created by the issuer. Large servers can delegate SID creation to helpers, in which case SID issuance is no longer strictly consecutive.)

    Security isn't my area of expertise, so it's entirely possibly (perhaps even likely) that I got something wrong up above. But it's mostly correct, I think.

  • The Old New Thing

    Blow the dust out of the connector

    • 42 Comments
    Okay, I'm about to reveal one of the tricks of Product Support.

    Sometimes you're on the phone with somebody and you suspect that the problem is something as simple as forgetting to plug it in, or that the cable was plugged into the wrong port. This is easy to do with those PS/2 connectors that fit both a keyboard and a mouse plug, or with network cables that can fit both into the upstream and downstream ports on a router.

    Here's the trick: Don't ask "Are you sure it's plugged in correctly?"

    If you do this, they will get all insulted and say indignantly, "Of course it is! Do I look like an idiot?" without actually checking.

    Instead, say "Okay, sometimes the connection gets a little dusty and the connection gets weak. Could you unplug the connector, blow into it to get the dust out, then plug it back in?"

    They will then crawl under the desk, find that they forgot to plug it in (or plugged it into the wrong port), blow out the dust, plug it in, and reply, "Um, yeah, that fixed it, thanks."

    (Or if the problem was that it was plugged into the wrong port, then the act of unplugging it and blowing into the connector takes their eyes off the port. Then when they go to plug it in, they will look carefully and get it right the second time because they're paying attention.)

    Customer saves face, you close a support case, everybody wins.

    Corollary: Instead of asking "Are you sure it's turned on?", ask them to turn it off and back on.

  • The Old New Thing

    The ways people mess up IUnknown::QueryInterface

    • 33 Comments

    When you're dealing with application compatibility, you discover all sorts of things that worked only by accident. Today, I'll talk about some of the "creative" ways people mess up the IUnknown::QueryInterface method.

    Now, you'd think, "This interface is so critical to COM, how could anybody possible mess it up?"

    Forgetting to respond to IUnknown.

    Sometimes you get so excited about responding to all these great interfaces that you forget to respond to IUnknown itself. We have found objects where

    IShellFolder *psf = some object;
    IUnknown *punk;
    psf->QueryInterface(IID_IUnknown, (void**)&punk);
    
    fails with E_NOINTERFACE!

    Forgetting to respond to your own interface.

    There are some methods which return an object with a specific interface. And if you query that object for its own interface, its sole reason for existing, it says "Huh?"

    IShellFolder *psf = some object;
    IEnumIDList *peidl, *peidl2;
    psf->EnumObjects(..., &peidl);
    peidl->QueryInterface(IID_IEnumIDList, (void**)&peidl2);
    

    There are some objects which return E_NOINTERFACE to the QueryInterface call, even though you're asking the object for itself! "Sorry, I don't exist," it seems they're trying to say.

    Forgetting to respond to base interfaces.

    When you implement a derived interface, you implicitly implement the base interfaces, so don't forget to respond to them, too.

    IShellView *psv = some object;
    IOleView *pow;
    psv->QueryInterface(IID_IOleView, (void**)&pow);
    
    Some objects forget and the QueryInterface fails with E_NOINTERFACE.

    Requiring a secret knock.

    In principle, the following two code fragments are equivalent:

    IShellFolder *psf;
    IUnknown *punk;
    CoCreateInstance(CLSID_xyz, ..., IID_IShellFolder, (void**)&psf);
    psf->QueryInterface(IID_IUnknown, (void**)&punk);
    
    CoCreateInstance(CLSID_xyz, ..., IID_IUnknown, (void**)&punk);
    punk->QueryInterface(IID_IShellFolder, (void**)&psf);
    

    In reality, some implementations mess up and fail the second call to CoCreateInstance. The only way to create the object successfully is to create it with the IShellFolder interface.

    Forgetting to say "no" properly.

    One of the rules for saying "no" is that you have to set the output pointer to NULL before returning. Some people forget to do that.

    IMumble *pmbl;
    punk->QueryInterface(IID_IMumble, (void**)&pmbl);
    

    If the QueryInterface succeeds, then pmbl must be non-NULL on return. If it fails, then pmbl must be NULL on return.

    The shell has to be compatible with all these buggy objects because if it weren't, customers would get upset and the press would have a field day. Some of the offenders are big-name programs. If they broke, people would report, "Don't upgrade to Windows XYZ, it's not compatible with <big-name program>." Conspiracy-minded folks would shout, "Microsoft intentionally broke <big-name program>! Proof of unfair business tactics!"

    [Raymond is currently on vacation; this message was pre-recorded.]

  • The Old New Thing

    Why Ctrl+Alt shouldn't be used as a shortcut modifier

    • 22 Comments

    You may have noticed that Windows doesn't use Ctrl+Alt as a keyboard shortcut anywhere. (Or at least it shouldn't.) If a chorded modifier is needed, it's usually Ctrl+Shift.

    That's because Ctrl+Alt has special meaning on many keyboards. The combination Ctrl+Alt is also known as AltGr, and it acts as an alternate shift key. For example, consider the German keyboard layout. Notice that there are three keyboard shift states (Normal, Shift, and AltGr), whereas on U.S. keyboards there are only two (Normal and Shift). For example, to type the @ character on a German keyboard, you would type AltGr+Q = Ctrl+Alt+Q. (Some languages, like Swedish, have a fourth state: Shift+AltGr. And then of course, there's the Japanese keyboard...)

    Most international keyboards remap the right-hand Alt key to act as AltGr, so instead of the finger-contorting Ctrl+Alt+Q, you can usually type RAlt+Q.

    (For reference, here are diagrams of several other keyboard layouts, courtesy of my bubble-blowing friend, Nadine Kano.)

    Sometimes a program accidentally uses Ctrl+Alt as a shortcut modifier and they get bug reports like, "Every time I type the letter 'đ', the program thinks I want to start a mailmerge."

    [Raymond is currently on vacation; this message was pre-recorded.]

  • The Old New Thing

    Why are HANDLE return values so inconsistent?

    • 21 Comments

    If you look at the various functions that return HANDLEs, you'll see that some of them return NULL (like CreateThread) and some of them return INVALID_HANDLE_VALUE (like CreateFile). You have to check the documentation to see what each particular function returns on failure.

    Why are the return values so inconsistent?

    The reasons, as you may suspect, are historical.

    The values were chosen to be compatible with 16-bit Windows. The 16-bit functions OpenFile, _lopen and _lcreat return -1 on failure, so the 32-bit CreateFile function returns INVALID_HANDLE_VALUE in order to facilitate porting code from Win16.

    (Armed with this, you can now answer the following trivia question: Why do I call CreateFile when I'm not actually creating a file? Shouldn't it be called OpenFile? Answer: Yes, OpenFile would have been a better name, but that name was already taken.)

    On the other hand, there are no Win16 equivalents for CreateThread or CreateMutex, so they return NULL.

    Since the precedent had now been set for inconsistent return values, whenever a new function got added, it was a bit of a toss-up whether the new function returned NULL or INVALID_HANDLE_VALUE.

    This inconsistency has multiple consequences.

    First, of course, you have to be careful to check the return values properly.

    Second, it means that if you write a generic handle-wrapping class, you have to be mindful of two possible "not a handle" values.

    Third, if you want to pre-initialize a HANDLE variable, you have to initialize it in a manner compatible with the function you intend to use. For example, the following code is wrong:

    HANDLE h = NULL;
    if (UseLogFile()) {
        h = CreateFile(...);
    }
    DoOtherStuff();
    if (h) {
       Log(h);
    }
    DoOtherStuff();
    if (h) {
        CloseHandle(h);
    }
    
    This code has two bugs. First, the return value from CreateFile is checked incorrectly. The code above checks for NULL instead of INVALID_HANDLE_VALUE. Second, the code initializes the h variable incorrectly. Here's the corrected version:
    HANDLE h = INVALID_HANDLE_VALUE;
    if (UseLogFile()) {
        h = CreateFile(...);
    }
    DoOtherStuff();
    if (h != INVALID_HANDLE_VALUE) {
       Log(h);
    }
    DoOtherStuff();
    if (h != INVALID_HANDLE_VALUE) {
        CloseHandle(h);
    }
    

    Fourth, you have to be particularly careful with the INVALID_HANDLE_VALUE value: By coincidence, the value INVALID_HANDLE_VALUE happens to be numerically equal to the pseudohandle returned by GetCurrentProcess(). Many kernel functions accept pseudohandles, so if if you mess up and accidentally call, say, WaitForSingleObject on a failed INVALID_HANDLE_VALUE handle, you will actually end up waiting on your own process. This wait will, of course, never complete, because a process is signalled when it exits, so you ended up waiting for yourself.

  • The Old New Thing

    Defrauding the WHQL driver certification process

    • 81 Comments

    In a comment to one of my earlier entries, someone mentioned a driver that bluescreened under normal conditions, but once you enabled the Driver Verifier (to try to catch the driver doing whatever bad thing it was doing), the problem went away. Another commenter bemoaned that WHQL certification didn't seem to improve the quality of the drivers.

    Video drivers will do anything to outdo their competition. Everybody knows that they cheat benchmarks, for example. I remember one driver that ran the DirectX "3D Tunnel" demonstration program extremely fast, demonstrating how totally awesome their video card is. Except that if you renamed TUNNEL.EXE to FUNNEL.EXE, it ran slow again.

    There was another one that checked if you were printing a specific string used by a popular benchmark program. If so, then it only drew the string a quarter of the time and merely returned without doing anything the other three quarters of the time. Bingo! Their benchmark numbers just quadrupled.

    Anyway, similar shenanigans are not unheard of when submitting a driver to WHQL for certification. Some unscrupulous drivers will detect that they are being run by WHQL and disable various features so they pass certification. Of course, they also run dog slow in the WHQL lab, but that's okay, because WHQL is interested in whether the driver contains any bugs, not whether the driver has the fastest triangle fill rate in the industry.

    The most common cheat I've seen is drivers which check for a secret "Enable Dubious Optimizations" switch in the registry or some other place external to the driver itself. They take the driver and put it in an installer which does not turn the switch on and submit it to WHQL. When WHQL runs the driver through all its tests, the driver is running in "safe but slow" mode and passes certification with flying colors.

    The vendor then takes that driver (now with the WHQL stamp of approval) and puts it inside an installer that enables the secret "Enable Dubious Optimizations" switch. Now the driver sees the switch enabled and performs all sorts of dubious optimizations, none of which were tested by WHQL.

Page 1 of 5 (50 items) 12345