April, 2004

  • The Old New Thing

    Cleaner, more elegant, and wrong

    • 84 Comments

    Just because you can't see the error path doesn't mean it doesn't exist.

    Here's a snippet from a book on C# programming, taken from the chapter on how great exceptions are.

    try {
      AccessDatabase accessDb = new AccessDatabase();
      accessDb.GenerateDatabase();
    } catch (Exception e) {
      // Inspect caught exception
    }
    
    public void GenerateDatabase()
    {
      CreatePhysicalDatabase();
      CreateTables();
      CreateIndexes();
    }
    
    Notice how much cleaner and more elegant [this] solution is.

    Cleaner, more elegant, and wrong.

    Suppose an exception is thrown during CreateIndexes(). The GenerateDatabase() function doesn't catch it, so the error is thrown back out to the caller, where it is caught.

    But when the exception left GenerateDatabase(), important information was lost: The state of the database creation. The code that catches the exception doesn't know which step in database creation failed. Does it need to delete the indexes? Does it need to delete the tables? Does it need to delete the physical database? It doesn't know.

    So if there is a problem creating CreateIndexes(), you leak a physical database file and a table forever. (Since these are presumably files on disk, they hang around indefinitely.)

    Writing correct code in the exception-throwing model is in a sense harder than in an error-code model, since anything can fail, and you have to be ready for it. In an error-code model, it's obvious when you have to check for errors: When you get an error code. In an exception model, you just have to know that errors can occur anywhere.

    In other words, in an error-code model, it is obvious when somebody failed to handle an error: They didn't check the error code. But in an exception-throwing model, it is not obvious from looking at the code whether somebody handled the error, since the error is not explicit.

    Consider the following:

    Guy AddNewGuy(string name)
    {
     Guy guy = new Guy(name);
     AddToLeague(guy);
     guy.Team = ChooseRandomTeam();
     return guy;
    }
    

    This function creates a new Guy, adds him to the league, and assigns him to a team randomly. How can this be simpler?

    Remember: Every line is a possible error.

    What if an exception is thrown by "new Guy(name)"?

    Well, fortunately, we haven't yet started doing anything, so no harm done.

    What if an exception is thrown by "AddToLeague(guy)"?

    The "guy" we created will be abandoned, but the GC will clean that up.

    What if an exception is thrown by "guy.Team = ChooseRandomTeam()"?

    Uh-oh, now we're in trouble. We already added the guy to the league. If somebody catches this exception, they're going to find a guy in the league who doesn't belong to any team. If there's some code that walks through all the members of the league and uses the guy.Team member, they're going to take a NullReferenceException since guy.Team isn't initialized yet.

    When you're writing code, do you think about what the consequences of an exception would be if it were raised by each line of code? You have to do this if you intend to write correct code.

    Okay, so how to fix this? Reorder the operations.

    Guy AddNewGuy(string name)
    {
     Guy guy = new Guy(name);
     guy.Team = ChooseRandomTeam();
     AddToLeague(guy);
     return guy;
    }
    

    This seemingly insignificant change has a big effect on error recovery. By delaying the commitment of the data (adding the guy to the league), any exceptions taken during the construction of the guy do not have any lasting effect. All that happens is that a partly-constructed guy gets abandoned and eventually gets cleaned up by GC.

    General design principle: Don't commit data until they are ready.

    Of course, this example was rather simple since the steps in setting up the guy had no side-effects. If something went wrong during set-up, we could just abandon the guy and let the GC handle the cleanup.

    In the real world, things are a lot messier. Consider the following:

    Guy AddNewGuy(string name)
    {
     Guy guy = new Guy(name);
     guy.Team = ChooseRandomTeam();
     guy.Team.Add(guy);
     AddToLeague(guy);
     return guy;
    }
    

    This does the same thing as our corrected function, except that somebody decided that it would be more efficient if each team kept a list of members, so you have to add yourself to the team you intend to join. What consequences does this have on the function's correctness?

  • The Old New Thing

    In order to demonstrate our superior intellect, we will now ask you a question you cannot answer.

    • 54 Comments

    During the development of Windows 95, a placeholder dialog was added with the title, "In order to demonstrate our superior intellect, we will now ask you a question you cannot answer." The dialog itself asked a technical question that you need a brain the size of a planet in order to answer. (Okay, your brain didn't need to be quite that big.)

    Of course, there was no intention of shipping Windows 95 with such a dialog. The dialog was there only until other infrastructure became available, permitting the system to answer the question automatically.

    But when I saw that dialog, I was enlightened. As programmers, we often find ourselves unsure what to do next, and we say, "Well, to play it safe, I'll just ask the user what they want to do. I'm sure they'll make the right decision."

    Except that they don't. The default answer to every dialog box is Cancel. If you ask the user a technical question, odds are they they're just going to stare at it blankly for a while, then try to cancel out of it. The lesson they've learned is "Computers are hard to use."

    Even Eric Raymond has discovered this. (Don't forget to read his follow-up.)

    So don't ask questions the user can't answer. It doesn't get you anywhere and it just frustrates the user.

  • The Old New Thing

    Where does the taskbar get grouped button titles from?

    • 52 Comments

    If the "Group similar taskbar buttons" box is checked (default) and space starts to get tight on the taskbar, then then the taskbar will group together buttons represending windows from the same program and give them a common name. Where does this common name come from?

    The name for grouped taskbar buttons comes from the version resource of the underlying program. You can view this directly by viewing the properties of the executable program and looking on the Version tab.

    To set this property for your own programs, attach a version resource and set the name you want to display as the FileDescription property.

  • The Old New Thing

    Not all short filenames contain a tilde

    • 50 Comments

    I'm sure everybody has seen the autogenerated short names for long file names. For the long name "Long name for file.txt", you might get "LONGNA~1.TXT" or possibly "LO18C9~1.TXT" if there are a lot of collisions.

    What you may not know is that sometimes there is no tilde at all!

    Each filesystem decides how it wants to implement short filenames. Windows 95 uses the "~n" method exclusively. Windows NT adds the hexadecimal hash overflow technique. But some filesystems (like Novell) just truncate the name. "Long name for file.txt" on a Novell server will come out to just "LONGNAME.TXT".

    So don't assume that all short names contain tildes. They don't. This means no cheating on skipping a call to GetLongFileName if you don't see any tildes, since your optimization is invalid on Novell networks.

  • The Old New Thing

    Astonishingly, rules apply to everyone.

    • 48 Comments

    Spain's Crown Prince and his fiancée are outraged that they had to go through airport security in Miami.

    "The prince and his bodyguard felt they should not be subjected to the screening, but if they do not have an escort from the State Department or the Secret Service, it is required," she added. "It is the law."

    Apparently, the Prince did not give the standard 72 hours' notice to obtain pre-clearance. (Hm, I wonder if I can get pre-clearance by submitting my itinerary 72 hours in advance.)

    What bugs me even more is that the officials in Miami are all apologetic!

    Miami-Dade County Mayor Alex Penelas sent the royal family a letter of apology on the same day, calling the situation "lamentable".

    Reminds me of an article in the New York Times Magazine a few years ago titled Life is a Contact Sport [fee required], describing a mandatory meeting for all NFL rookies to introduce them to the "real world". My favorite part was this:

    Kendrell Bell, a Pittsburgh Steelers linebacker, tells of his great awakening to the verities of income tax: "I got a million-dollar signing bonus. But then I got the check, and it was only $624,000. I thought, Oh, well, I'll get the other half later. Then I found out that's all there was. I thought, They can't do this to me. Then I got on the Internet and I found out they can."

    Shocking! Football players have to pay income tax! Where will the injustice end?

    Happy Tax Day (US).

  • The Old New Thing

    Reference counting is hard.

    • 48 Comments

    One of the big advantages of managed code is that you don't have to worry about managing object lifetimes. Here's an example of some unmanaged code that tries to manage reference counts and doesn't quite get it right. Even a seemingly-simple function has a reference-counting bug.

    template <class T>
    T *SetObject(T **ppt, T *ptNew)
    {
     if (*ppt) (*ppt)->Release(); // Out with the old
     *ppt = ptNew; // In with the new
     if (ptNew) (ptNew)->AddRef();
     return ptNew;
    }
    

    The point of this function is to take a (pointer to) a variable that points to one object and replace it with a pointer to another object. This is a function that sits at the bottom of many "smart pointer" classes. Here's an example use:

    template <class T>
    class SmartPointer {
    public:
     SmartPointer(T* p = NULL)
       : m_p(NULL) { *this = p; }
     ~SmartPointer() { *this = NULL; }
     T* operator=(T* p)
       { return SetObject(&m_p, p); }
     operator T*() { return m_p; }
     T** operator&() { return &m_p; }
    private:
     T* m_p;
    };
    
    void Sample(IStream *pstm)
    {
      SmartPointer<IStream> spstm(pstm);
      SmartPointer<IStream> spstmT;
      if (SUCCEEDED(GetBetterStream(&spstmT))) {
       spstm = spstmT;
      }
      ...
    }
    

    Oh why am I explaining this? You know how smart pointers work.

    Okay, so the question is, what's the bug here?

    Stop reading here and don't read ahead until you've figured it out or you're stumped or you're just too lazy to think about it.


    The bug is that the old object is Release()d before the new object is AddRef()'d. Consider:

      SmartPointer<IStream> spstm;
      CreateStream(&spstm);
      spstm = spstm;
    

    This assignment statement looks harmless (albeit wasteful). But is it?

    The "smart pointer" is constructed with NULL, then the CreateStream creates a stream and assigns it to the "smart pointer". The stream's reference count is now one. Now the assignment statement is executed, which turns into

     SetObject(&spstm.m_p, spstm.m_p);
    

    Inside the SetObject function, ppt points tp spstm.m_p, and pptNew equals the original value of spstm.m_p.

    The first thing that SetObject does is release the old pointer, which now drops the reference count of the stream to zero. This destroys the stream object. Then the ptNew parameter (which now points to a freed object) is assigned to spstm.m_p, and finally the ptNew pointer (which still points to a freed object) is AddRef()d. Oops, we're invoking a method on an object that has been freed; no good can come of that.

    If you're lucky, the AddRef() call crashes brilliantly so you can debug the crash and see your error. If you're not lucky (and you're usually not lucky), the AddRef() call interprets the freed memory as if it were still valid and increments a reference count somewhere inside that block of memory. Congratulations, you've now corrupted memory. If that's not enough to induce a crash (at some unspecified point in the future), when the "smart pointer" goes out of scope or otherwise changes its referent, the invalid m_p pointer will be Release()d, corrupting memory yet another time.

    This is why "smart pointer" assignment functions must AddRef() the incoming pointer before Release()ing the old pointer.

    template <class T>
    T *SetObject(T **ppt, T *ptNew)
    {
     if (ptNew) (ptNew)->AddRef();
     if (*ppt) (*ppt)->Release();
     *ppt = ptNew;
     return ptNew;
    }
    

    If you look at the source code for the ATL function AtlComPtrAssign, you can see that it exactly matches the above (corrected) function.

    [Raymond is currently on vacation; this message was pre-recorded.]

  • The Old New Thing

    Why can't the system hibernate just one process?

    • 48 Comments

    Windows lets you hibernate the entire machine, but why can't it hibernate just one process? Record the state of the process and then resume it later.

    Because there is state in the system that is not part of the process.

    For example, suppose your program has taken a mutex, and then it gets process-hibernated. Oops, now that mutex is abandoned and is now up for grabs. If that mutex was protecting some state, then when the process is resumed from hibernation, it thinks it still owns the mutex and the state should therefore be safe from tampering, only to find that it doesn't own the mutex any more and its state is corrupted.

    Imagine all the code that does something like this:

    // assume hmtx is a mutex handle that
    // protects some shared object G
    WaitForSingleObject(hmtx, INFINITE);
    // do stuff with G
    ...
    // do more stuff with G on the assumption that
    // G hasn't changed.
    ReleaseMutex(hmtx);
    

    Nobody expects that the mutex could secretly get released during the "..." (which is what would happen if the process got hibernated). That goes against everything mutexes stand for!

    Consider, as another example, the case where you have a file that was opened for exclusive access. The program will happily run on the assumption that nobody can modify the file except that program. But if you process-hibernate it, then some other process can now open the file (the exclusive owner is no longer around), tamper with it, then resume the original program. The original program on resumption will see a tampered-with file and may crash or (worse) be tricked into a security vulnerability.

    One alternative would be to keep all objects that belong to a process-hibernated program still open. Then you would have the problem of a file that can't be deleted because it is being held open by a program that isn't even running! (And indeed, for the resumption to be successful across a reboot, the file would have to be re-opened upon reboot. So now you have a file that can't be deleted even after a reboot because it's being held open by a program that isn't running. Think of the amazing denial-of-service you could launch against somebody: Create and hold open a 20GB file, then hibernate the process and then delete the hibernation file. Ha-ha, you just created a permanently undeletable 20GB file.)

    Now what if the hibernated program had created windows. Should the window handles still be valid while the program is hibernated? What happens if you send it a message? If the window handles should not remain valid, then what happens to broadcast messages? Are they "saved somewhere" to be replayed when the program is resumed? (And what if the broadcast message was something like "I am about to remove this USB hard drive, here is your last chance to flush your data"? The hibernated program wouldn't get a chance to flush its data. Result: Corrupted USB hard drive.)

    And imagine the havoc if you could take the hibernated process and copy it to another machine, and then attempt to restore it there.

    If you want some sort of "checkpoint / fast restore" functionality in your program, you'll have to write it yourself. Then you will have to deal explicitly with issues like the above. ("I want to open this file, but somebody deleted it in the meantime. What should I do?" Or "Okay, I'm about to create a checkpoint, I'd better purge all my buffers and mark all my cached data as invalid because the thing I'm caching might change while I'm in suspended animation.")

  • The Old New Thing

    Why doesn't C# have "const"?

    • 45 Comments

    I was going to write about why C# doesn't have "const", but Stan Lippman already discussed this in A Question of Const, so now I don't have to.

    (And another example of synchronicity: After I wrote up this item and tossed it into the queue, Eric Gunnerson took up the topic as well.

  • The Old New Thing

    Using the echo command to remember what you were doing.

    • 43 Comments

    Sometimes you'll start typing a complex command line and then realize that you can't execute it now. Perhaps you're in the wrong directory or you forgot to map the drive. It would be a shame to hit ESC and lose that command line.

    What I do is edit the line to insert the word "echo" in front, then hit Enter. (Note: This doesn't work if you have command line redirection.)

    This prints the command to the screen, where it can be cut/pasted later. Even more, it enters the line into the command history, so a few uparrows will bring it back.

  • The Old New Thing

    Why a really large dictionary is not a good thing

    • 42 Comments

    Sometimes you'll see somebody brag about how many words are in their spell-checking dictionary. It turns out that having too many words in a spell checker's dictionary is worse than having too few.

    Suppose you had a spell checker whose dictionary contained every word in the Oxford English Dictionary. Then you hand it this sentence:

    Therf werre eyght bokes.

    That sentence would pass with flying colors, because all of the words in the above sentence are valid English words, though most people would be hard-pressed to provide definitions.

    The English language has so many words that if you included them all, then common typographical errors would often match (by coincidence) a valid English word and therefore not be detected by the spell checker. Which would go against the whole point of a spell checker: To catch spelling errors.

    So be glad that your spell checker doesn't have the largest dictionary possible. If it did, it would end up doing a worse job.

    After I wrote this article, I found a nice discussion of the subject of spell check dictionary size on the Wintertree Software web site.

    [Raymond is currently on vacation; this message was pre-recorded.]

Page 1 of 4 (36 items) 1234