April, 2004

  • The Old New Thing

    In order to demonstrate our superior intellect, we will now ask you a question you cannot answer.

    • 54 Comments

    During the development of Windows 95, a placeholder dialog was added with the title, "In order to demonstrate our superior intellect, we will now ask you a question you cannot answer." The dialog itself asked a technical question that you need a brain the size of a planet in order to answer. (Okay, your brain didn't need to be quite that big.)

    Of course, there was no intention of shipping Windows 95 with such a dialog. The dialog was there only until other infrastructure became available, permitting the system to answer the question automatically.

    But when I saw that dialog, I was enlightened. As programmers, we often find ourselves unsure what to do next, and we say, "Well, to play it safe, I'll just ask the user what they want to do. I'm sure they'll make the right decision."

    Except that they don't. The default answer to every dialog box is Cancel. If you ask the user a technical question, odds are they they're just going to stare at it blankly for a while, then try to cancel out of it. The lesson they've learned is "Computers are hard to use."

    Even Eric Raymond has discovered this. (Don't forget to read his follow-up.)

    So don't ask questions the user can't answer. It doesn't get you anywhere and it just frustrates the user.

  • The Old New Thing

    Why can't I install Windows on my USB drive?

    • 35 Comments

    A collection of limitations (both hardware and software) currently prevent Windows from booting and running off a USB drive. Some of them are described in this whitepaper from WinHEC 2003. Another reason not mentioned in this paper is that during any hot-plug operation, the USB bus is completely reinitialized. Windows really doesn't like it when it loses access to its boot device. Imagine, you plug in a USB camera, the USB bus reinitializes, Windows loses access to the boot drive, and *oops* the kernel needs to page in some data and it can't.

    Rats.

    But who knows, someday maybe it will work.

  • The Old New Thing

    Cleaner, more elegant, and wrong

    • 84 Comments

    Just because you can't see the error path doesn't mean it doesn't exist.

    Here's a snippet from a book on C# programming, taken from the chapter on how great exceptions are.

    try {
      AccessDatabase accessDb = new AccessDatabase();
      accessDb.GenerateDatabase();
    } catch (Exception e) {
      // Inspect caught exception
    }
    
    public void GenerateDatabase()
    {
      CreatePhysicalDatabase();
      CreateTables();
      CreateIndexes();
    }
    
    Notice how much cleaner and more elegant [this] solution is.

    Cleaner, more elegant, and wrong.

    Suppose an exception is thrown during CreateIndexes(). The GenerateDatabase() function doesn't catch it, so the error is thrown back out to the caller, where it is caught.

    But when the exception left GenerateDatabase(), important information was lost: The state of the database creation. The code that catches the exception doesn't know which step in database creation failed. Does it need to delete the indexes? Does it need to delete the tables? Does it need to delete the physical database? It doesn't know.

    So if there is a problem creating CreateIndexes(), you leak a physical database file and a table forever. (Since these are presumably files on disk, they hang around indefinitely.)

    Writing correct code in the exception-throwing model is in a sense harder than in an error-code model, since anything can fail, and you have to be ready for it. In an error-code model, it's obvious when you have to check for errors: When you get an error code. In an exception model, you just have to know that errors can occur anywhere.

    In other words, in an error-code model, it is obvious when somebody failed to handle an error: They didn't check the error code. But in an exception-throwing model, it is not obvious from looking at the code whether somebody handled the error, since the error is not explicit.

    Consider the following:

    Guy AddNewGuy(string name)
    {
     Guy guy = new Guy(name);
     AddToLeague(guy);
     guy.Team = ChooseRandomTeam();
     return guy;
    }
    

    This function creates a new Guy, adds him to the league, and assigns him to a team randomly. How can this be simpler?

    Remember: Every line is a possible error.

    What if an exception is thrown by "new Guy(name)"?

    Well, fortunately, we haven't yet started doing anything, so no harm done.

    What if an exception is thrown by "AddToLeague(guy)"?

    The "guy" we created will be abandoned, but the GC will clean that up.

    What if an exception is thrown by "guy.Team = ChooseRandomTeam()"?

    Uh-oh, now we're in trouble. We already added the guy to the league. If somebody catches this exception, they're going to find a guy in the league who doesn't belong to any team. If there's some code that walks through all the members of the league and uses the guy.Team member, they're going to take a NullReferenceException since guy.Team isn't initialized yet.

    When you're writing code, do you think about what the consequences of an exception would be if it were raised by each line of code? You have to do this if you intend to write correct code.

    Okay, so how to fix this? Reorder the operations.

    Guy AddNewGuy(string name)
    {
     Guy guy = new Guy(name);
     guy.Team = ChooseRandomTeam();
     AddToLeague(guy);
     return guy;
    }
    

    This seemingly insignificant change has a big effect on error recovery. By delaying the commitment of the data (adding the guy to the league), any exceptions taken during the construction of the guy do not have any lasting effect. All that happens is that a partly-constructed guy gets abandoned and eventually gets cleaned up by GC.

    General design principle: Don't commit data until they are ready.

    Of course, this example was rather simple since the steps in setting up the guy had no side-effects. If something went wrong during set-up, we could just abandon the guy and let the GC handle the cleanup.

    In the real world, things are a lot messier. Consider the following:

    Guy AddNewGuy(string name)
    {
     Guy guy = new Guy(name);
     guy.Team = ChooseRandomTeam();
     guy.Team.Add(guy);
     AddToLeague(guy);
     return guy;
    }
    

    This does the same thing as our corrected function, except that somebody decided that it would be more efficient if each team kept a list of members, so you have to add yourself to the team you intend to join. What consequences does this have on the function's correctness?

  • The Old New Thing

    How to retrieve text under the cursor (mouse pointer)

    • 23 Comments

    Microsoft Active Accessibilty is the technology that exposes information about objects on the screen to accessibility aids such as screen readers. But that doesn't mean that only screen readers can use it.

    Here's a program that illustrates the use of Active Accessibility at the most rudimentary level: Reading text. There's much more to Active Accessibility than this. You can navigate the objects on the screen, read various properties, even invoke commands on them, all programmatically.

    Start with our scratch program and change these two functions:

    BOOL
    OnCreate(HWND hwnd, LPCREATESTRUCT lpcs)
    {
      SetTimer(hwnd, 1, 1000, RecalcText);
      return TRUE;
    }
    
    void
    PaintContent(HWND hwnd, PAINTSTRUCT *pps)
    {
      if (g_pszText) {
          RECT rc;
          GetClientRect(hwnd, &rc);
          DrawText(pps->hdc, g_pszText, lstrlen(g_pszText),
                   &rc, DT_NOPREFIX | DT_WORDBREAK);
      }
    }
    

    Of course, the fun part is the function RecalcText, which retrieves the text from beneath the cursor:

    #include <oleacc.h>
    
    POINT g_pt;
    LPTSTR g_pszText;
    
    void CALLBACK RecalcText(HWND hwnd, UINT, UINT_PTR, DWORD)
    {
      POINT pt;
      if (GetCursorPos(&pt) &&
        (pt.x != g_pt.x || pt.y != g_pt.y)) {
        g_pt = pt;
        IAccessible *pacc;
        VARIANT vtChild;
        if (SUCCEEDED(AccessibleObjectFromPoint(pt, &pacc, &vtChild))) {
          BSTR bsName = NULL;
          BSTR bsValue = NULL;
          pacc->get_accName(vtChild, &bsName);
          pacc->get_accValue(vtChild, &bsValue);
          LPTSTR pszResult;
          DWORD_PTR args[2] = { (DWORD_PTR)(bsName ? bsName : L""),
                                (DWORD_PTR)(bsValue ? bsValue : L"") };
          if (FormatMessage(FORMAT_MESSAGE_ALLOCATE_BUFFER |
                            FORMAT_MESSAGE_FROM_STRING |
                            FORMAT_MESSAGE_ARGUMENT_ARRAY,
                            TEXT("Name: %1!ws!\r\n\r\nValue: %2!ws!"),
                            0, 0, (LPTSTR)&pszResult, 0, (va_list*)args)) {
            LocalFree(g_pszText);
            g_pszText = pszResult;
            InvalidateRect(hwnd, NULL, TRUE);
          }
    
          SysFreeString(bsName);
          SysFreeString(bsValue);
          VariantClear(&vtChild);
          pacc->Release();
        }
      }
    }
    

    Let's take a look at this function. We start by grabbing the cursor position and seeing if it changed since the last time we checked. If so, then we ask AccessibleObjectFromPoint to identify the object at those coordinates and give us an IAccessible pointer plus a child identifier. These two pieces of information together represent the object under the cursor.

    Now it's a simple matter of asking for the name (get_accName) and value (get_accValue) of the object and format it nicely.

    Note that we handled the NULL case of the BSTR in accordance with Eric's Complete Guide to BSTR Semantics.

    For more information about accessibility, check out Sara Ford's WebLog, in particular the bit titled What is Assistive Technology Compatibility.

  • The Old New Thing

    Why doesn't C# have "const"?

    • 45 Comments

    I was going to write about why C# doesn't have "const", but Stan Lippman already discussed this in A Question of Const, so now I don't have to.

    (And another example of synchronicity: After I wrote up this item and tossed it into the queue, Eric Gunnerson took up the topic as well.

  • The Old New Thing

    What is __purecall?

    • 39 Comments

    Both C++ and C# have the concept of virtual functions. These are functions which always invoke the most heavily derived implementation, even if called from a pointer to the base class. However, the two languages differ on the semantics of virtual functions during object construction and destruction.

    C# objects exist as their final type before construction begins, whereas C++ objects change type during the construction process.

    Here's an example:

    class Base {
    public:
      Base() { f(); }
      virtual void f() { cout << 1; }
      void g() { f(); }
    };
    
    class Derived : public Base {
    public:
      Derived() { f(); }
      virtual void f() { cout << 2; }
    };
    

    When a Derived object is constructed, the object starts as a Base, then the Base::Base constructor is executed. Since the object is still a Base, the call to f() invokes Base::f and not Derived::f. After the Base::Base constructor completes, the object then becomes a Derived and the Derived::Derived constructor is run. This time, the call to f() invokes Derived::f.

    In other words, constructing a Derived object prints "12".

    Similar remarks apply to the destructor. The object is destructed in pieces, and a call to a virtual function invokes the function corresponding to the stage of destruction currently in progress.

    This is why some coding guidelines recommend against calling virtual functions from a constructor or destructor. Depending on what stage of construction/destruction is taking place, the same call to f() can have different effects. For example, the function Base::g() above will call Base::f if called from the Base::Base constructor or destructor, but will call Derived::f if called after the object has been constructed and before it is destructed.

    On the other hand, if this sample were written (with suitable syntactic changes) in C#, the output would be "22" because a C# object is created as its final type. Both calls to f() invoke Derived::f, since the object is always a Derived. Notice that means a method can be invoked on an object before its constructor has run. Something to bear in mind.

    Sometimes your C++ program may crash with the error "R6025 - pure virtual function call". This message comes from a function called __purecall. What does it mean?

    C++ and C# both have the concept of a "pure virtual function" (which C# calls "abstract"). This is a method which is declared by the base class, but for which no implementation is provided. In C++ the syntax for this is "=0":

    class Base {
    public:
      Base() { f(); }
      virtual void f() = 0;
    };
    

    If you attempt to create a Derived object, the base class will attempt to call Base::f, which does not exist since it is a pure virtual function. When this happens, the "pure virtual function call" error is raised and the program is terminated.

    Of course, the mistake is rarely as obvious as this. Typically, the call to the pure virtual function occurs deep inside the call stack of the constructor.

    This raises the side issue of the "novtable" optimization. As we noted above, the identity of the object changes during construction. This change of identity is performed by swapping the vtables around during construction. If you have a base class that is never instantiated directly but always via a derived class, and if you have followed the rules against calling virtual methods during construction, then you can use the novtable optimization to get rid of the vtable swapping during construction of the base class.

    If you use this optimization, then calling virtual methods during the base class's constructor or destructor will result in undefined behavior. It's a nice optimization, but it's your own responsibility to make sure you conform to its requirements.

    Sidebar: Why does C# not do type morphing during construction? One reason is that it would result in the possibility, given two objects A and B, that typeof(A) == typeof(B) yet sizeof(A) != sizeof(B). This would happen if A were a fully constructed object and B were a partially-constructed object on its way to becoming a derived object.

    Why is this so bad? Because the garbage collector is really keen on knowing the size of each object so it can know how much memory to free. It does this by checking the object's type. If an object's type did not completely determine its size, this would result in the garbage collector having to do extra work to figure out exactly how big the object is, which means extra code in the constructor and destructor, as well as space in the object, to keep track of which stage of construction/destruction is currently in progress. And all this for something most coding guidelines recommend against anyway.

  • The Old New Thing

    WM_KILLFOCUS is the wrong time to do field validation

    • 31 Comments

    "I'll do my field validation when I get a WM_KILLFOCUS message."

    This is wrong for multiple reasons.

    First, you may not get your focus loss message until it's too late.

    Consider a dialog box with an edit control and an OK button. The edit control validates its contents on receipt of the WM_KILLFOCUS message. Suppose the user fills in some invalid data.

    Under favorable circumstances, the user clicks the OK button. Clicking the OK button causes focus to move away from the edit control, so the edit control's WM_KILLFOCUS runs and gets a chance to tell the user that the field is no good. Since button clicks do not fire until the mouse is released while still over the button, invalid data will pop up a message box, which steals focus, and now the mouse-button-release doesn't go to the button control. Result: Error message and IDOK action does not execute.

    Now let's consider less favorable circumstances. Instead of clicking on the OK button, the user just presses Enter or types the keyboard accelerator for whatever button dismisses the dialog. The accelerator is converted by IsDialogMessage into a WM_COMMAND with the button control ID. Focus does not change.

    So now the IDOK (or whatever) handler runs and calls EndDialog() or performs whatever action the button represents. If the dialog exits, then focus will leave the edit control as part of dialog box destruction, and only then will the validation occur, but it's too late now. The dialog is already exiting.

    Alternatively, if the action in response to the button is not dialog termination but rather starting some other procedure, then it will do it based on the unvalidated data in the dialog box, which is likely not what you want. Only when that procedure moves focus (say, by displaying a progress dialog) will the edit control receive a WM_KILLFOCUS, at which time it is too late to do anything. The procedure (using the unvalidated data) is already under way.

    There is also a usability problem with validating on focus loss. Suppose the user starts typing data into the edit control, and then the user gets distracted. Maybe they need to open a piece of email that has the information they need. Maybe they got a phone call and need to look up something in their Contacts database. Maybe they went to the bathroom and the screen saver just kicked in. The user does not want a "Sorry, that partial information you entered is invalid" error dialog, because they aren't yet finished entering the data.

    I've told you all the places you shouldn't do validation but haven't said where you should.

    Do the validation when the users indicate that they are done with data entry and want to go on to the next step. For a simple dialog, this would mean performing validation when the OK or other action verb button is clicked. For a wizard, it would be when the Next button is clicked. For a tabbed dialog, it would be when the user tabs to a new page.

    (Warnings that do not change focus are permitted, like the balloon tip that apperas if you accidentally turn on Caps Lock while typing your password.)

  • The Old New Thing

    Where does the taskbar get grouped button titles from?

    • 52 Comments

    If the "Group similar taskbar buttons" box is checked (default) and space starts to get tight on the taskbar, then then the taskbar will group together buttons represending windows from the same program and give them a common name. Where does this common name come from?

    The name for grouped taskbar buttons comes from the version resource of the underlying program. You can view this directly by viewing the properties of the executable program and looking on the Version tab.

    To set this property for your own programs, attach a version resource and set the name you want to display as the FileDescription property.

  • The Old New Thing

    Why can't the system hibernate just one process?

    • 48 Comments

    Windows lets you hibernate the entire machine, but why can't it hibernate just one process? Record the state of the process and then resume it later.

    Because there is state in the system that is not part of the process.

    For example, suppose your program has taken a mutex, and then it gets process-hibernated. Oops, now that mutex is abandoned and is now up for grabs. If that mutex was protecting some state, then when the process is resumed from hibernation, it thinks it still owns the mutex and the state should therefore be safe from tampering, only to find that it doesn't own the mutex any more and its state is corrupted.

    Imagine all the code that does something like this:

    // assume hmtx is a mutex handle that
    // protects some shared object G
    WaitForSingleObject(hmtx, INFINITE);
    // do stuff with G
    ...
    // do more stuff with G on the assumption that
    // G hasn't changed.
    ReleaseMutex(hmtx);
    

    Nobody expects that the mutex could secretly get released during the "..." (which is what would happen if the process got hibernated). That goes against everything mutexes stand for!

    Consider, as another example, the case where you have a file that was opened for exclusive access. The program will happily run on the assumption that nobody can modify the file except that program. But if you process-hibernate it, then some other process can now open the file (the exclusive owner is no longer around), tamper with it, then resume the original program. The original program on resumption will see a tampered-with file and may crash or (worse) be tricked into a security vulnerability.

    One alternative would be to keep all objects that belong to a process-hibernated program still open. Then you would have the problem of a file that can't be deleted because it is being held open by a program that isn't even running! (And indeed, for the resumption to be successful across a reboot, the file would have to be re-opened upon reboot. So now you have a file that can't be deleted even after a reboot because it's being held open by a program that isn't running. Think of the amazing denial-of-service you could launch against somebody: Create and hold open a 20GB file, then hibernate the process and then delete the hibernation file. Ha-ha, you just created a permanently undeletable 20GB file.)

    Now what if the hibernated program had created windows. Should the window handles still be valid while the program is hibernated? What happens if you send it a message? If the window handles should not remain valid, then what happens to broadcast messages? Are they "saved somewhere" to be replayed when the program is resumed? (And what if the broadcast message was something like "I am about to remove this USB hard drive, here is your last chance to flush your data"? The hibernated program wouldn't get a chance to flush its data. Result: Corrupted USB hard drive.)

    And imagine the havoc if you could take the hibernated process and copy it to another machine, and then attempt to restore it there.

    If you want some sort of "checkpoint / fast restore" functionality in your program, you'll have to write it yourself. Then you will have to deal explicitly with issues like the above. ("I want to open this file, but somebody deleted it in the meantime. What should I do?" Or "Okay, I'm about to create a checkpoint, I'd better purge all my buffers and mark all my cached data as invalid because the thing I'm caching might change while I'm in suspended animation.")

  • The Old New Thing

    Not all short filenames contain a tilde

    • 50 Comments

    I'm sure everybody has seen the autogenerated short names for long file names. For the long name "Long name for file.txt", you might get "LONGNA~1.TXT" or possibly "LO18C9~1.TXT" if there are a lot of collisions.

    What you may not know is that sometimes there is no tilde at all!

    Each filesystem decides how it wants to implement short filenames. Windows 95 uses the "~n" method exclusively. Windows NT adds the hexadecimal hash overflow technique. But some filesystems (like Novell) just truncate the name. "Long name for file.txt" on a Novell server will come out to just "LONGNAME.TXT".

    So don't assume that all short names contain tildes. They don't. This means no cheating on skipping a call to GetLongFileName if you don't see any tildes, since your optimization is invalid on Novell networks.

Page 1 of 4 (36 items) 1234