April, 2004

  • The Old New Thing

    In order to demonstrate our superior intellect, we will now ask you a question you cannot answer.


    During the development of Windows 95, a placeholder dialog was added with the title, "In order to demonstrate our superior intellect, we will now ask you a question you cannot answer." The dialog itself asked a technical question that you need a brain the size of a planet in order to answer. (Okay, your brain didn't need to be quite that big.)

    Of course, there was no intention of shipping Windows 95 with such a dialog. The dialog was there only until other infrastructure became available, permitting the system to answer the question automatically.

    But when I saw that dialog, I was enlightened. As programmers, we often find ourselves unsure what to do next, and we say, "Well, to play it safe, I'll just ask the user what they want to do. I'm sure they'll make the right decision."

    Except that they don't. The default answer to every dialog box is Cancel. If you ask the user a technical question, odds are they they're just going to stare at it blankly for a while, then try to cancel out of it. The lesson they've learned is "Computers are hard to use."

    Even Eric Raymond has discovered this. (Don't forget to read his follow-up.)

    So don't ask questions the user can't answer. It doesn't get you anywhere and it just frustrates the user.

  • The Old New Thing

    Why can't I install Windows on my USB drive?


    A collection of limitations (both hardware and software) currently prevent Windows from booting and running off a USB drive. Some of them are described in this whitepaper from WinHEC 2003. Another reason not mentioned in this paper is that during any hot-plug operation, the USB bus is completely reinitialized. Windows really doesn't like it when it loses access to its boot device. Imagine, you plug in a USB camera, the USB bus reinitializes, Windows loses access to the boot drive, and *oops* the kernel needs to page in some data and it can't.


    But who knows, someday maybe it will work.

  • The Old New Thing

    Cleaner, more elegant, and wrong


    Just because you can't see the error path doesn't mean it doesn't exist.

    Here's a snippet from a book on C# programming, taken from the chapter on how great exceptions are.

    try {
      AccessDatabase accessDb = new AccessDatabase();
    } catch (Exception e) {
      // Inspect caught exception
    public void GenerateDatabase()
    Notice how much cleaner and more elegant [this] solution is.

    Cleaner, more elegant, and wrong.

    Suppose an exception is thrown during CreateIndexes(). The GenerateDatabase() function doesn't catch it, so the error is thrown back out to the caller, where it is caught.

    But when the exception left GenerateDatabase(), important information was lost: The state of the database creation. The code that catches the exception doesn't know which step in database creation failed. Does it need to delete the indexes? Does it need to delete the tables? Does it need to delete the physical database? It doesn't know.

    So if there is a problem creating CreateIndexes(), you leak a physical database file and a table forever. (Since these are presumably files on disk, they hang around indefinitely.)

    Writing correct code in the exception-throwing model is in a sense harder than in an error-code model, since anything can fail, and you have to be ready for it. In an error-code model, it's obvious when you have to check for errors: When you get an error code. In an exception model, you just have to know that errors can occur anywhere.

    In other words, in an error-code model, it is obvious when somebody failed to handle an error: They didn't check the error code. But in an exception-throwing model, it is not obvious from looking at the code whether somebody handled the error, since the error is not explicit.

    Consider the following:

    Guy AddNewGuy(string name)
     Guy guy = new Guy(name);
     guy.Team = ChooseRandomTeam();
     return guy;

    This function creates a new Guy, adds him to the league, and assigns him to a team randomly. How can this be simpler?

    Remember: Every line is a possible error.

    What if an exception is thrown by "new Guy(name)"?

    Well, fortunately, we haven't yet started doing anything, so no harm done.

    What if an exception is thrown by "AddToLeague(guy)"?

    The "guy" we created will be abandoned, but the GC will clean that up.

    What if an exception is thrown by "guy.Team = ChooseRandomTeam()"?

    Uh-oh, now we're in trouble. We already added the guy to the league. If somebody catches this exception, they're going to find a guy in the league who doesn't belong to any team. If there's some code that walks through all the members of the league and uses the guy.Team member, they're going to take a NullReferenceException since guy.Team isn't initialized yet.

    When you're writing code, do you think about what the consequences of an exception would be if it were raised by each line of code? You have to do this if you intend to write correct code.

    Okay, so how to fix this? Reorder the operations.

    Guy AddNewGuy(string name)
     Guy guy = new Guy(name);
     guy.Team = ChooseRandomTeam();
     return guy;

    This seemingly insignificant change has a big effect on error recovery. By delaying the commitment of the data (adding the guy to the league), any exceptions taken during the construction of the guy do not have any lasting effect. All that happens is that a partly-constructed guy gets abandoned and eventually gets cleaned up by GC.

    General design principle: Don't commit data until they are ready.

    Of course, this example was rather simple since the steps in setting up the guy had no side-effects. If something went wrong during set-up, we could just abandon the guy and let the GC handle the cleanup.

    In the real world, things are a lot messier. Consider the following:

    Guy AddNewGuy(string name)
     Guy guy = new Guy(name);
     guy.Team = ChooseRandomTeam();
     return guy;

    This does the same thing as our corrected function, except that somebody decided that it would be more efficient if each team kept a list of members, so you have to add yourself to the team you intend to join. What consequences does this have on the function's correctness?

  • The Old New Thing

    How to retrieve text under the cursor (mouse pointer)


    Microsoft Active Accessibilty is the technology that exposes information about objects on the screen to accessibility aids such as screen readers. But that doesn't mean that only screen readers can use it.

    Here's a program that illustrates the use of Active Accessibility at the most rudimentary level: Reading text. There's much more to Active Accessibility than this. You can navigate the objects on the screen, read various properties, even invoke commands on them, all programmatically.

    Start with our scratch program and change these two functions:

    OnCreate(HWND hwnd, LPCREATESTRUCT lpcs)
      SetTimer(hwnd, 1, 1000, RecalcText);
      return TRUE;
    PaintContent(HWND hwnd, PAINTSTRUCT *pps)
      if (g_pszText) {
          RECT rc;
          GetClientRect(hwnd, &rc);
          DrawText(pps->hdc, g_pszText, lstrlen(g_pszText),
                   &rc, DT_NOPREFIX | DT_WORDBREAK);

    Of course, the fun part is the function RecalcText, which retrieves the text from beneath the cursor:

    #include <oleacc.h>
    POINT g_pt;
    LPTSTR g_pszText;
    void CALLBACK RecalcText(HWND hwnd, UINT, UINT_PTR, DWORD)
      POINT pt;
      if (GetCursorPos(&pt) &&
        (pt.x != g_pt.x || pt.y != g_pt.y)) {
        g_pt = pt;
        IAccessible *pacc;
        VARIANT vtChild;
        if (SUCCEEDED(AccessibleObjectFromPoint(pt, &pacc, &vtChild))) {
          BSTR bsName = NULL;
          BSTR bsValue = NULL;
          pacc->get_accName(vtChild, &bsName);
          pacc->get_accValue(vtChild, &bsValue);
          LPTSTR pszResult;
          DWORD_PTR args[2] = { (DWORD_PTR)(bsName ? bsName : L""),
                                (DWORD_PTR)(bsValue ? bsValue : L"") };
          if (FormatMessage(FORMAT_MESSAGE_ALLOCATE_BUFFER |
                            FORMAT_MESSAGE_FROM_STRING |
                            TEXT("Name: %1!ws!\r\n\r\nValue: %2!ws!"),
                            0, 0, (LPTSTR)&pszResult, 0, (va_list*)args)) {
            g_pszText = pszResult;
            InvalidateRect(hwnd, NULL, TRUE);

    Let's take a look at this function. We start by grabbing the cursor position and seeing if it changed since the last time we checked. If so, then we ask AccessibleObjectFromPoint to identify the object at those coordinates and give us an IAccessible pointer plus a child identifier. These two pieces of information together represent the object under the cursor.

    Now it's a simple matter of asking for the name (get_accName) and value (get_accValue) of the object and format it nicely.

    Note that we handled the NULL case of the BSTR in accordance with Eric's Complete Guide to BSTR Semantics.

    For more information about accessibility, check out Sara Ford's WebLog, in particular the bit titled What is Assistive Technology Compatibility.

  • The Old New Thing

    Why doesn't C# have "const"?


    I was going to write about why C# doesn't have "const", but Stan Lippman already discussed this in A Question of Const, so now I don't have to.

    (And another example of synchronicity: After I wrote up this item and tossed it into the queue, Eric Gunnerson took up the topic as well.

  • The Old New Thing

    What is __purecall?


    Both C++ and C# have the concept of virtual functions. These are functions which always invoke the most heavily derived implementation, even if called from a pointer to the base class. However, the two languages differ on the semantics of virtual functions during object construction and destruction.

    C# objects exist as their final type before construction begins, whereas C++ objects change type during the construction process.

    Here's an example:

    class Base {
      Base() { f(); }
      virtual void f() { cout << 1; }
      void g() { f(); }
    class Derived : public Base {
      Derived() { f(); }
      virtual void f() { cout << 2; }

    When a Derived object is constructed, the object starts as a Base, then the Base::Base constructor is executed. Since the object is still a Base, the call to f() invokes Base::f and not Derived::f. After the Base::Base constructor completes, the object then becomes a Derived and the Derived::Derived constructor is run. This time, the call to f() invokes Derived::f.

    In other words, constructing a Derived object prints "12".

    Similar remarks apply to the destructor. The object is destructed in pieces, and a call to a virtual function invokes the function corresponding to the stage of destruction currently in progress.

    This is why some coding guidelines recommend against calling virtual functions from a constructor or destructor. Depending on what stage of construction/destruction is taking place, the same call to f() can have different effects. For example, the function Base::g() above will call Base::f if called from the Base::Base constructor or destructor, but will call Derived::f if called after the object has been constructed and before it is destructed.

    On the other hand, if this sample were written (with suitable syntactic changes) in C#, the output would be "22" because a C# object is created as its final type. Both calls to f() invoke Derived::f, since the object is always a Derived. Notice that means a method can be invoked on an object before its constructor has run. Something to bear in mind.

    Sometimes your C++ program may crash with the error "R6025 - pure virtual function call". This message comes from a function called __purecall. What does it mean?

    C++ and C# both have the concept of a "pure virtual function" (which C# calls "abstract"). This is a method which is declared by the base class, but for which no implementation is provided. In C++ the syntax for this is "=0":

    class Base {
      Base() { f(); }
      virtual void f() = 0;

    If you attempt to create a Derived object, the base class will attempt to call Base::f, which does not exist since it is a pure virtual function. When this happens, the "pure virtual function call" error is raised and the program is terminated.

    Of course, the mistake is rarely as obvious as this. Typically, the call to the pure virtual function occurs deep inside the call stack of the constructor.

    This raises the side issue of the "novtable" optimization. As we noted above, the identity of the object changes during construction. This change of identity is performed by swapping the vtables around during construction. If you have a base class that is never instantiated directly but always via a derived class, and if you have followed the rules against calling virtual methods during construction, then you can use the novtable optimization to get rid of the vtable swapping during construction of the base class.

    If you use this optimization, then calling virtual methods during the base class's constructor or destructor will result in undefined behavior. It's a nice optimization, but it's your own responsibility to make sure you conform to its requirements.

    Sidebar: Why does C# not do type morphing during construction? One reason is that it would result in the possibility, given two objects A and B, that typeof(A) == typeof(B) yet sizeof(A) != sizeof(B). This would happen if A were a fully constructed object and B were a partially-constructed object on its way to becoming a derived object.

    Why is this so bad? Because the garbage collector is really keen on knowing the size of each object so it can know how much memory to free. It does this by checking the object's type. If an object's type did not completely determine its size, this would result in the garbage collector having to do extra work to figure out exactly how big the object is, which means extra code in the constructor and destructor, as well as space in the object, to keep track of which stage of construction/destruction is currently in progress. And all this for something most coding guidelines recommend against anyway.

  • The Old New Thing

    WM_KILLFOCUS is the wrong time to do field validation


    "I'll do my field validation when I get a WM_KILLFOCUS message."

    This is wrong for multiple reasons.

    First, you may not get your focus loss message until it's too late.

    Consider a dialog box with an edit control and an OK button. The edit control validates its contents on receipt of the WM_KILLFOCUS message. Suppose the user fills in some invalid data.

    Under favorable circumstances, the user clicks the OK button. Clicking the OK button causes focus to move away from the edit control, so the edit control's WM_KILLFOCUS runs and gets a chance to tell the user that the field is no good. Since button clicks do not fire until the mouse is released while still over the button, invalid data will pop up a message box, which steals focus, and now the mouse-button-release doesn't go to the button control. Result: Error message and IDOK action does not execute.

    Now let's consider less favorable circumstances. Instead of clicking on the OK button, the user just presses Enter or types the keyboard accelerator for whatever button dismisses the dialog. The accelerator is converted by IsDialogMessage into a WM_COMMAND with the button control ID. Focus does not change.

    So now the IDOK (or whatever) handler runs and calls EndDialog() or performs whatever action the button represents. If the dialog exits, then focus will leave the edit control as part of dialog box destruction, and only then will the validation occur, but it's too late now. The dialog is already exiting.

    Alternatively, if the action in response to the button is not dialog termination but rather starting some other procedure, then it will do it based on the unvalidated data in the dialog box, which is likely not what you want. Only when that procedure moves focus (say, by displaying a progress dialog) will the edit control receive a WM_KILLFOCUS, at which time it is too late to do anything. The procedure (using the unvalidated data) is already under way.

    There is also a usability problem with validating on focus loss. Suppose the user starts typing data into the edit control, and then the user gets distracted. Maybe they need to open a piece of email that has the information they need. Maybe they got a phone call and need to look up something in their Contacts database. Maybe they went to the bathroom and the screen saver just kicked in. The user does not want a "Sorry, that partial information you entered is invalid" error dialog, because they aren't yet finished entering the data.

    I've told you all the places you shouldn't do validation but haven't said where you should.

    Do the validation when the users indicate that they are done with data entry and want to go on to the next step. For a simple dialog, this would mean performing validation when the OK or other action verb button is clicked. For a wizard, it would be when the Next button is clicked. For a tabbed dialog, it would be when the user tabs to a new page.

    (Warnings that do not change focus are permitted, like the balloon tip that apperas if you accidentally turn on Caps Lock while typing your password.)

  • The Old New Thing

    Why can't the system hibernate just one process?


    Windows lets you hibernate the entire machine, but why can't it hibernate just one process? Record the state of the process and then resume it later.

    Because there is state in the system that is not part of the process.

    For example, suppose your program has taken a mutex, and then it gets process-hibernated. Oops, now that mutex is abandoned and is now up for grabs. If that mutex was protecting some state, then when the process is resumed from hibernation, it thinks it still owns the mutex and the state should therefore be safe from tampering, only to find that it doesn't own the mutex any more and its state is corrupted.

    Imagine all the code that does something like this:

    // assume hmtx is a mutex handle that
    // protects some shared object G
    WaitForSingleObject(hmtx, INFINITE);
    // do stuff with G
    // do more stuff with G on the assumption that
    // G hasn't changed.

    Nobody expects that the mutex could secretly get released during the "..." (which is what would happen if the process got hibernated). That goes against everything mutexes stand for!

    Consider, as another example, the case where you have a file that was opened for exclusive access. The program will happily run on the assumption that nobody can modify the file except that program. But if you process-hibernate it, then some other process can now open the file (the exclusive owner is no longer around), tamper with it, then resume the original program. The original program on resumption will see a tampered-with file and may crash or (worse) be tricked into a security vulnerability.

    One alternative would be to keep all objects that belong to a process-hibernated program still open. Then you would have the problem of a file that can't be deleted because it is being held open by a program that isn't even running! (And indeed, for the resumption to be successful across a reboot, the file would have to be re-opened upon reboot. So now you have a file that can't be deleted even after a reboot because it's being held open by a program that isn't running. Think of the amazing denial-of-service you could launch against somebody: Create and hold open a 20GB file, then hibernate the process and then delete the hibernation file. Ha-ha, you just created a permanently undeletable 20GB file.)

    Now what if the hibernated program had created windows. Should the window handles still be valid while the program is hibernated? What happens if you send it a message? If the window handles should not remain valid, then what happens to broadcast messages? Are they "saved somewhere" to be replayed when the program is resumed? (And what if the broadcast message was something like "I am about to remove this USB hard drive, here is your last chance to flush your data"? The hibernated program wouldn't get a chance to flush its data. Result: Corrupted USB hard drive.)

    And imagine the havoc if you could take the hibernated process and copy it to another machine, and then attempt to restore it there.

    If you want some sort of "checkpoint / fast restore" functionality in your program, you'll have to write it yourself. Then you will have to deal explicitly with issues like the above. ("I want to open this file, but somebody deleted it in the meantime. What should I do?" Or "Okay, I'm about to create a checkpoint, I'd better purge all my buffers and mark all my cached data as invalid because the thing I'm caching might change while I'm in suspended animation.")

  • The Old New Thing

    Where does the taskbar get grouped button titles from?


    If the "Group similar taskbar buttons" box is checked (default) and space starts to get tight on the taskbar, then then the taskbar will group together buttons represending windows from the same program and give them a common name. Where does this common name come from?

    The name for grouped taskbar buttons comes from the version resource of the underlying program. You can view this directly by viewing the properties of the executable program and looking on the Version tab.

    To set this property for your own programs, attach a version resource and set the name you want to display as the FileDescription property.

  • The Old New Thing

    Reference counting is hard.


    One of the big advantages of managed code is that you don't have to worry about managing object lifetimes. Here's an example of some unmanaged code that tries to manage reference counts and doesn't quite get it right. Even a seemingly-simple function has a reference-counting bug.

    template <class T>
    T *SetObject(T **ppt, T *ptNew)
     if (*ppt) (*ppt)->Release(); // Out with the old
     *ppt = ptNew; // In with the new
     if (ptNew) (ptNew)->AddRef();
     return ptNew;

    The point of this function is to take a (pointer to) a variable that points to one object and replace it with a pointer to another object. This is a function that sits at the bottom of many "smart pointer" classes. Here's an example use:

    template <class T>
    class SmartPointer {
     SmartPointer(T* p = NULL)
       : m_p(NULL) { *this = p; }
     ~SmartPointer() { *this = NULL; }
     T* operator=(T* p)
       { return SetObject(&m_p, p); }
     operator T*() { return m_p; }
     T** operator&() { return &m_p; }
     T* m_p;
    void Sample(IStream *pstm)
      SmartPointer<IStream> spstm(pstm);
      SmartPointer<IStream> spstmT;
      if (SUCCEEDED(GetBetterStream(&spstmT))) {
       spstm = spstmT;

    Oh why am I explaining this? You know how smart pointers work.

    Okay, so the question is, what's the bug here?

    Stop reading here and don't read ahead until you've figured it out or you're stumped or you're just too lazy to think about it.

    The bug is that the old object is Release()d before the new object is AddRef()'d. Consider:

      SmartPointer<IStream> spstm;
      spstm = spstm;

    This assignment statement looks harmless (albeit wasteful). But is it?

    The "smart pointer" is constructed with NULL, then the CreateStream creates a stream and assigns it to the "smart pointer". The stream's reference count is now one. Now the assignment statement is executed, which turns into

     SetObject(&spstm.m_p, spstm.m_p);

    Inside the SetObject function, ppt points tp spstm.m_p, and pptNew equals the original value of spstm.m_p.

    The first thing that SetObject does is release the old pointer, which now drops the reference count of the stream to zero. This destroys the stream object. Then the ptNew parameter (which now points to a freed object) is assigned to spstm.m_p, and finally the ptNew pointer (which still points to a freed object) is AddRef()d. Oops, we're invoking a method on an object that has been freed; no good can come of that.

    If you're lucky, the AddRef() call crashes brilliantly so you can debug the crash and see your error. If you're not lucky (and you're usually not lucky), the AddRef() call interprets the freed memory as if it were still valid and increments a reference count somewhere inside that block of memory. Congratulations, you've now corrupted memory. If that's not enough to induce a crash (at some unspecified point in the future), when the "smart pointer" goes out of scope or otherwise changes its referent, the invalid m_p pointer will be Release()d, corrupting memory yet another time.

    This is why "smart pointer" assignment functions must AddRef() the incoming pointer before Release()ing the old pointer.

    template <class T>
    T *SetObject(T **ppt, T *ptNew)
     if (ptNew) (ptNew)->AddRef();
     if (*ppt) (*ppt)->Release();
     *ppt = ptNew;
     return ptNew;

    If you look at the source code for the ATL function AtlComPtrAssign, you can see that it exactly matches the above (corrected) function.

    [Raymond is currently on vacation; this message was pre-recorded.]

Page 1 of 4 (36 items) 1234