January, 2005

  • The Old New Thing

    Taskbar notification balloon tips don't penalize you for being away from the keyboard


    The Shell_NotifyIcon function is used to do various things, among them, displaying a balloon tip to the user. As discussed in the documentation for the NOTIFYICONDATA structure, the uTimeout member specifies how long the balloon should be displayed.

    But what if the user is not at the computer when you display your balloon? After 30 seconds, the balloon will time out, and the user will have missed your important message!

    Never fear. The taskbar keeps track of whether the user is using the computer (with the help of the GetLastInputInfo function) and doesn't "run the clock" if it appears that the user isn't there. You will get your 30 seconds of "face time" with the user.

    But what if you want to time out your message even if the user isn't there?

    You actually have the information available to you to solve this puzzle on the web pages I linked above. See if you can put the pieces together and come up with a better solution than simulating a click on the balloon. (Hint: Look carefully at what it means if you set your balloon text to an empty string.)

    And what if you want your message to stay on the screen longer than 30 seconds?

    You can't. The notification area enforces a 30 second limit for any single balloon. Because if they user hasn't done anything about it for 30 seconds, they probably aren't interested. If your message is so critical that the user shouldn't be allowed to ignore it, then don't use a notification balloon. Notification balloons are for non-critical transient messages to the user.

  • The Old New Thing

    How can code that tries to prevent a buffer overflow end up causing one?


    If you read your language specification, you'll find that the ...ncpy functions have extremely strange semantics.

    The strncpy function copies the initial count characters of strSource to strDest and returns strDest. If count is less than or equal to the length of strSource, a null character is not appended automatically to the copied string. If count is greater than the length of strSource, the destination string is padded with null characters up to length count.

    In pictures, here's what happens in various string copying scenarios.

    strncpy(strDest, strSrc, 5)
    W e l c o m e \0
    W e l c o
    observe no null terminator
    strncpy(strDest, strSrc, 5)
    H e l l o \0
    H e l l o
    observe no null terminator
    strncpy(strDest, strSrc, 5)
    H i \0
    H i \0 \0 \0
    observe null padding to end of strDest

    Why do these functions have such strange behavior?

    Go back to the early days of UNIX. Personally, I only go back as far as System V. In System V, file names could be up to 14 characters long. Anything longer was truncated to 14. And the field for storing the file name was exactly 14 characters. Not 15. The null terminator was implied. This saved one byte.

    Here are some file names and their corresponding directory entries:

    p a s s w d \0 \0 \0 \0 \0 \0 \0 \0
    n e w s g r o u p s . o l d
    n e w s g r o u p s . o l d

    Notice that newsgroups.old and newsgroups.old.backup are actually the same file name, due to truncation. The too-long name was silently truncated; no error was raised. This has historically been the source of unintended data loss bugs.

    The strncpy function was used by the file system to store the file name into the directory entry. This explains one part of the odd behavior of strcpy, namely why it does not null-terminate when the destination fills. The null terminator was implied by the end of the array. (It also explains the silent file name truncation behavior.)

    But why null-pad short file names?

    Because that makes scanning for file names faster. If you guarantee that all the "garbage bytes" are null, then you can use memcmp to compare them.

    For compatibility reasons, the C language committee decided to carry forward this quirky behavior of strncpy.

    So what about the title of this entry? How did code that tried to prevent a buffer overflow end up causing one?

    Here's one example. (Sadly I don't read Japanese, so I am operating only from the code.) Observe that it uses _tcsncpy to fill the lpstrFile and lpstrFileTitle, being careful not to overflow the buffers. That's great, but it also leaves off the null terminator if the string is too long. The caller may very well copy the result out of that buffer to a second buffer. But the lstrFile buffer lacks a proper null terminator and therefore exceeds the length the caller specified. Result: Second buffer overflows.

    Here's another example. Observe that the function uses _tcsncpy to copy the result into the output buffer. This author was mindful of the quirky behavior of the strncpy family of functions and manually slapped a null terminator in at the end of the buffer.

    But what if ccTextMax = 0? Then the attempt to force a null terminator dereferences past the beginning of the array and corrupts a random character.

    What's the conclusion of all this? Personally, my conclusion is simply to avoid strncpy and all its friends if you are dealing with null-terminated strings. Despite the "str" in the name, these functions do not produce null-terminated strings. They convert a null-terminated string into a raw character buffer. Using them where a null-terminated string is expected as the second buffer is plain wrong. Not only do you fail to get proper null termination if the source is too long, but if the source is short you get unnecessary null padding.

  • The Old New Thing

    A rant against flow control macros


    I try not to rant, but it happens sometimes. This time, I'm ranting on purpose: to complain about macro-izing flow control.

    No two people use the same macros, and when you see code that uses them you have to go dig through header files to figure out what they do.

    This is particularly gruesome when you're trying to debug a problem with some code that somebody else wrote. For example, say you see a critical section entered and you want to make sure that all code paths out of the function release the critical section. It would normally be as simple as searching for "return" and "goto" inside the function body, but if the author of the program hid those operations behind macros, you would miss them.

    HRESULT SomeFunction(Block *p)
     HRESULT hr;
     if (andSomethingElse) {
     hr = p->DoSomethingAgain();
     return hr;

    [Update: Fixed missing parenthesis in code that was never meant to be compiled anyway. Some people are so picky. - 10:30am]

    Is the critical section leaked? What happens if the BLOCK fails to validate? If DoSomethingElse fails, does DoSomethingAgain get called? What's with that unused "Cleanup" label? Is there a code path that leaves the "hr" variable uninitialized?

    You won't know until you go dig up the header file that defined the VALIDATE_BLOCK, TRAP_FAILURE, and MUST_SUCCEED macros.

    (Yes, the critical section question could be avoided by using a lock object with destructor, but that's not my point. Note also that this function temporarily exits the critical section. Most lock objects don't support that sort of thing, though it isn't usually that hard to add, at the cost of a member variable.)

    When you create a flow-control macro, you're modifying the language. When I fire up an editor on a file whose name ends in ".cpp" I expect that what I see will be C++ and not some strange dialect that strongly resembles C++ except in the places where it doesn't. (For this reason, I'm pleased that C# doesn't support macros.)

    People who still prefer flow-control macros should be sentenced to maintaining the original Bourne shell. Here's a fragment:

    ADDRESS	alloc(nbytes)
        POS	    nbytes;
        REG POS	rbytes = round(nbytes+BYTESPERWORD,BYTESPERWORD);
        LOOP    INT	    c=0;
    	REG BLKPTR  p = blokp;
    	REG BLKPTR  q;
    	REP IF !busy(p)
    	    THEN    WHILE !busy(q = p->word) DO p->word = q->word OD
    		IF ADR(q)-ADR(p) >= rbytes
    		THEN	blokp = BLK(ADR(p)+rbytes);
    		    IF q > blokp
    		    THEN    blokp->word = p->word;
    	    q = p; p = BLK(Rcheat(p->word)&~BUSY);
    	PER p>q ORF (c++)==0 DONE

    Back in its day, this code was held up as an example of "death by macros", code that relied so heavily on macros that nobody could understand it. What's scary is that by today's standards, it's quite tame.

    (This rant is a variation on one of my earlier rants, if you think about it. Exceptions are a form of nonlocal control flow.)

  • The Old New Thing

    PulseEvent is fundamentally flawed


    The PulseEvent function releases one thread (or all threads, if manual-reset) which is/are waiting for the pulsed event, then returns the event to the unset state. If no threads happen to be waiting, then the event goes to the unset state without anything happening.

    And there's the flaw.

    How do you know whether the thread that you think is waiting on the event really is? Surely you can't use something like

    WaitForSingleObject(hEvent, INFINITE);

    because there is a race between the signal and the wait. The thread that the semaphore is alerting might complete all its work and pulse the event before you get around to waiting for it.

    You can try using the SignalObjectAndWait function, which combines the signal and wait into a single operation. But even then, you can't be sure that the thread is waiting for the event at the moment of the pulse.

    While the thread is sitting waiting for the event, a device driver or part of the kernel itself might ask to borrow the thread to do some processing (by means of a "kernel-mode APC"). During that time, the thread is not in the wait state. (It's being used by the device driver.) If the PulseEvent happens while the thread is being "borrowed", then it will not be woken from the wait, because the PulseEvent function wakes only threads that were waiting at the time the PulseEvent occurs.

    Not only are you (as a user-mode program) unable to prevent kernel mode from doing this to your thread, you cannot even detect that it has occurred.

    (One place where you are likely to see this sort of thing happening is if you have the debugger attached to the process, since the debugger does things like suspend and resume threads, which result in kernel APCs.)

    As a result, the PulseEvent function is useless and should be avoided. It continues to exist solely for backwards compatibility.

    Sidebar: This whole business with kernel APCs also means that you cannot predict which thread will be woken when you signal a semaphore, an auto-reset event, or some other synchronization object that releases a single thread when signalled. If a thread is "borrowed" to service a kernel APC, then when it is returned to the wait list, it "goes back to the end of the line". Consequently, the order of objects waiting for a kernel object is unpredictable and cannot be relied upon.

  • The Old New Thing

    You don't need to run away from home to join the circus


    Last week, I saw a performance of Circus Contraption at The Seattle Center with some friends. We were all left agape by the aerialists as they climbed ropes, hoisted, hung, and balanced themselves high above the ground.

    I thought back to seeing acrobats as a child at the circus and realized how much more impressive they are as you get older and realize how much strength, balance, and just plain nerves are required to accomplish these amazing feats. When you're a kid, nothing is impossible. Hanging upside-down by the crook of your ankles doesn't sound so hard when you're a kid. When you're older, the same feat makes you shiver with excitement.

    To learn more, you can read an article from the University of Washington school newspaper (click through to the "mangled" version to see pictures) or read the blog of an Aerialistas member.

    If you too want to join the circus, you don't need to leave Seattle. You can take flying lessons with Trapezius, the school where the Aerialistas train. Or, as one of my friends discovered, you can go for a broader circus training at The School of Acrobatics and New Circus Arts.

    I hadn't realized how much circus-y stuff there is in Seattle. In addition to Circus Contraption, my friend also pointed out Teatro ZinZanni, a sort of dinner theater circus thing.

  • The Old New Thing

    Using fibers to simplify enumerators, part 5: Composition


    Another type of higher-order enumeration is composition, where one enumerator actually combines the results of multiple enumerators. (Everybody knows about derivation, but composition is another powerful concept in object-oriented programming. We've seen it before when building context menus.)

    In a producer-driven enumerator, you would implement composition by calling the two enumeration functions one after the other. In a consumer-driven enumerator, you would implement composition by wrapping the two enumerators inside a large enumerator which then chooses between the two based on which enumerator was currently active.

    A fiber-based enumerator behaves more like a consumer-driven enumerator, again, with easier state management.

    Let's write a composite enumerator that enumerates everything in the root of your C: drive (no subdirectories), plus everything in the current directory (including subdirectories).

    class CompositeEnumerator : public FiberEnumerator {
       : m_eFiltered(TEXT("C:\\"))
       , m_eCd(TEXT(".")) { }
     LPCTSTR GetCurDir()
        { return m_peCur->GetCurDir(); }
     LPCTSTR GetCurPath()
        { return m_peCur->GetCurPath(); }
     const WIN32_FIND_DATA* GetCurFindData()
        { return m_peCur->GetCurFindData(); }
     void FiberProc();
     FiberEnumerator* m_peCur;
     FilteredEnumerator m_eFiltered;
     DirectoryTreeEnumerator m_eCd;
    void CompositeEnumerator::FiberProc()
     FEFOUND fef;
     m_peCur = &m_eFiltered;
     while ((fef = m_peCur->Next()) != FEF_DONE &&
            fef != FEF_LEAVEDIR) {
     m_peCur = &m_eCd;
     while ((fef = m_peCur->Next()) != FEF_DONE) {

    Sidebar: Our composite enumeration is complicated by the fact that our FilteredEnumerator spits out a FEF_LEAVEDIR at the end, but which we want to suppress, so we have to check for it and eat it.

    In the more common case where the enumerator is generating a flat list, it would be a simple matter of just forwarding the two enumerators one after the other. Something like this:

    void CompositeEnumerator2::FiberProc()
    void CompositeEnumerator2::Enum(FiberEnumerator *pe)
     m_peCur = pe;
     FEFOUND fef;
     while ((fef = m_peCur->Next()) != FEF_DONE) {

    End sidebar.

    You can try out this CompositeEnumerator with the program you've been playing with for the past few days. Just change the line in main that creates the enumerator to the following:

     CompositeEnumerator e;

    Exercise: Gosh, why is the total so unusually large?

    Exercise: How many fibers are there in the program?

    Exercise: Draw a diagram showing how control flows among the various fibers in this program.

    Before you get all excited about fibers, consider the following:

    • Converting a thread to a fiber needs to be coordinated among all the components in the process so that it is converted only once and stays converted until everybody is finished. This means that if you are writing a plug-in that will go into some other process, you probably should avoid fibers, since you don't know what the other components in the process are going to do with fibers.
    • Fibers do not completely solve the one-thread-per-connection problem. They do reduce the context switching, but the memory footprint will still limit you to 2000 fibers per process (assuming a 2GB user-mode address space) since each fiber has a stack, which defaults to 1MB.

    I think that's enough about fibers for now.

  • The Old New Thing

    Using fibers to simplify enumerators, part 4: Filtering


    One type of higher-order enumeration is filtering, where one enumerator takes the output of another enumerator and removes some elements.

    In a producer-driven enumerator, you would implement filtering by substituting a new callback function that responds to callbacks on behalf of the client for items that should be filtered, and forwarding callbacks to the client for items that are not filtered.

    In a consumer-driven enumerator, you would implement composition by wrapping the enumerator inside another enumerator which drives the inner enumerator and forwards items that it wishes the caller to see.

    A fiber-based enumerator behaves more like a consumer-driven enumerator, but,with easier state management.

    Let's write a filter enumerator that removes all directories and suppresses recursing into them.

    class FilteredEnumerator : public FiberEnumerator {
     FilteredEnumerator(LPCTSTR pszDir) : m_e(pszDir) { }
     LPCTSTR GetCurDir()
        { return m_e.GetCurDir(); }
     LPCTSTR GetCurPath()
        { return m_e.GetCurPath(); }
     const WIN32_FIND_DATA* GetCurFindData()
        { return m_e.GetCurFindData(); }
     void FiberProc();
     DirectoryTreeEnumerator m_e;
    void FilteredEnumerator::FiberProc()
     FEFOUND fef;
     while ((fef = m_e.Next()) != FEF_DONE) {
      FERESULT fer;
      if (fef == FEF_DIR) {
       fer = FER_SKIP; // don't recurse into directories
      } else {
       fer = Produce(fef);

    To produce items from this filtered enumerator, we run the real enumerator (m_e) and remove all directories, preventing them from being propagated to the filter's consumer and just responding "skip it" to the real enumerator.

    You can test out this filtered enumerator with the same TestWalk function we've been using for the past few days. The only change you'll need to make is to the main function:

    int __cdecl main(int argc, char **argv)
     FilteredEnumerator e(TEXT("."));
     return 0;

    Observe that the program no longer recurses into subdirectories. It just tallies the sizes of the files in the current directory.

    Next time, composition.

Page 3 of 3 (27 items) 123