• The Old New Thing

    Taskbar notification balloon tips don't penalize you for being away from the keyboard

    • 70 Comments

    The Shell_NotifyIcon function is used to do various things, among them, displaying a balloon tip to the user. As discussed in the documentation for the NOTIFYICONDATA structure, the uTimeout member specifies how long the balloon should be displayed.

    But what if the user is not at the computer when you display your balloon? After 30 seconds, the balloon will time out, and the user will have missed your important message!

    Never fear. The taskbar keeps track of whether the user is using the computer (with the help of the GetLastInputInfo function) and doesn't "run the clock" if it appears that the user isn't there. You will get your 30 seconds of "face time" with the user.

    But what if you want to time out your message even if the user isn't there?

    You actually have the information available to you to solve this puzzle on the web pages I linked above. See if you can put the pieces together and come up with a better solution than simulating a click on the balloon. (Hint: Look carefully at what it means if you set your balloon text to an empty string.)

    And what if you want your message to stay on the screen longer than 30 seconds?

    You can't. The notification area enforces a 30 second limit for any single balloon. Because if they user hasn't done anything about it for 30 seconds, they probably aren't interested. If your message is so critical that the user shouldn't be allowed to ignore it, then don't use a notification balloon. Notification balloons are for non-critical transient messages to the user.

  • The Old New Thing

    How can code that tries to prevent a buffer overflow end up causing one?

    • 65 Comments

    If you read your language specification, you'll find that the ...ncpy functions have extremely strange semantics.

    The strncpy function copies the initial count characters of strSource to strDest and returns strDest. If count is less than or equal to the length of strSource, a null character is not appended automatically to the copied string. If count is greater than the length of strSource, the destination string is padded with null characters up to length count.

    In pictures, here's what happens in various string copying scenarios.

    strncpy(strDest, strSrc, 5)
    strSource
    W e l c o m e \0
    strDest
    W e l c o
    observe no null terminator
     
    strncpy(strDest, strSrc, 5)
    strSource
    H e l l o \0
    strDest
    H e l l o
    observe no null terminator
     
    strncpy(strDest, strSrc, 5)
    strSource
    H i \0
    strDest
    H i \0 \0 \0
    observe null padding to end of strDest

    Why do these functions have such strange behavior?

    Go back to the early days of UNIX. Personally, I only go back as far as System V. In System V, file names could be up to 14 characters long. Anything longer was truncated to 14. And the field for storing the file name was exactly 14 characters. Not 15. The null terminator was implied. This saved one byte.

    Here are some file names and their corresponding directory entries:

    passwd
    p a s s w d \0 \0 \0 \0 \0 \0 \0 \0
    newsgroups.old
    n e w s g r o u p s . o l d
    newsgroups.old.backup
    n e w s g r o u p s . o l d

    Notice that newsgroups.old and newsgroups.old.backup are actually the same file name, due to truncation. The too-long name was silently truncated; no error was raised. This has historically been the source of unintended data loss bugs.

    The strncpy function was used by the file system to store the file name into the directory entry. This explains one part of the odd behavior of strcpy, namely why it does not null-terminate when the destination fills. The null terminator was implied by the end of the array. (It also explains the silent file name truncation behavior.)

    But why null-pad short file names?

    Because that makes scanning for file names faster. If you guarantee that all the "garbage bytes" are null, then you can use memcmp to compare them.

    For compatibility reasons, the C language committee decided to carry forward this quirky behavior of strncpy.

    So what about the title of this entry? How did code that tried to prevent a buffer overflow end up causing one?

    Here's one example. (Sadly I don't read Japanese, so I am operating only from the code.) Observe that it uses _tcsncpy to fill the lpstrFile and lpstrFileTitle, being careful not to overflow the buffers. That's great, but it also leaves off the null terminator if the string is too long. The caller may very well copy the result out of that buffer to a second buffer. But the lstrFile buffer lacks a proper null terminator and therefore exceeds the length the caller specified. Result: Second buffer overflows.

    Here's another example. Observe that the function uses _tcsncpy to copy the result into the output buffer. This author was mindful of the quirky behavior of the strncpy family of functions and manually slapped a null terminator in at the end of the buffer.

    But what if ccTextMax = 0? Then the attempt to force a null terminator dereferences past the beginning of the array and corrupts a random character.

    What's the conclusion of all this? Personally, my conclusion is simply to avoid strncpy and all its friends if you are dealing with null-terminated strings. Despite the "str" in the name, these functions do not produce null-terminated strings. They convert a null-terminated string into a raw character buffer. Using them where a null-terminated string is expected as the second buffer is plain wrong. Not only do you fail to get proper null termination if the source is too long, but if the source is short you get unnecessary null padding.

  • The Old New Thing

    A rant against flow control macros

    • 53 Comments

    I try not to rant, but it happens sometimes. This time, I'm ranting on purpose: to complain about macro-izing flow control.

    No two people use the same macros, and when you see code that uses them you have to go dig through header files to figure out what they do.

    This is particularly gruesome when you're trying to debug a problem with some code that somebody else wrote. For example, say you see a critical section entered and you want to make sure that all code paths out of the function release the critical section. It would normally be as simple as searching for "return" and "goto" inside the function body, but if the author of the program hid those operations behind macros, you would miss them.

    HRESULT SomeFunction(Block *p)
    {
     HRESULT hr;
     EnterCriticalSection(&g_cs);
     VALIDATE_BLOCK(p);
     MUST_SUCCEED(p->DoSomething());
     if (andSomethingElse) {
      LeaveCriticalSection(&g_cs);
      TRAP_FAILURE(p->DoSomethingElse());
      EnterCriticalSection(&g_cs);
     }
     hr = p->DoSomethingAgain();
    Cleanup:
     LeaveCriticalSection(&g_cs);
     return hr;
    }
    

    [Update: Fixed missing parenthesis in code that was never meant to be compiled anyway. Some people are so picky. - 10:30am]

    Is the critical section leaked? What happens if the BLOCK fails to validate? If DoSomethingElse fails, does DoSomethingAgain get called? What's with that unused "Cleanup" label? Is there a code path that leaves the "hr" variable uninitialized?

    You won't know until you go dig up the header file that defined the VALIDATE_BLOCK, TRAP_FAILURE, and MUST_SUCCEED macros.

    (Yes, the critical section question could be avoided by using a lock object with destructor, but that's not my point. Note also that this function temporarily exits the critical section. Most lock objects don't support that sort of thing, though it isn't usually that hard to add, at the cost of a member variable.)

    When you create a flow-control macro, you're modifying the language. When I fire up an editor on a file whose name ends in ".cpp" I expect that what I see will be C++ and not some strange dialect that strongly resembles C++ except in the places where it doesn't. (For this reason, I'm pleased that C# doesn't support macros.)

    People who still prefer flow-control macros should be sentenced to maintaining the original Bourne shell. Here's a fragment:

    ADDRESS	alloc(nbytes)
        POS	    nbytes;
    {
        REG POS	rbytes = round(nbytes+BYTESPERWORD,BYTESPERWORD);
    
        LOOP    INT	    c=0;
    	REG BLKPTR  p = blokp;
    	REG BLKPTR  q;
    	REP IF !busy(p)
    	    THEN    WHILE !busy(q = p->word) DO p->word = q->word OD
    		IF ADR(q)-ADR(p) >= rbytes
    		THEN	blokp = BLK(ADR(p)+rbytes);
    		    IF q > blokp
    		    THEN    blokp->word = p->word;
    		    FI
    		    p->word=BLK(Rcheat(blokp)|BUSY);
    		    return(ADR(p+1));
    		FI
    	    FI
    	    q = p; p = BLK(Rcheat(p->word)&~BUSY);
    	PER p>q ORF (c++)==0 DONE
    	addblok(rbytes);
        POOL
    }
    

    Back in its day, this code was held up as an example of "death by macros", code that relied so heavily on macros that nobody could understand it. What's scary is that by today's standards, it's quite tame.

    (This rant is a variation on one of my earlier rants, if you think about it. Exceptions are a form of nonlocal control flow.)

  • The Old New Thing

    PulseEvent is fundamentally flawed

    • 44 Comments

    The PulseEvent function releases one thread (or all threads, if manual-reset) which is/are waiting for the pulsed event, then returns the event to the unset state. If no threads happen to be waiting, then the event goes to the unset state without anything happening.

    And there's the flaw.

    How do you know whether the thread that you think is waiting on the event really is? Surely you can't use something like

    SignalSemaphore(hOtherSemaphore);
    WaitForSingleObject(hEvent, INFINITE);
    

    because there is a race between the signal and the wait. The thread that the semaphore is alerting might complete all its work and pulse the event before you get around to waiting for it.

    You can try using the SignalObjectAndWait function, which combines the signal and wait into a single operation. But even then, you can't be sure that the thread is waiting for the event at the moment of the pulse.

    While the thread is sitting waiting for the event, a device driver or part of the kernel itself might ask to borrow the thread to do some processing (by means of a "kernel-mode APC"). During that time, the thread is not in the wait state. (It's being used by the device driver.) If the PulseEvent happens while the thread is being "borrowed", then it will not be woken from the wait, because the PulseEvent function wakes only threads that were waiting at the time the PulseEvent occurs.

    Not only are you (as a user-mode program) unable to prevent kernel mode from doing this to your thread, you cannot even detect that it has occurred.

    (One place where you are likely to see this sort of thing happening is if you have the debugger attached to the process, since the debugger does things like suspend and resume threads, which result in kernel APCs.)

    As a result, the PulseEvent function is useless and should be avoided. It continues to exist solely for backwards compatibility.

    Sidebar: This whole business with kernel APCs also means that you cannot predict which thread will be woken when you signal a semaphore, an auto-reset event, or some other synchronization object that releases a single thread when signalled. If a thread is "borrowed" to service a kernel APC, then when it is returned to the wait list, it "goes back to the end of the line". Consequently, the order of objects waiting for a kernel object is unpredictable and cannot be relied upon.

  • The Old New Thing

    You don't need to run away from home to join the circus

    • 6 Comments

    Last week, I saw a performance of Circus Contraption at The Seattle Center with some friends. We were all left agape by the aerialists as they climbed ropes, hoisted, hung, and balanced themselves high above the ground.

    I thought back to seeing acrobats as a child at the circus and realized how much more impressive they are as you get older and realize how much strength, balance, and just plain nerves are required to accomplish these amazing feats. When you're a kid, nothing is impossible. Hanging upside-down by the crook of your ankles doesn't sound so hard when you're a kid. When you're older, the same feat makes you shiver with excitement.

    To learn more, you can read an article from the University of Washington school newspaper (click through to the "mangled" version to see pictures) or read the blog of an Aerialistas member.

    If you too want to join the circus, you don't need to leave Seattle. You can take flying lessons with Trapezius, the school where the Aerialistas train. Or, as one of my friends discovered, you can go for a broader circus training at The School of Acrobatics and New Circus Arts.

    I hadn't realized how much circus-y stuff there is in Seattle. In addition to Circus Contraption, my friend also pointed out Teatro ZinZanni, a sort of dinner theater circus thing.

  • The Old New Thing

    Using fibers to simplify enumerators, part 5: Composition

    • 11 Comments

    Another type of higher-order enumeration is composition, where one enumerator actually combines the results of multiple enumerators. (Everybody knows about derivation, but composition is another powerful concept in object-oriented programming. We've seen it before when building context menus.)

    In a producer-driven enumerator, you would implement composition by calling the two enumeration functions one after the other. In a consumer-driven enumerator, you would implement composition by wrapping the two enumerators inside a large enumerator which then chooses between the two based on which enumerator was currently active.

    A fiber-based enumerator behaves more like a consumer-driven enumerator, again, with easier state management.

    Let's write a composite enumerator that enumerates everything in the root of your C: drive (no subdirectories), plus everything in the current directory (including subdirectories).

    class CompositeEnumerator : public FiberEnumerator {
    public:
     CompositeEnumerator()
       : m_eFiltered(TEXT("C:\\"))
       , m_eCd(TEXT(".")) { }
    
     LPCTSTR GetCurDir()
        { return m_peCur->GetCurDir(); }
     LPCTSTR GetCurPath()
        { return m_peCur->GetCurPath(); }
     const WIN32_FIND_DATA* GetCurFindData()
        { return m_peCur->GetCurFindData(); }
    
    private:
     void FiberProc();
    
    private:
     FiberEnumerator* m_peCur;
     FilteredEnumerator m_eFiltered;
     DirectoryTreeEnumerator m_eCd;
    };
    
    void CompositeEnumerator::FiberProc()
    {
     FEFOUND fef;
    
     m_peCur = &m_eFiltered;
     while ((fef = m_peCur->Next()) != FEF_DONE &&
            fef != FEF_LEAVEDIR) {
      m_peCur->SetResult(Produce(fef));
     }
    
     m_peCur = &m_eCd;
     while ((fef = m_peCur->Next()) != FEF_DONE) {
      m_peCur->SetResult(Produce(fef));
     }
    }
    

    Sidebar: Our composite enumeration is complicated by the fact that our FilteredEnumerator spits out a FEF_LEAVEDIR at the end, but which we want to suppress, so we have to check for it and eat it.

    In the more common case where the enumerator is generating a flat list, it would be a simple matter of just forwarding the two enumerators one after the other. Something like this:

    void CompositeEnumerator2::FiberProc()
    {
     Enum(&m_eFiltered);
     Enum(&m_eCd);
    }
    
    void CompositeEnumerator2::Enum(FiberEnumerator *pe)
    {
     m_peCur = pe;
     FEFOUND fef;
     while ((fef = m_peCur->Next()) != FEF_DONE) {
      m_peCur->SetResult(Produce(fef));
     }
    }
    

    End sidebar.

    You can try out this CompositeEnumerator with the program you've been playing with for the past few days. Just change the line in main that creates the enumerator to the following:

     CompositeEnumerator e;
    

    Exercise: Gosh, why is the total so unusually large?

    Exercise: How many fibers are there in the program?

    Exercise: Draw a diagram showing how control flows among the various fibers in this program.

    Before you get all excited about fibers, consider the following:

    • Converting a thread to a fiber needs to be coordinated among all the components in the process so that it is converted only once and stays converted until everybody is finished. This means that if you are writing a plug-in that will go into some other process, you probably should avoid fibers, since you don't know what the other components in the process are going to do with fibers.
    • Fibers do not completely solve the one-thread-per-connection problem. They do reduce the context switching, but the memory footprint will still limit you to 2000 fibers per process (assuming a 2GB user-mode address space) since each fiber has a stack, which defaults to 1MB.

    I think that's enough about fibers for now.

  • The Old New Thing

    Using fibers to simplify enumerators, part 4: Filtering

    • 1 Comments

    One type of higher-order enumeration is filtering, where one enumerator takes the output of another enumerator and removes some elements.

    In a producer-driven enumerator, you would implement filtering by substituting a new callback function that responds to callbacks on behalf of the client for items that should be filtered, and forwarding callbacks to the client for items that are not filtered.

    In a consumer-driven enumerator, you would implement composition by wrapping the enumerator inside another enumerator which drives the inner enumerator and forwards items that it wishes the caller to see.

    A fiber-based enumerator behaves more like a consumer-driven enumerator, but,with easier state management.

    Let's write a filter enumerator that removes all directories and suppresses recursing into them.

    class FilteredEnumerator : public FiberEnumerator {
    public:
     FilteredEnumerator(LPCTSTR pszDir) : m_e(pszDir) { }
    
     LPCTSTR GetCurDir()
        { return m_e.GetCurDir(); }
     LPCTSTR GetCurPath()
        { return m_e.GetCurPath(); }
     const WIN32_FIND_DATA* GetCurFindData()
        { return m_e.GetCurFindData(); }
    
    private:
     void FiberProc();
    
    private:
     DirectoryTreeEnumerator m_e;
    };
    
    void FilteredEnumerator::FiberProc()
    {
     FEFOUND fef;
     while ((fef = m_e.Next()) != FEF_DONE) {
      FERESULT fer;
      if (fef == FEF_DIR) {
       fer = FER_SKIP; // don't recurse into directories
      } else {
       fer = Produce(fef);
      }
      m_e.SetResult(fer);
     }
    }
    

    To produce items from this filtered enumerator, we run the real enumerator (m_e) and remove all directories, preventing them from being propagated to the filter's consumer and just responding "skip it" to the real enumerator.

    You can test out this filtered enumerator with the same TestWalk function we've been using for the past few days. The only change you'll need to make is to the main function:

    int __cdecl main(int argc, char **argv)
    {
     ConvertThreadToFiber(NULL);
     FilteredEnumerator e(TEXT("."));
     TestWalk(&e);
     ConvertFiberToThread();
     return 0;
    }
    

    Observe that the program no longer recurses into subdirectories. It just tallies the sizes of the files in the current directory.

    Next time, composition.

  • The Old New Thing

    Using fibers to simplify enumerators, part 3: Having it both ways

    • 28 Comments
    As we discovered in the previous two entries [second], the problem with enumeration is that somebody always loses.

    Now we will use fibers to fight back. Before you decide to use fibers in your programs, make sure to read the dire warnings at the end of this article. My goal here is to show one use of fibers, not to say that fibers are the answer to all your problems. Fibers can create more problems than they solve. We'll come back to all the dire warnings later.

    As with most clever ideas, it has a simple kernel: Use a fiber to run both the caller and the enumerator each on their own stack.

    #include <windows.h>
    #include <shlwapi.h>
    #include <stdio.h>
    #include <strsafe.h>
    
    enum FEFOUND {
     FEF_FILE,          // found a file
     FEF_DIR,           // found a directory
     FEF_LEAVEDIR,      // leaving a directory
     FEF_DONE,          // finished
    };
    
    enum FERESULT {
     FER_CONTINUE,      // continue enumerating
                        // (if directory: recurse into it)
     FER_SKIP,          // skip directory (do not recurse)
    };
    
    class __declspec(novtable) FiberEnumerator {
    public:
     FiberEnumerator();
     ~FiberEnumerator();
    
     FEFOUND Next();
     void SetResult(FERESULT fer) { m_fer = fer; }
     void Skip() { SetResult(FER_SKIP); }
    
     virtual LPCTSTR GetCurDir() = 0;
     virtual LPCTSTR GetCurPath() = 0;
     virtual const WIN32_FIND_DATA* GetCurFindData() = 0;
    
    protected:
     virtual void FiberProc() = 0;
    
     static void DECLSPEC_NORETURN WINAPI
        s_FiberProc(void* pvContext);
    
     FERESULT Produce(FEFOUND fef);
    
    protected:
     void* m_hfibCaller;
     void* m_hfibEnumerator;
     FEFOUND  m_fef;
     FERESULT m_fer;
    };
    
    FiberEnumerator::FiberEnumerator()
     : m_fer(FER_CONTINUE)
    {
     m_hfibEnumerator = CreateFiber(0, s_FiberProc, this);
    }
    
    FiberEnumerator::~FiberEnumerator()
    {
     DeleteFiber(m_hfibEnumerator);
    }
    
    void DECLSPEC_NORETURN FiberEnumerator::
        s_FiberProc(void *pvContext)
    {
     FiberEnumerator* self =
        reinterpret_cast<FiberEnumerator*>(pvContext);
     self->FiberProc();
    
     // Theoretically, we need only produce Done once,
     // but keep looping in case a consumer gets
     // confused and asks for the Next() item even
     // though we're Done.
     for (;;) self->Produce(FEF_DONE);
    }
    

    This helper class does the basic bookkeeping of fiber-based enumeration. At construction, it remembers the fiber that is consuming the enumeration, as well as creating a fiber that will produce the enumeration. At destruction, it cleans up the fiber. The derived class is expected to implement the FiberProc method and call Produce() every so often.

    The real magic happens in the (somewhat anticlimactic) Produce() and Next() methods:

    FERESULT FiberEnumerator::Produce(FEFOUND fef)
    {
     m_fef = fef; // for Next() to retrieve
     m_fer = FER_CONTINUE; // default
     SwitchToFiber(m_hfibCaller);
     return m_fer;
    }
    
    FEFOUND FiberEnumerator::Next()
    {
     m_hfibCaller = GetCurrentFiber();
     SwitchToFiber(m_hfibEnumerator);
     return m_fef;
    }
    

    To Produce() something, we remember the production code, pre-set the enumeration result to its default of FER_CONTINUE, and switch to the consumer fiber. When the consumer fiber comes back with an answer, we return it from Produce().

    To get the next item, we remember the identity of the calling fiber, then switch to the enumerator fiber. This runs the enumerator until it decides to Produce() something, at which point we take the production code and return it.

    That's all there is to it. The m_fef and m_fer members are for passing the parameters and results back and forth across the fiber boundary.

    Okay, with that groundwork out of the way, writing the producer itself is rather anticlimactic.

    Since we want to make things easy for the consumer, we use the interface the consumer would have designed, with some assistance from the helper class.

    class DirectoryTreeEnumerator : public FiberEnumerator {
    public:
     DirectoryTreeEnumerator(LPCTSTR pszDir);
     ~DirectoryTreeEnumerator();
    
     LPCTSTR GetCurDir() { return m_pseCur->m_szDir; }
     LPCTSTR GetCurPath() { return m_szPath; }
     const WIN32_FIND_DATA* GetCurFindData()
        { return &m_pseCur->m_wfd; }
    
    private:
     void FiberProc();
     void Enum();
    
     struct StackEntry {
       StackEntry* m_pseNext;
       HANDLE m_hfind;
       WIN32_FIND_DATA m_wfd;
       TCHAR m_szDir[MAX_PATH];
     };
     bool Push(StackEntry* pse);
     void Pop();
    
    private:
     StackEntry *m_pseCur;
     TCHAR m_szPath[MAX_PATH];
    };
    
    DirectoryTreeEnumerator::
     DirectoryTreeEnumerator(LPCTSTR pszDir)
     : m_pseCur(NULL)
    {
     StringCchCopy(m_szPath, MAX_PATH, pszDir);
    }
    
    DirectoryTreeEnumerator::~DirectoryTreeEnumerator()
    {
     while (m_pseCur) {
       Pop();
     }
    }
    
    bool DirectoryTreeEnumerator::
          Push(StackEntry* pse)
    {
     pse->m_pseNext = m_pseCur;
     m_pseCur = pse;
     return
      SUCCEEDED(StringCchCopy(pse->m_szDir,
                     MAX_PATH, m_szPath)) &&
      PathCombine(m_szPath, pse->m_szDir, TEXT("*.*")) &&
      (pse->m_hfind = FindFirstFile(m_szPath,
           &pse->m_wfd)) != INVALID_HANDLE_VALUE;
    }
    
    void DirectoryTreeEnumerator::Pop()
    {
     StackEntry* pse = m_pseCur;
     if (pse->m_hfind != INVALID_HANDLE_VALUE) {
      FindClose(pse->m_hfind);
     }
     m_pseCur = pse->m_pseNext;
    }
    
    void DirectoryTreeEnumerator::FiberProc()
    {
     Enum();
    }
    
    void DirectoryTreeEnumerator::Enum()
    {
     StackEntry se;
     if (Push(&se)) {
      do {
       if (lstrcmp(se.m_wfd.cFileName, TEXT(".")) != 0 &&
           lstrcmp(se.m_wfd.cFileName, TEXT("..")) != 0 &&
           PathCombine(m_szPath, se.m_szDir, se.m_wfd.cFileName)) {
        FEFOUND fef = (se.m_wfd.dwFileAttributes &
                        FILE_ATTRIBUTE_DIRECTORY) ?
                        FEF_DIR : FEF_FILE;
        if (Produce(fef) == FER_CONTINUE && fef == FEF_DIR) {
         Enum(); // recurse into the subdirectory we just produced
        }
       }
      } while (FindNextFile(se.m_hfind, &se.m_wfd));
     }
     Produce(FEF_LEAVEDIR);
     Pop();
    }
    

    As you can see, this class is a mix of the two previous classes. Like the consumer-based class, information about the item being enumerated is obtained by calling methods on the enumerator object. But like the callback-based version, the loop that generates the objects themselves is a very simple recursive function, with a call to Produce in place of a callback.

    In fact, it's even simpler than the callback-based version, since we don't have to worry about the FER_STOP code. If the consumer wants to stop enumeration, the consumer simply stops calling Next().

    Most of the complexity in the class is just bookkeeping to permit abandoning the enumeration prematurely.

    Okay, let's take this fiber out for a spin. You can use the same TestWalk function as last time, but for added generality, change the first parameter from DirectoryTreeEnumerator* to FiberEnumerator*. (The significance of this will become apparent next time.)

    A little tweak needs to be made to the main function, though.

    int __cdecl main(int argc, char **argv)
    {
     ConvertThreadToFiber(NULL);
     DirectoryTreeEnumerator e(TEXT("."));
     TestWalk(&e);
     ConvertFiberToThread();
     return 0;
    }
    

    Since the enumerator is going to switch between fibers, we'd better convert the thread to a fiber so it'll have something to switch back to!

    Here's a schematic of what happens when you run this fiber-based enumerator:

    ConvertThreadToFiber
    Main fiber
    construct DirectoryTreeEnumerator Enumerator fiber
    CreateFiber (not running)
    initialize variables  
    Next(CONTINUE)  
    SwitchToFiber() starts running
      FindFirstFile etc
      Produce(FILE)
    Next(CONTINUE) "returns" FILE SwitchToFiber()
    use the result  
    Next(CONTINUE)  
    SwitchToFiber() Produce(FILE) "returns" CONTINUE
      FindNextFile etc
      Produce(DIR)
    Next(CONTINUE) "returns" DIR SwitchToFiber()
    use the result  
    Next(CONTINUE)  
    SwitchToFiber() Produce(DIR) "returns" CONTINUE
    and so on... until...
      Produce(DONE)
    Next(CONTINUE) "returns" DONE SwitchToFiber()
    cleanup  
    DeleteFiber  
    destruct DirectoryTreeEnumerator
    ConvertFiberToThread

    Observe that from each fiber's point of view, the other fiber is just a subroutine!

    Coding subtlety: Why do we capture the caller's fiber each time the Next() method is called? Why not capture it when the FiberEnumerator is constructed?

    Next time, we'll see how this fiber-based enumerator easily admits higher-order operations such as filtering and composition.

    Dire warnings about fibers

    Fibers are like dynamite. Mishandle them and your process explodes.

    The first dire warning is that fibers are expensive in terms of address space, since each one gets its own stack (typically a megabyte).

    And since each fiber has its own stack, it also has its own exception chain. This means that if a fiber throws an exception, only that fiber can catch it. (Same as threads.) That's a strong argument against using an STL std::stack object to maintain our state: STL is based on an exception-throwing model, but you can't catch exceptions raised by another fiber. (You also can't throw exceptions past a COM boundary, which severely limits how much you can use STL in a COM object.)

    One of the big problems with fibers is that everybody has to be in cahoots. You need to decide on one person who will call the ConvertThreadToFiber function since fiber/thread conversion is not reference-counted. If two people call ConvertThreadToFiber on the same thread, the first will convert it, and so will the second! This results in two fibers for the same thread, and things can only get worse from there.

    You might think, "Well, wouldn't the GetCurrentFiber function return NULL if the thread hasn't been converted to a fiber?" Try it: It returns garbage. (It's amazing how many people ask questions without taking even the slightest steps towards figuring out the answer themselves. Try writing a test program.)

    But even if GetCurrentFiber told you whether or not the thread had been converted to a fiber, that still won't help. Suppose two people want to do fibrous activity on the thread. The first converts, the second notices that the thread is already a fiber (somehow) and skips the conversion. Now the first operation completes and calls the ConvertFiberToThread function. Oh great, now the second operation is stranded doing fibrous activity without a fiber!

    Therefore, you can use fibers safely only if you control the thread and can get all your code to agree on who controls the fiber/thread conversion.

    An important consequence of the "in cahoots" rule is that you have to make sure all the code you use on a fiber is "fiber-safe" - a level of safety even beyond thread-safety. The C runtime library keeps information in per-thread state: There's errno, all sorts of bonus bookkeeping when you create a thread, or call various functions that maintain state in per-thread data (such as strerror, _fcvt, and strtok).

    In particular, C++ exception handling is managed by the runtime, and the runtime tracks this data in per-thread state (rather than per-fiber state). Therefore, if you throw a C++ exception from a fiber, strange things happen.

    (Note: Things may have changed in the C runtime lately; I'm operating from information that's a few years old.)

    Even if you carefully avoid the C runtime library, you still have to worry about any other libraries you use that use per-thread data. None of them will work with fibers. If you see a call to the TlsAlloc function, then there's a good chance that the library is not fiber-safe. (The fiber-safe version is the FlsAlloc function.)

    Another category of things that are not fiber-safe are windows. Windows have thread affinity, not fiber affinity.

  • The Old New Thing

    Using fibers to simplify enumerators, part 2: When life is easier for the caller

    • 16 Comments
    Last time, we looked at how a directory tree enumerator function would have been written if the person writing the enumerator (the producer) got to write the spec. Now let's look at what it would look like if the person consuming the enumerator wrote the spec:
    #include <windows.h>
    #include <shlwapi.h>
    #include <stdio.h>
    #include <strsafe.h>
    
    enum FEFOUND {
     FEF_FILE,          // found a file
     FEF_DIR,           // found a directory
     FEF_LEAVEDIR,      // leaving a directory
     FEF_DONE,          // finished
    };
    
    enum FERESULT {
     FER_CONTINUE,      // continue enumerating
                        // (if directory: recurse into it)
     FER_SKIP,          // skip directory (do not recurse)
    };
    
    class DirectoryTreeEnumerator {
    public:
      DirectoryTreeEnumerator(LPCTSTR pszDir);
    
      FEFOUND Next();
      void SetResult(FERESULT fer);
      void Skip() { SetResult(FER_SKIP); }
    
      LPCTSTR GetCurDir();
      LPCTSTR GetCurPath();
      const WIN32_FIND_DATA* GetCurFindData();
    private:
        ... implementation ...
    };
    

    Under this design, the enumerator spits out files, and the caller tells the enumerator when to move on to the next one, optionally indicating that an enumerated directory should be skipped rather than recursed into.

    Notice that there is no FER_STOP result code. If the consumer wants to stop enumerating, it will merely stop calling Next().

    With this design, our test function that computes the inclusive and exclusive sizes of each directory is quite simple:

    ULONGLONG FileSize(const WIN32_FIND_DATA *pwfd)
    {
      return 
        ((ULONGLONG)pwfd->nFileSizeHigh << 32) +
        pwfd->nFileSizeLow;
    }
    
    ULONGLONG TestWalk(DirectoryTreeEnumerator* penum)
    {
     ULONGLONG ullSizeSelf = 0;
     ULONGLONG ullSizeAll = 0;
     for (;;) {
      FEFOUND fef = penum->Next();
      switch (fef) {
      case FEF_FILE:
       ullSizeSelf += FileSize(penum->GetCurFindData());
       break;
    
      case FEF_DIR:
       ullSizeAll += TestWalk(penum);
       break;
    
      case FEF_LEAVEDIR:
       ullSizeAll += ullSizeSelf;
       printf("Size of %s is %I64d (%I64d)\n",
        penum->GetCurDir(), ullSizeSelf, ullSizeAll);
       return ullSizeAll;
    
      case FEF_DONE:
       return ullSizeAll;
      }
     }
     /* notreached */
    }
    
    int __cdecl main(int argc, char **argv)
    {
     DirectoryTreeEnumerator e(TEXT("."));
     TestWalk(&e);
     return 0;
    }
    

    Of course, this design puts all the work on the enumerator. Instead of letting the producer walking the tree and calling the callback as it finds things, the caller calls Next() repeatedly, and each time, the enumerator has to find the next file and return it. Since the enumerator returns, it can't store its state in the call stack; instead it has to mimic the call stack manually with a stack data structure.

    class DirectoryTreeEnumerator {
    public:
     DirectoryTreeEnumerator(LPCTSTR pszDir);
     ~DirectoryTreeEnumerator();
    
     FEFOUND Next();
     void SetResult(FERESULT fer)
      { m_es = fer == FER_SKIP ? ES_SKIP : ES_NORMAL; }
     void Skip() { SetResult(FER_SKIP); }
    
     LPCTSTR GetCurDir()
        { return m_pseCur->m_szDir; }
     LPCTSTR GetCurPath()
        { return m_szPath; }
     const WIN32_FIND_DATA* GetCurFindData()
        { return &m_pseCur->m_wfd; }
    
    private:
     struct StackEntry {
      StackEntry *m_pseNext;
      HANDLE m_hfind;
      WIN32_FIND_DATA m_wfd;
      TCHAR m_szDir[MAX_PATH];
     };
    
     StackEntry* Push(LPCTSTR pszDir);
     void StopDir();
     bool Stopped();
     void Pop();
    
     enum EnumState {
      ES_NORMAL,
      ES_SKIP,
      ES_FIRST,
     };
    
     StackEntry *m_pseCur;
     EnumState m_es;
     TCHAR m_szPath[MAX_PATH];
    };
    
    DirectoryTreeEnumerator::StackEntry*
    DirectoryTreeEnumerator::Push(
        LPCTSTR pszDir)
    {
     StackEntry* pse = new StackEntry();
     if (pse &&
         SUCCEEDED(StringCchCopy(pse->m_szDir,
                     MAX_PATH, pszDir)) &&
         PathCombine(m_szPath, pse->m_szDir,
                      TEXT("*.*")) &&
         (pse->m_hfind = FindFirstFile(m_szPath,
           &pse->m_wfd)) != INVALID_HANDLE_VALUE) {
      pse->m_pseNext = m_pseCur;
      m_es = ES_FIRST;
      m_pseCur = pse;
     } else {
      delete pse;
      pse = NULL;
     }
     return pse;
    }
    
    void DirectoryTreeEnumerator::StopDir()
    {
     StackEntry* pse = m_pseCur;
     if (pse->m_hfind != INVALID_HANDLE_VALUE) {
      FindClose(pse->m_hfind);
      pse->m_hfind = INVALID_HANDLE_VALUE;
     }
    }
    
    bool DirectoryTreeEnumerator::Stopped()
    {
     return m_pseCur->m_hfind == INVALID_HANDLE_VALUE;
    }
    
    void DirectoryTreeEnumerator::Pop()
    {
     StackEntry* pse = m_pseCur;
     m_pseCur = pse->m_pseNext;
     delete pse;
    }
    
    DirectoryTreeEnumerator::~DirectoryTreeEnumerator()
    {
     while (m_pseCur) {
      StopDir();
      Pop();
     }
    }
    
    DirectoryTreeEnumerator::
        DirectoryTreeEnumerator(LPCTSTR pszDir)
     : m_pseCur(NULL)
    {
     Push(pszDir);
    }
    
    FEFOUND DirectoryTreeEnumerator::Next()
    {
     for (;;) {
      /* Anything to enumerate? */
      if (!m_pseCur) return FEF_DONE;
    
      /* If just left a directory, pop */
      if (Stopped()) {
       Pop();
       m_es = ES_NORMAL;
      }
    
      /* If accepted a directory, recurse */
      else if (m_es == ES_NORMAL &&
          (m_pseCur->m_wfd.dwFileAttributes &
                          FILE_ATTRIBUTE_DIRECTORY)) {
       Push(m_szPath);
      }
    
      /* Any more files in this directory? */
      if (m_es != ES_FIRST &&
           !FindNextFile(m_pseCur->m_hfind,
                 &m_pseCur->m_wfd)) {
       StopDir();
       return FEF_LEAVEDIR;
      }
    
      /* Don't recurse into . or .. */
      if (lstrcmp(m_pseCur->m_wfd.cFileName,
                       TEXT(".")) == 0 ||
          lstrcmp(m_pseCur->m_wfd.cFileName,
                       TEXT("..")) == 0 ||
          !PathCombine(m_szPath, m_pseCur->m_szDir,
                       m_pseCur->m_wfd.cFileName)) {
       m_es = ES_NORMAL;
       continue;
      }
    
      /* Return this found item */
      m_es = ES_NORMAL; /* default state */
      if (m_pseCur->m_wfd.dwFileAttributes &
                          FILE_ATTRIBUTE_DIRECTORY) {
       return FEF_DIR;
      } else {
       return FEF_FILE;
      }
     }
     /* notreached */
    }
    

    Yuck-o-rama. The simple recursive function has turned into this horrible mess of state management.

    Wouldn't it be great if we could have it both ways? The caller would see a simple enumerator that spits out files (or directories). But the enumerator sees a callback that it can throw files into.

    We'll build that next time.

  • The Old New Thing

    Using fibers to simplify enumerators, part 1: When life is easier for the enumerator

    • 28 Comments

    The COM model for enumeration (enumeration objects) is biased towards making life easy for the consumer and hard for the producer. The enumeration object (producer) needs to be structured as a state machine, which can be quite onerous for complicated enumerators, for example, tree walking or composite enumeration.

    On the other hand, the callback model for producer (used by most Win32 functions) is biased towards making life easy for the enumerator and hard for the consumer. This time, it is the consumer that needs to be structured as a state machine, which is more work if the consumer is doing something complicated with each callback. (And even if not, you have to create a context structure to pass state from the caller, through the enumerator, to the callback.)

    For example, suppose we want to write a routine that walks a directory structure, allowing the caller to specify what to do at each decision point. Let's design this first using the callback paradigm:

    #include <windows.h>
    #include <shlwapi.h>
    #include <stdio.h>
    
    enum FERESULT {
     FER_CONTINUE,      // continue enumerating
                        // (if directory: recurse into it)
     FER_SKIP,          // skip this file/directory
     FER_STOP,          // stop enumerating
    };
    
    enum FEOPERATION {
     FEO_FILE,          // found a file
     FEO_DIR,           // found a directory
     FEO_LEAVEDIR,      // leaving a directory
    };
    
    typedef FERESULT (CALLBACK *FILEENUMCALLBACK)
        (FEOPERATION feo,
         LPCTSTR pszDir, LPCTSTR pszPath,
         const WIN32_FIND_DATA* pwfd,
         void *pvContext);
    
    FERESULT EnumDirectoryTree(LPCTSTR pszDir,
        FILEENUMCALLBACK pfnCB, void* pvContext);
    

    The design here is that the caller calls EnumDirectoryTree and provides a callback function that is informed of each file found and can decide how the enumeration should proceed.

    Designing this as a callback makes life much simpler for the implementation of EnumDirectoryTree.

    FERESULT EnumDirectoryTree(
        LPCTSTR pszDir,
        FILEENUMCALLBACK pfnCB, void *pvContext)
    {
     FERESULT fer = FER_CONTINUE;
     TCHAR szPath[MAX_PATH];
     if (PathCombine(szPath, pszDir, TEXT("*.*"))) {
      WIN32_FIND_DATA wfd;
      HANDLE hfind = FindFirstFile(szPath, &wfd);
      if (hfind != INVALID_HANDLE_VALUE) {
       do {
        if (lstrcmp(wfd.cFileName, TEXT(".")) != 0 &&
            lstrcmp(wfd.cFileName, TEXT("..")) != 0 &&
            PathCombine(szPath, pszDir, wfd.cFileName)) {
         FEOPERATION feo = (wfd.dwFileAttributes &
                         FILE_ATTRIBUTE_DIRECTORY) ?
                         FEO_DIR : FEO_FILE;
         fer = pfnCB(feo, pszDir, szPath, &wfd, pvContext);
         if (fer == FER_CONTINUE) {
          if (feo == FEO_DIR) {
           fer = EnumDirectoryTree(szPath, pfnCB, pvContext);
           if (fer == FER_CONTINUE) {
            fer = pfnCB(FEO_LEAVEDIR, pszDir, szPath,
                        &wfd, pvContext);
           }
          }
         } else if (fer == FER_SKIP) {
          fer = FER_CONTINUE;
         }
        }
       } while (FindNextFile(hfind, &wfd));
       FindClose(hfind);
      }
     }
     return fer;
    }
    

    Note: I made no attempt to make this function at all efficient since that's not my point here. It's highly wasteful of stack space (which can cause problems when walking deep directory trees). This function also doesn't like paths deeper than MAX_PATH; fixing this is beyond the scope of this series. Nor do I worry about reparse points, which can induce infinite loops if you're not careful.

    Well, that wasn't so hard to write. But that's because we made life hard for the consumer. The consumer needs to maintain state across each callback. For example, suppose you wanted to build a list of directories and their sizes (both including and excluding subdirectories).

    class EnumState {
    public:
     EnumState()
       : m_pdirCur(new Directory(NULL)) { }
     ~EnumState() { Dispose(); }
     FERESULT Callback(FEOPERATION feo,
        LPCTSTR pszDir, LPCTSTR pszPath,
        const WIN32_FIND_DATA* pwfd);
     void FinishDir(LPCTSTR pszDir);
    
    private:
    
     struct Directory {
      Directory(Directory* pdirParent)
       : m_pdirParent(pdirParent)
       , m_ullSizeSelf(0)
       , m_ullSizeAll(0) { }
      Directory* m_pdirParent;
      ULONGLONG m_ullSizeSelf;
      ULONGLONG m_ullSizeAll;
     };
     Directory* Push();
     void Pop();
     void Dispose();
    
     Directory* m_pdirCur;
    };
    
    EnumState::Directory* EnumState::Push()
    {
     Directory* pdir = new Directory(m_pdirCur);
     if (pdir) {
      m_pdirCur = pdir;
     }
     return pdir;
    }
    
    void EnumState::Pop()
    {
     Directory* pdir = m_pdirCur->m_pdirParent;
     delete m_pdirCur;
     m_pdirCur = pdir;
    }
    
    void EnumState::Dispose()
    {
     while (m_pdirCur) {
      Pop();
     }
    }
    
    void EnumState::FinishDir(LPCTSTR pszDir)
    {
      m_pdirCur->m_ullSizeAll +=
        m_pdirCur->m_ullSizeSelf;
      printf("Size of %s is %I64d (%I64d)\n",
       pszDir, m_pdirCur->m_ullSizeSelf,
       m_pdirCur->m_ullSizeAll);
    }
    
    ULONGLONG FileSize(const WIN32_FIND_DATA *pwfd)
    {
      return 
        ((ULONGLONG)pwfd->nFileSizeHigh << 32) +
        pwfd->nFileSizeLow;
    }
    
    FERESULT EnumState::Callback(FEOPERATION feo,
        LPCTSTR pszDir, LPCTSTR pszPath,
        const WIN32_FIND_DATA* pwfd)
    {
     if (!m_pdirCur) return FER_STOP;
    
     switch (feo) {
     case FEO_FILE:
      m_pdirCur->m_ullSizeSelf += FileSize(pwfd);
      return FER_CONTINUE;
    
     case FEO_DIR:
      if (Push()) {
       return FER_CONTINUE;
      } else {
       return FER_SKIP;
      }
    
     case FEO_LEAVEDIR:
      FinishDir(pszPath);
    
     /* Propagate size into parent */
      m_pdirCur->m_pdirParent->m_ullSizeAll +=
        m_pdirCur->m_ullSizeAll;
      Pop();
      return FER_CONTINUE;
    
     default:
      return FER_CONTINUE;
     }
     /* notreached */
    }
    
    FERESULT CALLBACK EnumState_Callback(
        FEOPERATION feo,
        LPCTSTR pszDir, LPCTSTR pszPath,
        const WIN32_FIND_DATA* pwfd,
        void* pvContext)
    {
     EnumState* pstate =
        reinterpret_cast<EnumState*>(pvContext);
     return pstate->Callback(feo, pszDir,
                pszPath, pwfd);
    }
    
    int __cdecl main(int argc, char **argv)
    {
     EnumState state;
     if (EnumDirectoryTree(TEXT("."),
            EnumState_Callback,
            &state) == FER_CONTINUE) {
      state.FinishDir(TEXT("."));
     }
     return 0;
    }
    

    Boy that sure was an awful lot of typing, and what's worse, the whole structure of the program has been obscured by the explicit state management. It sure is hard to tell at a glance what this chunk of code is trying to do. Instead, you have to stare at the EnumState class and reverse-engineer what's going on.

    (Yes, I could have simplified this code a little by using a built-in stack class, but as I have already noted in the context of smart pointers, I try to present these articles in "pure" C++ so people won't get into arguments about which class library is best.)

    Tomorrow, we'll look at how the world would be if the function EnumDirectoryTree were spec'd out by the caller rather than the enumerator!

Page 369 of 431 (4,308 items) «367368369370371»