August, 2011

  • The Old New Thing

    Stupid command-line trick: Counting the number of lines in stdin

    • 42 Comments

    On unix, you can use wc -l to count the number of lines in stdin. Windows doesn't come with wc, but there's a sneaky way to count the number of lines anyway:

    some-command-that-generates-output | find /c /v ""
    

    It is a special quirk of the find command that the null string is treated as never matching. The /v flag reverses the sense of the test, so now it matches everything. And the /c flag returns the count.

    It's pretty convoluted, but it does work.

    (Remember, I provide the occasional tip on batch file programming as a public service to those forced to endure it, not as an endorsement of batch file programming.)

    Now come da history: Why does the find command say that a null string matches nothing? Mathematically, the null string is a substring of every string, so it should be that if you search for the null string, it matches everything. The reason dates back to the original MS-DOS version of find.exe, which according to the comments appears to have been written in 1982. And back then, pretty much all of MS-DOS was written in assembly language. (If you look at your old MS-DOS floppies, you'll find that find.exe is under 7KB in size.) Here is the relevant code, though I've done some editing to get rid of distractions like DBCS support.

            mov     dx,st_length            ;length of the string arg.
            dec     dx                      ;adjust for later use
            mov     di, line_buffer
    lop:
            inc     dx
            mov     si,offset st_buffer     ;pointer to beg. of string argument
    
    comp_next_char:
            lodsb
            cmp     al,byte ptr [di]
            jnz     no_match
    
            dec     dx
            jz      a_matchk                ; no chars left: a match!
            call    next_char               ; updates di
            jc      no_match                ; end of line reached
            jmp     comp_next_char          ; loop if chars left in arg.
    

    If you're rusty on your 8086 assembly language, here's how it goes in pseudocode:

     int dx = st_length - 1;
     char *di = line_buffer;
    lop:
     dx++;
     char *si = st_buffer;
    comp_next_char:
     char al = *si++;
     if (al != *di) goto no_match;
     if (--dx == 0) goto a_matchk;
     if (!next_char(&di)) goto no_match;
     goto comp_next_char;
    

    In sort-of-C, the code looks like this:

     int l = st_length - 1;
     char *line = line_buffer;
    
     l++;
     char *string = st_buffer;
     while (*string++ == *line && --l && next_char(&line)) {} 
    

    The weird - 1 followed by l++ is an artifact of code that I deleted, which needed the decremented value. If you prefer, you can look at the code this way:

     int l = st_length;
     char *line = line_buffer;
     char *string = st_buffer;
     while (*string++ == *line && --l && next_char(&line)) {} 
    

    Notice that if the string length is zero, there is an integer underflow, and we end up reading off the end of the buffers. The comparison loop does stop, because we eventually hit bytes that don't match. (No virtual memory here, so there is no page fault when you run off the end of a buffer; you just keep going and reading from other parts of your data segment.)

    In other words, due to an integer underflow bug, a string of length zero was treated as if it were a string of length 65536, which doesn't match anywhere in the file.

    This bug couldn't be fixed, because by the time you got around to trying, there were already people who discovered this behavior and wrote batch files that relied on it. The bug became a feature.

    The integer underflow was fixed, but the code is careful to treat null strings as never matching, in order to preserve existing behavior.

    Exercise: Why is the loop label called lop instead of loop?

  • The Old New Thing

    Starting up inside the box

    • 41 Comments

    the shell team received two customer questions about a month apart which seemed unrelated but had the same root cause.

    I found that in Windows Vista, the xcopy command is ten times slower than it was in Windows XP. What is the source of this slowdown, and how can I fix it?
    We have an application which takes a very long time to start up on Windows Vista than it did in Windows XP. We noticed that the slowdown occurs only if we set the application to autostart.

    Let's look at the second one first, since that customer provided a useful piece of information: The slowdown occurs only if they set the program to run automatically at logon. In Windows Vista, programs which are set to run automatically at logon run with reduced priority. This was done in response to the fact that application developers went angling for a bonus and decided to slow down the operating system overall in order to get their program to start up faster. To counteract this tragedy of the commons, the performance team runs these programs inside a job object with reduced CPU, I/O, and paging priority—which the performance team informally calls boxing— for 60 seconds, so that the user isn't forced to sit and wait for all these startup programs to finish doing whatever "really important" stuff they want to do.

    Okay, back to the first customer, the one who reported that xcopy was taking a long time. It took a bit of back-and-forth, but eventually the customer revealed that they were performing the xcopy in a batch file which they placed in the Startup group. Once they volunteered that information, the reason for the slowdown became obvious: Their batch file was running inside the box, and consequently ran with low-priority I/O.

    There is no way to escape the box, but it so happens that logon-triggered scheduled tasks are not placed inside a box. That's your escape hatch. Don't abuse it. (Of course, now that I've told everybody how to avoid being put in a box, everybody will now take advantage of it, because eventually, nothing is special any more.)

    Oh, and if you look more closely at the Delay_Sec setting on a Windows 7 machine, you'll see that it's set to zero, so the boxing behavior is effectively disabled on Windows 7. I guess the performance team gave up. "Fine, if you want your computer to run like a dog when it starts up, then go right ahead. I won't try to save you from yourself any more."

    Bonus chatter: You can explicitly "put yourself inside a box" by using the PROCESS_MODE_BACKGROUND_BEGIN process priority mode. Programs which are intended to run in the background with minimal impact on the rest of the system can use this mode.

  • The Old New Thing

    Why does the Shift+F10 menu differ from the right-click menu?

    • 35 Comments

    The Shift+F10 key is a keyboard shortcut for calling up the context menu on the selected item. but if you look closely, you might discover that the right-click menu and the Shift+F10 menu differ in subtle ways. Shouldn't they be the same? After all, that's the point of being a keyboard shortcut, right?

    Let's set aside the possibility that a program might be intentionally making them different, in violation of UI guidelines. For example, a poorly-designed program might use the WM_RBUTTON­UP message as the trigger to display the context menu instead of using the WM_CONTEXT­MENU message, in which case Shift+F10 won't do anything at all. Or the poorly-designed program may specifically detect that the WM_CONTEXT­MENU message was generated from the keyboard and choose to display a different menu. (This on top of the common error of forgetting to display a keyboard-invoked context menu at the currently selected item.) If somebody intentionally makes them different, then they'll be different.

    Okay, so the program is not intentionally creating a distinction between mouse-initiated and keyboard-initiated context menus. Shift+F10 and right-click both generate the WM_CONTEXT­MENU message, and therefore the same menu-displaying code is invoked. The subtle difference is that when you press Shift+F10, the shift key is down, and as we all know, holding the shift key while calling up a context menu is a Windows convention for requesting the extended context menu rather than the normal context menu.

    You get a different menu not because the program is going out of its way to show you a different menu, but because the use of the shift key accidentally triggers the extended behavior. It's like why when you look at yourself in the mirror, your eyes are always open, or why when you call your own phone number, the line is always busy. To avoid this, use the Menu key (confusingly given the virtual key name VK_APPS) to call up the context menu. (This is the key that has a picture of a menu on it, usually to the right of your space bar.) When you press that key, the code which decides whether to show a normal or extended context menu will see that the shift key is not held down, and it'll go for the normal context menu.

    Of course, you can also press Shift+AppMenu, but then you'll have come full circle.

  • The Old New Thing

    Why can you set each monitor to a different color depth?

    • 31 Comments

    Random832 seemed horrified by the fact that it is possible to run multiple monitors, with different color formats on each monitor. "Seriously, why does it let you do that?"

    Well, of course you can do that. Why shouldn't it let you do that?

    When multiple monitors were introduced to Windows, video hardware was nowhere near as advanced as it is today. One common discovery was that your computer, which came with a video card in one of the expansion slots, actually had a video chip baked into the motherboard, but which was disabled in the BIOS. In other words, your computer was actually multiple-monitor-capable; it's just that the capability was disabled.

    Once you got it enabled, you would discover that the onboard video adapter was not as good as the one in the expansion slot. (Duh. If it were as good as the one in the expansion slot, then the manufacturer would have saved a lot of money and not bothered shipping a video card in the expansion slot!) Usually, the onboard video card didn't have a lot of video RAM. You still want to run it at 1024×768 (hey, that's high resolution for these days), but in order to do that, you need to reduce the color depth. On the other hand, the card in the expansion slot has a huge amount of video RAM (four megabytes!), so you take advantage of it by running at a higher color depth.

    You're now getting the most out of your machine; each video card is cranked up as high as it can go. (The lame-o card doesn't go very high, of course.) What could possibly be wrong with that?

    Bonus chatter: It so happened that some of these "secret video card" motherboards had a feature where they would disable the ROM BIOS on the on-board video card if they detected a plug-in video card. To get multi-monitor support on these recalcitrant machines, one of my colleagues wrote a tool that you used like this:

    • Turn off the computer and remove the plug-in video card.
    • Boot the computer with only the lame-o on-board video.
    • Run the tool, which captures the ROM BIOS.
    • Turn off the computer, put the plug-in video card back in, and boot the computer again.
    • Run the tool again and tell it "Hey, take that ROM BIOS you captured and map it into memory where a ROM BIOS would normally go, and then initialize the video card from it and add it as my second monitor."

    It was a bit of a hassle, but it let you squeak crappy multi-monitor support out of these lame-o motherboards.

  • The Old New Thing

    Why doesn't the Open Files list in the Shared Folders snap-in show all my open files?

    • 30 Comments

    A customer wanted a way to determine which users were using specific files on their server. They fired up the Shared Folders MMC snap-in and went to the Open Files list. They found that the results were inconsistent. Some file types like .exe and .pdf did show up in the list when they were open, but other file types like .txt did not. The customer asked for an explanation of the inconsistency and for a list of which file types work and which ones don't.

    The customer is confusing two senses of the term open file. From the file system point of view, an open file is one that has an outstanding handle reference. This is different from the user interface concept of "There is an open window on my screen showing the contents of the file."

    The Open Files list shows files which are open in the file system sense, not in the user interface sense.

    Whether a file shows up in the Open Files list depends on the application that is used to open the file (in the user interface sense). Text files are typically opened by Notepad, and Notepad reads the entire contents of the file into memory and closes the file handle. Therefore, the file is open (in the file system sense) only when it is in the process of being loaded or saved.

    There is no comprehensive list of which types of files fall into which category because the behavior is not a function of the file type but rather a function of the application being used to view the file. (If you open a .txt file in Word, I believe it will keep the file system handle open until you close the document window.)

    The customer seemed satisfied with the explanation. They ran some experiments and observed that Hey, check it out, if I load a really big text file into Notepad, I can see it show up in the Open Files list momentarily. They never did come back with any follow-up questions, so I don't know how they went about solving the original problem. (Maybe they used a SACL to audit who was opening the files.)

  • The Old New Thing

    Microspeak: Dogfood

    • 28 Comments

    Everybody knows about the Microspeak term dogfood. It refers to the practice of taking the product you are working on and using it in production.¹ For the Windows team, it means installing a recent build of Windows on your own computer as well as onto a heavily-used file server. For the Office team, it means using a recent build of Office for all your documents. For the Exchange team, it means moving the entire product team to a server running a recent build of Exchange. You get the idea.

    Purists would restrict the use of the word dogfood to refer to a product group using its own product, but in practice the meaning has been generalized to encompass using a prerelease product in a production environment. The Windows team frequently dogfoods recent builds of Office and Exchange Server. Actually, the Exchange Server case is one of double-dogfood,² for not only is the server running a prerelease version of Exchange Server, it's doing so atop a prerelease version of Windows!

    Dogfooding does have its costs. For example, the prerelease version of Exchange Server might uncover a bug in the prerelease version of Windows. While the problem is investigated, the Windows division can't send email. These outages are comparatively rare, although they are quite frustrating when they occur. But you have to understand that the whole purpose of dogfooding is to find exactly these sorts of problems so that our customers won't!

    Footnote

    ¹ Despite the efforts of our CIO, the term ice-creaming has not caught on.

    ² I made up the term "double-dogfood" just now. It is not part of Microspeak.

  • The Old New Thing

    You don't make something easier to find by hiding it even more deeply

    • 28 Comments

    Commenter rolfhub suggested that, to help people recover from accidentally going into Tiny Footprint Mode, the Task Manager could display a right-click context menu with an entry to return to normal mode.

    My initial reaction to this was Huh? Who right-clicks on nothing? Tiny Footprint Mode is itself already a bad secret hidden setting. Having the exit from the mode be a right-click menu on a blank space is a double-secret hidden setting.

    If I had dictatorial control over all aspects of the shell, I would put a Restore button  in the upper right corner to let people return to normal mode.

  • The Old New Thing

    The ways people mess up IUnknown::QueryInterface, episode 4

    • 27 Comments

    One of the rules for IUnknown::Query­Interface is so obvious that nobody even bothers to state it explicitly as a rule: "If somebody asks you for an interface, and you return S_OK, then the pointer you return must point to the interface the caller requested." (This feels like the software version of dumb warning labels.)

    During compatibility testing for Windows Vista, we found a shell extension that behaved rather strangely. Eventually, the problem was traced to a broken IUnknown::Query­Interface implementation which depended subtly on the order in which interfaces were queried.

    The shell asked for the IExtract­IconA and IExtract­IconW interfaces in the following order:

    // not the actual code but it gets the point across
    IExtractIconA *pxia;
    IExtractIconW *pxiw;
    punk->QueryInterface(IID_IExtractIconA, &pxia);
    punk->QueryInterface(IID_IExtractIconW, &pxiw);
    

    One particular shell extension would return the same pointer to both queries; i.e., after the above code executed, pxia == pxiw even though neither interface derived from the other. The two interfaces are not binary-compatible, because IExtract­IconA::Get­Icon­Location operates on ANSI strings, whereas IExtract­IconW::Get­Icon­Location operates on Unicode strings.

    The shell called pxiw->Get­Icon­Location, but the object interpreted the szIcon­File as an ANSI string buffer; as a result, when the shell went to look at it, it saw gibberish.

    Further experimentation revealed that if the order of the two Query­Interface calls were reversed, then pxiw->Get­Icon­Location worked as expected. In other words, the first interface you requested "locked" the object into that interface, and asking for any other interface just returned a pointer to the locked interface. This struck me as very odd; coding up the object this way seems to be harder than doing it the right way!

    // this code is wrong - see discussion above
    class CShellExtension : public IExtractIcon
    {
     enum { MODE_UNKNOWN, MODE_ANSI, MODE_UNICODE };
      HRESULT CShellExtension::QueryInterface(REFIID riid, void **ppv)
      {
       *ppv = NULL;
       if (riid == IID_IUnknown) *ppv = this;
       else if (riid == IID_IExtractIconA) {
        if (m_mode == MODE_UNKNOWN) m_mode = MODE_ANSI;
        *ppv = this;
       } else if (riid == IID_IExtractIconW) {
        if (m_mode == MODE_UNKNOWN) m_mode = MODE_UNICODE;
        *ppv = this;
       }
       if (*ppv) AddRef();
       return *ppv ? S_OK : E_NOINTERFACE;
      }
      ... AddRef / Release ...
    
      HRESULT GetIconLocation(UINT uFlags, LPTSTR szIconFile, UINT cchMax,
                              int *piIndex, UINT *pwFlags)
      {
       if (m_mode == MODE_ANSI) lstrcpynA((char*)szIconFile, "foo", cchMax);
       else lstrcpynW((WCHAR*)szIconFile, L"foo", cchMax);
       ...
      }
      ...
    }
    

    Instead of implementing both IExtract­IconA and IExtract­IconW, my guess is that they implemented just one of the interfaces and made it alter its behavior based on which interface it thinks it needs to pretend to be. It never occurred to them that the single interface might need to pretend to be two different things at the same time.

    The right way of supporting two interfaces is to actually implement two interfaces and not write a single morphing interface.

    class CShellExtension : public IExtractIconA, public IExtractIconW
    {
      HRESULT CShellExtension::QueryInterface(REFIID riid, void **ppv)
      {
       *ppv = NULL;
       if (riid == IID_IUnknown ||
           riid == IID_IExtractIconA) {
        *ppv = static_cast<IExtractIconA*>(this);
       } else if (riid == IID_IExtractIconW) {
        *ppv = static_cast<IExtractIconW*>(this);
       }
       if (*ppv) AddRef();
       return *ppv ? S_OK : E_NOINTERFACE;
      }
      ... AddRef / Release ...
    
      HRESULT GetIconLocation(UINT uFlags, LPSTR szIconFile, UINT cchMax,
                              int *piIndex, UINT *pwFlags)
      {
       lstrcpynA(szIconFile, "foo", cchMax);
       return GetIconLocationCommon(uFlags, piIndex, pwFlags);
      }
    
      HRESULT GetIconLocation(UINT uFlags, LPWSTR szIconFile, UINT cchMax,
                              int *piIndex, UINT *pwFlags)
      {
       lstrcpynW(szIconFile, L"foo", cchMax);
       return GetIconLocationCommon(uFlags, piIndex, pwFlags);
      }
      ...
    }
    

    We worked around this in the shell by simply changing the order in which we perform the calls to IUnknown::Query­Interface and adding a comment explaining why the order of the calls is important.

    (This is another example of how the cost of a compatibility fix is small potatoes. The cost of deciding whether or not to apply the fix far exceeds the cost of just doing it for everybody.)

    A different shell extension had a compatibility problem that also was traced back to a dependency on the order in which the shell asked for interfaces. The shell extension registered as a context menu extension, but when the shell tried to create it, it got E_NO­INTERFACE back:

    CoCreateInstance(CLSID_YourAwesomeExtension, NULL,
                     CLSCTX_INPROC_SERVER, IID_IContextMenu, &pcm);
    // returns E_NOINTERFACE?
    

    This was kind of bizarre. I mean, the shell extension went to the effort of registering itself as a context menu extension, but when the shell said, "Okay, it's show time, let's do the context menu dance!" it replied, "Sorry, I don't do that."

    The vendor explained that the shell extension relies on the order in which the shell asked for interfaces. The shell used to create and initialize the extension like this:

    // error checking and other random bookkeeping removed
    IShellExtInit *psei;
    IContextMenu *pcm;
    
    CoCreateInstance(CLSID_YourAwesomeExtension, NULL,
                     CLSCTX_INPROC_SERVER, IID_IShellExtInit, &psei);
    psei->Initialize(...);
    psei->QueryInterface(IID_IContextMenu, &pcm);
    psei->Release();
    // use pcm
    

    We changed the order in a manner that you would think should be equivalent:

    CoCreateInstance(CLSID_YourAwesomeExtension, NULL,
                     CLSCTX_INPROC_SERVER, IID_IContextMenu, &pcm);
    pcm->QueryInterface(IID_IShellExtInit, &psei);
    psei->Initialize(...);
    psei->Release();
    

    (Of course, it's not written in so many words in the code; the various parts are spread out over different components and helper functions, but this is the sequence of calls the shell extension sees.)

    The vendor explained that their shell extension will not respond to any shell extension interfaces (aside from IShell­Ext­Init) until it has been initialized, because it is at that point that they decide which extensions they want to support. Unfortunately, this violates the first of the four explicit rules for IUnknown::Query­Interface, namely that the set of interfaces must be static. (It's okay to have an object expose different interfaces conditionally, as long as it understands that once it says yes or no to a particular interface, it is committed to answering the same way for the lifetime of the object.)

  • The Old New Thing

    A shell extension is a guest in someone else's house; don't go changing the code page

    • 26 Comments

    A customer reported a problem with their shell extension:

    We want to format a floating point number according to the user's default locale. We do this by calling snprintf to convert the value from floating point to text with a period (U+002E) as the decimal separator, then using Get­Number­Format to apply the user's preferred grouping character, decimal separator, etc. We found, however, that if the user is running in (say) German, we find that sometimes (but not always) the snprintf function follows the German locale and uses a comma (U+002C) as the decimal separator with no thousands separator. This format prevents the Get­Number­Format function from working, since it requires the decimal separator to be U+002E. What is the recommended way of formatting a floating point number according to the user's locale?

    The recommended way of formatting a floating point number according to the user's locale is indeed to use a function like snprintf to convert it to text with U+002E as the decimal separator (and other criteria), then use Get­Number­Format to apply the user's locale preferences.

    The snprintf function follows the C/C++ runtime locale to determine how the floating point number should be converted, and the default C runtime locale is the so-called "C" locale which indeed uses U+002E as the decimal separator. Since you're getting U+002C as the decimal separator, somebody must have called set­locale to change the locale from "C" to a German locale, most likely by passing "" as the locale, which means "follow the locale of the environment."

    Our shell extension is running in Explorer. Under what conditions will Explorer call set­locale(LC_NUMERIC, "")? What should we do if the locale is not "C"?

    As it happens, Explorer never calls set­locale. It leaves the locale set to the default value of "C". Therefore, the call to snprintf should have generated a string with U+002E as the decimal separator. Determining who was calling set­locale was tricky since the problem was intermittent, but after a lot of work, we found the culprit: some other shell extension loaded before the customer's shell extension and decided to change the carpet by calling set­locale(LC_ALL, "") in its DLL_PROCESS_ATTACH, presumably so that its calls to snprintf would follow the environment locale. What made catching the miscreant more difficult was that the rogue shell extension didn't restore the locale when it was unloaded (not that that would have been the correct thing to do either), so by the time the bad locale was detected, the culprit was long gone!

    That other DLL used a global setting to solve a local problem. Given the problem "How do I get my calls to snprintf to use the German locale settings?" they decided to change all calls to snprintf to use the German locale settings, even the calls that didn't originate from the DLL itself. What if the program hosting the shell extension had done a set­locale(LC_ALL, "French")? Tough noogies; the rogue DLL just screwed up the host program, which wants to use French locale settings but is now being forced to use German ones. The program probably won't notice that somebody secretly replaced its coffee with Folgers Crystals. It'll be a client who notices that the results are not formatted correctly. The developers of the host program, of course, won't be able to reproduce the problem in their labs, since they don't have the rogue shell extension, and the problem will be classified as "unsolved."

    What both the rogue shell extension and the original customer's shell extension should be using is the _l variety of string formatting functions (in this case _snprintf_l, although _snprintf_s_l is probably better). The _l variety lets you pass an explicit locale which will be used to format that particular string. (You create one of these _locale_t objects by calling _create_locale with the same parameters you would have passed to set­locale.) Using the _l technique solves two problems:

    1. It lets you apply a local solution to a local problem. The locale you specify applies only to the specific call; the process's default locale remains unchanged.
    2. It allows you to ensure that you get the locale you want even if the host process has set a different locale.

    If either the customer's DLL or the rogue DLL had followed this principle of not using a global setting to solve a local problem, the conflict would not have arisen.

  • The Old New Thing

    Ow, I'm too safe!

    • 26 Comments

    One of my friends is a geek, and, naturally, fully researches everything he does, from cement pouring to bicycle parts, perhaps a bit obsessively. He made sure to get five-point restraints for his children's car seats, for example. And he naturally tightens the belts snugly when putting his children in the car.

    At one point, as he was strapping his daughter in, she complained, "Ow! I'm too safe!"

    Because as far as she was concerned, "being safe" was a synonym for "having a tight seat belt." I leave you to figure out how she came to this conclusion.

Page 1 of 3 (26 items) 123