July, 2012

  • The Old New Thing

    Reading the output of a command into a batch file variable

    • 28 Comments

    It's Day Two of Batch File Week. Don't worry, it'll be over in a few days.

    There is no obvious way to read the output of a command into a batch file variable. In unix-style shells, this is done via backquoting.

    x=`somecommand`
    

    The Windows command processor does not have direct backquoting, but you can fake it by abusing the FOR command. Here's the evolution:

    The /F flag to the FOR command says that it should open the file you pass in parentheses and set the loop variable to the contents of each line.

    for /f %%i in (words.txt) do echo [%%i]
    

    The loop variable in the FOR command takes one percent sign if you are executing it directly from the command prompt, but two percent signs if you are executing it from a batch file. I'm going to assume you're writing a batch file, so if you want to practice from the command line, remember to collapse the double percent signs to singles.

    I'm cheating here because I know that words.txt contains one word per line. By default, the FOR command sets the loop variable to the first word of each line. If you want to capture the entire line, you need to change the delimiter.

    for /f "delims=" %%i in (names.txt) do echo [%%i]
    

    There are other options for capturing say the first and third word or whatever. See the FOR /? online help for details.

    Now, parsing files is not what we want, but it's closer. You can put the file name in single quotes to say "Instead of opening this file and reading the contents, I want you to run this command and read the contents." For example, suppose you have a program called printappdir which outputs a directory, and you want a batch file that changes to that directory.

    for /f "delims=" %%i in ('printappdir') do cd "%%i"
    

    We ask the FOR command to run the printappdir program and execute the command cd "%%i" for each line of output. Since the program has only one line of output, the loop executes only once, and the result is that the directory is changed to the path that the printappdir program prints.

    If you want to capture the output into a variable, just update the action:

    for /f %%i in ('printappdir') do set RESULT=%%i
    echo The directory is %RESULT%
    

    If the command has multiple lines of output, then this will end up saving only the last line, since previous lines get overwritten by subsequent iterations.

    But what if the line you want to save isn't the last line? Or what if you don't want the entire line?

    If the command has multiple lines of output and you're interested only in a particular one, you can filter it in the FOR command itself...

    for /f "tokens=1-2,14" %%i in ('ipconfig') do ^
        if "%%i %%j"=="IPv4 Address." set IPADDR=%%k
    

    The above command asked to execute the ipconfig command and extract the first, second, and fourteenth words into loop variable starting with %i. In other words, %i gets the first word, %j gets the second word, and %k gets the fourteenth word. (Exercise: What if you want to extract more than 26 words?)

    The loop then checks each line to see if it begins with "IPv4 Address.", and if so, it saves the fourteenth word (the IP address itself) into the IPADDR variable.

    How did I know that the IP address was the fourteenth word? I counted!

       IPv4 Address. . . . . . . . . . . : 192.168.1.1
       ---- -------- - - - - - - - - - - - -----------
         1      2    3 4 5 6 7 8 9  11  13      14
                                  10  12
    

    That's also why my test includes the period after Address: The first dot comes right after the word Address without an intervening space, so it's considered part of the second "word".

    Somebody thought having the eye-catching dots would look pretty, but didn't think about how it makes parsing a real pain in the butt. (Note also that the above script works only for US-English systems, since the phrase IPv4 Address will change based on your current language.)

    Instead of doing the searching yourself, you can have another program do the filtering, which is important if the parsing you want is beyond the command prompt's abilities.

    for /f "tokens=14" %%i in ('ipconfig ^| findstr /C:"IPv4 Address"') do ^
      set IPADDR=%%i
    

    This alternate version makes the findstr program do the heavy lifting, and then saves the fourteenth word. (But this version will get fooled by the line Autoconfiguration IPv4 Address.)

    Yes I know that you can do this in PowerShell

    foreach ($i in Get-WmiObject Win32_NetworkAdapterConfiguration) {
      if ($i.IPaddress) { $i.IPaddress[0] }
    }
    

    You're kind of missing the point of Batch File Week.

  • The Old New Thing

    Raymond's subjective, unfair, and completely wrong impressions of the opening ceremonies of a major athletic event which took place recently

    • 41 Comments

    Like many other people, I watched the opening ceremonies of a major athletic event which took place a few days ago. (The organization responsible for the event has taken the step of blocking the mention of the name of the city hosting the event and the year the event takes place, or the name of the event itself except in editorial news pieces or journalistic statements of fact, of which this is neither, so I will endeavour to steer clear of the protected marks.)

    I wish somebody had let me know in advance that the opening ceremonies came with a reading list. I hope that at least the British history majors enjoyed it.

    NBC, the media organization which obtained the rights to broadcast the event in the United States, explained that they were not streaming the opening or closing ceremonies live because they "do not translate well online because they require context, which our award-winning production team will provide." And now we learned what sort of contextualization their award-winning production team provided: For Tim Berners-Lee, their valuable context was, "I don't know who that guy is." (The Guardian provides a snarky backgrounder.)

    During the entry of the various national teams, the standard activity is to make fun of their outfits.

    Dear Czech Republic: Spandex shorts and blue rain galoshes? It's as if you're trying to look hideous.

    Dear Germany: Wha??? I'm speechless.

    Dear United States of America: I hope you enjoy your shore leave. (Somebody seriously has a navy fetish going on.)

    Dear Sweden: I know it's late, but you're not supposed to wear your jammy-jams to the opening ceremony. Jag säger bara...

    Dear gracious hosts: Oh, now I get it. It's the 100th anniversary of the sinking of the Titanic. But still, you could've chosen a better tribute than wearing dresses from 1912.

  • The Old New Thing

    Why don't any commands work after I run my batch file? I'm told that they are not recognized as an internal or external command, operable program, or batch file.

    • 30 Comments

    I sort of forgot to celebrate CLR Week last year, so let's say that CLR week is "on hiatus" until next year. To fill the summertime time slot, I'm going to burn off a busted pilot: This week is Batch File Week 2012. Remember, nobody actually enjoys batch programming. It's just something you have to put up with in order to get something done. Batch programming is the COBOL of Windows. (Who knows, if people actually like Batch File Week [fat chance], maybe it'll come back as a regular series.)

    We'll open Batch File Week with a simple puzzle.

    A customer reported that after running their batch file, almost no commands worked any more!

    C:\> awesomebatchfile.bat
    ... awesome batch file does its work ...
    
    C:\> reg query "HKLM\Software\Clients\Mail" /ve
    'reg' is not recognized as an internal or external command,
    operable program or batch file.
    

    Wha? Maybe I can run regedit.

    C:\> regedit
    'regedit' is not recognized as an internal or external command,
    operable program or batch file.
    

    OMG OMG OMG OMG.

    C:\> notepad
    'notepad' is not recognized as an internal or external command,
    operable program or batch file.
    

    Okay, first, sit down and take a deep breath. Maybe take a Chill Pill.

    My first question was "Does awesomebatchfile.bat modify the PATH variable?" (This was, strictly speaking, a psychic debugging question, but a rather obvious one.)

    The customer replied, "Nope. Here, I'll send you the whole thing."

    And there it was, right there at the top of awesomebatchfile.bat:

    set path=C:\awesomedir
    if NOT "%1"=="" set path=%1
    cd /d %path%
    echo Awesomeness commencing in the %path% directory!
    ...
    

    The customer figured it would be convenient to have a variable called path, unaware that this variable has special meaning to the command interpreter. The customer didn't make the connection that their seemingly private variable called path was connected to the system variable of the same name (but by convention capitalized as PATH).

  • The Old New Thing

    Psychic debugging: Why your IContextMenu::InvokeCommand never gets called

    • 12 Comments

    A customer reported a problem with their shell context menu extension.

    I have implemented the IContext­Menu shell extension, but when the user selects my custom menu item, my IContext­Menu::Invoke­Command is never called. Can anyone please let me know what the problem could be and how to fix it?

    Since there really isn't much information provided in this request, I was forced to invoke my psychic powers. Actually, given what you know about shell context menu hosting, you probably know the answer too.

    My psychic powers tell me that you gave your menu item the wrong ID, or you returned the wrong value from IContext­Menu::Query­Context­Menu.

    If the menu IDs do not lie in the range you described by the return value from IContext­Menu::Query­Context­Menu, then when the user chooses the menu item, the item ID will not map to your shell extension. In our sample composite context menu, observe that CComposite­Context­Menu::Reduce­Ordinal relies on the component context menu handlers putting their menu IDs in the range idCmd­First through idCmd­First - return_value - 1. If the two don't line up, then CComposite­Context­Menu::Reduce­Ordinal won't realize that the menu item the user selected corresponds to you.

    We never did hear back from the customer, so the world may never know whether my psychic prediction was correct.

    Bonus chatter: When possible, use a static verb registration instead of an IContext­Menu handler. They are much simpler to implement while still providing a good amount of expressive power.

    You can provide additional information in your registration to control things like the conditions under which your verb should be shown. You can even register cascading submenus statically.

  • The Old New Thing

    A brief and also incomplete history of Windows localization

    • 32 Comments
    The process by which Windows has been localized has changed over the years.

    Back in the days of 16-bit Windows, Windows was developed with a single target language: English.

    Just English.

    After Windows was complete and masters were sent off to the factory for duplication, the development team handed the source code over to the localization teams. "Hey, by the way, we shipped a new version of Windows. Go localize it, will ya?"

    While the code that was written for the English version was careful to put localizable content in resources, there were often English-specific assumptions hard-coded into the source code. For example, it may have assumed that the text reading direction was left-to-right or assumed that a single character fit in a single byte. (Unicode hadn't been invented yet.)

    The first assumption is not true for languages such as Hebrew and Arabic (which read right-to-left), and to a lesser degree Chinese and Japanese (which read top-to-bottom in certain contexts). The second assumption is not true for languages like Chinese, Japanese, and Korean, which use DBCS (double-byte character sets).

    The localization teams made the necessary code changes to make Windows work in these other locales and merged them back into the master code base. The result was that there were three different versions of the code for Windows, commonly known as Western, Middle-East, and Far-East. If you wanted Windows to support Chinese, you had to buy the Far-East version of Windows. And since the code was different for the three versions, they had different sets of bugs, and workarounds for one version didn't always work on the others. (Patches didn't exist back then, there being no mechanism for distributing them.)

    If you ran into a problem with a Western language, like say, German, then you were out of luck, since there was no German Windows code base; it used the same Western code base. Windows 95 tried out a crazy idea: Translate Windows into German during the development cycle, to help catch these only-on-German problems while there was still time to do something about it. This, of course, created significant additional expense, since you had to have translators available throughout the product cycle instead of hiring them just once at the end. I remember catching a few translation errors during Windows 95: A menu item Sort was translated as Art (as in "What sort of person would do this?") rather than Sortieren ("put in a prearranged order"). And a command line tool asked the user a yes/no question, promting "J/N" (Ja/Nein), but if you wanted to answer in the affirmative, you had to type "Y".

    The short version of the answer to the question "Why can't the localizers change the code if they have to?" is "Because the code already shipped. What are you going to do, recall every copy of Windows?"

    At least in Windows 95, the prohibition on changing code was violated if circumstances truly demanded them, but doing so was very expensive. The only one I can think of is the change to remove the time zone highlighting from the world map. And the change was done in the least intrusive possible way: Patching four bytes in the binary to make the highlight and not-highlight colors the same. You dare not do something like introduce a new variable; who knows what kinds of havoc could result!

    Having all these different versions of Windows made servicing very difficult, because you had to develop and test a different patch for each code base. Over the years, the Windows team has developed techniques for identifying these potential localization problems earlier in the development cycle. For a time, Windows was "early-localized" into German and Japanese, so as to cover the Western and Far-East scenarios. Arabic was added later, expanding coverage to the Mid-East cases, and Hindi was added in Windows 7 to cover languages which are Unicode-only.

    Translating each internal build of Windows has its pros and cons: The advantage is that it can find issues when there is still time to make code changes to address them. The disadvantage is that code can change while you are localizing, and those code changes can invalidate the work you've done so far, or render it pointless. For example, somebody might edit a dialog you already spent time translating, forcing you to go back and re-translate it, or at least verify that the old translation still works. Somebody might take a string that you translated and start using it in a new way. Unless they let you know about the new purpose, you won't know that the translation needs to be re-evaluated and possibly revised.

    The localization folks came up with a clever solution which gets most of the benefits while avoiding most of the drawbacks: They invented pseudo-localization, which simulates what Michael Kaplan calls "an eager and hardworking yet naïve intern localizer who is going to translate every single string." This was so successful that they hired a few more naïve intern localizers, one which performed "Mirrored pseudo-localization" (covering languages which read right-to-left) and "East Asian pseudo-localization" (covering Chinese, Japanese, and Korean).

    But the rule prohibiting code changes remains in effect. Changing any code resets escrow, which means that the ship countdown clock gets reset back to its original value and all the testing performed up until that point needs to be redone in order to verify that the change did not affect them.

  • The Old New Thing

    One way to make sure you pass an array of the correct size

    • 61 Comments

    Another entry in the very sporadic series of "very strange code I've seen." The code has been changed to protect the guilty, but the essence has been preserved.

    class Store
    {
    public:
        // Retrieve "count" colors from item "itemId" into "values"
        bool GetItemColors(int itemId, int count, COLORREF *values);
    
        // Set "count" colors from "values" into item "itemId"
        bool SetItemColors(int itemId, int count, const COLORREF *values);
    };
    
    bool CopyUpToFourColors(Store *store1, Store *store2, int itemId, int count)
    {
        COLORREF size1[1];
        COLORREF size2[2];
        COLORREF size3[3];
        COLORREF size4[4];
    
        int *buffer = ((count == 1) ? size1 :
                      ((count == 2) ? size2 :
                      ((count == 3) ? size3 :
                      ((count == 4) ? size4 :
                                      nullptr))));
    
        if (buffer == nullptr)
            return false;
    
        if (!store1->GetItemColors(itemId, count, buffer))
            return false;
    
        if (!store2->SetItemColors(itemId, count, buffer))
            return false;
    
        return true;
    }
    
  • The Old New Thing

    Taking flexitarianism to another, perhaps unintended, level

    • 22 Comments

    Our cafeteria has been trying to encourage flexitarianism, which it defines as eating one meat-free meal per week. But in their effort to make the concept more appealing, they may have lost sight of the goal.

    Italian Sausage Calzone
    Lorem ipsum dolor sit amet, consectetur adipisicing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua.
    Vegetarian
    Option

    (The "Vegetarian Option" magnet was probably intended for the Asparagus, Mushroom and Spinach Pizette just above it.)

    One of my colleagues suggested that the sign was applying the transitive property of vegeterianism: "If you eat that which eats plants, you too eat the plants."

    Fool me twice: The following day, the "Vegetarian Option" magnet was placed on the sign for the meatball pizza. Maybe they're trying to make vegetarians sick?

    Resolution: The cafeteria people apologized for the misplaced magnets (which ended up in the wrong place due to the slow but persistent force of gravity). They implemented an immediate short-term solution of simply changing the order of the items on the menu so that the vegetarian option is at the bottom. (That way, a sliding magnet still ends up in the right place.) The long-term solution is to print the "Vegetarian Option" marker on the menu itself.

  • The Old New Thing

    What's the story behind the WM_SYNCPAINT message?

    • 11 Comments

    Danail wants to know the story behind the WM_SYNC­PAINT message.

    The documentation pretty much tells the story. When a window has been hidden, shown, moved or sized, the system may determine that it needs to send a WM_SYNC­PAINT message to the windows of other threads. This message must be passed to Def­Window­Proc, which will send the WM_NCPAINT and WM_ERASE­BKGND messages to the window as necessary.

    When you call the Set­Window­Pos function, the window manager updates the window size, position, whatever, and then it goes around repainting the windows that were affected by the operation. By default, the Set­Window­Pos function does a quick-repaint of the windows before returning. After the function returns, the normal WM_PAINT message does the real work of painting the window. The quick-repaint is done so that there is immediate feedback that the window did change its size, position, whatever.

    This quick-repaint is done by sending a WM_NCPAINT and WM_ERASE­BKGND message to the windows that were affected by the Set­Window­Pos operation. This normally happens without incident, but if one of the windows affected by the Set­Window­Pos operation belongs to another thread, the window manager needs to get into the context of that other thread to finish the job. That's where WM_SYNC­PAINT comes in. The WM_SYNC­PAINT message means, "Hey, I was going around quick-painting a bunch of windows, but I couldn't quick-paint you (or any other windows on your thread) because I was on the wrong thread. Could you finish quick-painting yourself (and all the other windows that need quick-painting)? Thanks."

    Another way of looking at this is that it is a way for the window manager to teleport itself into another thread so it can finish its work. "Lah di dah, quick-painting all the windows, oh crap, I can't quick-paint that window because it's on the wrong thread. Let me inject myself into that other process [trivial, since I'm the window manager, I'M IN YR PROCESS REEDING YR MSGS], and now I can send a message to myself [WM_SYNCPAINT], and when that other copy of me receives it, he'll finish where I left off."

    If you don't like any of this teleportation or multiple-copies-of-yourself imagery, you can say that the WM_SYNC­PAINT message means, "Quick-paint this window as part of a quick-paint operation begun on another thread."

    If you don't want this quick-paint to take place, you can follow the instructions in the documentation and pass the SWP_DEFER­ERASE flag to suppress the WM_SYNC­PAINT message.

  • The Old New Thing

    The format of icon resources

    • 12 Comments

    It's been a long time since my last entry in the continuing sporadic series on resources formats. Today we'll look at icons.

    Recall that an icon file consists of two parts, an icon directory (consisting of an icon directory header followed by a number of icon directory entries), and then the icon images themselves.

    When an icon is stored in resources, each of those parts gets its own resource entry.

    The icon directory (the header plus the directory entries) is stored as a resource of type RT_GROUP_ICON. The format of the icon directory in resources is slightly different from the format on disk:

    typedef struct GRPICONDIR
    {
        WORD idReserved;
        WORD idType;
        WORD idCount;
        GRPICONDIRENTRY idEntries[];
    } GRPICONDIR;
    
    typedef struct GRPICONDIRENTRY
    {
        BYTE  bWidth;
        BYTE  bHeight;
        BYTE  bColorCount;
        BYTE  bReserved;
        WORD  wPlanes;
        WORD  wBitCount;
        DWORD dwBytesInRes;
        WORD  nId;
    } GRPICONDIRENTRY;
    

    All the members mean the same thing as in the corresponding ICONDIR and IconDirectoryEntry structures, except for that mysterious nId (which replaces the dwImageOffset from the IconDirectoryEntry). To unravel that mystery, we need to look at where the rest of the icon file went.

    In the icon file format, the dwImageOffset represented the location of the icon bitmap within the file. When the icon file is converted to a resource, each icon bitmap is split off into its own resource of type RT_ICON. The resource compiler auto-assigns the resource IDs, and it is those resource IDs that are stored in the nId member.

    For example, suppose you have an icon file with four images. In your resource file you say

    42 ICON myicon.ico
    

    The resource compiler breaks the file into five resources:

    Resource type Resource Id Contents
    RT_GROUP_ICON 42 GRPICONDIR.idCount = 4
    GRPICONDIRENTRY[0].nId = 124
    GRPICONDIRENTRY[1].nId = 125
    GRPICONDIRENTRY[2].nId = 126
    GRPICONDIRENTRY[3].nId = 127
    RT_ICON 124 Pixels for image 0
    RT_ICON 125 Pixels for image 1
    RT_ICON 126 Pixels for image 2
    RT_ICON 127 Pixels for image 3

    Why does Windows break the resources into five pieces instead of just dumping them all inside one giant resource?

    Recall how 16-bit Windows managed resources. Back in 16-bit Windows, a resource was a handle into a table, and obtaining the bits of the resource involved allocating memory and loading it from the disk. Recall also that 16-bit Windows operated under tight memory constraints, so you didn't want to load anything into memory unless you really needed it.

    Therefore, looking up an icon in 16-bit Windows went like this:

    • Find the icon group resource, load it, and lock it.
    • Study it to decide which icon image is best.
    • Unlock and free the icon group resource since we don't need it any more.
    • Find and load the icon image resource for the one you chose.
    • Return that handle as the icon handle.

    Observe that once we decide which icon image we want, the only memory consumed is the memory for that specific image. We never load the images we don't need.

    Drawing an icon went like this:

    • Lock the icon handle to get access to the pixels.
    • Draw the icon.
    • Unlock the icon handle.

    Since icons were usually marked discardable, they could get evicted from memory if necessary, and they would get reloaded the next time you tried to draw them.

    Although Win32 does not follow the same memory management model for resources as 16-bit Windows, it preserved the programming model (find, load, lock) to make it easier to port programs from 16-bit Windows to 32-bit Windows. And in order not to break code which loaded icons from resources directly (say, because they wanted to replace the icon selection algorithm), the breakdown of an icon file into a directory + images was also preserved.

    You now know enough to solve this customer's problem:

    I have an icon in a resource DLL, and I need to pass its raw data to another component. However, the number of bytes reported by Size­Of­Resource is only 48 instead of 5KB which is the amount actually stored in the resource DLL. I triple-checked the resource DLL and I'm sure I'm looking at the right icon resource.

    Here is my code:

    HRSRC hrsrcIcon = FindResource(hResources,
                         MAKEINTRESOURCE(IDI_MY_ICON), RT_GROUP_ICON);
    DWORD cbIcon = SizeofResource(hResources, hrsrcIcon);
    HGLOBAL hIcon = LoadResource(hResources, hrsrcIcon);
    void *lpIcon = LockResource(hIcon);
    
  • The Old New Thing

    Why do some font names begin with an at-sign?

    • 21 Comments

    It was a simple question.

    For some reason, my font selection dialog (CFont­Dialog) shows a bunch of font names beginning with the at-sign (@). These fonts don't work correctly if I use them. Any idea what they are? (I tried searching the Internet, but search engines don't seem to let you search for @ so it's hard to make much headway.)

    (And that's why I wrote "at-sign" in the subject instead of using the @ character.)

    Fonts which begin with an @-sign are vertically-oriented fonts. They are used in languages like Chinese, Japanese, and (less often) Korean. The idea is that if you want to generate vertical text, you start with the horizontal version of the font and compose your document, then switch to the vertical version for printing.

     x x x 

    I wasn't able to detect that your browser supports the @SimSun font, so I'll give an example with fake Chinese characters. Pretend that the shapes and Latin letters are actually Chinese characters. First, you compose your document with the horizontal font:

    ▴❤❦Quo123▴‌̥ 

    When it's time to print, switch to the vertical version of the font.

    ◀❥❧℺ᴝᴑ123◀°

    Hm, it looks like the Chinese characters got rotated 90° to the left, so they're all lying on their side. The result is not really all that readable, but wait, here's the trick: After the paper comes out of the printer, rotate the paper right 90°:

    ◀❥❧℺ᴝᴑ123◀°

    Notice that the vertical version of a font does not simply rotate every character 90°. Non-CJK characters typically remain in their original orientation (which means that when you turn the paper, they will come out rotated). And some CJK characters change form between horizontal and vertical variants, like the period in the example above, so it's not a simple rule like "rotate all CJK characters and leave non-CJK characters alone."

    This is basically a hack to get rudimentary vertical font support in software that doesn't support vertical text natively. (Web browsers support vertical text natively with the proposed writing-mode property.)

    If you don't want vertical fonts to show up in your font dialog, pass the CF_NO­VERT­FONTS flag. Of course, if you pass that flag, then your users won't be able to use the vertical-font trick any more.

    Supplemental reading which served as the source material for this article:

    Bonus head-to-head competition: You can read how Michael Kaplan blogged this exact same subject in his own Kaplanesque way.

Page 1 of 3 (24 items) 123