October, 2010

  • The Old New Thing

    Why does each drive have its own current directory?

    • 39 Comments

    Commenter Dean Earley asks, "Why is there a 'current directory' AND an current drive? Why not merge them?"

    Pithy answer: Originally, each drive had its own current directory, but now they don't, but it looks like they do.

    Okay, let's unwrap that sentence. You actually know enough to answer the question yourself; you just have to put the pieces together.

    Set the wayback machine to DOS 1.0. Each volume was represented by a drive letter. There were no subdirectories. This behavior was carried forward from CP/M.

    Programs from the DOS 1.0 era didn't understand subdirectories; they referred to files by just drive letter and file name, for example, B:PROGRAM.LST. Let's fire up the assembler (compilers were for rich people) and assemble a program whose source code is on the A drive, but sending the output to the B drive.

    A>asm foo       the ".asm" extension on "foo" is implied
    Assembler version blah blah blah
    Source File: FOO.ASM
    Listing file [FOO.LST]: NUL throw away the listing file
    Object file [FOO.OBJ]: B: send the object file to drive B

    Since we gave only a drive letter in response to the Object file prompt, the assembler defaults to a file name of FOO.OBJ, resulting in the object file being generated as B:FOO.OBJ.

    Okay, now let's introduce subdirectories into DOS 2.0. Suppose you have want to assemble A:\SRC\FOO.ASM and put the result into B:\OBJ\FOO.OBJ. Here's how you do it:

    A> B:
    B> CD \OBJ
    B> A:
    A> CD \SRC
    A> asm foo
    Assembler version blah blah blah
    Source File: FOO.ASM
    Listing file [FOO.LST]: NUL
    Object file [FOO.OBJ]: B:
    

    The assembler reads from A:FOO.ASM and writes to B:FOO.OBJ, but since the current directory is tracked on a per-drive basis, the results are A:\SRC\FOO.ASM and B:\OBJ\FOO.OBJ as desired. If the current directory were not tracked on a per-drive basis, then there would be no way to tell the assembler to put its output into a subdirectory. As a result, DOS 1.0 programs were effectively limited to operating on files in the root directory, which means that nobody would put files in subdirectories (because their programs couldn't access them).

    From a DOS 1.0 standpoint, changing the current directory on a drive performs the logical equivalent of changing media. "Oh look, a completely different set of files!"

    Short attention span.

    Remembering the current directory for each drive has been preserved ever since, at least for batch files, although there isn't actually such a concept as a per-drive current directory in Win32. In Win32, all you have is a current directory. The appearance that each drive has its own current directory is a fake-out by cmd.exe, which uses strange environment variables to create the illusion to batch files that each drive has its own current directory.

    Dean continues, "Why not merge them? I have to set both the dir and drive if i want a specific working dir."

    The answer to the second question is, "They already are merged. It's cmd.exe that tries to pretend that they aren't." And if you want to set the directory and the drive from the command prompt or a batch file, just use the /D option to the CHDIR command:

    D:\> CD /D C:\Program Files\Windows NT
    C:\Program Files\Windows NT> _
    

    (Notice that the CHDIR command lets you omit quotation marks around paths which contain spaces: Since the command takes only one path argument, the lack of quotation marks does not introduce ambiguity.

  • The Old New Thing

    The evolution of the ICO file format, part 1: Monochrome beginnings

    • 36 Comments

    This week is devoted to the evolution of the ICO file format. Note that the icon resource format is different from the ICO file format; I'll save that topic for another day.

    The ICO file begins with a fixed header:

    typedef struct ICONDIR {
        WORD          idReserved;
        WORD          idType;
        WORD          idCount;
        ICONDIRENTRY  idEntries[];
    } ICONHEADER;
    

    idReserved must be zero, and idType must be 1. The idCount describes how many images are included in this ICO file. An ICO file is really a collection of images; the theory is that each image is an alternate representation of the same underlying concept, but at different sizes and color depths. There is nothing to prevent you, in principle, from creating an ICO file where the 16×16 image looks nothing like the 32×32 image, but your users will probably be confused.

    After the idCount is an array of ICONDIRECTORY entries whose length is given by idCount.

    struct IconDirectoryEntry {
        BYTE  bWidth;
        BYTE  bHeight;
        BYTE  bColorCount;
        BYTE  bReserved;
        WORD  wPlanes;
        WORD  wBitCount;
        DWORD dwBytesInRes;
        DWORD dwImageOffset;
    };
    

    The bWidth and bHeight are the dimensions of the image. Originally, the supported range was 1 through 255, but starting in Windows 95 (and Windows NT 4), the value 0 is accepted as representing a width or height of 256.

    The wBitCount and wPlanes describe the color depth of the image; for monochrome icons, these value are both 1. The bReserved must be zero. The dwImageOffset and dwBytesInRes describe the location (relative to the start of the ICO file) and size in bytes of the actual image data.

    And then there's bColorCount. Poor bColorCount. It's supposed to be equal to the number of colors in the image; in other words,

    bColorCount = 1 << (wBitCount * wPlanes)

    If wBitCount * wPlanes is greater than or equal to 8, then bColorCount is zero.

    In practice, a lot of people get lazy about filling in the bColorCount and set it to zero, even for 4-color or 16-color icons. Starting in Windows XP, Windows autodetects this common error, but its autocorrection is slightly buggy in the case of planar bitmaps. Fortunately, almost nobody uses planar bitmaps any more, but still, it would be in your best interest not to rely on the autocorrection performed by Windows and just set your bColorCount correctly in the first place. An incorrect bColorCount means that when Windows tries to find the best image for your icon, it may choose a suboptimal one because it based its decision on incorrect color depth information.

    Although it probably isn't true, I will pretend that monochrome icons existed before color icons, because it makes the storytelling easier.

    A monochome icon is described by two bitmaps, called AND (or mask) and XOR (or image, or when we get to color icons, color). Drawing an icon takes place in two steps: First, the mask is ANDed with the screen, then the image is XORed. In other words,

    pixel = (screen AND mask) XOR image

    By choosing appropriate values for mask and image, you can cover all the possible monochrome BLT operations.

    mask image result operation
    0 0 (screen AND 0) XOR 0 = 0 blackness
    0 1 (screen AND 0) XOR 1 = 1 whiteness
    1 0 (screen AND 1) XOR 0 = screen nop
    1 1 (screen AND 1) XOR 1 = NOT screen invert

    Conceptually, the mask specifies which pixels from the image should be copied to the destination: A black pixel in the mask means that the corresponding pixel in the image is copied.

    The mask and image bitmaps are physically stored as one single double-height DIB. The image bitmap comes first, followed by the mask. (But since DIBs are stored bottom-up, if you actually look at the bitmap, the mask is in the top half of the bitmap and the image is in the bottom half).

    In terms of file format, each icon image is stored in the form of a BITMAPINFO (which itself takes the form of a BITMAPINFOHEADER followed by a color table), followed by the image pixels, followed by the mask pixels. The biCompression must be BI_RGB. Since this is a double-height bitmap, the biWidth is the width of the image, but the biHeight is double the image height. For example, a 16×16 icon would specify a width of 16 but a height of 16 × 2 = 32.

    That's pretty much it for classic monochrome icons. Next time we'll look at color icons.

    Still, given what you know now, the following story will make sense.

    A customer contacted the shell team to report that despite all their best efforts, they could not get Windows to use the image they wanted from their .ICO file. Windows for some reason always chose a low-color icon instead of using the high-color icon. For example, even though the .ICO file had a 32bpp image available, Windows always chose to use the 16-color (4bpp) image, even when running on a 32bpp display.

    A closer inspection of the offending .ICO file revealed that the bColorCount in the IconDirectoryEntry for all the images was set to 1, regardless of the actual color depth of the image. The table of contents for the .ICO file said "Yeah, all I've got are monochrome images. I've got three 48×48 monochrome images, three 32×32 monochrome images, and three 16×16 monochrome images." Given this information, Windows figured, "Well, given those choices, I guess that means I'll use the monochrome one." It chose one of images (at pseudo-random), and went to the bitmap data and found, "Oh, hey, how about that, it's actually a 16-color image. Okay, well, I guess I can load that."

    In summary, the .ICO file was improperly authored. Patching each IconDirectoryEntry in a hex editor made the icon work as intended. The customer thanked us for our investigation and said that they would take the issue up with their graphic design team.

  • The Old New Thing

    Secret passages on Microsoft main campus

    • 33 Comments

    They aren't really "secret passages" but they are definitely underutilized, and sometimes they provide a useful shortcut.

    At the northwest corner of Building 50, there are two doors. One leads to a stairwell that takes you to the second floor. That's the one everybody uses. The other door is a service entrance that takes you to the cafeteria. If your office is on the second or third floor in the northwest corner, it's faster to use the service hallway to get to the cafeteria than it is to walk to the core of the building and take the main stairs.

    There is a service tunnel that runs from the first floor of Building 86 (entrance next to the first floor cafeteria elevator) through the loading dock to the Central Garage, where you can continue to Building 85 or 84. This is not really any faster than the regular route, but it does have the advantage of being underground and mostly indoors, which is a major benefit when it is cold or raining.

    What is your favorite secret passage at your workplace?

  • The Old New Thing

    Belated happy first birthday, Windows 7

    • 33 Comments

    On Friday, the marketing folks informed me that they decided to put me on the Microsoft Careers United States home page in recognition of Windows 7's first birthday. It's an honor and to be honest a bit scary to be chosen to be the face of Windows on a day of such significance. (They told me that had narrowed it down to me and "some Director of Test". Sorry, Director of Test; maybe they'll pick you for Windows 7's second birthday.)

    I think my picture is still there (they didn't tell me how long it was going to be up), but here's a screen capture just to prove it to my relatives:

    Here's looking at you, kid.

    (Thank goodness they cropped out my withered hand.)

    I wondered what would happen if I clicked Find jobs like mine. What did they consider to be jobs like mine? Alas, it just takes you to the job search page with no criteria filled in. Maybe every job at Microsoft is like mine?

    One of my colleagues teased me, "Did you really legally change your last name to Windows?"

  • The Old New Thing

    Why is the origin at the upper left corner?

    • 32 Comments

    Via the Suggestion Box, Dirk Declercq asks why the default client-area coordinate origin is at the upper left corner instead of the lower left corner. (I assume he also intends for the proposed client-area coordinate system to have the y-coordinate increasing as you move towards the top of the screen.)

    Well, putting the client area origin at the lower left would have resulted in the client coordinate space not being a simple translation of the screen coordinate space. After all, the screen origin is at the upper left, too. Windows was originally developed on left-to-right systems, where the relationship between client coordinates and screen coordinates was a simple translation. Having the y-coordinate increase as you move down the screen but move up the client would have just been one of those things you did to be annoying.

    Okay, so why not place the screen origin at the lower left, too?

    Actually, OS/2 does this, and DIBs do it as well. And then everybody wonders why their images are upside-down.

    Turns out that the people who designed early personal computers didn't care much for mathematical theory. The raster gun of a television set starts at the upper left corner, continues to the right, and when it reaches the right-hand edge of the screen, it jumps back to the left edge of the screen to render the second scan line. Why did television sets scan from the top down instead of from the bottom up? Beats me. You'll have to ask the person who invented the television (who, depending on whom you ask, is Russian or American or German or Scottish or some other nationality entirely), or more specifically, whoever invented the scanning model of image rendering, why they started from the top rather than from the bottom.

    Anyway, given that the video hardware worked from top to bottom, it was only natural that the memory for the video hardware work the same way. (The Apple II famously uses a peculiar memory layout in order to save a chip.)

    Who knows, maybe if the design of early computers had been Chinese, we would be wondering why the origin was in the upper right corner with the pixels in column-major order.

    Bonus chatter: Even mathematicians can't get their story straight. Matrices are typically written with the origin element at the upper left. Which reminds me of a story from the old Windows 95 days. The GDI folks received a defect report from the user interface team, who backed up their report with a complicated mathematical explanation. The GDI team accepted the change request with the remark, "We ain't much fer book lernin."

  • The Old New Thing

    Wildly popular computer game? The Windows product team has you covered

    • 30 Comments

    In Windows 95, the most heavily-tested computer game was DOOM. Not because the testers spent a lot of time developing test plans and test harnesses and automated run-throughs. Nope, it was because it was by far the most popular game the Windows 95 team members played to unwind.

    It was a huge breakthrough when DOOM finally ran inside a MS-DOS box and didn't require booting into MS-DOS mode any more. Now you could fire up DOOM without having to exit all your other programs first.

    I've learned that in Windows Vista, the most heavily tested game was World of Warcraft. Most members of the DirectX application compatibility team are WoW players, in addition to a large number in the Windows division overall.

    So if you have a wildly popular computer game for the PC, you can be pretty sure that the Windows team will be all over it. "For quality control purposes, I assure you."

    Related story: How to make sure your network card works with Windows.

  • The Old New Thing

    The memcmp function reports the result of the comparison at the point of the first difference, but it can still read past that point

    • 27 Comments

    This story originally involved a more complex data structure, but that would have required too much explaining (with relatively little benefit since the data structure was not related to the moral of the story), so I'm going to retell it with double null-terminated strings as the data structure instead.

    Consider the following code to compare two double-null-terminated strings for equality:

    size_t SizeOfDoubleNullTerminatedString(const char *s)
    {
      const char *start = s;
      for (; *s; s += strlen(s) + 1) { }
      return s - start + 1;
    }
    
    BOOL AreDoubleNullTerminatedStringsEqual(
        const char *s, const char *t)
    {
     size_t slen = SizeOfDoubleNullTerminatedString(s);
     size_t tlen = SizeOfDoubleNullTerminatedString(t);
     return slen == tlen && memcmp(s, t, slen) == 0;
    }
    

    "Aha, this code is inefficient. Since the memcmp function stops comparing as soon as it finds a difference, I can skip the call to SizeOfDoubleNullTerminatedString(t) and simply write

    BOOL AreDoubleNullTerminatedStringsEqual(
        const char *s, const char *t)
    {
     return memcmp(s, t, SizeOfDoubleNullTerminatedString(s)) == 0;
    }
    

    because we can never read past the end of t: If the strings are equal, then tlen will be equal to slen anyway, so the buffer size is correct. And if the strings are different, the difference will be found at or before the end of t, since it is not possible for a double-null-terminated string to be a prefix of another double-null-terminated string. In both cases, we never read past the end of t."

    This analysis is based on a flawed assumption, namely, that memcmp compares byte-by-byte and does not look at bytes beyond the first point of difference. The memcmp function makes no such guarantee. It is permitted to read all the bytes from both buffers before reporting the result of the comparison.

    In fact, most implementations of memcmp do read past the point of first difference. Your typical library will try to compare the two buffers in register-sized chunks rather than byte-by-byte. (This is particularly convenient on x86 thanks to the block comparison instruction rep cmpsd which compares two memory blocks in DWORD-sized chunks, and x64 doubles your fun with rep cmpsq.) Once it finds two chunks which differ, it then studies the bytes within the chunks to determine what the return value should be.

    (Indeed, people with free time on their hands or simply enjoy a challenge will try to outdo the runtime library with fancy-pants memcmp algorithms which compare the buffers in larger-than-normal chunks by doing things like comparing via SIMD registers.)

    To illustrate, consider an implementation of memcmp which uses 4-byte chunks. Typically, memory comparison functions do some preliminary work to get the buffers aligned, but let's ignore that part since it isn't interesting. The inner loop goes like this:

    while (length >= 4)
    {
     int32 schunk = *(int32*)s;
     int32 tchunk = *(int32*)t;
     if (schunk != tchunk) {
       -- difference found - calculate and return result
     }
     length -= 4;
     s += 4;
     t += 4;
    }
    
    Let's compare the strings s = "a\0b\0\0" and t = "a\0\0". The size of the double-null-terminated string s is 4, so the memory comparison goes like this: First we read four bytes from s into schunk, resulting in (on a little-endian machine) 0x00620061. Next, we read four bytes from t into tchunk, resulting in 0x??000061. Oops, we read one byte past the end of the buffer.

    If t happened to sit right at the end of a page, and the next page was uncommitted memory, then you take an access violation while trying to read tchunk. Your optimization turned into a crash.

    Remember, when you say that a buffer is a particular size, the basic ground rules of programming say that it really has to be that size.

  • The Old New Thing

    How do I programmatically invoke Aero Peek on a window?

    • 27 Comments

    A customer wanted to know if there was a way for their application to invoke the Aero Peek feature so that their window appeared and all the other windows on the system turned transparent.

    No, there is no such programmatic interface exposed. Aero Peek is a feature for the user to invoke, not a feature for applications to invoke so they can draw attention to themselves.

    Yes, I realize you wrote a program so awesome that all other programs pale in comparison, and that part of your mission is to make all the other programs literally pale in comparison to your program.

    Sorry.

    Maybe you can meet up with that other program that is the most awesome program in the history of the universe and share your sorrows over a beer.

  • The Old New Thing

    The evolution of the ICO file format, part 2: Now in color!

    • 26 Comments

    Last time, we looked at the format of classic monochrome icons. But if you want to include color images, too? (Note that it is legal—and for a time it was common—for a single ICO file to offer both monochrome and color icons. After all, a single ICO file can offer both 16-color and high-color images; why not also 2-color images?)

    The representation of color images in an ICO file is almost the same as the representation of monochrome images: All that changes is that the image bitmap is now in color. (The mask remains monochrome.)

    In other words, the image format consists of a BITMAPINFOHEADER where the bmWidth is the width of the image and bmHeight is double the height of the image, followed by the bitmap color table, followed by the image pixels, followed by the mask pixels.

    Note that the result of this is a bizarre non-standard bitmap. The height is doubled because we have both an image and a mask, but the color format changes halfway through!

    Other restrictions: Supported color formats are 4bpp, 8bpp, 16bpp, and 0RGB 32bpp. Note that 24bpp is not supported; you'll have to convert it to a 0RGB 32bpp bitmap. Supported values for biCompression for color images are BI_RGB and (if your bitmap is 16bpp or 32bpp) BI_BITFIELDS.

    The mechanics of drawing the icon are the same as for a monochrome image: First, the mask is ANDed with the screen, then the image is XORed. In other words,

    pixel = (screen AND mask) XOR image

    On the other hand, XORing color pixels is not really a meaningful operation. It's not like people say "Naturally, fuchsia XOR aqua equals yellow. Any idiot knows that." Or "Naturally, blue XOR eggshell equals apricot on 8bpp displays (because eggshell is palette index 56, blue is palette index 1, and palette index 57 is apricot) but is equal to #F0EA29 on 32bpp displays." The only meaningful color to XOR against is black, in which case you have "black XOR Q = Q for all colors Q".

    mask image result operation
    0 Q (screen AND 0) XOR Q = Q copy from icon
    1 0 (screen AND 1) XOR 0 = screen nop
    1 Q (screen AND 1) XOR Q = screen XOR Q dubious

    For pixels you want to be transparent, set your mask to white and your image to black. For pixels you want to come from your icon, set your mask to black and your image to the desired color.

    We now have enough information to answer a common question people have about icons. After that break, we'll return to the evolution of the ICO file format.

    For further reading: Icons in Win32.

  • The Old New Thing

    Which ferry should we take from Germany back to Denmark? Oh, it's this one, except for that one word I don't understand

    • 26 Comments

    Writing that entry last Friday reminded me that I never did share my stories of that emergency vacation, save for a cryptic list of learnings. Here's a little story about the ferry crossing.

    Part of the trip involved driving from Copenhagen (København) to Munich (München), which means taking a ferry. (It's faster than driving down the peninsula.) Realizing that we would also have to take the ferry on the return trip on Saturday, we grabbed a ferry schedule during a rest stop in Denmark.

    What we didn't realize until it was time for the return trip was that the schedule was (naturally) in Danish, a language none of us could read or speak. As the closest thing in the car to an expert in northern European languages (it wasn't a close vote), I was called upon to do my best to puzzle out the timetable. The choice was between the Puttgarten–Rødby ferry and the Rostock–Gedser ferry. If we could make it, the Rostock–Gedser ferry would have been a better choice since it is a more direct route.

    I was able to decode that the Puttgarten–Rødby ferry ran every half hour around the clock, whereas the Rostock–Gedser ferry stopped running at 11:30pm. (This sounds like a no-brainer, except that the Puttgarten–Rødby timetable didn't give the actual run times; it just said afgang hver halve time, sejltid 0:45.) We consulted the map, estimated our driving time, and calculated that if we stayed focused and didn't dawdle at the rest breaks, we could make the 11:30pm Rostock–Gedser ferry.

    In the fine print of the schedule I was able to translate that the late-night Rostock–Gedser ferry run does not operate on ... um ... lørdag.

    What's lørdag?

    We decided to play it safe and go to Puttgarten.

    It was a good call, because lørdag is Danish (and as I later learned, Swedish) for Saturday.

    Bonus ferry story: After paying the fare on the Danish side for the morning crossing into Germany, the toll booth agent quickly mumbled, "Syv, sieben, seven" as we pulled away. He said it so quickly and unexpectedly, the English-speaking people in the car didn't hear the "seven" since they tuned out what the guy said once they determined that it wasn't English. I was paying more attention, figuring there was an off-chance the agent would speak German to us, and I picked out the sieben, and then upon mentally rewinding and replaying what he said, I also recognized the "seven."

    But why did he keep saying "seven" in various languages? As we reached the waiting area, we figured it out: Because he was telling us to line up in row number seven.

    Bonus ferry storylet: As we waited for the ferry, I noticed that the car behind us had a sticker saying that it was purchased from... a town just a few miles from where I grew up! How the heck did a car from a small town in New Jersey end up in Denmark? We asked the people standing next to the car—it was a nice day, so people stood outside—and they explained that they bought it from somebody who moved to Denmark from the States, and he brought his car with him. Strange coincidences abound.

    Of course, an even stranger coincidence would have been if I happened to have known the original owner. (Not the case here.)

Page 1 of 3 (27 items) 123