October, 2003

  • The Old New Thing

    Answer to previous exercise


    If the program starts with the mouse already in the client area without moving, why do you get a beep?

    Because of the spurious WM_MOUSEMOVE message that is generated when a window is created. In this case, the spurious message is a good thing, since it lets us do our mouse work at window creation.
  • The Old New Thing

    Why is there no WM_MOUSEENTER message?


    There is a WM_MOUSELEAVE message. Why isn't there a WM_MOUSEENTER message?

    Because you can easily figure that out for yourself.

    When you receive a WM_MOUSELEAVE message, set a flag that says, "The mouse is outside the window." When you receive a WM_MOUSEMOVE message and the flag is set, then the mouse has entered the window. (And clear the flag while you're there.)

    Let's illustrate this with our sample program, just to make sure you get the idea:

    BOOL g_fMouseInClient;
    void OnMouseMove(HWND hwnd, int x, int y, UINT keyFlags)
        if (!g_fMouseInClient) {
            g_fMouseInClient = TRUE;
            TRACKMOUSEEVENT tme = { sizeof(tme) };
            tme.dwFlags = TME_LEAVE;
            tme.hwndTrack = hwnd;
        case WM_MOUSELEAVE: g_fMouseInClient = FALSE; return 0;
        HANDLE_MSG(hwnd, WM_MOUSEMOVE, OnMouseMove);

    This program beeps whenever the mouse enters the client area.

    Exercise: If the program starts with the mouse already in the client area without moving, why do you get a beep?
  • The Old New Thing

    Why doesn't the clock in the taskbar display seconds?


    Early beta versions of the taskbar clock did display seconds, and it even blinked the colon like some clocks do. But we had to remove it.


    Because that blinking colon and the constantly-updating time were killing our benchmark numbers.

    On machines with only 4MB of memory (which was the minimum memory requirement for Windows 95), saving even 4K of memory had a perceptible impact on benchmarks. By blinking the clock every second, this prevented not only the codepaths related to text rendering from ever being paged out, it also prevented the taskbar's window procedure from being paged out, plus the memory for stacks and data, plus all the context structures related to the Explorer process. Add up all the memory that was being forced continuously present, and you had significantly more than 4K.

    So out it went, and our benchmark numbers improved. The fastest code is code that doesn't run.
  • The Old New Thing

    Other uses for bitmap brushes


    Bitmap brushes used to be these little 8x8 monochrome patterns that you could use for hatching and maybe little houndstooth patterns if you were really crazy. But you can do better.

    CreatePatternBrush lets you pass in any old bitmap - even a huge one, and it will create a brush from it. The bitmap will automatically be tiled, so this is a quick way to get bitmap tiling. Let GDI do all the math for you!

    This is particularly handy when you're stuck with a mechanism where you are forced to pass an HBRUSH but you really want to pass an HBITMAP. Convert the bitmap to a brush and return that brush instead.

    For example, let's take our scratch program and give it a custom tiled background by using a pattern brush.

    HBRUSH CreatePatternBrushFromFile(LPCTSTR pszFile)
        HBRUSH hbr = NULL;
        HBITMAP hbm = (HBITMAP)LoadImage(g_hinst, pszFile,
                       IMAGE_BITMAP, 0, 0, LR_LOADFROMFILE);
        if (hbm) {
            hbr = CreatePatternBrush(hbm);
        return hbr;
    InitApp(LPSTR lpCmdLine)
        BOOL fSuccess = FALSE;
        HBRUSH hbr = CreatePatternBrushFromFile(lpCmdLine);
        if (hbr) {
            WNDCLASS wc;
            wc.style = 0;
            wc.lpfnWndProc = WndProc;
            wc.cbClsExtra = 0;
            wc.cbWndExtra = 0;
            wc.hInstance = g_hinst;
            wc.hIcon = NULL;
            wc.hCursor = LoadCursor(NULL, IDC_ARROW);
            wc.hbrBackground = hbr;
            wc.lpszMenuName = NULL;
            wc.lpszClassName = CLASSNAME;
            fSuccess = RegisterClass(&wc);
            // Do not delete the brush - the class owns it now
        return fSuccess;

    With a corresponding adjustment to WinMain:

        if (!InitApp(lpCmdLine)) return 0;

    Pass the path to a *.bmp file on the command line, and bingo, the window will tile its background with that bitmap. Notice that we did not have to change anything other than the class registration. No muss, no fuss, no bother.

    Here's another way you can use bitmap brushes to save yourself a lot of work. Start with a new scratch program and change it as follows:

    HBRUSH g_hbr; // the pattern brush we created
    HBRUSH CreatePatternBrushFromFile(LPCTSTR pszFile)
    { ... same as above ... }
    PaintContent(HWND hwnd, PAINTSTRUCT *pps)
        Ellipse(pps->hdc, 0, 0, 200, 100);
        HBRUSH hbrOld = SelectBrush(pps->hdc, g_hbr);
        SelectBrush(pps->hdc, hbrOld);

    And add the following code to WinMain before the call to CreateWindowEx:

        g_hbr = CreatePatternBrushFromFile(lpCmdLine);
        if (!g_hbr) return 0;

    This time, since we are managing the brush ourselves we need to remember to


    at the end of WinMain before it returns.

    This program draws an ellipse filled with your bitmap. The FillPath function uses the currently-selected brush, so we select our bitmap brush (instead of a boring solid brush) and draw with that. Result: A pattern-filled ellipse. Without a bitmap brush, you would have had to do a lot of work manually clipping the bitmap (and tiling it) to the ellipse.
  • The Old New Thing

    Why is address space allocation granularity 64K?


    You may have wondered why VirtualAlloc allocates memory at 64K boundaries even though page granularity is 4K.

    You have the Alpha AXP processor to thank for that.

    On the Alpha AXP, there is no "load 32-bit integer" instruction. To load a 32-bit integer, you actually load two 16-bit integers and combine them.

    So if allocation granularity were finer than 64K, a DLL that got relocated in memory would require two fixups per relocatable address: one to the upper 16 bits and one to the lower 16 bits. And things get worse if this changes a carry or borrow between the two halves. (For example, moving an address 4K from 0x1234F000 to 0x12350000, this forces both the low and high parts of the address to change. Even though the amount of motion was far less than 64K, it still had an impact on the high part due to the carry.)

    But wait, there's more.

    The Alpha AXP actually combines two signed 16-bit integers to form a 32-bit integer. For example, to load the value 0x1234ABCD, you would first use the LDAH instruction to load the value 0x1235 into the high word of the destination register. Then you would use the LDA instruction to add the signed value -0x5433. (Since 0x5433 = 0x10000 - 0xABCD.) The result is then the desired value of 0x1234ABCD.

    LDAH t1, 0x1235(zero) // t1 = 0x12350000
    LDA  t1, -0x5433(t1)  // t1 = t1 - 0x5433 = 0x1234ABCD

    So if a relocation caused an address to move between the "lower half" of a 64K block and the "upper half", additional fixing-up would have to be done to ensure that the arithmetic for the top half of the address was adjusted properly. Since compilers like to reorder instructions, that LDAH instruction could be far, far away, so the relocation record for the bottom half would have to have some way of finding the matching top half.

    What's more, the compiler is clever and if it needs to compute addresses for two variables that are in the same 64K region, it shares the LDAH instruction between them. If it were possible to relocate by a value that wasn't a multiple of 64K, then the compiler would no longer be able to do this optimization since it's possible that after the relocation, the two variables no longer belonged to the same 64K block.

    Forcing memory allocations at 64K granularity solves all these problems.

    If you have been paying really close attention, you'd have seen that this also explains why there is a 64K "no man's land" near the 2GB boundary. Consider the method for computing the value 0x7FFFABCD: Since the lower 16 bits are in the upper half of the 64K range, the value needs to be computed by subtraction rather than addition. The naïve solution would be to use

    LDAH t1, 0x8000(zero) // t1 = 0x80000000, right?
    LDA  t1, -0x5433(t1)  // t1 = t1 - 0x5433 = 0x7FFFABCD, right?

    Except that this doesn't work. The Alpha AXP is a 64-bit processor, and 0x8000 does not fit in a 16-bit signed integer, so you have to use -0x8000, a negative number. What actually happens is

    LDAH t1, -0x8000(zero) // t1 = 0xFFFFFFFF`80000000
    LDA  t1, -0x5433(t1)   // t1 = t1 - 0x5433 = 0xFFFFFFFF`7FFFABCD

    You need to add a third instruction to clear the high 32 bits. The clever trick for this is to add zero and tell the processor to treat the result as a 32-bit integer and sign-extend it to 64 bits.

    ADDL t1, zero, t1    // t1 = t1 + 0, with L suffix
    // L suffix means sign extend result from 32 bits to 64
                         // t1 = 0x00000000`7FFFABCD

    If addresses within 64K of the 2GB boundary were permitted, then every memory address computation would have to insert that third ADDL instruction just in case the address got relocated to the "danger zone" near the 2GB boundary.

    This was an awfully high price to pay to get access to that last 64K of address space (a 50% performance penalty for all address computations to protect against a case that in practice would never happen), so roping off that area as permanently invalid was a more prudent choice.
  • The Old New Thing

    In Explorer, you can right-click the icon in the caption

    In Explorer, you can right-click the icon in the caption to get the context menu for the folder you are viewing. (Very handy for "Search" or "Command Prompt Here".) Apparently not enough people realize this. In Windows 95, we tried to make it so most icons on the screen did something interesting when you right-clicked them.
  • The Old New Thing

    I'm doing this instead of writing a book


    Some commenters mentioned that I should write a book. It turns out that writing a book is hard.

    A few years ago, MS Press actually approached me about writing a book for them. But I declined because the fashion for technical books is to take maybe fifty pages of information and pad it to a 700-page book, and I can't write that way. None of my topics would ever make it to a 100-page chapter. They're just little page-and-a-half vignettes. And it's not like the world needs yet another book on Win32 programming.

    So I'll just continue to babble here. It's easier.

  • The Old New Thing

    Low-tech usability testing

    My pal Jason Moore discusses using paper prototypes as a fast way to get usability feedback. I found it interesting that by going low-tech, you actually get better feedback, because people are more willing to criticize a paper model than running code. (And another advantage of the paper model is that you can make changes on the fly. If during the session you get the idea, "Maybe if I did it this way," you can grab a piece of paper, write on it, and insert it into the session instantly. Try doing that with running code.)
  • The Old New Thing

    Stupid memory-mapping tricks


    Shared memory is not just for sharing memory with other processes. It also lets you share memory with yourself in sneaky ways.

    For example, this sample program (all error checking and cleanup deleted for expository purposes) shows how you can map the same shared memory into two locations simultaneously. Since they are the same memory, modifications to one address are reflected at the other.

    #include <windows.h>
    #include <stdio.h>
    void __cdecl main(int argc, char **argv)
        HANDLE hfm = CreateFileMapping(INVALID_HANDLE_VALUE, NULL,
                        PAGE_READWRITE, 0, sizeof(DWORD), NULL);
        LPDWORD pdw1 = (LPDWORD)MapViewOfFile(hfm, FILE_MAP_WRITE,
                                              0, 0, sizeof(DWORD));
        LPDWORD pdw2 = (LPDWORD)MapViewOfFile(hfm, FILE_MAP_WRITE,
                                              0, 0, sizeof(DWORD));
        printf("Mapped to %x and %x\n", pdw1, pdw2);
        printf("*pdw1 = %d, *pdw2 = %d\n", *pdw1, *pdw2);
        /* Now watch this */
        *pdw1 = 42;
        printf("*pdw1 = %d, *pdw2 = %d\n", *pdw1, *pdw2);

    This program prints

    Mapped to 280000 and 290000
    *pdw1 = 0, *pdw2 = 0
    *pdw1 = 42, *pdw2 = 42

    (Missing asterisks added, 8am - thanks to commenter Tom for pointing this out.)

    The addresses may vary from run to run, but observe that the memory did get mapped to two different addresses, and changing one value to 42 magically changed the other.

    This is a nifty consequence of the way shared memory mapping works. I stumbled across it while investigating how I could copy large amounts of memory without actually copying it. The solution: Create a shared memory block, map it at one location, write to it, then unmap it from the old location and map it into the new location. Presto: The memory instantly "moved" to the new location. This a major win if the memory block is large, since you didn't have to allocate a second block, copy it, then free the old block - the memory block doesn't even get paged in.

    It turns out I never actually got around to using this trick, but it was a fun thing to discover anyway.
  • The Old New Thing

    Researchers discover link between music and drinking

    A British scientific study shows that a bit of classical music can persuade diners to buy more fancy coffees, pricey wines and luxurious desserts. "North has shown that playing German or French music can persuade diners to buy wine from those countries." I found this to be true in my experience. If you get two thousand people in a tent and play live oom-pah music, they end up drinking lots of German beer.
Page 3 of 4 (37 items) 1234