October, 2012

  • The Old New Thing

    Using WM_COPYDATA to marshal message parameters since the window manager otherwise doesn't know how

    Miral asks for the recommended way of passing messages across processes if they require custom marshaling.

    There is no one recommended way of doing the custom marshaling, although some are hackier than others.

    Probably the most architecturally beautiful way of doing it is to use a mechanism that does perform automatic marshaling, like COM and MIDL. Okay, it's not actually automatic, but it does allow you just give MIDL your structures and some information about how they should be interpreted, and the MIDL compiler autogenerates the marshaler. You can then pass the data back and forth by simply invoking COM methods and letting COM do the work.

    Architecturally beautiful often turns into forcing me to learn more than I really wanted to learn, so here's a more self-contained approach: Take advantage of the WM_COPY­DATA message. This is sort of the poor-man's marshaler. All it knows how to marshal is a blob of bytes. It's your responsibility to take what you want to marshal and serialize it into a blob of bytes. WM_COPY­DATA will get the bytes to the other side, and then the recipient needs to deserialize the blob of bytes back into your data. But at least WM_COPY­DATA does the tricky bit of getting the bytes from one side to the other.

    Let's start with our scratch program and have it transfer data to another copy of itself. Make the following changes:

    #include <strsafe.h>
    HWND g_hwndOther;
    #define CDSCODE_WINDOWPOS 42 // lpData -> WINDOWPOS
    void OnWindowPosChanged(HWND hwnd, LPWINDOWPOS pwp)
     if (g_hwndOther) {
      cds.dwData = CDSCODE_WINDOWPOS;
      cds.cbData = sizeof(WINDOWPOS);
      cds.lpData = pwp;
      SendMessage(g_hwndOther, WM_COPYDATA,
     FORWARD_WM_WINDOWPOSCHANGED(hwnd, pwp, DefWindowProc);
    void OnCopyData(HWND hwnd, HWND hwndFrom, PCOPYDATASTRUCT pcds)
     switch (pcds->dwData) {
      if (pcds->cbData == sizeof(WINDOWPOS)) {
       LPWINDOWPOS pwp = static_cast<LPWINDOWPOS>(pcds->lpData);
       TCHAR szMessage[256];
       StringCchPrintf(szMessage, 256,
        TEXT("From window %p: x=%d, y=%d, cx=%d, cy=%d, flags=%s %s"),
        hwndFrom, pwp->x, pwp->y, pwp->cx, pwp->cy,
        (pwp->flags & SWP_NOMOVE) ? TEXT("nomove") : TEXT("move"),
        (pwp->flags & SWP_NOSIZE) ? TEXT("nosize") : TEXT("size"));
       SetWindowText(hwnd, szMessage);
    // WndProc
        HANDLE_MSG(hwnd, WM_WINDOWPOSCHANGED, OnWindowPosChanged);
        HANDLE_MSG(hwnd, WM_COPYDATA, OnCopyData);
    // WinMain
        // If there is another window called "Scratch", then it becomes
        // our recipient.
        g_hwndOther = FindWindow(TEXT("Scratch"), TEXT("Scratch"));
        hwnd = CreateWindow(
            "Scratch",                      /* Class Name */
            g_hwndOther ? TEXT("Sender") : TEXT("Scratch"),
            WS_OVERLAPPEDWINDOW,            /* Style */
            CW_USEDEFAULT, CW_USEDEFAULT,   /* Position */
            CW_USEDEFAULT, CW_USEDEFAULT,   /* Size */
            NULL,                           /* Parent */
            NULL,                           /* No menu */
            hinst,                          /* Instance */
            0);                             /* No special parameters */

    Just to make it easier to tell the two windows apart, I call the one sending the message "Sender". (Note that my method for finding the other window is pretty rudimentary, because that's not the point of the example.)

    Whenever the sender window receives a WM_WINDOW­POS­CHANGED message, it sends a copy of the WINDOW­POS structure to the recipient, which then displays it in its own title bar. Things to observe:

    • The value you put into dwData can be anything you like. It's just another DWORD of data. Traditionally, it's used like a "message number", used to communicate what type of data is being sent. In our case, we choose 42 to mean "The lpData points to a WINDOW­POS structure."
    • The cbData is the number of bytes you want to send, and lpData points to the buffer. In our case, the number of bytes is always the same, but variable-sized data is also fine.
    • The lpData can point anywhere, as long as the memory is valid for the lifetime of the Send­Message call. In this case, I just point it at the data given to me by the window manager. Of course, if you allocated memory to put into lpData, then the responsibility for freeing it also belongs to you.
    • For safety's sake, I validate that when I get a CDS­CODE_WINDOW­POS request, the associated data really is the size of a WINDOW­POS structure. This helps protect against a rogue caller who tries to crash the application by sending a CDS­CODE_WINDOW­POS with a size less than sizeof(WINDOW­POS), thereby triggering a buffer overflow. (Exercise: Under what other conditions can the size be incorrect? How would you fix that?)
    • The WM_COPY­DATA copies data in only one direction. It does not provide a way to pass information back to the sender. (Exercise: How would you pass information back?)
    • The hwndFrom parameter is a courtesy parameter, like dwData. There is currently no attempt to verify that the window really is that of the sender. (In practice, all that could really be verified is that the window belongs to the thread that is doing the sending, but right now, not even that level of validation is performed.)

    The WM_COPY­DATA message is suitable for small-to-medium-sized amounts of memory. Though if the amount of memory is so small that it fits into a WPARAM and LPARAM, then even WM_COPY­DATA is overkill.

    If you're going to be passing large chunks of memory, then you may want to consider using a shared memory handle instead. The shared memory handle also has the benefit of being shared, which means that the recipient can modify the shared memory block, and the sender can see the changes. (Yes, this is one answer to the second exercise, but see if you can find another answer that tays within the spirit of the exercise.)

  • The Old New Thing

    What's the difference between F5 and F8 at the boot screen?


    Ian B wondered what the difference is between pressing F5 and F8 while Windows is booting.

    I have no idea either. My strategy was to just mash on the function keys, space bar, DEL key, anything else I can think of. Keep pressing them all through the boot process, and maybe a boot menu will show up.

    The F5 hotkey was introduced in Windows 95, where the boot sequence hotkeys were as follows:

    • ESC - Boot in text mode.
    • F5 - Boot in Safe Mode.
    • Shift+F5 - Boot to Safe Mode MS-DOS.
    • Ctrl+F5 - Boot to Safe Mode MS-DOS with drive compression disabled.
    • Alt+F5 - Boot with LOADTOP=0 for Japanese systems.
    • F6 - Boot in Safe Mode with networking.
    • F4 - Boot to previous version of MS-DOS.
    • Ctrl+F4 - Boot to previous version of MS-DOS with drive compression disabled.
    • F8 - Boot to menu.
    • Shift+F8 - Boot with step-by-step confirmation.
    • Ctrl+F8 - Boot with step-by-step confirmation with drive compression disabled.

    Man, that's an insane number of boot options all buried behind obscure function keys. Boy am I glad we got rid of them. This frees up room in my brain for things like Beanie Baby trivia.

    Bonus chatter: The next generation of computers boots so fast that there's no time to hit any of these hotkeys!

  • The Old New Thing

    The TEMP directory is like a public hot tub whose water hasn't been changed in over a year


    A customer reported that they couldn't install product X. When they ran the installer, the got the error message

    setup.exe - Application Error
    The application was unable to start correctly (0xc00000ba). Click OK to close the application.

    The product X setup team weren't sure what to make of this, and they asked if anybody had any ideas.

    The error code 0xc00000ba is STATUS_FILE_IS_A_DIRECTORY, which means that something was supposed to be a file, but instead it was a directory. The path-searching algorithm is not a backtracking algorithm, so once it finds something wrong, it just stops rather than backing up and trying the next directory.

    This was enough of a clue to direct further investigation, which revealed that the customer had a directory named C:\Users\Bob\AppData\Local\Temp\version.dll\. The customer responded, "There are plenty of directories with names of DLLs in my TEMP directory, but getting rid of this one fixes the issue. Thanks!"

    (Puzzle: Why are there so many directories with the names of DLLs? Psychic answer.)

    I slipped something past you a little while back. Did you notice?

    Okay, I gave it away in the subject line. The setup program is running from the TEMP directory. That should already set off alarm bells.

    The TEMP directory is a dumping ground of random junk. A downloader may have put a DLL there and forgotten to delete it. (Or worse, expected it to stay there forever.) And that DLL might be from an incompatible version of some DLL your setup program uses. (I have seen applications ship their own custom versions of system DLLs! Yeah, because the x86 version of shlwapi.dll from Windows 2000 is drop-in compatible with the version of shlwapi.dll that comes with Windows 7.) Who knows what other yucky things have been lying around in that directory. Since the application directory is the first directory searched when the system looks for a DLL, a rogue DLL in the TEMP directory is a trap waiting to be sprung. (A similar issue applies to a shared Downloads directory.)

    It's like the horror movie trope where the frightened pretty girl runs into a room, slams the door shut, then breathes a sigh of relief, believing herself to be safe. But she didn't check that the room was empty! (In other words, she created her airtight hatchway around an insecure room.)

    The Program X setup team decided to change their installer so that it created a subdirectory of TEMP and extracted the main setup program there. That way, it got a fresh hot tub with clean water.

    Remember, the directory is the application bundle. If you drop your application into a random directory, you've just added everything in that directory to your bundle. And if you don't secure your application directory, you're allowing anybody to add components to your bundle. That's one of the reasons why the Logo program encourages (requires?) applications to install into the Program Files directory: The ACLs on the Program Files directory allow write access only to administrators and installers. This makes the application bundle secure by default. If you want to make your application bundle insecure, you have to go out of your way.

  • The Old New Thing

    Keyboard layouts aren't like Beetlejuice - they won't appear just because you say their name


    A customer reported a bug in Windows Vista Home Editions:

    We are handling a Ctrl+V keyboard event and want to interpret it in the context of a US-English keyboard.

    // This keyState represents no keys pressed except for Ctrl
    BYTE keyState[256] = {0};
    keyState[VK_CONTROL] = 0x80;
    // This is the handle for the US-English keyboard
    HKL hkl = (HKL) 0x04090409;
    // output variables
    wchar_t outChar[2];
    WORD outWord;
    ToUnicodeEx('V', 47, keyState, outChar, 2, 0, hkl);
    ToAsciiEx('V', 47, keyState, &outWord, 0, hkl);
    VkKeyScanEx('V', hkl);

    On Windows XP and versions of Windows Vista other than Home editions, the three calls all succeed, whereas on Windows Vista Home Editions, the calls fail. On the other hand, if instead of using the US-English keyboard, we use the current keyboard layout:

    HKL hkl = GetKeyboardLayout(GetCurrentThreadId());

    then Windows Vista Home Editions behave the same as Windows XP and non-Home editions of Vista.

    This suggests that the Home Editions of Vista supports keyboard queries only for the currently active keyboard layout, which renders useless the last parameter to those three functions.

    Notice how the customer's sample code just synthesizes a keyboard layout handle from thin air. While it is true that the format keyboard layout handles is documented, that doesn't mean that you can just make one up and start using it.

    It's like saying, "I know that Contoso uses the email address format Firstname.Lastname@contoso.com, but I just tried to send email to Bob.Smith@contoso.com, and it bounced."

    Does Bob work at Contoso?

    "No. Does that matter?"

    The customer's code blindly assumes that the US-English keyboard layout is loaded rather than calling Load­Keyboard­Layout to actually load it. As a result, if the keyboard layout is not loaded, the call will fail because you passed an invalid keyboard layout handle.

    The customer liaison asked, "Is this documented somewhere that the HKL has to be created from only from the functions and cannot be assigned a value?"

    Um, yeah, it's right there in the documentation of the hkl parameter to the To­Unicode­Ex function. (Emphasis mine.)

    dwhkl [in, optional]

    Type: HKL

    The input locale identifier used to translate the specified code. This parameter can be any input locale identifier previously returned by the Load­Keyboard­Layout function.

    Identical text appears in the documentation of the hkl parameter to the To­Ascii­Ex and Vk­Key­Scan­Ex functions as well.

    The difference observed on Windows Vista Home Editions, then, is that on those systems, in the configurations the customer happens to be using, US-English is not a preloaded keyboard layout.

  • The Old New Thing

    Irony patrol: Recycling bins


    Microsoft has a large corporate recycling effort. Every office, every mail room, every kitchenette, every conference room has a recycling bin. The dining facilities earned Green Restaurant Certification, and there is a goal of making the cafeterias a zero-landfill facility by 2012. (Hey, that's this year!)

    A few years ago, I found one room in my building that didn't have a recycling bin, and you'd think it'd be one of the rooms near the top of the list for needing one.

    The room without a recycling bin was the copy machine room.

    As a result, people were throwing their unwanted cover sheets and other paper waste into the regular garbage.

    I decided to be somebody. I took a recycling bin from an unused office and moved it into the copy room.

    Bonus recycling bin irony: For many years, each office had three recycling bins, each labeled for its intended contents: white paper, mixed paper, and aluminum cans. Improvements in automated sorting technology removed the need to separate these recyclables manually, and in 2008, all three recycling bins were replaced with a single recycle bin, which was labeled with the simple three-arrow recycling logo.

    The irony is that Microsoft was going to toss all the old recycling bins into a landfill because they couldn't find anybody who wanted them.

    Alert Microsoft employee Tom Roth found the right people in Building Facilities and got them to stop the trucks as they were about 100 feet from dumping 40,000 perfectly good plastic bins into a landfill. Tom's son Justin works in the recycling industry, and he used his contacts to get the word out, and soon requests for recycling bins were coming in from all over the state of Washington. It took three months, but they eventually found homes for all of the recycle bins.

    Ironic disaster averted.

  • The Old New Thing

    In the conversion to 64-bit Windows, why were some parameters not upgraded to SIZE_T?


    James wonders why many functions kept DWORD for parameter lengths instead of upgrading to SIZE_T or DWORD_PTR.

    When updating the interfaces for 64-bit Windows, there were a few guiding principles. Here are two of them.

    • Don't change an interface unless you really need to.
    • Do you really need to?

    Changing an interface causes all sorts of problems when porting. For example, if you change the parameters to a COM interface, then you introduce a breaking change in everybody who implements it. Consider this hypothetical interface:

    // namedobject.idl
    interface INamedObject : IUnknown
        HRESULT GetName([out, string, sizeof(cchBuf)] LPWSTR pszBuf,
                        [in] DWORD cchBuf);

    And here's a hypothetical implementation:

    // contoso.cpp
    class CContosoBasicNamedObject : public INamedObject
        HRESULT GetName(LPWSTR pszBuf, DWORD cchBuf)
            return StringCchPrintfW(pszBuf, cchBuf, L"Contoso");

    Okay, now it's time to 64-bit-ize this puppy. So you do the natural thing: Grow the DWORD parameter to DWORD_PTR. Since DWORD_PTR maps to DWORD on 32-bit systems, this is a backward-compatible change.

    // namedobject.idl
    interface INamedObject : IUnknown
        HRESULT GetName([out, string, sizeof(cchBuf)] LPWSTR pszBuf,
                        [in] DWORD_PTR cchBuf);

    Then you recompile the entire operating system and find that the compiler complains, "Cannot instantiate abstract class: CContosoBasicNamedObject." Oh, right, that's because the INamed­Object::Get­Name method in the implementation no longer matches the method in the base class, so the method in the base class is not overridden. Fortunately, you have access to the source code for contoso.cpp, and you can apply the appropriate fix:

    // contoso.cpp
    class CBasicNamedObject : public INamedObject
        HRESULT GetName(LPWSTR pszBuf, DWORD_PTR cchBuf)
            return StringCchPrintfW(pszBuf, cchBuf, L"Basic");

    Yay, everything works again. A breaking change led to a compiler error, which led you to the fix. The only consequence (so far) is that the number of "things in code being ported from 32-bit Windows to 64-bit Windows needs to watch out for" has been incremented by one. Of course, too much of this incrementing, and the list of things becomes so long that developers are going to throw up their hands and say "Porting is too much work, screw it." Don't forget, the number of breaking API changes in the conversion from 16-bit to 32-bit Windows was only 117.

    You think you fixed the problem, but you didn't. Because there's another class elsewhere in the Contoso project.

    class CSecureNamedObject : public CBasicNamedObject
        HRESULT GetName(LPWSTR pszBuf, DWORD cchBuf)
            if (IsAccessAllowed())
                return CBasicNamedObject::GetName(pszBuf, cchBuf);
                return E_ACCESSDENIED:

    The compiler did not raise an error on CSecure­Named­Object because that class is not abstract. The INamed­Object::Get­Name method from the INamed­Object interface is implemented by CBasic­Named­Object. All abstract methods have been implemented, so no "instantiating abstract class" error.

    On the other hand, the CSecure­Named­Object method wanted to override the base method, but since its parameter list didn't match, it ended up creating a separate method rather than an override. (The override pseudo-keyword not yet having been standardized.) As a result, when somebody calls the INamed­Object::Get­Name method on your CSecure­Named­Object, they don't get the one with the security check, but rather the one from CBasic­Named­Object. Result: Security check bypassed.

    These are the worst types of breaking changes: The ones where the compiler doesn't tell you that something is wrong. Your code compiles, it even basically runs, but it doesn't run correctly. Now, sure, the example I gave would have been uncovered in security testing, but I chose that just for drama. Go ahead and substitute something much more subtle. Like say, invalidating the entire desktop when you pass NULL to Invalidate­Rect.

    Okay, so let's look back at those principles. Do we really need to change this interface? The only case where expanding to SIZE_T would make a difference is if an object had a name longer than 2 billion characters. Is that a realistic end-user scenario? Not really. Therefore, don't change it.

    Remember, you want to make it easier for people to port their program to 64-bit Windows, not harder. The goal is make customers happy, not create the world's most architecturally pure operating system. And customers aren't happy when the operating system can't run their programs (because every time the vendor try to port it, they keep stumbling over random subtle behavior changes that break their program).

  • The Old New Thing

    Keyboard shortcut for resizing all columns in a listview control to fit


    The keyboard shortcut for resizing all columns in a report-mode (also known as Details mode) list view control to fit the current content width is Ctrl+Num+. That's the + key on the numeric keypad. (If you're using Explorer, you can also right-click the column header and choose Size All Columns to Fit.)

    Note that this command is a verb, not a state, so it takes into account the contents of the listview at the time you press the hotkey. If the contents change and you want the columns resize based on the new contents, you'll have to press the hotkey again.

  • The Old New Thing

    The cries of "Oh no!" emerge from each office as the realization slowly dawns


    Today is the (approximate) 15th anniversary of the Bedlam Incident. To commemorate that event, here's a story of another email incident gone horribly awry.

    Some time ago, an email message was sent to a large mailing list. It came from somebody in the IT department and said roughly, "This is a mail sent on behalf of Person X to check if your XYZ server has migrated to the new datacenter. Please visit http://blahblah and confirm that your server name is of the form XXYYNNZZ. If not, please Reply to All."


    The seasoned Microsoft employees (and the new employees who paid attention during new employee orientation) recognized the monster that was about to be unleashed, and the cries of "Oh no!" could be heard emerging from each office as the realization dawned.

    And then it started. All the replies from people saying "I'm still on the old datacenter." And then the replies from people saying, "Stop replying!"

    What's frustrating is that you can't do anything about the catastrophe that is unfolding. Any attempt to reply to the message telling people to stop replying only makes the problem worse. All you can do is stand back and wait for the fire to burn itself out.

    Ten minutes later, Person X sent a message to the mailing list. "Please DO NOT Reply all to this email thread. I am working with the IT department to see if there is another way to get this information."

    It took another ten minutes for the messages to finally stop, but that seems to have shut things down. Now it's time for blame and speculation!

    We were never told whose brilliant idea it was to try to gather information by sending mail to 7000 people telling them to reply all. One theory was that Person X went to the IT department saying "Hi, I'd like to collect information XYZ from this large list of people. Can you help?" And some new hire in the IT department said, "Sure, I can get that information for you. I'll just send everybody some email!"

    After the dust settled, somebody made a tally of the damage.

    Number of people on the mailing list: around 7000.
    Number of replies: 70.
    Of those, number of replies saying "stop replying": 17.

    To commemorate this event, a colleague of mine who maintains a popular internal tool pushed out an upgraded version. The new version had a checkbox on the main page:

    I have not been migrated to the new datacenter.

    Bonus chatter: It so happens that the message was sent at the beginning of the summer, and most of the "I'm still on the old datacenter" replies came from summer interns. Maybe it was a test.

  • The Old New Thing

    You can't use the WM_USER message in a dialog box


    Today, I'm not actually going to say anything new. I'm just going to collate information I've already written under a better title to improve search engine optimization.

    A customer reported that they did the following but found that it didn't work:

    INT_PTR CALLBACK MyDlgProc(HWND hdlg, UINT wm, WPARAM wParam, LPARAM lParam)
      switch (wm) {
        SetDlgItemInt(hwnd, IDC_ITEMCOUNT, (UINT)wParam, FALSE);
        return TRUE;
      return FALSE;

    "I send the MDM_SET­ITEM­COUNT message to my dialog, but the value doesn't stick. At random times, the value resets back to zero."

    As we saw some time ago, window messages in the WM_USER range belong to the window class. In the case of a dialog box, the window class is the dialog class, and the owner of the class is the window manager itself. An application which tries to use the WM_USER message is using window messages it does not own.

    It so happens that the dialog manager already defined the WM_USER message:

    #define DM_GETDEFID         (WM_USER+0)

    We saw this problem some time ago when we tried to find a message we could use for custom use in a dialog box.

    What the customer is seeing is that whenever the dialog manager sends a DM_GET­DEF­ID message to the dialog box to get the default control ID, the MyDlgProc function mistakenly thinks that it's a MDM_SET­ITEM­COUNT message and sets the item count to whatever happens to be in the wParam (which happens to be zero). On top of that, it claims to have handled the message, which means that the current value of DWL_MSG­RESULT is returned to the sender (probably zero), so the dialog manager thinks that there is no default ID on the dialog.

    The solution, as noted in that same article, is to use WM_APP instead of WM_USER. Because you don't have permission to define messages in the WM_USER range if you aren't the owner of the window class.

  • The Old New Thing

    Usage guidance for a popcorn machine in the kitchenette


    My colleague KC Lemson tipped me off to a sign hanging next to a popcorn machine in one of the kitchens:


    Please do not leave
    popcorn kettle on

    while unattended.
    The fire truck will
    , and they don't
    want any popcorn.


    A friend of mine happened to have a chat with a fire fighter who used to be assigned to the fire station nearest to Microsoft main campus. According to him, the top three reasons for being called to a Microsoft building are (in no particular order)

    • Burnt popcorn.
    • Somebody has a panic attack and mistakes it for a heart attack.
    • Somebody pulls the fire alarm because they are under a lot of stress because they missed a deadline, and they think pulling the fire alarm will buy them some time. (Where do they think they are, college?)

    This week is Fire Prevention Week.

Page 1 of 3 (25 items) 123