• The Old New Thing

    A function pointer cast is a bug waiting to happen

    • 35 Comments

    A customer reported an application compatibility bug in Windows.

    We have some code that manages a Win32 button control. During button creation, we subclass the window by calling Set­Window­Subclass. On the previous version of Windows, the subclass procedure receives the following messages, in order:

    • WM_WINDOWPOSCHANGING
    • WM_NCCALCSIZE
    • WM_WINDOWPOSCHANGED

    We do not handle any of these messages and pass them through to Def­Subclass­Proc. On the latest version of Windows, we get only the first two messages, and comctl32 crashes while it's handling the third message before it gets a chance to call us. It looks like it's reading from invalid memory.

    The callback function goes like this:

    LRESULT ButtonSubclassProc(
        HWND hwnd,
        UINT uMsg,
        WPARAM wParam,
        LPARAM lParam,
        UINT_PTR idSubclass,
        DWORD_PTR dwRefData);
    

    We install the subclass function like this:

    SetWindowSubclass(
        hwndButton,
        reinterpret_cast<SUBCLASSPROC>(ButtonSubclassProc),
        id,
        reinterpret_cast<DWORD_PTR>(pInfo));
    

    We found that if we changed the callback function declaration to

    LRESULT CALLBACK ButtonSubclassProc(
        HWND hwnd,
        UINT uMsg,
        WPARAM wParam,
        LPARAM lParam,
        UINT_PTR idSubclass,
        DWORD_PTR dwRefData);
    

    and install the subclass function like this:

    SetWindowSubclass(
        hwndButton,
        ButtonSubclassProc,
        id,
        reinterpret_cast<DWORD_PTR>(pInfo));
    

    then the problem goes away. It looks like the new version of Windows introduced a compatibility bug; the old code works fine on all previous versions of Windows.

    Actually, you had the problem on earlier versions of Windows, too. You were just lucky that the bug wasn't a crashing bug. But now it is.

    This is a classic case of mismatching the calling convention. The SUB­CLASS­PROC function is declared as requiring the CALLBACK calling convention (which on x86 maps to __stdcall), but the code declared it without any calling convention at all, and the ambient calling convention was __cdecl. When they went to compile the code, they got a compiler error that said something like this:

    error C2664: 'SetWindowSubclass' : cannot convert parameter 2 from 'LRESULT (__cdecl *)(HWND,UINT,WPARAM,LPARAM,UINT_PTR,DWORD_PTR)' to 'SUBCLASSPROC'

    "Since the compiler was unable to convert the parameter, let's give it some help and stick a cast in front. There, that shut up the compiler. Those compiler guys are so stupid. They can't even figure out how to convert one function pointer to another. I bet they need help wiping their butts when they go to the bathroom."

    And there you go, you inserted a cast to shut up the compiler and masked a bug instead of fixing it.

    The only thing you can do with a function pointer after casting it is to cast it back to its original type.¹ If you try to use it as the cast type, you will crash. Maybe not today, maybe not tomorrow, but someday.

    In this case, the calling convention mismatch resulted in the stack being mismatched when the function returns. It looks like earlier versions of Windows managed to hobble along long enough before things got resynchronized (by an EBP frame restoration, most likely) so the damage didn't spread very far. But the new version of Windows, possibly one compiled with more aggressive optimizations, ran into trouble before things resynchronized, and thus occurred the crash.

    The compiler was yelling at you for a reason.

    It so happened that the Windows application compatibility team had already encountered this problem in their test labs, and a shim had already been developed to auto-correct this mistake. (Actually, the shim also corrects another mistake they hadn't noticed yet: They forgot to call Remove­Window­Subclass when they were done.)

    ¹I refer here to pointers to static functions. Pointers to member functions are entirely different animals.

  • The Old New Thing

    Your program loads libraries by their short name and you don't even realize it

    • 20 Comments

    In the discussion of the problems that occur if you load the same DLL by both its short and long names, Xepol asserted that any program which loads a DLL by its short name "would have ONLY itself to blame for making stupid, unpredictable, asinine assumptions" and that Windows should "change the loader to NOT load any dll with a short name where the short name and long name do not match."

    The problem with this rule, well, okay, one problem is that you're changing the rules after the game has started. Programs have been allowed to load DLLs by their short name in the past; making it no longer possible creates an application compatibility hurdle.

    Second, it may not be possible to obtain the long name for a DLL. If the user has access to the DLL but not to one of the directories in the path to the DLL, then the attempt to get its long name will fail. The "directory you can access hidden inside a directory you cannot access" directory structure is commonly used as a less expensive alternative to Access Based Enumeration. Maybe your answer to this is to say that this is not allowed; people who set up their systems this way deserve to lose.

    Third, the application often does not control the path being passed to Load­Library, so it doesn't even know that it's making a stupid, unpredictable, asinine assumption. For example, the program may call Get­Module­File­Name to obtain the directory the application was installed to, then attempt to load satellite DLLs from that same directory. If the current directory was set by passing a short name to Set­Current­Directory (try it: cd C:\Progra~1) then the program will unwittingly be making a stupid, unpredictable, asinine decision. (But since the decision is being made consistently, there was no problem up until now.)

    When you call Co­Create­Instance, there is nearly always a Load­Library happening behind the scenes, because the object you are trying to create is coming from a DLL that somebody else registered. If they registered it with a short name, then any application that wanted to use that object runs afoul of the new rule, even though the application did nothing wrong.

    Now, you might argue that this just means that the component provider is the one who made a stupid, unpredicable, asinine assumption. But that may not have been a stupid assumption at the time: 16-bit applications see only short names, so it might have been that that's the only thing they can do. Or the component may have made a conscious decision to register under short names; this was common in the Windows 95 era because file names often passed through components that didn't understand long file names. (Examples: 16-bit applications, backup applications.)

    Even if you decide that what was once a reasonable decision is now asinine, you have to get there from here. Do you declare all 16-bit applications asinine (because they can only use those asinine short file name)? Even for the 32-bit components, how do you convince them to switch over? Can you even get them to switch over? Until they do, their component will not be accessible: The shell extension won't be loaded until they fix their registration. A program which anticipated this problem and always loaded DLLs by their short names is now broken. (After all, you had to make an arbitrary decision to use long names exclusively or short names exclusively, and someone may have chosen the other branch of the decision tree.)

    And long name/short name is really just a special case of a single file having multiple names. Other ways of creating multiple names include hard links, symbolic links, and junction points. If a file has two names due to hard links, which one is the "preferred" name and which is the "asinine" name? (And what if the "preferred" name is not available to the user? Suppose you decide that the preferred name is the one that comes first alphabetically. Can I launch a denial-of-service attack by creating a hard link to C:\Windows\system32\shell32.dll called C:\A\A.DLL? All attempts to link to C:\Windows\system32\shell32.dll will fail because it is now the "asinine" name.) Does this rule against loading DLLs by "asinine" names negate part of the functionality of the compatibility paths introduced in Windows Vista (since, under this new rule, you can't use them to load DLLs or execute programs because they entail traversing a symbolic link).

    Perhaps these concerns could be addressed in a manner that didn't have all these problems, but it's an awful lot of complexity. In general, simple rules are preferred to complex rules. Especially if the complexity is to address an issue that was not a serious problem to begin with.

  • The Old New Thing

    Why don't the file timestamps on an extracted file match the ones stored in the ZIP file?

    • 35 Comments

    A customer liaison had the following question:

    My customer has ZIP files stored on a remote server being accessed from a machine running Windows Server 2003 and Internet Explorer Enhanced Security Configuration. When we extract files from the ZIP file, the last-modified time is set to the current time rather than the time specified in the ZIP file. The problem goes away if we disable Enhanced Security Configuration or if we add the remote server to our Trusted Sites list. We think the reason is that if the file is in a non-trusted zone, the ZIP file is copied to a temporary location and is extracted from there, and somehow the date information is lost.

    The customer is reluctant to turn off Enhanced Security Configuration (which is understandable) and doesn't want to add the server as a trusted site (somewhat less understandable). Their questions are

    • Why is the time stamp changed during the extract? If we copy the ZIP file locally and extract from there, the time stamp is preserved.
    • Why does being in an untrusted zone affect the behavior?
    • How can we avoid this behavior without having to disable Enhanced Security Configuration or adding the server as a trusted site?

    The customer has an interesting theory (that the ZIP file is copied locally) but it's the wrong theory. After all, copying the ZIP file locally doesn't modify the timestamps stored inside it.

    Since the ZIP file is on an untrusted source, a zone identifier is being applied to the extracted file to indicate that the resulting file is not trustworthy. This permits Explorer to display a dialog box that says "Do you want to run this file? It was downloaded from the Internet, and bad guys hang out there, bad guys who try to give you candy."

    And that's why the last-modified time is the current date: Applying the zone identifier to the extracted file modifies its last-modified time, since the file on disk is not identical to the one in the ZIP file. (The one on disk has the "Oh no, this file came from a stranger with candy!" label on it.)

    The recommended solution is to add the server containing trusted ZIP files to your trusted sites list. Since the customer is reluctant to do this (for unspecified reasons), there are some other alternatives, though they are considerably riskier. (These alternatives are spelled out in KB article 883260: Description of how the Attachment Manager works.)

    You can disable the saving of zone information from the Group Policy Editor, under Administrative Templates, Windows Components, Attachment Manager, Do not preserve zone information in file attachments. This does mean that users will not be warned when they attempt to use a file downloaded from an untrusted source, so you have to trust your users not to execute that random executable they downloaded from some untrusted corner of the Internet.

    You can use the Inclusion list for low, moderate, and high risk file types policy to add ZIP as a low-risk file type. This is not quite as drastic as suppressing zone information for all files, but it means that users who fall for the "Please unpack the attached ZIP file and open the XYZ icon" trick will not receive a "Do you want to eat this candy that a stranger gave to you?" warning prompt before they get pwned.

    But like I said, it's probably best to add just the server containing the trusted ZIP files to your trusted sites list. If the server contains both trusted and untrusted data (maybe that's why the customer doesn't want to put it on the trusted sites list), then you need to separate the trusted data from the untrusted data and put only the trusted server's name in your trusted sites list.

  • The Old New Thing

    Like a chicken talking to a duck

    • 48 Comments

    Many years ago, I called the home of my Chinese-speaking nieces. This was before they started learning English and before I started learning their dialect of Chinese. After the call was over, the eldest niece asked, "Who was that on the phone?"

    "That was Uncle Raymond."

    "Oh, I want to talk to Uncle Raymond!"

    Her mother replied, "That'd be like a chicken talking to a duck."

    A chicken talking to a duck is a Chinese idiom referring to two people who cannot communicate due to a language barrier.

    It seems that ducks are somehow central to the concept of language unintelligibility. Is there a duck-related language idiom in your language?

  • The Old New Thing

    That's not a duck

    • 19 Comments

    One of the audio features added to Windows 7 goes by the formal name stream attenuation, but it is more commonly known to people in the audio world as ducking. Ducking is the process of lowering the volume of background sounds in order to draw more attention to the foreground sound. For example, when you're watching a big battle scene in a summer action movie, your ears are assaulted with the sounds of weapons fire, objects exploding left and right, but when the hero turns to his girlfriend-of-the-moment, the sound level of all the death and destruction drops a bit so you can hear him say something tender, or maybe inspiring, or maybe inspiringly tender. Something like that. And then when the moment is over, the sound level returns to normal and once again the sound of things blowing up overwhelms your eardrums.

    Sorry, I got a bit carried away. Where was I? Oh right, ducking.

    Ducking is the process of temporarily reducing the volume of background sounds in order to make foreground sounds easier to hear. Here's a presentation from the 2008 PDC by Larry Osterman which covers the technical part of ducking.

    When the feature was added to Windows 7, the icon on the Communications tab of the Sound control panel was a yellow rubber duck. Those audio folks think they're so cute.

    Sadly (or perhaps fortunately), the icon was changed to a telephone handset.

    The subject line of this article is an inside joke: When visiting one of my friends, I would sometimes speak to his three-year-old daughter in German, because they say that exposure to multiple languages is a good thing. At one time, I asked her a question, I forget what it was, and she responded, in English, "That's not a duck!"

    None of us could figure out what she was talking about, but it has been a catch phrase among us for over a decade.

  • The Old New Thing

    The Importance of Being Snooki

    • 2 Comments

    What if Jersey Shore were actually a play written by Oscar Wilde? Algernon Moncrieff and Jack Worthing from Broadway's current production of The Importance of Being Earnest explore that premise in a five-part series titled "Jersey Shore" Gone Wilde, drawing its dialogue from things actually said on "Jersey Shore".

    The dialogue and delivery are funny enough. But what really sells it is the acting: The eye rolls, the nervous glances, the blank stares...

    Warning: NSFW for crude language, so I didn't embed the videos. You'll have to click through.

  • The Old New Thing

    Hidden compatibility constraints of redirecting program execution via a stub

    • 20 Comments

    One of the "obvious" solutions to the issue of how much work you're willing to do to save 68KB of disk space was to replace one of the copies with a stub that launches the other copy.

    If you try this obvious solution, you may run into some compatibility issues.

    First of all, there are programs which launch Notepad and then wait on the process handle so they can wait until the user closes Notepad. Your stub program cannot just do a Create­Process on the target, because programs which perform a wait will find the wait satisfied when your stub program exits.

    Okay, so your stub program has to wait for the real copy of Notepad to exit before it can exit itself.

    Once you fix that, you'll find another problem: Programs call Get­Exit­Code­Process to see how Notepad exited. Your stub program therefore cannot just perform an Exit­Process; it has to do a Get­Exit­Code­Process on the real Notepad and pass that exit code to your own Exit­Process.

    Once you fix that, you'll find another problem: There are programs which execute a process and then look for windows owned by that process. (Yes, there can be more than one, but Notepad is a simple program that creates only one top-level unowned window.) Those programs will get the process ID of your stub program and be unable to find the Notepad window (since it belongs to the real Notepad program, which has a different process ID). I'm not sure how to fix that one.

    Yes, you can write a stub that launches another program, but that solves the "save disk space" problem by introducing other problems.

    Remember, even though people are supposed to stick to documented behavior (since that is all that is contractual), in practice any implementation detail becomes a compatibility constraint.

  • The Old New Thing

    Not quite understanding why you wash your hands before playing the piano

    • 15 Comments

    My niece wanted to play my piano, and I asked her to wash her hands. She said, "I don't need that, I have Magic Soap," and she produced a bottle of hand sanitizer.

    Um, the purpose of washing your hands isn't so the piano doesn't get sick.

    My piano-instructor cousin-in-law tells me that her young students often say, "My hands are not dirty. They are just a little sticky because I just ate a chocolate muffin on my way here. I don't think I need to wash my hands. Why do you always ask me to wash my hands before the piano lesson?"

  • The Old New Thing

    Why is there a RestoreLastError function that does the same thing as SetLastError?

    • 18 Comments

    Matt Pietrek noticed that Set­Last­Error and Restore­Last­Error do exactly the same thing and wondered why there's a separate function for it.

    It's to assist in debugging and diagnostics.

    Say you're debugging a problem and when you call Get­Last­Error you get ERROR_ACCESS_DENIED. It would really help a lot if you could figure out who set the error code to ERROR_ACCESS_DENIED. If you set a breakpoint on Set­Last­Error, you find that people call Set­Last­Error for two different reasons:

    1. To report an error.
    2. To restore the error code to what it was before they did something that might change the last error code.

    That second one needs a little explanation. You might have a logging function that goes like this:

    // Remember, code in italics is wrong
    void LogSomething(blah blah)
    {
     DWORD dwError = GetLastError();
     ... do logging stuff ...
     SetLastError(dwError);
    }
    
    // or if you prefer RAII
    
    class PreserveLastError
    {
    public:
        PreserveLastError() : m_dwLastError(GetLastError()) {} 
        ~PreserveLastError() { SetLastError(m_dwLastError); }
    private:
        DWORD m_dwLastError;
    };
    
    void LogSomething(blah blah)
    {
     PreserveLastError preserve;
    
     ... do logging stuff ...
    }
    

    It's important that functions which perform logging, assertion checking, and other diagnostic operations are nonintrusive. You don't want a bug to go away when you turn on logging because the logging code somehow perturbed the system. Therefore, your logging function saves the value of Get­Last­Error() and sets that back as the error code when it's done, so that any errors that took place during logging do not escape and inadvertently affect the rest of the program.

    Now let's go back to the code that's trying to figure out who set the error code to ERROR_ACCESS_DENIED. You set up your debugging diagnostic tool and tell it to record everybody who calls Set­Last­Error() and pay particular attention to everybody who sets the error to ERROR_ACCESS_DENIED. You then run your scenario, your program encounters the failure you're trying to debug, and you ask the diagnostic tool, "Tell me who set the error code to ERROR_ACCESS_DENIED." The diagnostic tool says, "Ah, I have that in my history. The function that set the error code to ERROR_ACCESS_DENIED is... Log­Something!"

    Of course, Log­Something wasn't really the originator of the ERROR_ACCESS_DENIED; it was just restoring things to how it found them. The real ERROR_ACCESS_DENIED came from somebody else, and the log function was just being careful not to disturb it.

    ...
      if (!FunctionX()) {
        LogSomething("Function X failed");
      } else {
        LogSomething("Function X succeeded");
       FunctionY(); // also does some logging
      }
      FunctionZ(); // also does some logging
      Assert(EverythingOkay()); // assertion fires
      // GetLastError() returns ERROR_ACCESS_DENIED
    ...
    

    All those calls to logging functions in between called Get­Last­Error() and got ERROR_ACCESS_DENIED back, then when the logging was complete, they called Set­Last­Error(ERROR_ACCESS_DENIED) to put things back. Your diagnostic error-tracing tool gleefully points the finger at your logging function: "Look! Look! This guy set the error code to ERROR_ACCESS_DENIED!"

    Enter Restore­Last­Error. This function does the same thing as Set­Last­Error, but its use is a message to diagnostic tools that "Sure, you may see me set an error code, but it wasn't my idea. I'm just trying to put things back the way I found them. Keep looking backwards in your history."

    (The message also works forward in time: If you want to catch ERROR_ACCESS_DENIED in the act, you might set a breakpoint on Set­Last­Error, and then get frustrated that the breakpoint keeps getting hit by your logging function. Switching the logging function to Restore­Last­Error keeps the breakpoint on Set­Last­Error from firing spuriously.)

    The corrected version of the Log­Something function is therefore something like this:

    void LogSomething(blah blah)
    {
     DWORD dwError = GetLastError();
     ... do logging stuff ...
     RestoreLastError(dwError);
    }
    
    // or if you prefer RAII
    
    class PreserveLastError
    {
    public:
        PreserveLastError() : m_dwLastError(GetLastError()) {} 
        ~PreserveLastError() { RestoreLastError(m_dwLastError); }
    private:
        DWORD m_dwLastError;
    };
    
  • The Old New Thing

    Microspeak: Hipo

    • 23 Comments

    A friend of mind was asked out of the blue, "What does hypo mean?"

    She started to flash back to high school English class and Greek word roots.

    "I've started to hear it everywhere. Like Everyone in that meeting is a hypo or We need to reach out to hypos."

    My friend realized that she had mis-heard the question. It was not about the Greek root hypo but rather the bizarro Microspeak word hipo, shorthand for high-potential employee.

    As I researched this term (which I had never encountered before), I found that it fell into that special category of Microspeak known as if you have to ask, I'm not going to tell you. Identifying and developing high-potential employees is one of the charter activities of the ExPo project. And if you look through the ExPo Web site, you'll find that nowhere do they tell you what HiPo and ExPo stand for. "If you have to ask, I'm not going to tell you."

    My friend suggested that "ExPos are people who have a lot of potential and enjoy showing it off to others."

Page 123 of 446 (4,451 items) «121122123124125»