December, 2012

  • The Old New Thing

    Why was Pinball removed from Windows Vista?

    • 115 Comments

    Windows XP was the last client version of Windows to include the Pinball game that had been part of Windows since Windows 95. There is apparently speculation that this was done for legal reasons.

    No, that's not why.

    One of the things I did in Windows XP was port several millions of lines of code from 32-bit to 64-bit Windows so that we could ship Windows XP 64-bit Edition. But one of the programs that ran into trouble was Pinball. The 64-bit version of Pinball had a pretty nasty bug where the ball would simply pass through other objects like a ghost. In particular, when you started the game, the ball would be delivered to the launcher, and then it would slowly fall towards the bottom of the screen, through the plunger, and out the bottom of the table.

    Games tended to be really short.

    Two of us tried to debug the program to figure out what was going on, but given that this was code written several years earlier by an outside company, and that nobody at Microsoft ever understood how the code worked (much less still understood it), and that most of the code was completely uncommented, we simply couldn't figure out why the collision detector was not working. Heck, we couldn't even find the collision detector!

    We had several million lines of code still to port, so we couldn't afford to spend days studying the code trying to figure out what obscure floating point rounding error was causing collision detection to fail. We just made the executive decision right there to drop Pinball from the product.

    If it makes you feel better, I am saddened by this as much as you are. I really enjoyed playing that game. It was the location of the one Windows XP feature I am most proud of.

    Update: Hey everybody asking that the source code be released: The source code was licensed from another company. If you want the source code, you have to go ask them.

  • The Old New Thing

    Why can't you rename deleted items in the Recycle Bin?

    • 30 Comments

    I misread a question from commenter Comedy Gaz, so let's try it again. (Good thing I held one last Suggestion Box Monday of the year in reserve.)

    Why can't you rename deleted items in the Recycle Bin?

    Okay, first of all, "Why would you want to do this?"

    I see no explanation for how this could possibly escape the 100-point hole every feature starts out in. I mean, these are items you deleted. Why do you care what their names are? Are you renaming it so you can find it again later? Why would you go to the effort of locating an item in the Recycle Bin, and then not bother recovering it? It's like calling the Lost and Found at Grand Central Terminal, and saying, "Hi, I left my umbrella on the Danbury train last Tuesday. It's blue with white snowflakes. Yes, that's the one. No, I don't want to come in and get it. Could you just dye it green, and paint a yellow smiley face on it? Thanks."

    The purpose of the Recycle Bin is not to provide another place where you can organize your data. The purpose of the Recycle Bin is to give you one last chance to recover the data you deleted by mistake!

    What would be the point of writing the code to allow the name to be edited, update the name in the Recycle Bin databases (watch out for the cross-process race conditions!), then locate all the other open Recycle Bin windows and tell them, "Hey, if you were showing the name of deleted item number 51462, please go refresh it, because it has a new name"? That's a lot of code to be written and tested (and re-tested every build) for a pretty dubious scenario in the first place. (And why stop at just the file name? Why not let people edit the Original Location and Date Deleted too?)

    From an information-theoretical standpoint, renaming an item in the Recycle Bin would be a falsification of the historical record. The information about the items in the Recycle Bin describe the item at the time it was deleted. Its name, the folder it was deleted from, the date it was deleted. If you could change the name of an item in the Recycle Bin, then that record would be incorrect. "This icon represents the file that you deleted from folder Q, except that the name I'm showing you isn't actually the name. It's some bogus name that somebody edited."

    It'd be like asking the church to go update its registry to change your birth name. "Yes, I know that I was born baptized with the name Amélie Bernadette, but please change your files so it says that I was baptized with the name Chloë Dominique. Thanks."

    The church isn't going to do that because that would now be lying. You were baptized with the name Amélie Bernadette. You are welcome to change your name to Chloë Dominique, but that doesn't change the fact that you were baptized with the name Amélie Bernadette.

  • The Old New Thing

    You too can use your psychic powers: Spaces in paths

    • 29 Comments

    I'm going to be lazy today and make you solve the problem.

    Did a recent security hotfix change change the way we handle UNC paths with spaces? Normally, if we open the Run dialog and type \\server\share\Support Library\, the Support Library folder opens. But recently it stopped working. Instead of opening the Support Library folder, we get the Open With dialog:

    Open with

    Choose the program you want to use to open this file:

    File: Support

      Contoso Chess Challenger 5.0
      Contoso Music Studio
      Fabrikam Chart 2.0
      Litware 2010
    $   Woodgrove Bank Online
     
     
     
     
     
     
     
      Happy Holidays!
     
     
     
     
     
      Always use the selected program to open this kind of file

     

    Can you figure out what happened recently that introduced this problem? You have all the information you need.

  • The Old New Thing

    A question about endian-ness turns out to be the wrong question

    • 29 Comments
    Via a customer liaison, we received what seemed like a simple question: "How can I detect whether a Windows machine is big-endian or little-endian?"

    You could actually answer this question (say by coughing up a code fragment that stores a 16-bit value to memory and then takes it apart into bytes to see how it got stored, or by simply hard-coding it based on the target architecture you are compiling for), but you'd be making the mistake of answering the question instead of solving the problem.

    The customer liaison explained, "My customer is having a problem that is caused by a bug in the SAP BI connector. According to the Knowledge Base article, the problem occurs when the SAP BI server is installed on a big-endian system."

    Okay, with that background, we immediately recognize that the question is wrong. The problem occurs when the SAP BI server is running on a big-endian system. It doesn't matter what the endian-ness of the Windows machine is, so any mechanism for detecting whether the Windows machine is big-endian or little-endian is barking up the wrong tree.

    But it turns out that the customer never even had to do this detection at all. If you read the Knowledge Base article, it says that the problem is already fixed.

    The fix for this issue was first released in Cumulative Update 4 for SQL Server 2008 Service Pack 1.

    So just make sure you're running Cumulative Update 4 for SQL Server 2008 Service Pack 1 or higher (which, if you've been making any attempt at keeping your server up to date, you've been doing for three years), and the problem will go away.

    The customer liaison thanked us for our assistance, but nevertheless asked for the code that would detect the endian-ness of the Windows system. I asked, "How will that help you solve your problem?" but before the customer liaison answered, some other people just gave the customer code that detects the machine endian-ness.

    Even though that will do absolutely nothing to solve the customer's problem.

    That was the last we heard from the customer liaison. I'm hoping that they actually installed the service pack and solved their problem. And I'm afraid what they're going to do with that code fragment.

  • The Old New Thing

    Why do some shortcuts not support editing the command line and other properties?

    • 27 Comments

    Ben L observed that some shortcuts do not permit the command line and other options to be edited. "Where is this feature controlled? Is there a way to override this mode?" This question is echoed by "Anonymous (Mosquito buzzing around)" (and don't think we don't know who you are), who in a huge laundry list of questions adds, "Why does the Game Explorer limit customizing command line, target, etc?"

    These questions are looking at the situation backwards. The issue is not "Why do these shortcuts block editing the command line?" The issue is "Why do some shortcuts allow editing the command line?"

    Recall that shortcuts are references to objects in the shell namespace. Shell namespace objects are abstract. Some of them refer to files, but others refer to non-file objects, like control panels, printers, and dial-up networking connectoids. And in the abstract, these objects support verbs like Open and Rename. But there is no requirement that a shell namespace object support "Run with command line argument".

    If you have a shortcut to an executable, then the LNK file handler says, "Okay, this is a special case. Executables support command line arguments, so I will run the executable with the command line arguments set by the IShell­Link::Set­Arguments method.

    Note that the shortcut target and arguments are separate properties. The LNK file property sheet hides this from you by calling IShell­Link::Get­Path and IShell­Link::Get­Arguments, then taking the two strings and combining them into a single Target field for display. When you save the changes, the LNK file property sheet takes the Target, figures out which part is the executable and which part is the arguments, and calls IShell­Link::Set­Path and IShell­Link::Set­Arguments on the two parts.

    In other words, the command line is all a ruse.

    This special action is performed only for executable targets, because those are the only things that accept arguments. If you create a shortcut to a control panel, you'll find that the Target is not editable. If you create a shortcut to a printer, you'll find that the Target is not editable. If you create a shortcut to a dial-up networking connectoid, you'll find that the Target is not editable. Having a non-editable command line is the normal case. The file system is the weirdo.

    Shortcuts to advertise applications and shortcuts to items in the Games folder are not shortcuts to executables. They are shortcuts into the shell namespace for various types of virtual data. An advertised application is a shell namespace object that represents "an installed application". It is not a pointer directly to the executable, but rather a reference to an entry in the MSI database, which in turn contains information about how to install the program, repair it, update it, and run it. The shell doesn't even know what the command line is. To launch an advertised shortcut, the shell asks the MSI database for the command line, and it then executes that command line that MSI returns. The value set by IShell­Link::Set­Arguments never enters the picture. Similarly, the entries in the Games Folder are not executables; they are entries in the games database.

    I can see how this can be confusing, because when you click on these shortcuts, a program runs, but these shortcuts are not shortcuts directly to programs. As a result, the code that takes a Target and Arguments and combines them into a command line does not get a chance to run.

  • The Old New Thing

    The QuickCD PowerToy, a brief look back

    • 27 Comments

    One of the original Windows 95 PowerToys was a tool called QuickCD. Though that wasn't its original name.

    The original name of the QuickCD PowerToy was FlexiCD. You'd think that it was short for "Flexible CD Player", but you'd be wrong. FlexiCD was actually named after its author, whose name is Felix, but who uses the "Flexi" anagram as a whimsical nickname. We still called him Felix, but he would occasionally use the Flexi nickname to sign off an email message, or use it whenever he had to create a userid for a Web site (if Web sites which required user registration existed in 1994).

    You can still see remnants of FlexiCD in the documentation. The last sample INF file on this page was taken from the QuickCD installer.

  • The Old New Thing

    Have you found any TheDailyWTF-worthy code during the development of Windows 95?

    • 25 Comments

    Mott555 is interested in some sloppy/ugly code or strange workarounds or code comments during the development of Windows 95, like "anything TheDailyWTF-worthy."

    I discovered that opening a particular program churned the hard drive a lot when you opened it. I decided to hook up the debugger to see what the problem was. What I discovered was code that went roughly like this, in pseudo-code:

    int TryToCallFunctionX(a, b, c)
    {
      for each file in (SystemDirectory,
                        WindowsDirectory,
                        ProgramFilesDirectory(RecursiveSearch),
                        KitchenSink,
                        Uncle.GetKitchenSink)
      {
        hInstance = LoadLibrary(file);
        fn = GetProcAddress(hInstance, "FunctionX");
        if (fn != nullptr) {
            int result = fn(a,b,c);
            FreeLibrary(hInstance);
            return result;
        }
        fn = GetProcAddress(hInstance, "__imp__FunctionX");
        if (fn != nullptr) {
            int result = fn(a,b,c);
            FreeLibrary(hInstance);
            return result;
        }
        fn = GetProcAddress(hInstance, "FunctionX@12");
        if (fn != nullptr) {
            int result = fn(a,b,c);
            FreeLibrary(hInstance);
            return result;
        }
        fn = GetProcAddress(hInstance, "__imp__FunctionX@12");
        if (fn != nullptr) {
            int result = fn(a,b,c);
            FreeLibrary(hInstance);
            return result;
        }
        FreeLibrary(hInstance);
      }
      return 0;
    }
    

    The code enumerated every file in the system directory, Windows directory, Program Files directory, and possibly also the kitchen sink and their uncle's kitchen sink. It tries to load each one as a library, and sees if it has an export called FunctionX. For good measure, it also tries __imp__­FunctionX, FunctionX@12, and __imp__­FunctionX@12. If it finds any match, it calls the function.

    As it happens, every single call to Get­Proc­Address failed. The function they were trying to call was an internal function in the window manager that wasn't exported. I guess they figured, "Hm, I can't find it in user32. Maybe it moved to some other DLL," and went through every DLL they could think of.

    I called out this rather dubious programming technique, and word got back to the development team for that program. They came back and admitted, "Yeah, we were hoping to call that function, but couldn't find it, and the code you found is stuff we added during debugging. We have no intention of actually shipping that code."

    Well, yeah, but still, what possesses you to try such a crazy technique, even if only for debugging?

  • The Old New Thing

    It rather involved being on the other side of this airtight hatchway: Writing to the application directory

    • 25 Comments

    We received a security vulnerability report that went roughly like this:

    There is a security vulnerability in the X component. It loads shell32.dll from the current directory, thereby making it vulnerable to a current directory attack. Here is a sample program that illustrates the problem. Copy a rogue shell32.dll into the current directory and run the program. Observe that the rogue shell32.dll is loaded instead of the system one.

    If you actually followed the instructions, what you saw depended on your definition of "run the program." Let's assume that the program has been placed in the directory C:\sample\sample.exe.

    1. Setting the current directory to the application directory.
      cd /d C:\sample
      copy \\rogue\server\shell32.dll
      c:\sample\sample.exe
      
      In this case, the attack succeeds.
    2. Setting the current directory to an unrelated directory.
      cd /d %USERPROFILE%
      copy \\rogue\server\shell32.dll
      c:\sample\sample.exe
      
      In this case, the attack fails.
    3. Running the application from Explorer.
      copy \\rogue\server\shell32.dll C:\sample
      double-click sample.exe in Explorer
      
      In this case, the attack succeeds.

    Let's look at case 3 first. In case 3, what is the current directory? When you launch a program from Explorer, the current directory is set to the directory of the thing you double-clicked. Therefore, case 3 is identical to case 1. That's one less case to have to study.

    We also see that the attack is not strictly a current directory attack, because the attack failed in case 2 even though a rogue shell32.dll was in the current directory.

    What we're actually seeing is an application directory attack.

    Recall that the application directory is searched ahead of the system directory. Therefore, you can override a file in the system directory by putting it in your application directory. This is part of the directory as a bundle principle. If you packaged a DLL with your application, then presumably that's the one you want, even if a future version of Windows decides to create a DLL of the same name.

    The vulnerability report sort of acknowledged that this was an application directory attack rather than a current directory attack when they explained why this is a serious problem:

    By placing a rogue copy of shell32.dll in the C:\Program Files\Microsoft Office\Office12 directory, an attacker can inject arbitrary code into all Office applications.

    If the attack were really a current directory attack, the attacker would have put a rogue copy of shell32.dll in the directory containing your Excel spreadsheet, not the directory containing EXCEL.EXE.

    And that's where you reach the airtight hatchway: Normal users do not have write permission into the C:\Program Files\Microsoft Office\Office12 directory. You need administrator privileges to create files there. And if you have administrator privileges, then you already pwn the machine. It's not really a vulnerability that you can do anything you want once you pwn the machine.

    Of course, this non-vulnerability does expose a security issue you need to bear in mind when you run your own programs: Your application's directory is its airtight hatchway. Make sure you control who you let in! If you leave your application directory world-writeable, then you've effectively left your airtight hatchway unlocked. This is one reason why the Microsoft Logo guidelines recommend (require?) that programs be installed into the Program Files directory: The default security descriptor for subdirectories of Program Files does not grant write permission to normal users. It's secure by default.

    There are many variations of this type of vulnerability report, and they nearly always are mischaracterized as a current directory attack. They usually go like this:

    There is a DLL planting vulnerability in LITWARE.EXE. Place a rogue DLL named SHELL32.DLL in the same directory as LITWARE.EXE. When LITWARE.EXE is run, the rogue DLL is loaded from the current directory, resulting in code injection.

    The person who submits the report has confused the application directory with the current directory, probably because they never considered that the two might be different.

    C:\> mkdir C:\test
    C:\> cd C:\test
    C:\test> copy \\trusted\server\LITWARE.EXE
    C:\test> copy \\rogue\server\SHELL32.DLL
    C:\test> LITWARE
    -- observe that the rogue DLL is loaded
    -- proof of current directory attack
    

    They never tried this:

    C:\> mkdir C:\test
    C:\> cd C:\test
    C:\test> copy \\trusted\server\LITWARE.EXE
    C:\> mkdir C:\test2
    C:\> cd C:\test2
    C:\test2> copy \\rogue\server\SHELL32.DLL
    C:\test2> ..\test\LITWARE
    -- observe that the rogue DLL is not loaded
    

    That second experiment shows that the attack is not a current directory attack at all. It's an application directory attack.

    Each time one of these reports comes in, we have to perform the same evaluation to confirm that it really is an application directory attack and not a current directory attack. (This means, among other things, repeating the test on every version of Windows, and every version of LitWare, and every combination of the two, just to make sure all the possibilities have been covered. The odds are strong that it will all turn into a false alarm, but who knows. Maybe there's something about the interaction between LitWare 5.2 SP2 and Windows XP SP3 that triggers a new code path that does indeed try to load shell32.dll from the current directory. And it's that specific combination of circumstances the person was trying to report, but did a bad job of expressing.)

  • The Old New Thing

    How am I supposed to free the information returned by the GetSecurityInfo function?

    • 25 Comments

    The Get­Security­Info function returns a copy of the security descriptor for a kernel object, along with pointers to specific portions you request. More than once, a customer has been confused by the guidelines for how to manage the memory returned by the function.

    Let's look at what the function says:

    ppsidOwner [out, optional]
    A pointer to a variable that receives a pointer to the owner SID in the security descriptor returned in ppSecurity­Descriptor. The returned pointer is valid only if you set the OWNER_SECURITY_INFORMATION flag. This parameter can be NULL if you do not need the owner SID.

    Similar verbiage can be found for the other subcomponent parameters. The final parameter is described as

    ppSecurity­Descriptor [out, optional]
    A pointer to a variable that receives a pointer to the security descriptor of the object. When you have finished using the pointer, free the returned buffer by calling the Local­Free function.

    Okay, so it's clear that you need to free the security descriptor with Local­Free. But how do you free the owner, group, DACL, and SACL?

    Read the documentation again. I've underlined the important part.

    ppsidOwner [out, optional]
    A pointer to a variable that receives a pointer to the owner SID in the security descriptor returned in ppSecurity­Descriptor. The returned pointer is valid only if you set the OWNER_SECURITY_INFORMATION flag. This parameter can be NULL if you do not need the owner SID.

    In case that wasn't clear, the point is reiterated in the remarks.

    If the ppsidOwner, ppsidGroup, ppDacl, and ppSacl parameters are non-NULL, and the Security­Info parameter specifies that they be retrieved from the object, those parameters will point to the corresponding parameters in the security descriptor returned in ppSecurity­Descriptor.

    In other words, you are getting a pointer into the security descriptor. No separate memory allocation is made. The memory for the owner SID is freed when you free the security descriptor. It's like the last parameter to Get­Full­Path­Name, which receives a pointer to the file part of the full path. There is no separate memory allocation for that pointer; it's just a pointer back into the main buffer.

    You can think of the ppsidOwner parameter as a convenience parameter. The Get­Security­Info function offers to do the work of calling Get­Security­Descriptor­Owner for you. You can think of the function as operating like this:

    DWORD WINAPI GetSecurityInfo(...)
    {
        ... blah blah get the security info ...
    
        // Just out of courtesy:
        // Fetch the owner if the caller requested it
        if (ppsidOwner != NULL &&
            (SecurityInfo & OWNER_SECURITY_INFO)) {
            BOOL fDefaulted;
            GetSecurityDescriptorOwner(pSecurityDescriptor,
                                       ppsidOwner,
                                       &fDefaulted);
        }
    
        ...
    }
    

    That's why the documentation says that you need to pass a non-null ppSecurity­Descriptor if you request any of the pieces of the security descriptor: If you don't, then you won't be able to free the memory for it.

    Bonus chatter: If the ppSecurity­Descriptor is so important, why is it marked "optional"?

    It really should be a mandatory parameter, but older versions of Windows didn't enforce the rule, so the parameter is grandfathered in as optional, even though no self-respecting program should ever pass in NULL. If you pass NULL for the ppSecurity­Descriptor, the function happily allocates the security descriptor and then, "Oh wait, the caller didn't give me a way to receive the pointer to the security descriptor, so I guess I won't give it to him."

    DWORD WINAPI GetSecurityInfo(...)
    {
        ... blah blah get the security info ...
    
        if (ppSecurityDescriptor != NULL) {
            *ppSecurityDescriptor = pSecurityDescriptor;
        }
        ...
    }
    

    Result: Memory leak.

    You might say that the last parameter was designed by somebody wearing kernel-colored glasses.

  • The Old New Thing

    Why do BackupRead and BackupWrite require synchronous file handles?

    • 24 Comments

    The Backup­Read and Backup­Write functions require that the handle you provide by synchronous. (In other words, that they not be opened with FILE_FLAG_OVERLAPPED.)

    A customer submitted the following question:

    We have been using asynchronous file handles with the Backup­Read. Every so often, the call to Backup­Read will fail, but we discovered that as a workaround, we can just retry the operation, and it will succeed the second time. This solution has been working for years.

    Lately, we've been seeing crash when trying to back up files, and the stack traces in the crash dumps appear to be corrupted. The issue appears to happen only on certain networks, and the problem goes away if we switch to a synchronous handle.

    Do you have any insight into this issue? Why were the Backup­Read and Backup­Write functions designed to require synchronous handles?

    The Backup­Read and Backup­Write functions have historically issued I/O against the handles provided on the assumption that they are synchronous. As we saw a while ago, doing so against an asynchronous handle means that you're playing a risky game: If the I/O completes synchronously, then nobody gets hurt, but if the I/O goes asynchronous, then the temporary OVERLAPPED structure on the stack will be updated by the kernel when the I/O completes, which could very well be after the function that created it has already returned. The result: A stack smash. (Related: Looking at the world through kernel-colored glasses.)

    This oversight in the code (blindly assuming that the handle is a synchronous handle) was not detected until 10 years after the API was originally designed and implemented. During that time, backup applications managed to develop very tight dependencies on the undocumented behavior of the backup functions. The backup folks tried fixing the bug but found that it ended up introducing massive compatibility issues. On top of that, there was no real business case for extending the Backup­Read and Backup­Write functions to accept asynchronous handles.

    As a result, there was no practical reason for changing the function's behavior. Instead, the requirement that the handle be synchronous was added to the documentation, along with additional text explaining that if you pass an asynchronous handle, you will get "subtle errors that are very difficult to debug."

    In other words, the requirement that the handles be synchronous exists for backward compatibility.

Page 1 of 3 (26 items) 123