February, 2009

  • The Old New Thing

    What does the COM Surrogate do and why does it always stop working?


    The dllhost.exe process goes by the name COM Surrogate and the only time you're likely even to notice its existence is when it crashes and you get the message COM Surrogate has stopped working. What is this COM Surrogate and why does it keep crashing?

    The COM Surrogate is a fancy name for Sacrificial process for a COM object that is run outside of the process that requested it. Explorer uses the COM Surrogate when extracting thumbnails, for example. If you go to a folder with thumbnails enabled, Explorer will fire off a COM Surrogate and use it to compute the thumbnails for the documents in the folder. It does this because Explorer has learned not to trust thumbnail extractors; they have a poor track record for stability. Explorer has decided to absorb the performance penalty in exchange for the improved reliability resulting in moving these dodgy bits of code out of the main Explorer process. When the thumbnail extractor crashes, the crash destroys the COM Surrogate process instead of Explorer.

    In other words, the COM Surrogate is the I don't feel good about this code, so I'm going to ask COM to host it in another process. That way, if it crashes, it's the COM Surrogate sacrificial process that crashes instead of me process. And when it crashes, it just means that Explorer's worst fears were realized.

    In practice, if you get these types of crashes when browsing folders containing video or media files, the problem is most likely a flaky codec.

    Now that you know what the COM Surrogate does, you can answer this question from a customer:

    I'm trying to delete a file, but I'm told that "The action can't be completed because the file is open in COM Surrogate." What is going on?
  • The Old New Thing

    Foreground activation permission is like love: You can't steal it, it has to be given to you


    This is the blog entry that acted as the inspiration for the last topic in my 200 PDC talk.

    When somebody launches a second copy of your single-instance program, you usually want the second copy to send its command line to the first instance (and deal with the current directory somehow), and then you want the first instance to come to the foreground. But a common problem people run into is that when the first instance calls SetForegroundWindow, it fails.

    The problem with this design is that as far as the window manager is concerned, what happened is that the first instance received a message and then decided to steal foreground. That message wasn't an input message, so the window manager sees no reason for the first instance to have any right to take the foreground. There is no evidence that the first instance is coming to the foreground in response to some user action.

    There are a variety of ways of addressing this problem. The easiest way is simply to have the second instance make the call to SetForegroundWindow. The second program has permission to take the foreground because you just launched it. And if a program can take the foreground, it can also give it away, in this case, by setting the first program as the foreground window.

    Another way to do this is to have the second program call the AllowSetForegroundWindow function with the process ID of the first program before it sends the magic message. The AllowSetForegroundWindow function lets a program say to the window manager, "It's okay; he's with me." And then when the first program finally gets around to calling SetForegroundWindow, the window manager says, "Oh, this guy's okay. That other program vouched for him."

    If you are transferring foreground activation to a COM server, you can use the corresponding COM function CoAllowSetForegroundWindow.

    In all cases, note that you can't give away something that's not yours. If you don't have permission to take foreground, then calling AllowSetForegroundWindow (or one of its moral equivalents) will have no effect: You just told the window manager, "It's okay; he's with me," and the window manager replied, "Who the hell are you?"

    Pre-emptive snarky comment: "There are some really sneaky people who found a way to circumvent the rules and steal foreground activation." Well yeah, and there are some really sneaky people who find ways to steal love, so there you have it. If everything is right with the world, both groups of people will eventually be found out and made to suffer for their malfeasance.

    Update (02/21): Deleted all comments that showed ways of circumventing the rules. Duh, people.

  • The Old New Thing

    How does Raymond decide what to post on any particular day?


    Occasionally somebody asks about the timing of an entry I've written and wants to know how far ahead with this blog thing I really am.

    To give you an idea of how far in advance I write my blog entries, I wrote this particular entry on February 13, 2008. Generally, the articles are published in the order I wrote them; this particular entry ended up on February 27, 2009 because that was the next available open day. If the big news topic of February 27th, 2009 happens to be related to this entry, it's just a coincidence.

    Now, with a buffer of over a year, I do have quite a bit of leeway in choosing when any particular article is published. Although articles in general just get slotted in for the next open day, I will occasionally arrange for one to come out on a thematically-related day. Sometimes the connection is blatant, like writing about time and time zones on the Fridays before Daylight Saving Time transitions, or writing about Hallowe'en on, well, Hallowe'en. More often the connection is low-key, like telling a story about the economics of parking tickets on a day when parking is free in Seattle or warning about fake guacamole on the Friday before the Super Bowl. And sometimes the connection is impossibly obscure, like taking a story that I know a friend will like and slotting it in on their birthday, anniversary, or some other day meaningful to them.

    (To answer Paul's specific question: I found the article on guacamole and said, "That would be a great article to use for the Super Bowl." Since I already had a Super Bowl article picked out for 2007, the guacamole article was slotted in for 2008.)

    There are other patterns you may have picked up on:

    • Mondays are usually spent "answering viewer mail" (i.e., taking topics from the Suggestion Box).
    • Less technical articles appear towards the beginning of the week; more technical articles appear later in the week. Tuesday in particular tends to get the funny stories.
    • There is usually one e-mail related posting per month.
    • There is usually one Microspeak posting per month.
    • Sometimes an entire week is devoted to a theme, such as the annual CLR Week.
    • An unusually short or unusually long technical post is usually balanced by an amusing non-technical post. (I have a stash of about a hundred of these "light diversions".)
    • I tend to avoid technical topics on major holidays.

    But generally, it's just a FIFO queue.

    Oh, and right now, the queue is full up through the beginning of June 2010.

  • The Old New Thing

    Why do my file properties sometimes show an Archive check box and sometimes an Advanced button?


    When you view the properties of a file and go to the General page, there are some check boxes at the bottom for file attributes. There's one for Read-only and one for Hidden, and then it gets weird. Sometimes you get Archive and sometimes you get an Advanced button. What controls which one you get?

    It depends on whether there is anything interesting in the Advanced dialog.

    If the volume supports either compression or encryption (or both), then you will get an Advanced dialog with check boxes for Archive, Compress and Encrypt. On the other hand, if the volume supports neither compression nor encryption, then you will just get an Archive check box, since it looks kind of silly having an Advanced button that shows you a dialog box with just one check box on it. (Note that these features can also be disabled by group policy, so it's not purely a file system decision.)

    In Windows, the most commonly encountered file system that does not support compression or encryption is probably FAT, and the most commonly encountered one that does is almost certainly NTFS, so in a rough sense, you can say that FAT gives you an Archive check box and NTFS gives you an Advanced button.

  • The Old New Thing

    Why doesn't the file system have a function that tells you the number of files in a directory?


    There are any number of bits of information you might want to query from the file system, such as the number of files in a directory or the total size of the files in a directory. Why doesn't the file system keep track of these things?

    Well, of course, one answer is that it certainly couldn't keep track of every possible fragment of information anybody could possibly want, because that would be an infinite amount of information. But another reason is simply a restatement of the principle we learned last time: Because the file system doesn't keep track of information it doesn't need.

    The file system doesn't care how many files there are in the directory. It also doesn't care how many bytes of disk space are consumed by the files in the directory (and its subdirectories). Since it doesn't care, it doesn't bother maintaining that information, and consequently it avoids all the annoying problems that come with attempting to maintain the information.

    For example, one thing I noticed about many of the proposals for maintaining the size of a directory in the file system is that very few of them addressed the issue of hard links. Suppose a directory contains two hard links to the same underlying file. Should that file be double-counted? If a file has 200 hard links, then a change to the size of the file would require updating the size field in 200 directories, not just one as one commenter postulated. (Besides, the file size isn't kept in the directory entry anyway.)

    Another issue most people ignored was security. If you're going to keep track of the recursive directory size, you have to make sure to return values consistent with each user's permissions. If a user does not have permission to see the files in a particular directory, you'd better not include the sizes of those files in the "recursive directory size" value when that user goes asking for it. That would be an information disclosure security vulnerability. Now all of a sudden that single 64-bit value is now a complicated set of values, each with a different ACL that controls which users each value applies to. And if you change the ACL on a file, the file system would have to update the file sizes for each of the directories that contains the file, because the change in ACL may result in a file becoming visible to one user and invisible to another.

    Yet another cost many people failed to take into account is just the amount of disk I/O, particular writes, that would be required. Generating additional write I/O is a bad idea in general, particularly on media with a limited number of write cycles like USB thumb drives. One commenter did note that this metadata could not be lazy-written because a poorly-timed power outage would result in the cached value being out of sync with the actual value.

    Indeed the added cost of all the metadata writes is one of the reasons why Windows Vista no longer updates the Last Access time by default.

    Bonus chatter: My colleague Aaron Margosis points out a related topic over on the ntdebugging blog: NTFS Misreports Free Space? on the difficulties of accurate accounting, especially in the face of permissions which don't grant you total access to the drive.

  • The Old New Thing

    A different type of writing exercise, this time in preparation for buying a house


    One of my colleagues was overwhelmed by how many times papers need to be signed when you buy a house. A seemingly endless stack of papers. Sign and date here, initial here, initial here, now sign this, and this, and this, and sign and date here, and sign here, and initial here... By the time it's over, your arm is about to fall off.

    Some years later, my colleague was about to buy a new house and began to dread the signature-fest that would invariably ensue at the closing. Another ten-foot-tall stack of papers that needed to be signed and initialed.

    In preparation, my colleague actually did hand exercises to build up stamina and strength. I'm not sure what the exercises consisted of, but they were probably a mix of strength-building wrist exercises and some stints of extended longhand writing.

    Finally closing day came. My colleague walked into the agent's office, sat down, and prepared for the worst, only to be surprised when the stack of papers was only a dozen pages long. What happened to the ten-foot-tall stack of papers that took hours to sign and initial?

    The difference is that my colleague paid cash for the new house rather than taking out a loan. That ten-foot-tall stack of papers? Nearly all of them were related to the mortgage, not to the actual sale of the house. The paperwork associated with a house sale proper is comparatively light.

    But all was not lost. At least my colleague had pretty strong wrists now.

  • The Old New Thing

    Smart quotes: The hidden scourge of text meant for computer consumption


    Smart quotes—you know, those fancy quotation marks that curl “like this” ‘and this’ instead of standing up straight "like this" 'and this'—are great for text meant for humans to read. Like serifs and other typographic details, they act as subtle cues that aid in reading.

    But don't let a compiler or interpreter see them.

    In most programming languages, quotation marks have very specific meanings. They might be used to enclose the text of a string, they might be used to introduce a comment, they might even be a scope resolution operator. But in all cases, the language specification indicates that the role is played by the quotation mark U+0022 or apostrophe U+0027. From the language's point of view, the visually similar characters U+2018, U+2019, and U+02BC (among others) are completely unrelated.

    I see this often on Web sites, where somebody decided to "edit" the content to make it "look better" by curlifying the quotation marks, turning what used to be working code into a big pile of syntax errors.

    I even see it in email. Somebody encounters a crash in a component under development and connects a debugger and sends mail to the component team describing the problem and including the information on how to connect to the debugger like this:

    WinDbg –remote npipe:server=abc,pipe=def

    Or maybe like this:

    Remote.exe “abc” “def”

    And you, as a member of the team responsible for that component copy the text out of the email (to ensure there are no transcription errors) and paste it into a command line.

    C:\> Remote.exe "abc" "def"

    and you get the error

    Unable to connect to server ôabcö

    What happened? You got screwed over by smart quotes. The person who sent the email had smart quotes turned on in their email editor, and it turned "abc" into “abc”. You then got lulled into a false sense of security by the best fit behavior of WideCharToMultiByte, which says I can't represent “ and ” in the console code page, but I can map them to " which is a close visual approximation, so I'll use that instead. As a result, the value you see on the command line shows straight quotes, but that's just a façade behind which the ugly smart quotes are lurking.

    I've even seen people hoist by their own smartly-quoted petard.

    I can't seem to access a file called aaa bbb.txt. The command

    type “aaa bbb.txt”

    results in the strange error message

    The system cannot find the file specified.
    Error occurred while processing: "a.
    The system cannot find the file specified.
    Error occurred while processing: x.txt".

    Why can't I access this file?

    Somehow they managed to type smart quotes into their own command line.

    So watch out for those smart quotes. When you're sending email containing code or command lines, make sure your editor didn't "make it pretty" and in the process destroy it.

    Exercise: What is wrong with the WinDbg command line above?

    Bonus chatter: PowerShell is a notable exception to this principle, for it treats all flavors of smart quotes and smart dashes as if they were dumb quotes and dumb dashes. From what I can tell, you have this guy to thank (or blame).

  • The Old New Thing

    How do I programmatically show and hide the Quick Launch bar?


    Commenter Mihai wants to know how to show or hide the Quick Launch bar programmatically.

    That's not something a program should be doing. Whether the Quick Launch bar is shown or hidden is an end user setting, and programs should not be overriding the user's preferences. Explorer consciously does not expose an interface for showing and hiding taskbar bands because it would just be a target for abuse. Much like the program that wants to uninstall other programs, the taskbar would become a battleground among programs that each wanted to force themselves on and force their opponents off.

    The user is the arbiter of what goes into the Taskbar.

    I'm told that Windows Vista added a new ITrayDeskBand interface that does indeed let you turn taskbar bands on and off. (I don't know whether it works for Quick Launch. Heck, I don't even know if it works at all! Not my area of expertise.) The story I heard was that so many programs were doing exactly what they shouldn't be doing—namely forcing their feature on, overriding the user's preference—that the Taskbar folks decided, "If you can't stop people from doing a bad thing, at least make them do the bad thing under your supervision. That way you have just one evil thing to support instead of everybody's home-grown undocumented hack." It's sort of the Taskbar Needle Exchange Program.

  • The Old New Thing

    What the various registry data types mean is different from how they are handled


    Although you can tag your registry data with any of a variety of types, such as REG_DWORD or REG_BINARY or REG_EXPAND_SZ. What do these mean, really?

    Well, that depends on what you mean by mean, specifically, who is doing the interpreting.

    At the bottom, the data stored in the registry are opaque chunks of data. The registry itself doesn't care if you lie and write two bytes of data to something you tagged as REG_DWORD. (Try it!) The type is just another user-defined piece of metadata. The registry dutifully remembers the two bytes you stored, and when the next person comes by asking for the data, those two bytes come out, along with the type REG_DWORD. Garbage in, garbage out. The registry doesn't care that what you wrote doesn't many any sense any more than the NTFS file system driver doesn't care that you wrote an invalid XML document to the file config.xml. Its job is just to remember what you wrote and produce it later upon request.

    There is one place where the registry does pay attention to the type, and that's when you use one of the types that involve strings. If you use the RegQueryValueA function to read data which is tagged with one of the string types (such as REG_SZ), then the registry code will read the raw data from its database, and then call WideCharToMultiByte to convert it to ANSI. But that's the extent of its assistance.

    Just as the registry doesn't care whether you really wrote four bytes when you claimed to be writing a REG_DWORD, is also doesn't care whether the various string types actually are of the form they claim to be. If you forget to include the null terminator in your byte count when you write the data to the registry, then the null terminator will not be stored to the registry, and the next person to read from it will not read back a null terminator.

    This simplicity in design pushes the responsibility onto the code that uses the registry. If you read a registry value and the data is tagged with the REG_EXPAND_SZ type, then it's up to you to expand it if that's what you want to do. The REG_EXPAND_SZ value is just part of the secret handshake between the code that wrote the data and the code that is reading it, a secret handshake which is well-understood by convention. After all, if RegQueryValueEx automatically expanded the value, then how could you read the original unexpanded value?

    Windows Vista added a new function RegGetValue which tries to take care of most of the cumbersome parts of reading registry values. You can tell it what data types you are expecting (and it will fail if the data is of an incompatible type), and it coerces the data to match its putative type. For example, it auto-expands REG_EXPAND_SZ data, and if a blob of registry data marked REG_SZ is missing a null terminator, RegGetValue will add one for you. Better late than never.

  • The Old New Thing

    The checkbox: The mating call of the loser


    (Cultural note: The phrase the mating call of the loser is a term of derision. I used it here to create a more provocative headline even though it's stronger than I really intended, but good writing is bold.)

    When given a choice between two architectures, some people say that you should give users a checkbox to select which one should be used. That is the ultimate cowardly answer. You can't decide between two fundamentally different approaches, and instead of picking one, you say "Let's do both!", thereby creating triple, perhaps quadruple the work compared to just choosing one or the other.

    It's like you're remodeling a book library and somebody asks you, "Should we use Dewey Decimal or Library of Congress?" Your answer, "Let's do both and let the user choose!"

    Imagine if there were a checkbox somewhere in the Control Panel that let you specify how Windows XP-styled controls were implemented. Your choices are either to require applications to link to a new UxCtrl.DLL (let's call this Method A) or to link to COMCTL32.DLL with a custom manifest (let's call this Method B). Well, it means that every component that wanted styled common controls would have to come in two versions, one that linked to UxCtrl and used the new class names in its dialog boxes and calls to CreateWindow, and one that used a manifest and continued to use the class names under their old names.

    #ifdef USE_METHODA
     hwnd = CreateWindow(TEXT("UxButton"), ...);
     hwnd = CreateWindow(TEXT("Button"), ...);
    DLG_WHATEVER DIALOG 36, 44, 230, 94
    CAPTION "Whatever"
    #ifdef USE_METHODA
        CONTROL         "Whatever",IDC_WHATEVER,"UxButton",
                        BS_AUTOCHECKBOX | WS_TABSTOP, 14,101,108,9
        CONTROL         "Whatever",IDC_WHATEVER,"Button",
                        BS_AUTOCHECKBOX | WS_TABSTOP, 14,101,108,9

    At run time, every program would have to check this global setting and spawn off either the "Method A" binary or the "Method B" binary.

    Now you might try to pack this all into a single binary with something like this:

     hwnd = CreateWindow(TEXT("Button"), ...);
    } else {
     hwnd = CreateWindow(TEXT("UxButton"), ...);
    } else {

    But it's not actually that simple because a lot of decisions take place even before your program starts running. For example, if your program specifies a load-time link to UXCTRL.DLL, then that DLL will get loaded before your program even runs, even if the system switch is set to use Method B. A single-binary program that tries to choose between the two methods at runtime will have to do some activation context juggling and delay-loading. Hardly a slam dunk.

    Okay, so now you have two versions of every program. And you also have to decide what should happen if somebody writes and ships a program that uses Method A exclusively, even when the system switch is set to Method B. Does everything still work within that program as if Method A were the system setting, while the rest of the system uses Method B? (If you go this route, then you've completely undermined the point of Method B. The whole point of Method B is to allow programs that rely on specific class names to continue working, but this rogue Method A program is running around using the wrong class names!)

    Now the entire operating system and application compatibility work needs to be done with the checkbox set both to Method A and to Method B, because the compatibility impact of each of the methods is quite different. Okay, that's double the work. Where is triple and quadruple?

    Well, the two different versions of the program need to be kept in sync, since you want them to behave identically. This can of course be managed with judicious use of #ifdefs or runtime branches. But you have to remember both ways of doing things and be mindful of the two method each time you modify the program. Somebody else might come in and "fix a little bug" or "add a little feature" to your program, unaware of how your program manages the shuffle of Method A versus Method B. The mental effort necessary to remember two different ways of doing the same thing plus having to expend that effort to correct mistakes in the code, that's the triple.

    The quadruple? I'm not sure, maybe the ongoing fragility of such a scheme, especially one that, at the end of the day, is a choice between two things that have no real impact on the typical end user.

    Engineering is about making tradeoffs. If you refuse to make the choice, then you're taking the cowardly route and ultimately are creating more work for your team. Instead of solving problems, you're creating them.

    All because you're too chicken to make a hard decision.

Page 1 of 4 (32 items) 1234