• The Old New Thing

    The cost of trying too hard: Splay trees


    Often, it doesn't pay off to be too clever. Back in the 1980's, I'm told the file system group was working out what in-memory data structures to use to represent the contents of a directory so that looking up a file by name was fast. One of the experiments they tried was the splay tree. Splay trees were developed in 1985 by Sleator and Tarjan as a form of self-rebalancing tree that provides O(log n) amortized cost for locating an item in the tree, where n is the number of items in the tree. (Amortized costing means roughly that the cost of M operations is O(M log n). The cost of an individual operation is O(log n) on average, but an individual operation can be very expensive as long as it's made-up for by previous operations that came in "under budget".)

    If you're familiar with splay trees you may already see what's about to happen.

    A very common operation in a directory is enumerating and opening every file in it, say, because you're performing a content search through all the files in the directory or because you're building a preview window. Unfortunately, when you sequentially access all the elements in a splay tree in order, this leaves the tree totally unbalanced. If you enumerate all the files in the directory and open each one, the result is a linear linked list sorted in reverse order. Locating the first file in the directory becomes an O(n) operation.

    From a purely algorithmic analysis point of view, the O(n) behavior of that file open operation is not a point of concern. After all, in order to get to this point, you had to perform n operations to begin with, so that very expensive operation was already "paid for" by the large number of earlier operations. However, in practice, people don't like it when the cost of an operation varies so widely from use to use. If you arrive at a client's office five minutes early for a month and then show up 90 minutes late one day, your explanation of "Well, I was early for so much, I'm actually still ahead of schedule according to amortized costing," your client will probably not be very impressed.

    The moral of the story: Sometimes trying too hard doesn't work.

    (Postscript: Yes, there have been recent research results that soften the worst-case single-operation whammy of splay trees, but these results weren't available in the 1980's. Also, remember that consistency in access time is important.)

  • The Old New Thing

    ReadProcessMemory is not a preferred IPC mechanism


    Occasionally I see someone trying to use the ReadProcessMemory function as an inter-process communication mechanism. This is ill-advised for several reasons.

    First, you cannot use ReadProcessMemory across security contexts, at least not without doing some extra work. If somebody uses "runas" to run your program under a different identity, your two processes will not be able to use ReadProcessMemory to transfer data back and forth.

    You could go to the extra work to get ReadProcessMemory by adjusting the privileges on your process to grant PROCESS_VM_READ permission to the owner of the process you are communicating with, but this opens the doors wide open. Any process running with that identity read the data you wanted to share, not just the process you are communicating with. If you are communicating with a process of lower privilege, you just exposed your data to lower-privilege processes other than the one you are interested in.

    What's more, once you grant PROCESS_VM_READ permission, you grant it to your entire process. Not only can that process read the data you're trying to share, it can read anything else that is mapped into your address space. It can read all your global variables, it can read your heap, it can read variables out of your stack. It can even corrupt your stack!

    What? Granting read access can corrupt your stack?

    If a process grows its stack into the stack guard page, the unhandled exception filter catches the guard exception and extends the stack. But when it happen inside a private "catch all exceptions" handler, such as the one that the IsBadReadPtr Function uses, it is handled privately and doesn't reach the unhandled exception filter. As a result, the stack is not grown; a new stack guard page is not created. When the stack normally grows to and then past the point of the prematurely-committed guard page, what would normally be a stack guard exception is now an access violation, resulting in the death of the thread and with it likely the process.

    You might think you could catch the stack access violation and try to shut down the thread cleanly, but that is not possible for multiple reasons. First, structured exception handling executes on the stack of the thread that encountered the exception. If that thread has a corrupted stack, it becomes impossible to dispatch that exception since the stack that the exception filters want to run on is no longer viable.

    Even if you could somehow run these exception filters on some sort of "emergency stack", you still can't fix the problem. At the point of the exception, the thread could be in the middle of anything. Maybe it was inside the heap manager with the heap lock held and with heap data structures in a state of flux. In order for the process to stay alive, the heap data structures need to be made consistent and the heap lock released. But you don't know how to do that.

    There are plenty of other inter-process communication mechanisms available to you. One of them is anonymous shared memory, which I discussed a few years ago. Anonymous shared memory still has the problem that any process running under the same token as the one you are communicating with can read the shared memory block, but at least the scope of the exposure is limited to the data you explicitly wanted to share.

    (In a sense, you can't do any better than that. The process you are communicating with can do anything it wants with the data once it gets it from you. Even if you somehow arranged so that only the destination process can access the memory, there's nothing stopping that destination process from copying it somewhere outside your shared memory block, at which point your data can be read from the destination process by anybody running with the same token anyway.)

  • The Old New Thing

    At least there's a funny side to spam


    Poorly-drawn cartoons inspired by actual spam subject lines!

    It's pretty much what the title says. Don't forget to read the fan mail.

    Sometimes it's even funny.

  • The Old New Thing

    Understanding what things mean in context: Dispatch interfaces


    Remember that you have to understand what things mean in context. For example, the IActiveMovie3 interface has a method called get_MediaPlayer. If you come into this method without any context, you might expect it to return a pointer to an IMediaPlayer interface, yet the header file says that it returns a pointer to an IDispatch interface instead. If you look at the bigger picture, you'll see why this makes sense.

    IActiveMovie3 is an IDispatch interface. As you well know, the IDispatch interface's target audience is scripting languages, primarily classic Visual Basic (and to a lesser degree, JScript). Classic Visual Basic is a dynamically-typed language, wherein nearly all variables are merely "objects", the precise type of which is not known until run-time. A statically-typed language will complain at compile time that you are invoking a method on an object that doesn't support that method or that you are passing the wrong number or type of operands to a method. A dynamically-typed language, on the other hand, doesn't check until the line of code is actually executed whether the method exists, and if it does, whether you called it correctly.

    When working with IDispatch and dynamically-typed languages, therefore, the natural unit of currency for objects is the IDispatch. All objects take the form of IDispatch. Objects that produce other objects will produce IDispatch interfaces, because that's what the scripting engine is expecting.

    That's why the get_MediaPlayer method returns an IDispatch. Because that's what the scripting engine expects. And, if you are familiar with the context, it's also what you should expect.

    A tell-tale sign of this context comes from the name "get_MediaPlayer". This name does not follow the COM function naming convention but rather is a constructed name for the C/C++ binding of the "get" property. C/C++ bindings are the assembly language of OLE automation: You're operating with the nuts and bolts of OLE automation, and if you want to play at this level, you're going to have to know how to use a screwdriver.

  • The Old New Thing

    France, she is, how you say, on sale!


    Marketplace reports on the start of the winter sale season in France. By law, retailers are permitted sales only twice a year, so the onset of sale season generates quite a bit of shopping madness. There is also a proposal to allow more sale periods, but opponents argue that doing so would harm smaller businesses. Coming from the land of sale fatigue (we just emerged from the after-Christmas sale season and are entering the Winter White Sale season, after which comes the President's Day season...), I find a certain appeal to the idea of limiting how often things can "go on sale". Who can forget the oriental rug stores that are perpetually going out of business? It's become such a joke that The New York Times flatly refuses to run "Going Out of Business" sales for oriental rug stores.

  • The Old New Thing

    Why do words beginning with "home" get treated as URLs?


    Vitaly from the Suggestion Box asked (with grammatical editing),

    Could you explain why Windows starts the web browser if the file name passed to ShellExecute starts with "home".

    First thing to note is that this URL-ization happens only after the ShellExecuteEx function has tried all the other possible interpretations. If a file named "homestar" is found in the current directory or on the PATH or in the App Paths, then that file will be chosen, as you would expect. Only when the ShellExecuteEx function is about to give up does it try to "do what you mean".

    What you're seeing is autocorrection kicking in yet again. If you go to HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\Windows\CurrentVersion\URL\Prefixes, you can see the various autocorrection rules that ShellExecute consults when it can't figure out what you are trying to do. For example, if the thing you typed begins with "www", it will stick "http://" in front and try again. This is why you can just type "www.microsoft.com" into the Run dialog instead of having to type the cumbersome "http://www.microsoft.com".

    Most of the autocorrection rules are pretty self-evident. Something beginning with "ftp" is probably an FTP site. Something beginning with "www" is probably a web site. But why are strings beginning with "home" also treated as web sites?

    For one thing, several web sites have domains whose names begin with "home". Furthermore, some internet service providers set up their DNS so that non-fully-qualified domain names go to servers that the ISP set up specifically to provide customer services. For example, "mail" would send you to a web-based mail system, and "home" would send you to the ISP's home page.

    The use of "home" has fallen out of fashion of late, so the auto-correction rule isn't all that useful any more, but the rule stays around because it doesn't really hurt anybody, and compatibility concerns advise against removing a feature if it isn't hurting anyone and you aren't absolutely certain that nobody is still using it. (Heck, if you look at the key, you can see an entry for "gopher". Like anybody uses gopher any more.)

  • The Old New Thing

    When web sites rely on security holes


    Perhaps the biggest risk when making a change in the name of security is all the things that may have been relying on the previously-lax security settings. After all, disabling an insecure feature is easy. The hard part is disabling it while retaining compatibility with people who were relying on that feature. In the security investigations I've been involved with, perhaps the largest chunk of my time is spent trying to find a way to mitigate the security hole without breaking existing customers. (And it's the Line of Business scenario that is the biggest question mark.)

    Here's a real-life example: Consider a sports web site which sells a service to subscribers wherein the site creates a pop-up window whenever a game's score has changed or some other significant event has occurred. That way, you can leave your browser minimized and go about your day, but when something happens in the game, it will pop up an alert.

    The round of security changes in Windows XP SP2 broke this site because the rules on positioning of pop-up windows were tightened so that pop-up windows could not appear outside the browser itself. This prevents pop-up windows from being used to cover important browser elements (such as the status bar, the address bar, or a security dialog) and makes it harder for pop-ups to masquerade as system dialogs. But it also broke this company's business model.

    And of course, if Microsoft does something that cause you to lose money, you sue.

    There were probably corporations that had internal web sites that relied on the ability to position pop-ups without restriction. Those corporations no doubt also complained about this change in the name of security.

    As with most security changes that have compatibility consequences, a "safety valve" was added to return to the old insecure behavior for those customers who were relying on it. In this case, you can put the affected sites in the Trusted Sites zone and enable the "Allow script-initiated windows without size or position constraints" setting. But this is just a stop-gap, re-opening the security hole to let this site continue to operate the way it does. The real fix is not to rely on the security hole.

  • The Old New Thing

    The decoy visual style


    During the development of Windows XP, the visual design team were very cloak-and-dagger about what the final visual look was going to be. They had done a lot of research and put a lot of work into their designs and wanted to make sure that they made a big splash at the E3 conference when Luna was unveiled. Nobody outside the visual styles team, not even me, knew what Luna was going to look like.

    On the other hand, the programmers who were setting up the infrastructure for visual styles needed to have something to test their code against. And something had to go out in the betas.

    The visual styles team came up with two styles. In secret, they worked on Luna. In public, they worked on a "decoy" visual style called "Mallard". (For non-English speakers: A mallard is a type of duck commonly used as the model for decoys.) The ruse was so successful that people were busy copying the decoy and porting it to their own systems. (So much for copyright protection.)

  • The Old New Thing

    The decoy display control panel


    Last time, we saw one example of a "decoy" used in the service of application compatibility with respect to the Printers Control Panel. Today we'll look at another decoy, this time for the Display Control Panel.

    When support for multiple monitors was being developed, a major obstacle was that a large number of display drivers hacked the Display Control Panel directly instead of using the documented extension mechanism. For example, instead of adding a separate page to the Display Control Panel's property sheet for, say, virtual desktops, they would just hack into the "Settings" page and add their button there. Some drivers were so adventuresome as to do what seemed like a total rewrite of the "Settings" page. They would take all the controls, move them around, resize them, hide some, show others, add new buttons of their own, and generally speaking treat the page as a lump of clay waiting to be molded into their own image. (Here's a handy rule of thumb: If your technique works only if the user speaks English, you probably should consider the possibility that what you're doing is relying on an implementation detail rather than something that will be officially supported going forward.)

    In order to support multiple monitors, the Settings page on the Display Control Panel underwent a major overhaul. But when you tried to open the Display Control Panel on a system that had one of these aggressive drivers installed, it would crash because the driver ran around rearranging things like it always did, even though the things it was manipulating weren't what the developers of the driver intended!

    The solution was to create a "decoy" Settings page that looked exactly like the classic Windows 95 Settings page. The decoy page's purpose in life was to act as bait for these aggressive display drivers and allow itself to be abused mercilessly, letting the driver have its way. Meanwhile, the real Settings page (which is the one that was shown to the user), by virtue of having been overlooked, remained safe and unharmed.

    There was no attempt to make this decoy Settings page do anything interesting at all. Its sole job was to soak up mistreatment without complaining. As a result, those drivers lost whatever nifty features their shenanigans were trying to accomplish, but at least the Display Control Panel stayed alive and allowed the user to do what they were trying to do in the first place: Adjust their display settings.

  • The Old New Thing

    Beware the MSJ subscription scam


    Stephen Toub from MSDN Magazine alerts us to the MSJ subscription scam. Somebody has been sending out (via paper mail) a fake subscription offer for Microsoft Systems Journal, a magazine that ceased publication back in 2000. Read Stephen's article for more details as well as a copy of the scam letter itself. (The address for the "publisher" is a rented mailbox at a what appears to be a UPS Store in the Beaumont Centre mall.) Under no circumstances send these people any money!

Page 354 of 458 (4,571 items) «352353354355356»