April, 2006

  • The Old New Thing

    Why is the Microsoft Protection Service called "msmpsvc"?


    (This is the first in a series of short posts on where Microsoft products got their names.)

    The original name for the malware protection service was "mpsvc" the "Microsoft Protection Service", but it was discovered later that that filename was already used by malware! As a result, the name of the service had to be changed by sticking an "ms" in front, making it "msmpsvc.exe".

    Therefore, technically, its name is the "Microsoft Microsoft Protection Service". (This is, of course, not to be confused with "mpssvc.exe", which is, I guess, the "Microsoft Protection Service Service".)

    Fortunately, the Marketing folks can attempt to recover by deciding that "msmpsvc" stands for "Microsoft Malware Protection Service". But you and I will know what it really stands for.

  • The Old New Thing

    What's the deal with the house in front of Microsoft's RedWest campus?


    What's the deal with the house in front of Microsoft's RedWest campus?

    Here is my understanding. It may be incomplete or even flat-out wrong.

    The house belongs to a couple who was unwilling to sell their property when Microsoft's real estate people were buying up the land on which to build the RedWest campus. (I'm told it was originally a chicken farm.) Eventually, a deal was struck: The couple would sell the property to Microsoft but retain the right to live there until the end of their natural lives. Furthermore, Microsoft would assume responsibility for maintaining the lawn and landscaping.

    When Microsoft needed to build an underground parking garage beneath their property, the house was put on a truck, carried across the street, where it rested for the duration of the construction, after which it was returned to its original location. I imagine the couple was put up in a very nice hotel for the duration of the construction. (Heck, maybe they got a nice kitchen remodel out of the deal, who knows?)

    And while I'm spreading rumors about the Microsoft RedWest campus, here's another one: If you pay a visit to the campus, you will find a nature trail that leads through the wetlands that adjoin the campus. I was told that the wetlands preservation area was part of the environmental impact mitigation plan that was necessary to obtain approval for the construction. The students at the nearby school will occasionally take field trips there.

    (I'm going to cover lighter issues for a while just to take a break from the network interoperability topic that has raged for over a week now.)

  • The Old New Thing

    You'd think it'd be easy to give away a ticket to the symphony


    I'm sort of the ringleader of a group of friends who go in together on a block of tickets to the Seattle Symphony. I bought a pair of tickets in the block, one for myself, and one for a rotating guest. And for some reason, I had a hard time finding a guest for last weekend's concert.

    Of course, six of my friends have already been ruled out as guests because they're already coming! I asked a dozen other friends; they were all enthusiastic for the opportunity but had to decline for one reason or another. Such busy social calendars.

    • "I will be out of town { on a business trip (2x) | to visit my parents/in-laws (1x × 2) | for a chess tournament }."
    • "My parents/in-laws are visiting from out of town (2x × 2 + 1x)."
    • "I'm attending/organizing a birthday party (2x)."
    • "I'm going to a dinner party thrown by my girlfriend."
    • "I'd just fall asleep." At least this one was honest.

    I did eventually find a taker for my ticket, and all the people who couldn't make it can go eat their hearts out.

    I hadn't seen Mstislav Rostropovich conduct in a long time. He's older now (duh) and appears to have lost some weight, turning him into a somewhat more frail old man. Being nearly eighty years old may also be a factor... His musical stature, on the other hand, has not diminished in the least. (And he still conducts with his mouth open. Some things never change.)

    After I read the story behind the composition of the Festive Overture, I found the piece even more impressive. Shostakovich's First Symphony was significantly harder to grasp—his language has always eluded me—and it wasn't helped by the audience's mistaking the grand pause near at the end of the second movement for its conclusion, or its laughter when the piece resumed. (Maestro Rostropovich seemed kind of annoyed by that.) The Prokofiev was wonderfully done, and the normally expressionless Assistant Principal Second Violin Michael Miropolsky got to show off some of his wit while acting as an interpreter when Maestro Rostropovich introduced the encore. The ovation was so resounding that the conductor had to take the concertmaster off the stage with him to tell everybody, "The show's over."

    (One of the people in our symphony group has a friend who performs in the Cascade Symphony Orchestra, which Mr. Miropolsky conducts. Apparently, when he gets a microphone in his hand, Mr. Miropolsky is quite a funny guy.)

  • The Old New Thing

    Be very careful if you decide to change the rules after the game has ended


    One suggestion for addressing the network compatibility problem was returning an error code like ERROR_PLEASE_RESTART which means "Um, the server ran into a problem. Please start over." This is basically the same as the "do nothing" option, because the server is already returning an error code specifically for this problem, namely, STATUS_INVALID_LEVEL. Now, sure, that error doesn't actually mean "Please try again," It actually means "Sorry, I can't do that." This is the error code that is supposed to come if you ask a server to go into fast mode and it doesn't support fast mode.

    But the effect from a coding standpoint is the same. "If FindNextFile return the error xyz, then the server ran into a problem and you should start over." Call xyz "ERROR_PLEASE_RESTART" or "STATUS_INVALID_LEVEL" or "PURPLE_LILACS". No matter what you pick, the net effect is the same: Existing code must be changed to specifically check for this new error code and react accordingly. Programs that aren't updated will behave strangely.

    And that's the issue faced by today's topic: When do you decide that a problem requires you to change the rules of the game after it has ended?

    Programs out there were written to one of may sets of rules. Most of them were written to Windows XP's rules. Some were written to Windows 2000's rules. Even older programs may have been written to Windows 95's or Windows 3.1's rules. One aspect of backwards compatibility is accomodating programs that broke the rules and got away with it. But here, the issue is not fixing broken programs; it's keeping correct programs correct.

    If you introduce a new error code and specify an unusual recommended action (i.e., something other than "fail the operation"), then all programs written prior to the introduction of this rule have suddenly become "wrong" through no fault of their own. Depending on how "wrong" they are, the severity of the problem can range from inconvenient to fatal. In the Explorer case, the directory comes up wrong the first time but fixes itself if you refresh. But if a .NET object's enumerator suddenly threw a new ServerFailedMustRestartEnumeration exception, you're probably going to see lots of programs crash with unhandled exception failures.

    At this point, the usual suspects come to the surface: How will users get updated programs that conform to the new rules? The original program's author may no longer be alive. The source code may have been lost. Or the knowledge necessary to understand the source code may have been lost. ("This program was written by an outside contractor five years ago. We have the source code but nobody here can make heads nor tails of it.") Or the program's author may simply not consider updating that program to Windows Vista to be a priority. (After all, why bother updating version 1.0 of a program when version 2.0 is available?)

    Mind you, Microsoft does change the rules from time to time. Pre-emptive multi-tasking changed many rules. The new power management policiies in Windows Vista certainly changed the rules for a lot of programs. But even when the rules change, an effort is usually made to continue emulating the old rules for old programs. Because those programs are following a different set of rules, and it's not nice to change the rules after the game has ended.

  • The Old New Thing

    News for dummies now available in podcast form


    I'm probably the only person who uses the "News for dummies" links in the navigation pane on this page, and now I'm going to use them even less. The Swedish news for dummies recently became available in podcast form [RSS], joining the German news for dummies, which has been available as a podcast [RSS] since the beginning of the year. (Both the Swedes and Germans have, of course, other podcast offerings beyond the news for dummies.)

    Of the two, I prefer the Swedish news for dummies. The German news for dummies is still your standard news report, but spoken m-i-n-d   b-o-g-g-l-i-n-g-l-y   s-l-o-w-l-y. Which means I can hear every syllable clearly without actually understanding anything, since they still use all that fancy German vocabulary. The Swedish news for dummies, on the other hand, proceeds at a methodical (as opposed to mind-numbing) pace and, more importantly, explains the news using simpler words, words which I am much more likely to know.

    (What's the lesson here? I dunno. If you want to speak Swedish with me, talk like those Klartext folks. if you want to speak German with me, do not talk like those folks over at Deutsche Welle's Langsam gesprochene Nachrichten because I won't understand you. But your best choice is probably English, since your English will be about twenty times better than my Swedish or German.)

  • The Old New Thing

    Computing over a high-latency network means you have to bulk up


    One of the big complaints about Explorer we've received from corporations is how often it accesses the network. If the computer you're accessing is in the next room, then accessing it a large number of times isn't too much of a problem since you get the response back rather quickly. But if the computer you're talking to is halfway around the world, then even if you can communicate at the theoretical maximum possible speed (namely, the speed of light), it'll take 66 milliseconds for your request to reach the other computer and another 66 milliseconds for the reply to come back. In practice, the signal takes longer than that to make its round trip. A latency of a half second is not unusual for global networks. A latency of one to two seconds is typical for satellite networks.

    Note that latency and bandwidth are independent metrics. Bandwidth is how fast you can shovel data, measured in data per unit time (e.g. bits per second); latency is how long it takes the data to reach its destination, measured in time (e.g. milliseconds). Even though these global networks have very high bandwidth, the high latency is what kills you.

    (If you're a physicist, you're going to see the units "data per unit time" and "time" and instinctively want to multiply them together to see what the resulting "data" unit means. Bandwidth times latency is known as the "pipe". When doing data transfer, you want your transfer window to be the size of your pipe.)

    High latency means that you should try to issue as few I/O requests as possible, although it's okay for each of those requests to be rather large if your bandwidth is also high. Significant work went into reducing the number of I/O requests issued by Explorer during common operations such as enumerating the contents of a folder.

    Enumerating the contents of a folder in Explorer is more than just getting the file names. The file system shell folder needs other file metadata such as the last-modification time and the file size in order to build up its SHITEMID, which is the unit of item identification in the shell namespace. One of the other pieces of information that the shell needs is the file's index, a 64-bit value that is different for each file on a volume. Now, this information is not returned by the "slow" FindNextFile function. As a result, the shell would have to perform three round-trip operations to retrieve this extra information:

    • CreateFile(),
    • GetFileInformationByHandle() (which returns the file index in the BY_HANDLE_FILE_INFORMATION structure), and finally
    • CloseHandle().

    If you assume a 500ms network latency, then these three additional operations add a second and a half for each file in the directory. If a directory has even just forty files, that's a whole minute spent just obtaining the file indices. (As we saw last time, the FindNextFile does its own internal batching to avoid this problem when doing traditional file enumeration.)

    And that's where this "fast mode" came from. The "fast mode" query is another type of bulk query to the server which returns all the normal FindNextFile information as well as the file indices. As a result, the file index information is piggybacked on top of the existing FindNextFile-like query. That's what makes it fast. In "fast mode", enumerating 200 files from a directory would take just a few seconds (two "bulk queries" that return the FindNextFile information and the file indices at one go, plus some overhead for establishing and closing the connection). In "slow mode", getting the normal FindNextFile information takes a few seconds, but getting the file indices would add another 1.5 seconds for each file, for an additional 1.5 × 200 = 300 seconds, or five minutes.

    I think most people would agree that reducing the time it takes to obtain the SHITEMIDs for all the files in a directory from five minutes to a few seconds is a big improvement. That's why the shell is so anxious to use this new "fast mode" query.

    If your program is going to be run by multinational corporations, you have to take high-latency networks into account. And this means bulking up.

    Sidebar: Some people have accused me of intentionally being misleading with the characterization of this bug. Any misleading on my part was unintentional. I didn't have all the facts when I wrote up that first article, and even now I still don't have all the facts. For example, FindNextFile using bulk queries? I didn't learn that until Tuesday night when I was investigating an earlier comment—time I should have been spending planning Wednesday night's dinner, mind you. (Yes, I'm a slacker and don't plan my meals out a week at a time like organized people do.)

    Note that the exercise is still valuable as a thought experiment. Suppose that FindNextFile didn't use bulk queries and that the problem really did manifest itself only after the 101st round-trip query. How would you fix it?

    I should also point out that the bug in question is not my bug. I just saw it in the bug database and thought it would be an interesting springboard for discussion. By now, I'm kind of sick of it and will probably not bother checking back to see how things have settled out.

  • The Old New Thing

    Sometimes you just have to make a snap decision


    Saturday afternoon, my phone rings.


    "Quick! We're on our way to the nursery. Do you want to come?"

    I recognize the voice as one of my friends who recently bought a house and presumably is doing some spring landscaping. But I have to answer fast. Time for a snap decision.


    My friend seems surprised that I give my answer so quickly.

    "Oh! Well then! Bye."

    If you tell me I have to answer fast, you shouldn't act all offended if I give a quick answer.

  • The Old New Thing

    It's more efficient when you buy in bulk


    The Windows XP kernel does not turn every call into FindNextFile into a packet on the network. Rather, the first time an application calls FindNextFile, it issues a bulk query to the server and returns the first result to the application. Thereafter, when an application calls FindNextFile, it returns the next result from the buffer. If the buffer is empty, then FindNextFile issues a new bulk query to re-fill the buffer.

    This is a significant performance improvement when reading the entire contents of large directories because it reduces the number of round trips to the server. We'll see next time that the gain can be quite significant on certain types of servers.

    But it also means that the suggestion of "Well, why not ask for 101 files and see if you get an error" won't help any. (Actually I think the magic number was really 128, not 100, but let's keep calling it 100 since that's what I started with.) The number 100 was not some magic value on the server. That number was actually our own unwitting choice: The bulk query asks for 100 files at a time! If we changed the bulk query to ask for 101 files, then the problem would just appear at the 102nd file.

  • The Old New Thing

    USER and GDI compatibility in Windows Vista


    My colleague Nick Kramer who works over on WPF has the first of what will be a series of articles on USER and GDI compatibility in Windows Vista. The changes to tighten security, improve support for East Asian languages, and take the desktop to a new level with the Desktop Window Manager (among others) make for quite an interesting compatibility risk list.

    And since I mentioned the DWM, you would do well to check out Greg Schechter who has been writing about the Desktop Window Manager, how it works, how it fits into the rest of the system, all that stuff. I know some people have been posting comments asking for information about the DWM. You would be much better served asking Greg since he actually works on it, whereas all I know about the DWM is how to spell it.

    [Links fixed: 9am.]

  • The Old New Thing

    Adding flags to APIs to work around driver bugs doesn't scale


    Some people suggested, as a solution to the network interoperability compatibility problem, adding a flag to IShellFolder::EnumObjects to indicate whether the caller wanted to use fast or slow enumeration.

    Adding a flag to work around a driver bug doesn't actually solve anything in the long term.

    Considering all the video driver bugs that Windows has had to work around in the past, if the decision had been made to surface all those bugs and their workarounds to applications, then functions like ExtTextOut would have several dozen flags to control various optimizations that work on all drivers except one. A call to ExtTextOut would turn into something like this:

    ExtTextOut(hdc, x, y, ETO_OPAQUE |
               &rcOpaque, lpsz, cch, NULL);

    where each of those strange flags is there to indicate that you want to obtain the performance benefits enabled by each of those flags because you know that you aren't running on a version of the video driver that has the particular bug each of those flags was created to protect against.

    And then (still talking hypothetically) with Windows Vista, you find that your program runs slower than on Windows XP: Suppose a bug is found in a video driver where strings longer than 1024 characters come out garbled. Windows Vista therefore contained code to break all strings up into 1024-character chunks, but as an optimization you could pass the ETO_PASS_LONG_STRINGS_TO_DRIVER flag to tell GDI not to use this workaround. Your Windows XP program doesn't use this flag, so it now runs slower on Windows Vista. You'll have to ship an update to your program just to get back to where you were.

    It's not limited to flags either. By this philosophy of "Don't try to cover up for driver bugs and just make applications deal with them", you would have had the following strange paragraph in the FindNextFile documentation:

    If the FindNextFile function returns FALSE and sets the error code to ERROR_NO_MORE_FILES, then there were no more matching files. Some very old Lan Manager servers (circa 1994) report this error condition prematurely. If you are enumerating files from an old Lan Manager server and the FindNextFile function indicates that there are no more files, call the function a second time to confirm that there really are no more files.

    Perhaps it's just me, but I don't believe that workarounds for driver issues should become contractual. I would think that one of the goals of an operating system would be to smooth out these bumps and present a uniform programming model to applications. Applications have enough trouble dealing with their own bugs; you don't want them to have to deal with driver bugs, too.

Page 3 of 4 (34 items) 1234