September, 2005

  • The Old New Thing

    Why is there no all-encompassing superset version of Windows?

    • 55 Comments

    Sometimes, I am asked why there is no single version of Windows that contains everything. Instead, as you move up the ladder, say, from Windows XP Professional to Windows Server 2003, you gain server features and lose workstation features. Why lose features when you add others?

    Because it turns out no actual customer wants to keep the workstation features on their servers. Only developers want to have this "all-encompassing" version of Windows, and making it available to them would result in developers testing their programs on a version of Windows no actual customer owns.

    I think one of my colleagues who works in security support explained it best:

    When customers ask why their server has Internet Explorer, NetMeeting, Media Player, Games, Instant Messenger, etc., installed by default, it's hard for the support folks to come up with a good answer. Many customers view each additional installed component as additional risk, and want their servers to have the least possible amount of stuff installed.

    If you're the CIO of a bank, the thought that your servers are capable of playing Quake must give you the heebie-jeebies.

    [Raymond is currently away; this message was pre-recorded.]

  • The Old New Thing

    Reading the output of a command from batch

    • 48 Comments

    The FOR command has become the batch language's looping construct. If you ask for help via FOR /? you can see all the ways it has become overloaded. For example, you can read the output of a command by using the for command.

    FOR /F "tokens=*" %i IN ('ver') DO echo %i
    

    The /F switch in conjunction with the single quotation marks indicates that the quoted string is a command to run, whose output is then to be parsed and returned in the specified variable (or variables). The option "tokens=*" says that the entire line should be collected. There are several other options that control the parsing, which I leave you to read on your own.

    The kludgy batch language gets even kludgier. Why is the batch language such a grammatical mess? Backwards compatibility.

    Any change to the batch language cannot break compatibility with the millions of batch programs already in existence. Such batch files are burned onto millions of CDs (you'd be surprised how many commercial programs use batch files, particularly as part of their installation process). They're also run by corporations around the world to get their day-to-day work done. Plus of course the batch files written by people like you and me to do a wide variety of things. Any change to the batch language must keep these batch files running.

    Of course, one could invent a brand new batch language, let's call it Batch² for the sake of discussion, and thereby be rid of the backwards compatibility constraints. But with that decision come different obstacles.

    Suppose you have a 500-line batch file and you want to add one little feature to it, but that new feature is available only in Batch². Does this mean that you have to do a complete rewrite of your batch program into Batch²? Your company spent years tweaking this batch file over the years. (And by "tweaking" I might mean "turning into a plate of spaghetti".) Do you want to take the risk of introducing who-knows-how-many bugs and breaking various obscure features as part of the rewrite into Batch²?

    Suppose you decide to bite the bullet and rewrite. Oh, but Batch² is available only in more recent versions of Windows. Do you tell your customers, "We don't support the older versions of Windows any more"? Or do you bite another bullet and say, "We support only versions of Windows that have Batch²"?

    I'm not saying that it won't happen. (In fact, I'm under the impression that there are already efforts to design a new command console language with an entirely new grammar. Said effort might even be presenting at the PDC in a few days.) I'm just explaining why the classic batch language is such a mess. Welcome to evolution.

    [Raymond is currently away; this message was pre-recorded.]

  • The Old New Thing

    Things to do at Microsoft when the power goes out

    • 46 Comments

    When the power goes out, the first thing you notice is how quiet everything becomes. The hum of the computers in the building stops. You hear... nothing.

    Bask in its peaceful silence.

    The next thing you do is turn off all the machines in your office, because you don't want to stress the power grid and network when the power eventually returns by having a hundred thousand computers all firing themselves up and joining the network at the same time.

    Of course, another thing you need to do is find your way around. This can be quite a challenge if you're in a lab with no windows and no emergency lighting: It suddenly becomes pitch black! Laptop computers prove useful at this point. Fire up notepad and maximize it, resulting in an all-white screen. Use that screen as a flashlight to navigate through the lab turning off computers and eventually leading yourself out of the lab to daylight.

    Next time, a story of an employee-induced power outage.

  • The Old New Thing

    Precision is not the same as accuracy

    • 44 Comments

    Accuracy is how close you are to the correct answer; precision is how much resolution you have for that answer.

    Suppose you ask me, "What time is it?"

    I look up at the sun, consider for a moment, and reply, "It is 10:35am and 22.131 seconds."

    I gave you a very precise answer, but not a very accurate one.

    Meanwhile, you look at your watch, one of those fashionable watches with notches only at 3, 6, 9 and 12. You furrow your brow briefly and decide, "It is around 10:05." Your answer is more accurate than mine, though less precise.

    Now let's apply that distinction to some of the time-related functions in Windows.

    The GetTickCount function has a precision of one millisecond, but its accuracy is typically much worse, dependent on your timer tick rate, typically 10ms to 55ms. The GetSystemTimeAsFileTime function looks even more impressive with its 100-nanosecond precision, but its accuracy is not necessarily any better than that of GetTickCount.

    If you're looking for high accuracy, then you'd be better off playing around with the QueryPerformanceCounter function. You have to make some tradeoffs, however. For one, the precision of the result is variable; you need to call the QueryPerformanceFrequency function to see what the precision is. Another tradeoff is that the higher accuracy of QueryPerformanceCounter can be slower to obtain.

    What QueryPerformanceCounter actually does is up to the HAL (with some help from ACPI). The performance folks tell me that, in the worst case, you might get it from the rollover interrupt on the programmable interrupt timer. This in turn may require a PCI transaction, which is not exactly the fastest thing in the world. It's better than GetTickCount, but it's not going to win any speed contests. In the best case, the HAL may conclude that the RDTSC counter runs at a constant frequency, so it uses that instead. Things are particularly exciting on multiprocessor machines, where you also have to make sure that the values returned from RDTSC on each processor are consistent with each other! And then, for good measure, throw in a handful of workarounds for known buggy hardware.

  • The Old New Thing

    More undocumented behavior and the people who rely on it: Output buffers

    • 42 Comments

    For functions that return data, the contents of the output buffer if the function fails are typically left unspecified. If the function fails, callers should assume nothing about the contents.

    But that doesn't stop them from assuming it anyway.

    I was reminded of this topic after reading Michael Kaplan's story of one customer who wanted the output buffer contents to be defined even on failure. The reason the buffer is left untouched is because many programs assume that the buffer is unchanged on failure, even though there is no documentation supporting this behavior.

    Here's one example of code I've seen (reconstructed) that relies on the output buffer being left unchanged:

    HKEY hk = hkFallback;
    RegOpenKeyEx(..., &hk);
    RegQueryValue(hk, ...);
    if (hk != hkFallback) RegCloseKey(hk);
    

    This code fragment starts out with a fallback key then tries to open a "better" key, assuming that if the open fails, the contents of the hk variable will be left unchanged and therefore will continue to have the original fallback value. This behavior is not guaranteed by the specification for the RegOpenKeyEx function, but that doesn't stop people from relying on it anyway.

    Here's another example from actual shipping code. Observe that the CRegistry::Restore method is documented as "If the specified key does not exist, the value of 'Value' is unchanged." (Let's ignore for now that the documentation uses registry terminology incorrectly; the parameter specified is a value name, not a key name.) If you look at what the code actually does, it loads the buffer with the original value of "Value", then calls the RegQueryValueEx function twice and ignores the return value both times! The real work happens in the CRegistry::RestoreDWORD function. At the first call, observe that it initializes the type variable, then calls the RegQueryValueEx function and assumes that it does not modify the &type parameter on failure. Next, it calls the RegQueryValueEx function a second time, this time assuming that the output buffer &Value remains unchanged in the event of failure, because that's what CRegistry::Restore expects.

    I don't mean to pick on that code sample. It was merely a convenient example of the sorts of abuses that Win32 needs to sustain on a regular basis for the sake of compatibility. Because, after all, people buy computers in order to run programs on them.

    One significant exception to the "output buffers are undefined on failure" rule is output buffers returned by COM interface methods. COM rules are that output buffers are always initialized, even on failure. This is necessary to ensure that the marshaller doesn't crash. For example, the last parameter to the IUnknown::QueryInterface method must be set to NULL on failure.

  • The Old New Thing

    The double-Ctrl+Alt+Del feature is really a kludge

    • 39 Comments

    Most people who care about such things know that you can press Ctrl+Alt+Del twice from the Welcome screen and sometimes you will get a classic logon dialog. (Note: "Sometimes". It works only if the last operation was a restart or log-off, for complicated reasons that are irrelevant to this discussion.)

    The ability to do the double-Ctrl+Alt+Del was added as a fallback just in case there turned out to be some important logon scenario that the new Welcome screen failed to cover, but which the designers had failed to take into account by simple oversight. Scenarios such as smartcard or fingerprint logon.

    In other words, it's a kludge.

    In the time since Windows XP came out, the logon folks have kept an eye out to see if there indeed were any scenarios that weren't covered by the Welcome screen. I think the only one that came up was Kerberos authentication.

    Now that (once they fix the Kerberos problem) they have covered all the bases, the designers are probably going to feel more confident about the new logon design, and the double-Ctrl+Alt+Del panic button will likely be removed.

    So don't get too attached to it.

    This is why the Welcome screen shows that Administrator account if there are no other members of the Administrators group on the system: If it didn't show the Administrator account, you would be locked out of your own computer.

    "No I'm not. I can use the double-Ctrl+Alt+Del trick to log on as the Administrator."

    Well, okay, that works today, but you're relying on a panic button that might not be there tomorrow.

    [Raymond is currently away; this message was pre-recorded.]

  • The Old New Thing

    But I have Visual Basic Professional

    • 39 Comments

    Back in 1995, I was participating in a chat room on MSN on the subject of device driver development. One of the people in the chat room asked, "Can I write a device driver in Visual Basic?"

    I replied, "Windows 95 device drivers are typically written in low-level languages such as C or even assembly language."

    Undaunted, the person clarified: "But I have Visual Basic Professional."

  • The Old New Thing

    Spider Solitaire unseats the reigning champion

    • 38 Comments

    A few months ago, the usability research team summarized some statistics they had been collecting on the subject of what people spend most of their time doing on the computer at home. Not surprisingly, surfing the Internet was number one. Number two was playing games, and in particular, I found it notable that the number one game is no longer Klondike Solitaire (known to most Windows users as just plain "Solitaire").

    That title now belongs to Spider Solitaire. The top three games (Spider Solitaire, Klondike Solitaire, and Freecell) together account for more than half of all game-playing time.

    Personally, I'm a Freecell player.

    Exercise: Why aren't games like Unreal Tournament or The Sims in the top three?

  • The Old New Thing

    Windows Server 2003 can take you back in time

    • 34 Comments

    If you are running Windows Server 2003, you owe it to yourself to enable the Volume Shadow Copy service. What this service does is periodically (according to a schedule you set) capture a snapshot of the files you specify so they can be recovered later. The copies are lazy: If a file doesn't change between snapshots, a new copy isn't made. Up to 64 versions of a file can be recorded in the snapshot database. Bear this in mind when setting your snapshot schedule. If you take a snapshot twice a day, you're good for a month, but if you take a snapshot every minute, you get only an hour's worth of snapshots. You are trading off snapshot quality against quantity.

    Although I can count on my hand the number of times the Volume Shadow Copy service has saved my bacon, each time I needed it, it saved me at least a day's work. Typically, it's because I wasn't paying attention and deleted the wrong file. Once it was because I make some changes to a file and ended up making a bigger mess of things and would have been better off just returning to the version I had the previous day.

    I just click on "View previous versions of this folder" in the Tasks Pane, pick the snapshot from yesterday, and drag yesterday's version of the file to my desktop. Then I can take that file and compare it to the version I have now and reconcile the changes. In the case of a deleted file, I just click the "Restore" button and back to life it comes. (Be careful about using "Restore" for a file that still exists, however, because that will overwrite the current version with the snapshot version.)

    One tricky bit about viewing snapshots is that it works only on network drives. If you want to restore a file from a local hard drive, you'll need to either connect to the drive from another computer or (what I do) create a loopback connection and restore it via the loopback.

    Note that the Volume Shadow Copy service is not a replacement for backups. The shadow copies are kept on the drive itself, so if you lose the drive, you lose the shadow copies too.

    Given the ability of the Volume Shadow Copy service to go back in time and recover previous versions of a file, you're probably not surprised that the code name for the feature was "Timewarp".

    John, a colleague in security, points out that shadow copies provide a curious backdoor to the quota system. Although you have access to shadow copies of your file, they do not count against your quota. Counting them against your quota would be unfair since it is the system that created these files, not you. (Of course, this isn't a very useful way to circumvent quota, because the system will also delete shadow copies whenever it feels the urge.)

  • The Old New Thing

    Why doesn't Microsoft give every employee a UPS?

    • 30 Comments

    One reaction to my story about the oldest computer at Microsoft still doing useful work was shock (shock!) that Microsoft suffers from power outages.

    In the Pacific Northwest, winter windstorms are quite common, and it is not unexpected that a windstorm blow down tall trees (which are also quite common) which in turn take out power lines. And if those power lines supply Microsoft main campus, then main campus loses power.

    All the critical computers have UPSs so that they can make a soft landing when the power goes out, but it's hardly the case that every single computer in every office and lab gets a UPS. That would be prohibitively expensive and wouldn't accomplish much anyway. Sure, each of the five computers in your office might stay alive for another fifteen minutes, but this assumes that you're actually in your office to shut them down cleanly when the power goes out. If your machine is frozen into the debugger, no amount of software-automated shutdown will help. (A frozen machine cannot shut itself down.)

    In other words, the cost-benefit of giving every employee a UPS for each machine in their office simply doesn't pan out.

    In the last few days of 1999, the main Windows development building was prepared for a wholesale catastrophe. Generator trucks were brought in so that the entire building could be kept up and running should the power fail as part of a worldwide Year 2000 meltdown. Those trucks were huge and no doubt extremely expensive.

    And thankfully were never needed.

    Those who were in Los Angeles last week for the PDC might be amused to learn that the PDC technical staff, fearing a repeat of Monday's blackout, rented a generator truck to provide emergency backup power for all the machines on stage for Bill Gates' and Jim Allchin's keynote addresses. The power may go out in Los Angeles, but the PDC keynote must go on!

    More musings about power outages next time.

Page 1 of 4 (39 items) 1234