Holy cow, I wrote a book!
(This is the first in a series of short posts on where Microsoft
products got their names.)
The original name for the malware protection service was "mpsvc"
the "Microsoft Protection Service", but it was discovered later
that filename was already used by malware!
As a result, the name of the service had to be changed by sticking
an "ms" in front, making it "msmpsvc.exe".
Therefore, technically, its name is the "Microsoft Microsoft Protection
(This is, of course, not to be confused with
"mpssvc.exe", which is, I guess, the
"Microsoft Protection Service Service".)
Fortunately, the Marketing folks can attempt to recover by
deciding that "msmpsvc" stands for
"Microsoft Malware Protection Service".
But you and I will know what it really stands for.
What's the deal
the house in front of Microsoft's RedWest campus?
Here is my understanding.
It may be incomplete or even flat-out wrong.
The house belongs to a couple who was unwilling to sell their property
when Microsoft's real estate people were buying up the land
on which to build the RedWest campus.
(I'm told it was originally a chicken farm.)
Eventually, a deal was struck:
The couple would sell the property to Microsoft but retain
the right to live there until the end of their natural lives.
Furthermore, Microsoft would assume responsibility for maintaining
the lawn and landscaping.
When Microsoft needed to build an underground parking garage
beneath their property,
the house was put on a truck, carried across the street,
where it rested for the duration of the construction,
after which it was returned to its original location.
I imagine the couple was put up in a very nice hotel for
the duration of the construction.
(Heck, maybe they got a nice kitchen remodel out of the deal,
And while I'm spreading rumors about the Microsoft RedWest
campus, here's another one:
If you pay a visit to the campus,
you will find a nature trail that leads through
the wetlands that adjoin the campus.
I was told that the wetlands preservation area was part of the
environmental impact mitigation plan that was necessary to obtain
approval for the construction.
The students at
the nearby school will occasionally take field
(I'm going to cover lighter issues for a while just to take a break
from the network interoperability topic that has raged for over a week
I'm sort of the ringleader of a group of friends
who go in together
a block of tickets to the
I bought a pair of tickets in the block,
one for myself, and one for a rotating guest.
And for some reason, I had a hard time finding a guest for
last weekend's concert.
Of course, six of my friends have already been ruled out as guests
because they're already coming!
I asked a dozen other friends; they were all enthusiastic for
the opportunity but had to decline for one reason or another.
Such busy social calendars.
I did eventually find a taker for my ticket, and all the people
who couldn't make it can go
I hadn't seen Mstislav Rostropovich conduct in a long time.
He's older now (duh) and appears to have lost some weight,
turning him into a somewhat more frail old man.
Being nearly eighty years old may also be a factor...
His musical stature, on the other hand, has not diminished in the least.
(And he still conducts with his mouth open.
Some things never change.)
After I read
the story behind the composition of the
I found the piece even more impressive.
Shostakovich's First Symphony
was significantly harder to grasp—his language
has always eluded me—and it wasn't helped by the audience's
mistaking the grand pause near at the end of the second
movement for its conclusion, or its laughter when the piece resumed.
(Maestro Rostropovich seemed kind of annoyed by that.)
The Prokofiev was wonderfully done, and
the normally expressionless
Assistant Principal Second Violin
got to show off some of his wit while acting as an interpreter
when Maestro Rostropovich introduced the encore.
The ovation was so resounding that the conductor had to take the
concertmaster off the stage with him to tell everybody, "The show's over."
(One of the people in our symphony group has a friend who performs
Cascade Symphony Orchestra, which Mr. Miropolsky conducts.
Apparently, when he gets a microphone in his hand, Mr. Miropolsky
is quite a funny guy.)
for addressing the network compatibility problem
was returning an error code like ERROR_PLEASE_RESTART
which means "Um, the server ran into a problem. Please start over."
This is basically the same as the "do nothing" option,
because the server is already returning an error code specifically
for this problem, namely, STATUS_INVALID_LEVEL.
Now, sure, that error doesn't actually mean "Please try again,"
It actually means
"Sorry, I can't do that."
This is the error code that is supposed to come if you ask a server to go
into fast mode and it doesn't support fast mode.
But the effect from a coding standpoint is the same.
"If FindNextFile return the error xyz,
then the server ran into a problem and you should start over."
Call xyz "ERROR_PLEASE_RESTART"
No matter what you pick,
the net effect is the same:
Existing code must be changed to specifically check for this
new error code and react accordingly.
Programs that aren't updated will behave strangely.
And that's the issue faced by today's topic:
When do you decide that a problem requires you to change the rules
of the game after it has ended?
Programs out there were written to one of may sets of rules.
Most of them were written to Windows XP's rules.
Some were written to Windows 2000's rules.
Even older programs may have been written to Windows 95's
or Windows 3.1's rules.
One aspect of backwards compatibility is accomodating programs
that broke the rules and got away with it.
But here, the issue is not fixing broken programs;
it's keeping correct programs correct.
If you introduce a new error code and specify an unusual
(i.e., something other than "fail the operation"),
then all programs written prior to the introduction of this
rule have suddenly become "wrong" through no fault of their own.
Depending on how "wrong" they are,
the severity of the problem can range from inconvenient to fatal.
In the Explorer case, the directory comes up wrong the first time
but fixes itself if you refresh.
But if a .NET object's enumerator suddenly threw a new
you're probably going to see lots of programs crash with
unhandled exception failures.
At this point, the usual suspects come to the surface:
How will users get updated programs that conform to the new rules?
The original program's author may no longer be alive.
The source code may have been lost.
Or the knowledge necessary to understand the source code may have
("This program was written by an outside contractor five years ago.
We have the source code but nobody here can make heads nor tails of it.")
Or the program's author may simply not consider updating that program
to Windows Vista to be a priority.
(After all, why bother updating version 1.0 of a program when
version 2.0 is available?)
Mind you, Microsoft does change the rules from time to time.
Pre-emptive multi-tasking changed many rules.
The new power management policiies in Windows Vista
certainly changed the rules for a lot of programs.
But even when the rules change,
an effort is usually
made to continue emulating the old rules for old programs.
Because those programs are following a different set of rules,
and it's not nice to change the rules after the game has ended.
I'm probably the only person who uses the
"News for dummies" links in the navigation pane
on this page, and now I'm going to use them even
Swedish news for dummies
recently became available
in podcast form
German news for dummies,
which has been available
as a podcast
since the beginning of the year.
(Both the Swedes and Germans have, of course, other
beyond the news for dummies.)
Of the two, I prefer the Swedish news for dummies.
The German news for dummies is still your standard news report,
Which means I can hear every syllable clearly
without actually understanding anything,
since they still use all that fancy German vocabulary.
The Swedish news for dummies, on the other hand,
proceeds at a methodical (as opposed to mind-numbing) pace
and, more importantly, explains the news using simpler words,
words which I am much more likely to know.
(What's the lesson here? I dunno.
If you want to speak Swedish with me, talk like those Klartext folks.
if you want to speak German with me, do not talk
like those folks over at Deutsche Welle's Langsam gesprochene Nachrichten
because I won't understand you.
But your best choice is probably English, since your English will be
about twenty times better than my Swedish or German.)
One of the big complaints about Explorer we've received
from corporations is how often it accesses the network.
If the computer you're accessing is in the next room,
then accessing it a large number of times isn't too much
of a problem since you get the response back rather quickly.
But if the computer you're talking to is halfway around the
world, then even if you can communicate at the theoretical
maximum possible speed (namely, the speed of light),
it'll take 66 milliseconds for your request to reach the
other computer and another 66 milliseconds for the reply to
In practice, the signal takes longer than that to make its
A latency of a half second is not unusual for global networks.
A latency of one to two seconds is typical for satellite networks.
Note that latency and bandwidth are independent metrics.
Bandwidth is how fast you can shovel data, measured
in data per unit time (e.g. bits per second);
latency is how long it takes the data to reach its destination,
measured in time (e.g. milliseconds).
Even though these global networks have very high bandwidth,
the high latency is what kills you.
(If you're a physicist, you're going to see the units "data per unit time"
and "time" and instinctively want to multiply them together to see
what the resulting "data" unit means.
Bandwidth times latency is known as the "pipe".
When doing data transfer, you want your transfer window to be
the size of your pipe.)
High latency means that you should try to issue as few I/O requests
as possible, although it's okay for each of those requests to be
rather large if your bandwidth is also high.
Significant work went into reducing
the number of I/O requests issued by Explorer during
common operations such as enumerating the contents of a folder.
Enumerating the contents of a folder in Explorer is more than
just getting the file names.
The file system shell folder
needs other file metadata such as the last-modification time
and the file size in order to build up its SHITEMID,
which is the unit of item identification in the shell namespace.
One of the other pieces of information that the shell needs is
the file's index, a 64-bit value that is different for each file on a volume.
Now, this information is not returned by the "slow"
As a result, the shell would have to perform three round-trip
operations to retrieve this extra information:
If you assume a 500ms network latency,
then these three additional operations add a second and a half
for each file in the directory.
If a directory has even just forty files, that's a whole minute
spent just obtaining the file indices.
(As we saw last time,
the FindNextFile does its own internal batching to
avoid this problem when doing traditional file enumeration.)
And that's where this "fast mode" came from.
The "fast mode" query is another type of bulk query to the
server which returns all the normal FindNextFile
information as well as the file indices.
As a result, the file index information is piggybacked on top
of the existing FindNextFile-like query.
That's what makes it fast.
In "fast mode", enumerating 200 files from a directory would take
just a few seconds (two "bulk queries" that return the
FindNextFile information and the file indices at one go,
plus some overhead for establishing and closing the connection).
In "slow mode",
getting the normal FindNextFile information takes
a few seconds, but getting the file indices would add another 1.5 seconds
for each file, for an additional 1.5 × 200
= 300 seconds, or five minutes.
I think most people would agree that
reducing the time it takes to obtain the SHITEMIDs for
all the files in a directory
from five minutes to a few seconds is a big improvement.
That's why the shell is so anxious to use this new "fast mode" query.
If your program is going to be run by multinational corporations,
you have to take high-latency networks into account.
And this means bulking up.
Some people have accused me of intentionally being misleading
with the characterization of this bug.
Any misleading on my part was unintentional.
I didn't have all the facts when I wrote up that first article,
and even now I still don't have all the facts.
For example, FindNextFile using bulk queries?
I didn't learn that until Tuesday night when I was investigating
an earlier comment—time I should have been spending planning
Wednesday night's dinner, mind you.
(Yes, I'm a slacker and don't plan my meals out a week at a time
like organized people do.)
Note that the exercise is still valuable as a thought experiment.
Suppose that FindNextFile didn't use bulk queries
and that the problem really did manifest itself only after the
101st round-trip query.
How would you fix it?
I should also point out that the bug in question is not my bug.
I just saw it in the bug database and thought it would be an interesting
springboard for discussion.
By now, I'm kind of sick of it and will probably not bother
checking back to see how things have settled out.
Saturday afternoon, my phone rings.
"Quick! We're on our way to the nursery. Do you want to come?"
I recognize the voice as one of my friends who recently bought a house
and presumably is doing some spring landscaping.
But I have to answer fast.
Time for a snap decision.
My friend seems surprised that I give my answer so quickly.
"Oh! Well then! Bye."
If you tell me I have to answer fast,
you shouldn't act all offended if I give a quick answer.
The Windows XP kernel
does not turn every call into
FindNextFile into a packet on the network.
the first time an application calls
it issues a bulk query to the server and returns the first result
to the application.
Thereafter, when an application calls
FindNextFile, it returns the next result from the buffer.
If the buffer is empty, then
FindNextFile issues a new bulk query to re-fill the buffer.
This is a significant performance improvement when reading the entire
contents of large directories
because it reduces the number of round trips to the server.
We'll see next time that the gain can be quite significant on certain
types of servers.
But it also means that the suggestion of "Well, why not ask for 101 files
and see if you get an error" won't help any.
(Actually I think the magic number was really 128, not 100, but let's
keep calling it 100 since that's what I started with.)
The number 100 was not some magic value on the server.
That number was actually our own unwitting choice:
The bulk query asks for 100 files at a time!
If we changed the bulk query to ask for 101 files, then the
problem would just appear at the 102nd file.
who works over on WPF
has the first of what will be a series of articles on
USER and GDI compatibility in Windows Vista.
The changes to tighten security,
improve support for East Asian languages,
and take the desktop to a new level with the Desktop Window Manager
(among others) make for quite an interesting compatibility risk list.
And since I mentioned the DWM,
you would do well to check out
has been writing about the Desktop Window Manager,
how it works, how it fits into the rest of the system,
all that stuff.
I know some people have been posting comments asking
for information about the DWM.
You would be much better served asking Greg since he actually works
on it, whereas all I know about the DWM is how to spell it.
[Links fixed: 9am.]
Some people suggested,
as a solution to the network interoperability compatibility problem,
adding a flag to IShellFolder::EnumObjects to indicate
whether the caller wanted to use fast or slow enumeration.
Adding a flag to work around a driver bug doesn't actually solve anything
in the long term.
Considering all the video driver bugs that Windows has had to
work around in the past, if the decision had been made to surface
all those bugs and their workarounds to applications, then
functions like ExtTextOut would have several dozen
flags to control various optimizations that work on all drivers
A call to ExtTextOut would turn into something like this:
ExtTextOut(hdc, x, y, ETO_OPAQUE |
&rcOpaque, lpsz, cch, NULL);
where each of those strange flags is there to indicate that
you want to obtain the performance benefits enabled by each
of those flags because you know that you aren't running on
a version of the video driver that has the particular bug each
of those flags was created to protect against.
And then (still talking hypothetically)
with Windows Vista, you find that your program runs
slower than on Windows XP: Suppose a bug is found in a
video driver where strings longer than 1024 characters come out
Windows Vista therefore contained code to break all strings up
into 1024-character chunks, but as an optimization you could
flag to tell GDI not to use this workaround.
Your Windows XP program doesn't use this flag,
so it now runs slower on Windows Vista.
You'll have to ship an update to your program just to get back
to where you were.
It's not limited to flags either.
By this philosophy of "Don't try to cover up for driver bugs
and just make applications deal with them", you would
have had the following strange paragraph in the FindNextFile
the FindNextFile function returns FALSE
and sets the error code to
then there were no more matching files.
Some very old Lan Manager servers (circa 1994) report this error condition
If you are enumerating files from an old Lan Manager server
and the FindNextFile function indicates that there are
no more files, call the function a second time to confirm that there
really are no more files.
Perhaps it's just me,
but I don't believe that
workarounds for driver issues should become contractual.
I would think that
one of the goals of an operating system would be to smooth out
these bumps and present a uniform programming model to applications.
Applications have enough trouble dealing with their own bugs;
you don't want them to have to deal with driver bugs, too.