Holy cow, I wrote a book!
If you are running Windows Server 2003,
you owe it to yourself to
enable the Volume Shadow Copy service.
What this service does is periodically
(according to a schedule you set)
capture a snapshot of the files you specify
so they can be recovered later.
The copies are lazy: If a file doesn't change
between snapshots, a new copy isn't made.
Up to 64 versions of a file can be recorded in the
Bear this in mind when setting your snapshot schedule.
If you take a snapshot twice a day, you're good for a month,
but if you take a snapshot every minute,
you get only an hour's worth of snapshots.
You are trading off snapshot quality against quantity.
Although I can count on my hand the number of times
the Volume Shadow Copy service has saved my bacon,
each time I needed it, it saved me at least a day's work.
Typically, it's because I wasn't paying attention and deleted
the wrong file.
Once it was because I make some changes to a file and ended up
making a bigger mess of things
and would have been better off just returning to the version I had
the previous day.
I just click on "View previous versions of this folder" in the
Tasks Pane, pick the snapshot from yesterday, and drag yesterday's
version of the file to my desktop.
Then I can take that file and compare it to the version I have now
and reconcile the changes.
In the case of a deleted file,
I just click the "Restore" button and back to life it comes.
(Be careful about using "Restore" for a file that still exists,
however, because that will overwrite the current
version with the snapshot version.)
One tricky bit about viewing snapshots is that it works only on
If you want to restore a file from a local hard drive,
you'll need to either connect to the drive from another computer
or (what I do) create a loopback connection and restore it
via the loopback.
Note that the Volume Shadow Copy service is not a replacement for
The shadow copies are kept on the drive itself, so if you lose the
drive, you lose the shadow copies too.
Given the ability of the Volume Shadow Copy service to go back in time
and recover previous versions of a file,
you're probably not surprised that the code name for the feature
John, a colleague in security, points out that
shadow copies provide a curious backdoor to the quota system.
Although you have access to shadow copies of your file, they
do not count against your quota.
Counting them against your quota would be unfair since it is
the system that created these files, not you.
(Of course, this isn't a very useful way to circumvent quota,
because the system will also delete shadow copies whenever it
feels the urge.)
A few months ago, the usability research team summarized some
statistics they had been collecting on the subject of what
people spend most of their time doing on the computer at home.
Not surprisingly, surfing the Internet was number one.
Number two was playing games, and in particular,
I found it notable that the number
one game is no longer Klondike Solitaire
(known to most Windows users as just plain "Solitaire").
That title now belongs to Spider Solitaire.
The top three games (Spider Solitaire,
Klondike Solitaire, and Freecell) together
account for more than half of all game-playing time.
Personally, I'm a Freecell player.
Why aren't games like
Unreal Tournament or
The Sims in the top three?
Accuracy is how close you are to the correct answer;
precision is how much resolution you have for that answer.
Suppose you ask me, "What time is it?"
I look up at the sun, consider for a moment, and reply,
"It is 10:35am and 22.131 seconds."
I gave you a very precise answer, but not a very accurate one.
Meanwhile, you look at your watch, one of those fashionable watches
with notches only at 3, 6, 9 and 12. You furrow your brow briefly
and decide, "It is around 10:05."
Your answer is more accurate than mine, though less precise.
Now let's apply that distinction to
some of the time-related functions in Windows.
The GetTickCount function
has a precision of one millisecond, but its accuracy is typically
much worse, dependent on your timer tick rate, typically 10ms to
The GetSystemTimeAsFileTime function
looks even more impressive with its 100-nanosecond precision,
but its accuracy is not necessarily any better than that of
If you're looking for high accuracy, then you'd be better off
playing around with
the QueryPerformanceCounter function.
You have to make some tradeoffs, however.
For one, the precision of the result is variable; you need to call
the QueryPerformanceFrequency function
to see what the precision is.
Another tradeoff is that the higher accuracy
of QueryPerformanceCounter can be slower to obtain.
What QueryPerformanceCounter actually does is up to the HAL
(with some help from ACPI).
The performance folks tell me that,
in the worst case, you might get it from
the rollover interrupt on the programmable interrupt timer.
This in turn may require a PCI transaction, which is not exactly
the fastest thing in the world.
It's better than GetTickCount, but it's not going
to win any speed contests.
In the best case, the HAL may conclude that the RDTSC counter runs
at a constant frequency, so it uses that instead.
Things are particularly exciting on multiprocessor machines, where
you also have to make sure that the values returned from RDTSC on
each processor are consistent with each other!
And then, for good measure,
throw in a handful of workarounds for known buggy hardware.
For functions that return data,
the contents of the output buffer if the function fails are typically
If the function fails, callers should assume nothing about the contents.
But that doesn't stop them from assuming it anyway.
I was reminded of this topic after reading
Michael Kaplan's story of one customer who wanted the output buffer
contents to be defined even on failure.
The reason the buffer is left untouched is because many
programs assume that the buffer is unchanged on failure,
even though there is no documentation supporting this behavior.
Here's one example of code I've seen (reconstructed) that relies
on the output buffer being left unchanged:
HKEY hk = hkFallback;
if (hk != hkFallback) RegCloseKey(hk);
This code fragment starts out with a fallback key then tries
to open a "better" key,
assuming that if the open fails,
the contents of the hk variable will be left unchanged
and therefore will continue to have the original fallback value.
This behavior is not guaranteed by the specification for
the RegOpenKeyEx function, but that doesn't stop people
from relying on it anyway.
Here's another example
from actual shipping code.
Observe that the CRegistry::Restore method is documented
as "If the specified key does not exist, the value of 'Value' is unchanged."
(Let's ignore for now that the documentation uses registry
terminology incorrectly; the parameter specified is a value name,
not a key name.)
If you look at what the code actually does,
it loads the buffer with the original value of "Value",
the RegQueryValueEx function twice
and ignores the return value both times!
The real work happens in the CRegistry::RestoreDWORD
At the first call, observe that it initializes
the type variable, then calls
the RegQueryValueEx function and assumes that
it does not modify the
&type parameter on failure.
Next, it calls
the RegQueryValueEx function a second time,
this time assuming that the output buffer
&Value remains unchanged in the event of failure,
because that's what CRegistry::Restore expects.
I don't mean to pick on that code sample.
It was merely a convenient example
of the sorts of abuses that Win32 needs to sustain
on a regular basis for the sake of compatibility.
Because, after all, people buy computers in order to
run programs on them.
One significant exception to the "output buffers are undefined on failure"
rule is output buffers returned by COM interface methods.
COM rules are that output buffers are always initialized, even on failure.
This is necessary to ensure that the marshaller doesn't crash.
For example, the last parameter to the IUnknown::QueryInterface method
must be set to NULL on failure.
A few years ago,
American RadioWorks ran
a story on
the consequences to New Orleans of a Category 5 hurricane
[NPR part 1]
[NPR part 2].
I had been hoping that the city would escape
the worst-case scenario of the water
topping the levees and submerging the city in twenty feet of water,
appear to have taken us one step closer...
As you probably know, I'm fascinated by language,
particularly the slang terms of various professions,
the rich acronym soup of the emergency medical field
(my sick favorite being "CTD").
In the hurricane story, we hear the director of emergency management
use the acronym
which stands for "Kiss your..."
On more than one occasion, I've seen someone ask a question like this:
I have some procedure that generates strings dynamically,
and I want a formula that takes a string and produces a
small unique identifer for that string (a hash code),
such that two identical strings have the same identifier,
and that if two strings are different, then they will have different
I tried String.GetHashCode(), but there were occasional collisions.
Is there a way to generate a hash code that guarantees uniqueness?
If you can restrict the domain of the
strings you're hashing, you can sometimes squeak out uniqueness.
For example, if the domain is a finite set, you can develop
which guarantees no collisions among the domain strings.
In the general case, where you do not know what strings you are
going to be hashing, this is not possible.
Suppose your hash code is a 32-bit value.
This means that there are 232 possible hash values.
But there are more than 232 possible strings.
Therefore, by the pigeonhole principle, there must exist at least
two strings with the same hash code.
One little-known fact about
the pigeonhole principle
is that it has nothing to do with pigeons.
The term "pigeonhole" refers to
a small compartment in a desk
into which items such as papers or letters are distributed.
(Hence the verb "to pigeonhole": To assign a category, often based on
The pigeonhole principle, then, refers to the process of sorting
papers into pigeonholes, and not the nesting habits of
members of the family Columbidae.
Only a Game
recently covered the rise of
tenth birthday party last week,
there was a wide variety of entertainment options,
the highlight of which appeared to be an organized dodgeball tournament.
It was very well attended
and didn't have the ego-damaging overtones
you got from elementary school.
A good time was had by all.
Senior Vice President of MSN
happens also to have been the
development manager of Windows 95,
so he made the generous gesture of inviting the members
of the Windows 95 team to his group's birthday party.
(Since the remaining members of the
Windows 95 team are outnumbered forty-to-one by the current members
of the MSN team, it gave the impression that Windows 95
was merely an after-thought to MSN!
"Ten years ago, MSN 1.0 went live!
And if I recall correctly,
some little operating system rode our coattails.")
Most people probably haven't noticed this,
but there was a change to the requirements for file type handlers
that arrived with Windows XP SP 2:
Paths to programs now must be fully-qualified
if they reside in a directory outside of
the Windows directory and the System directory.
The reason for this is security with a touch of predictability
Security, because one of the places that
the SearchPath function
searches is the current directory,
and it searches the current directory before searching standard
system directories or the PATH.
This means that somebody can attack you by creating a file like
say "Super secret information.txt" and
creating a hidden NOTEPAD.EXE file in the same directory.
The victim says, "Oh wow, look, super secret information, let me
see what it is," and when they double-click it, the trojan
NOTEPAD.EXE is run instead of the one in the Windows directory.
Requiring paths to be fully-qualified
removes the current directory attack.
Predictability, because the contents of the PATH environment variable
can vary from process to process.
Consequently, the relative path could resolve to different programs
depending on who is asking.
This in turn results in having to troubleshoot problems like
"It works when I double-click it from Explorer, but not
if I run it from a batch file."
what program he got
out of the back of a pick-up truck.
Last year, we learned that
the ANSI code page isn't actually ANSI.
Indeed, the OEM code page isn't actually OEM either.
Back in the days of MS-DOS, there was only one code page,
namely, the code page that was provided by the
original equipment manufacturer
in the form of glyphs embedded in the character generator
on the video card.
When Windows came along,
the so-called ANSI code page was introduced
and the name "OEM" was used to refer to the MS-DOS code page.
Michael Kaplan went into more detail earlier this year
on the ANSI/OEM split.
Over the years, Windows has relied less and less on the character
generator embedded in the video card, to the point where the
term "OEM character set" no longer has anything to do with the
original equipment manufacturer.
It is just a convenient term to refer to "the character set used
by MS-DOS and console programs."
Indeed, if you take a machine running US-English Windows
(OEM code page 437) and install,
say, Japanese Windows,
then when you boot into Japanese Windows,
you'll find that you now have
an OEM code page of 932.