Holy cow, I wrote a book!
I've just been informed by
that the bonus chapters from
available for download.
Click on "Sample Chapters".
Sorry they're late.
The source code for the programs in the book can be downloaded
from the "Source Code" link.
And on a more embarrassing note,
there's that "Errata" link, too.
I promised to talk more about NMI, so here it is.
What generates an NMI? What does it mean?
The first question is easy to answer but doesn't actually shed much
Any device can pull the NMI line, and that will generate a non-maskable
Back in the Windows 95 days, a few really cool people had taken
the ball-point pen trick one step further:
They had a special expansion card in their computer with a cord coming
out the back.
At the end of the cord was a momentary switch
like the one you might see on a quiz show.
If you pressed it, the card generated an NMI.
No fumbling around with ball-point pens for these folks, no-ho!
(To be honest, I had two of these.
One of them was a simple NMI card,
triggered by a foot pedal!
The other was really a card with a high-resolution real-time clock
that could be used for performance analysis.
I used the NMI button far more often than the timer...)
In practice, the only device that generates an NMI (on purpose)
is the memory controller,
which raises it when a
parity error is detected.
The non-geek explanation of a parity error:
Your memory chips are acting flakey.
Here's what a parity error looks like.
It shows up as a mysterious "Hardware Malfunction" error.
Now, it's possible that a device may be generating an NMI by mistake.
For example, in Wendy's case, it may have been due to damaged
caused by overheating.
If you suspect your memory chips, you can run a
memory diagnostic tool to see if it can find the bad memory.
My colleague Keith Moore reminded me that paradoxically,
on the IBM PC-AT,
you could mask the non-maskable interrupt!
This definitely falls into the category of "Unclear on the concept."
The masking was done in hardware
that could be configured via some magic port I/O.
It prevented the NMI from reaching the CPU in the first place.
(NMI is still not maskable in the CPU.)
A commenter asked,
"As an application programmer,
can I really ignore DDE if I need to interact with explorer/shell?"
The answer is, "Yes, please!"
While it was a reasonable solution
back in the cooperatively-multitasked world of 16-bit Windows
where it was invented,
the transition to 32-bit Windows was not a nice one for DDE.
the reliance on broadcasts
to establish the initial
DDE conversation means that unresponsive programs can jam up
the entire DDE initiation process.
The last shell interface to employ DDE was the communication
with Program Manager to create program groups and items inside
This was replaced with Explorer and the Start menu back in
DDE has been dead as a shell interface for over ten years.
Of course, for backwards compatibility,
the shell still supports DDE
for older programs that choose to use it.
You can still create icons on the Start menu via DDE and
you can still register your documents to launch via DDE
if you really want to,
but if you take a pass on DDE you won't be missing anything.
On the other hand, even though there is no technological reason
for you to use DDE, you still have to be mindful of whether your
actions will interfere with other people who choose to:
If you stop processing messages, you will clog up DDE initiation,
among other things.
It's like driving an automatic transmission instead of a manual
There is no requirement (in the United States, at least)
that you own a manual transmission or even know how to operate one.
But you still have to know to ensure that your actions do not interfere
with people who do have manual transmissions,
such as watching out for cars waiting for the traffic light to change
while pointed uphill.
If you try to set the current directory of a command prompt, you
get the error message
"CMD does not support UNC paths as current directories."
What's going on here?
It's MS-DOS backwards compatibility.
If the current directory were a UNC,
there wouldn't be anything to return to
MS-DOS programs when they call function 19h (Get current drive).
That function has no way to return an error code,
so you have to return a drive letter.
UNCs don't have a drive letter.
You can work around this behavior by using the pushd
command to create a temporary drive letter for the UNC.
Instead of passing script.cmd to the
CreateProcess function as the lpCommandLine,
you can pass
cmd.exe /c pushd \\server\share && script.cmd.
cmd.exe /c pushd \\server\share && script.cmd
(Griping that seems to happen any time I write about batch files,
so I'll gripe them pre-emptively:
Yes, the batch "language" sucks because it wasn't designed;
it just evolved.
I write this not because I expect you to enjoy writing batch files
but because you might find yourself forced to deal with them.
If you would rather abandon batch files and use a different command
interpreter altogether, then more power to you.)
If you add me to an existing discussion, you have to say why.
Do you have a specific question for me?
Do you want my opinion on something?
Are you just sharing a funny joke?
Sometimes, I'll get a piece of mail that goes like this:
To: Aaaaa; Bbbbb; Ccccc; Raymond
--- Original Message ---
--- Original Message ---
Gee, that's very nice of you to add me, but you didn't say why.
Is this a FYI?
Is there a question you want answered?
Often, the discussion is just "Gosh, there's this bug,
person A proposes a theory,
person B proposes a counter-theory,
person C runs some tests and has some preliminary results,
It's like "Adding Raymond" is a ritual phrase people sprinkle
into a mail thread.
They don't know what'll happen when they say it,
they don't even have any expectations,
but it doesn't hurt to say it, right?
"When in doubt, add Raymond."
If you don't explain why you added me to a thread,
I'm just going to killfile it.
Okay, now that we know
what operations LockWindowUpdate is meant to be used with,
we can look at various ways people misuse the function for things
unrelated to dragging.
People see the "the window you lock won't be able to redraw itself"
behavior of LockWindowUpdate and use it as a sort of
lazy version of the WM_SETREDRAW message.
Though sending the WM_SETREDRAW message really isn't
that much harder than calling LockWindowUpdate.
It's twenty more characters of typing, half that if you
use the SetWindowRedraw macro in <windowsx.h>.
SendMessage(hwnd, WM_SETREDRAW, FALSE, 0)
SendMessage(hwnd, WM_SETREDRAW, TRUE, 0)
As we noted earlier, only one window in the system can be
locked for update at a time.
If your intention for calling LockWindowUpdate is
merely to prevent a window from redrawing, say, because you're
updating it and don't want the window continuously refreshing
until your update is complete,
then just disable redraw on that window.
If you use LockWindowUpdate,
you create a whole slew of subtle problems.
First off, if some other program
is misusing LockWindowUpdate
in this same way, then one of you will lose.
Whoever tries LockWindowUpdate first will get it,
and the second program will fail.
Now what do you do?
Your window isn't locked any more.
Second, if you have locked your window for update and the
user switches to another program and tries to drag an item
(or even just tries to move the window!),
that attempt to LockWindowUpdate
will fail, and the user is now in the position where drag/drop
has stopped working for some mysterious reason.
And then, ten seconds later, it starts working again.
"Stupid buggy Windows," the user mutters.
Conversely, if you decide to call LockWindowUpdate
when a drag/drop or window-move operation is in progress,
then your call will fail.
This is just a specific example of the more general programming
mistake of using global state to manage a local condition.
When you want to disable redrawing in one of your windows,
you don't want this to affect other windows in the system;
it's a local condition.
But you're using a global state (the window locked for update)
to keep track of it.
I can already anticipate people saying,
"Well, the window manager shouldn't let somebody lock a window for update
if they're not doing a drag/drop operation."
But how does the window manager know?
It knows what is happening, but it doesn't know why.
Is that program calling LockWindowUpdate because
it's too lazy to use the WM_SETREDRAW message?
Or is it doing it in response to some user input that resulted in
a drag/drop operation?
Note that you can't just say, "Well, the mouse button has to be down,"
because the user might be performing a keyboard-based operation
(such as resizing a window with the arrow keys) that has the moral
equivalent of a drag/drop.
Morality is hard enough to resolve as it is;
expecting computers to be able to infer it is asking a bit much.
Next time, a final remark on LockWindowUpdate.
Poor misunderstood LockWindowUpdate.
This is the first in a series on
what it does, what it's for and (perhaps most important) what it's not for.
What LockWindowUpdate does is pretty simple.
When a window is locked,
all attempt to draw into it or its children fail.
Instead of drawing, the window manager remembers which parts of
the window the application tried to draw into, and when the
window is unlocked, those areas are invalidated so that the
application gets another WM_PAINT message,
thereby bringing the screen contents back in sync with what
the application believed to be on the screen.
This "keep track of what the application tried to draw
while Condition X was in effect, and invalidate it when
Condition X no longer hold" behavior you've seen already
in another guise:
In this sense, LockWindowUpdate does the same bookkeeping
that would occur if you had covered the locked window with a
CS_SAVEBITS window, except that it doesn't save any bits.
The documentation explicitly calls out that only one window
(per desktop, of course)
can be locked at a time, but this is implied by the function prototype.
If two windows could be locked at once, it would be impossible
to use LockWindowUpdate reliably.
What would happen if you did this:
LockWindowUpdate(hwndA); // locks window A
LockWindowUpdate(hwndB); // also locks window B
LockWindowUpdate(NULL); // ???
What does that third call to LockWindowUpdate do?
Does it unlock all the windows?
Or just window A?
Or just window B?
Whatever your answer, it would make it impossible for the following
code to use LockWindowUpdate reliably:
Imagine that the BeginOperation functions started
some operation that was triggered by asynchronous activity.
For example, suppose operation A is drawing drag/drop
feedback, so it begins when the mouse goes down and ends when
the mouse is released.
Now suppose operation B finishes while a drag/drop is
still in progress.
Then EndOperationB will clean up operation B
If you propose that that should unlock all windows,
just ruined operation A, which expects that hwndA
still be locked.
Similarly, if you argue that it should unlock
only hwndA, then only only is operation A ruined,
but so too is operation B (since hwndB is still
locked even though the operation is complete).
On the other hand, if you propose that LockWindowUpdate(NULL)
should unlock hwndB, then consider the case where
operation A completes first.
If LockWindowUpdate were able to lock more than one
window at a time, then the function prototype would have to have
been changed so that the unlock operation knows which window is
There are many ways this could have been done.
For example, a new parameter could have been added
or a separate function created.
// Method A - new parameter
// fLock = TRUE to lock, FALSE to unlock
BOOL LockWindowUpdate(HWND hwnd, BOOL fLock);
// Method B - separate function
BOOL LockWindowUpdate(HWND hwnd);
BOOL UnlockWindowUpdate(HWND hwnd);
But neither of these is the case.
The LockWindowUpdate function locks only one window at a time.
And the reason for this will become more clear as we learn
what LockWindowUpdate is for.
One of my colleagues recently posted the story of
the work he did to get laptops to resume quickly.
The fun part was implementing the optimizations in the kernel.
The not-fun part was finding all the drivers who did bad things
and harassing their owners into fixing the bugs.
One some laptops, he could get the resume time down to an impressive
And then entropy set in.
It's likely you've never seen a real off-the-shelf laptop resume this quickly.
And the reason is that as soon as you stop twisting the arms of
all the driver writers,
they stop worrying about how fast your laptop resumes
and go back to worrying about when they can get their
widget driver mostly working so they can get through WHQL
and sell their widget.
But now you have some tools to fight back, at least a little bit.
The second half of that article explains how to use
the event viewer to track down which drivers are ruining your
resume time and disable them.
I learned this from Yes, Minister.
They call it the politician's fallacy:
As befits its name, you see it most often in politics,
where poorly-thought-out solutions are proposed for
But be on the lookout for it in other places, too.
You might see somebody falling victim to the politician's
fallacy at a business meeting, say.
Something else I picked up is what I'm going to call
the politician's apology.
This is where you apologize for a misdeed not by apologizing
for what you did, but rather apologizing that other people
One blogger coined the word "fauxpology" to describe this sort of
In other words, you're not apologizing at all!
It's like the childhood non-apology.
"Apologize to your sister for calling her ugly."
"I'm sorry you're ugly."
In the politician's apology, you apologize
not for the offense itself,
but for the fact that what you did offended someone.
"I'm sorry you're a hypersensitive crybaby."
regretted any hurt feelings
his statements may have caused.
Another form of non-apology is to state that bad things happened
without taking responsibility for causing them:
There should not have been any physical contact in this incident.
I am sorry that this misunderstanding happened at all,
I regret its escalation
and I apologize.
This particular non-apology even begins with the accusation that the
other party was at fault for starting the incident!
What bothers me is that these types of non-apologies are so common
that nobody is even offended by their inadequacy.
They are accepted as just
"the way people apologize in public".
(It's become so standard that Slate's William Saletan has
broken it down into steps for us.)
I post this entry with great reluctance, because I can feel the
heat from the pilot lights of the flame throwers all the way from here.
The struggle with the network interoperability problem continued
for several months after
I brought up the topic.
In that time, a
significant number of network attached storage devices
were found that did not implement "fast mode" queries correctly.
(Buried in this query are some of them; there are others.)
Some of them were Samba-based whose vendors did not have an upgrade
available that fixed the bug.
But many of them used custom implementations of CIFS;
consequently, any Samba-specific solutions would not have helped
(Most of the auto-detection suggestions
people proposed addressed only the Samba scenario.
Those non-Samba devices would still not have worked.)
Even worse, most of the devices are low-cost solutions which
aren't firmware-upgradable or have any vendor support.
Some of the reports came from people running fully-patched well-known
So much for being in
all the new commercially supported offerings over the next couple months.
Furthermore, those buggy non-Samba implementations mishandled fast mode
queries in different ways.
For example, one of them I was asked to look at didn't return
any error codes at all.
It just returned garbage data (most noticeably,
corrupting the file name by deleting the first five characters).
How do you detect that this has happened?
If the server reports "I have a file called e.txt",
is Windows supposed to say, "Oh, I don't think so. I bet you're
one of those buggy servers that chops off the first five letters
of file names and that you really meant to say (scrunches forehead
in concentration) readme.txt"?
What if you really had a file called e.txt?
What if the server said, "This directory has two files, 1.txt
Is this a buggy server?
Maybe the files are really abcde1.txt and defgh2.txt,
or maybe the server wasn't lying and the files really are
1.txt and 2.txt.
One device simply crashed if asked to perform a fast mode query.
Another wedged up and had to be reset.
"Oh, looks like somebody brought their Vista laptop from home
and plugged it into the corporate network.
Our document server crashed again."
Given the much broader ways that servers mishandled fast queries,
any attempt at auto-detecting them will necessarily be incomplete
and fail to detect broken servers.
This is fundamentally the case for servers which return perfectly
formed, but incorrect, data.
And even if the detection were perfect, if it left the server in
a crashed or hung state, that wouldn't be much consolation.
Given this new information, the solution that was settled on was
simply to stop using "fast mode" queries for anything other than
The most popular
file system drivers for local devices (NTFS, FAT, CDFS, UDF)
are all under Microsoft's control and they have already been tested
with fast mode queries.
Such is the sad but all-too-true
cost of interoperability and compatibility.
(To address other minor points:
It's not the case that the Vista developers
"knew the [fast mode query] would break Samba-based devices since
The fast mode query was added, and the incompatibility with Samba
wasn't discovered until March 2006.
"Why didn't you notify the Samba team?"
Because by the time we found the problem,
they had already fixed it.)