Holy cow, I wrote a book!
Some support people have asked me why the "About" dialog seems to be
kind of schizophrenic as to whether a machine has Windows XP SP1
Version 5.1 (Build 2600.xpsp2.040919-1003 : Service Pack 1)
Copyright© 1981-2001 Microsoft Corporation
Version 5.1 (Build 2600.xpsp2.040919-1003 : Service Pack 1)
Copyright© 1981-2001 Microsoft Corporation
Why does the version string say "xpsp2" and then "Service Pack 1"?
Is this machine running SP1 or SP2?
It's running Service Pack 1.
The build number string is a red herring.
Why does the build number string say "xpsp2" when the computer
is running SP1?
Because Windows XP Service Pack 2 was a victim of
After Service Pack 1 shipped,
there was no indication that Service Pack 2
was going to be anything other than "just another service pack":
A cumulative update of the fixes that had been issued since
the release of Service Pack 1.
Therefore, the release team created a new project,
called it "xpsp2" and when a fix needed to be made to
Service Pack 1, they made it there.
It was called "xpsp2" because the assumption was that
when the time came to release Service Pack 2,
they would just take all the fixes they had been making
to Service Pack 1 and call that Service Pack 2.
In other words, "fixes to Service Pack 1"
and "working on Service Pack 2" were the same thing.
Of course, things changed, and a "new" Service Pack 2
project was created for the "real" Service Pack 2 changes,
leaving the old "xpsp2" project
to be merely the place where Service Pack 1 fixes
Yes, it's confusing.
We're kind of embarrassed by the whole project naming fiasco.
That's what happens when plans take a radical change after work
has already started.
Anyway, there you have it,
the long and boring story of why fixes for Service Pack 1
have "xpsp2" in their build string.
Remember that there are typically two 8-bit code pages active,
the so-called "ANSI" code page and the so-called "OEM" code page.
GUI programs usually use the ANSI code page for 8-bit files
(though utf-8 is becoming more popular lately),
whereas console programs usually use the OEM code page.
This means, for example,
when you open an 8-bit text file in Notepad, it assumes the ANSI
code page. But if you use the TYPE command from the command prompt,
it will use the OEM code page.
This has interesting consequences if you switch between the GUI
and the command line frequently.
The two code pages typically agree on the first 128 characters,
but they nearly always disagree on the characters from 128 to 255
(so-called "extended characters").
For example, on a US-English machine, character 0x80 in the OEM
code page is Ç, whereas in the ANSI code page it is
Consider a directory which contains a file named Ç.
If you type "dir" at a command prompt, you see a happy Ç
on the screen.
On the other hand, if you do "dir >files.txt" and open files.txt
in a GUI editor like Notepad, you will find that the Ç has
changed to a , because the 0x80 in the file is being interpreted
in the ANSI character set instead of the OEM character set.
Stranger yet, if you mark/select the file name from the console window
and paste it into Notepad, you get a Ç. That's because the
console window's mark/select code saves text on the clipboard as Unicode;
the character saved into the clipboard is not 0x80 but rather U+00C7,
the Unicode code point for "Latin Capital Letter C With Cedilla".
When this is pasted into Notepad, it gets converted from Unicode to
the ANSI code page, which on a US-English system encodes the Ç
character as 0xC7.
But wait, there's more. The command processor has an option (/U)
to generate all piped and redirected output in Unicode rather than
the OEM code page.
(Note that the built-in documentation for the
command processor says that the /A switch produces ANSI output;
this is incorrect. /A produces OEM output.
This is one of those bugs that you recognize instantly
if you are familiar with what is going on. It's so obviously
OEM that when I see the documentation say "ANSI", my mind just
reads it as "OEM". In the same way native English speakers
often fail to notice misspellings or doubled words.)
If you run the command
cmd /U /C dir ^>files.txt
then the output will be in Unicode and therefore will record the
Ç character as U+00C7, which Notepad will then be able to
This has serious consequences for batch files.
Batch files are 8-bit files and are interpreted according to the
OEM character set. This means that if you write a batch file
with Notepad or some other program that uses the ANSI character
set for 8-bit files, and your batch file contains extended
characters, the results you get will not match the what you see
in your editor.
Why the discrepancy between GUI programs and console programs
over how 8-bit characters should be interpreted?
The reason is, of course, historical.
Back in the days of MS-DOS, the code page was what today is called
the OEM code page. For US-English systems, this is the code page
with the box-drawing characters and the fragments of the integral
signs. It contained accented letters, but not a very big set of them,
just enough to cover the German, French, Spanish, and Italian languages.
And Swedish. (Why Swedish yet not Danish and Norwegian I don't know.)
When Windows came along, it decided that those box-drawing characters
were wasting valuable space that could be used for adding still more
accented characters, so
out went the box-drawing characters and
in went characters for Danish, Norwegian, Icelandic, and
Canadian French. (Yes, Canadian French uses characters that
European French does not.)
Thus began the schism between console programs (MS-DOS) and
GUI programs (Windows) over how 8-bit character data should
This is one of those poorly-worded options.
In the Start menu configuration dialog, you can choose to uncheck
"Enable dragging and dropping".
This setting disables drag/drop but also disables right-click context menus.
The connection between the two is explained in the Group Policy Editor,
but is unfortunately oversimplified in the general-public configuration
Why does disabling dragging and dropping also disable context menus?
History, of course.
Originally, the "Disable drag/drop on the Start menu" setting was
a system policy, intended to be set by corporate IT departments to
prevent their employees from damaging the Start menu.
With this setting, users could no longer drag items around to rearrange
or reorganize their Start menu items.
This is a good thing in corporate environments because it
reduces support calls.
But very quickly, the IT departments found a loophole in this policy:
You could right-click an item on the Start menu
and select Cut, Copy, Paste, Delete, or Sort by Name,
thereby giving you access to the operations that the policy was
trying to block.
Therefore, they requested that the scope of the policy be expanded
so that it also disabled the context menu.
In Windows XP, it was decided to expose
what used to be an advanced deployment setting
to the primary UI,
and so it was.
Since it's the same setting, it carried the loophole-closure with it.
In the operating systems group, we have to take a holistic view of
performance. The goal is to get the entire system running faster,
balancing applications against each other for the greater good.
Applications, on the other hand, tend to have a selfish view of performance:
"I will do everything possible to make myself run faster.
The impact on the rest of the system is not my concern."
Some applications will put themselves into the Startup group so
that they will load faster. This isn't really making the system
run any faster; it's just shifting the accounting.
By shoving some of the application startup cost into operating
system startup, the amount of time between the user double-clicking
the application icon and the application being ready to run has
been reduced. But the total amount of time hasn't changed.
For example, consider the following time diagram.
The "*" marks the point at which the user turns on the computer,
the "+" marks the point at which Explorer is ready and the
user double-clicks the application icon, and
the "!" marks the point at which the application is ready.
The application developers then say, "Gosh, that pink 'Application
Startup' section is awfully big. What can we do to make it
smaller? I know, let's break our application startup into
"... and put part of it in the Startup group.
"Wow, look, the size of the pink bar (which represents how long
it takes for our application to get ready after the user
double-clicks the icon) is much shorter now!"
The team then puts this new shorter value in their performance
status report, everybody gets raises all around, and maybe
they go for a nice dinner to celebrate.
Of course, if you look at the big picture, from the asterisk
all the way to the exclamation point, nothing has changed.
It still takes the same amount of time for the application
to be ready from a cold start. All this "performance"
improvement did was rob Peter to pay Paul.
The time spent doing "Application Startup 1" is now
charged against the operating system and not against the
You shuffled numbers around, but the end user gained nothing.
In fact, the user lost ground.
For the above diagrams assume that the user wants to
run your application at all! If the user didn't want
to run your application but instead just wanted to check
their email, they are paying for "Application Startup 1"
even though they will reap none of the benefits.
Another example of applications having a selfish view
of performance came from a company developing an icon
overlay handler. The shell treats overlay computation
as a low-priority item, since it is more important
to get icons on the screen so the user can start
doing whatever it is they wanted to be doing.
The decorations can come later.
This company wanted to know if there was a way they
could improve their performance and get their overlay
onto the screen even before the icon shows up,
demonstrating a phenomenally selfish interpretation of
Performance is about getting the user finished with their task
sooner. If that task does not involve running
your program, then your "performance improvement" is really
a performance impediment.
I'm sure your program is very nice, but it would
also be rather presumptuous to expect that every user who
installs your program thinks that it should take priority
over everything else they do.
Although Windows is centered around, well, windows,
a window itself is not a cheap object.
What's more, the tight memory constraints of systems
of 1985 forced various design decisions.
Let's take for example the design of the list box control.
In a modern design, you might design the list box control
as accepting a list of child windows, each of which represents
an entry in the list. A list box with 20,000 items would have
20,000 child windows.
That would have been completely laughable in 1985.
Recall that Windows was built around a 16-bit processor.
Window handles were 16-bit values and internally were
just near pointers into a 64K heap. A window object
was 88 bytes (I counted), which means that you could
squeeze in a maximum of 700 or so before you ran out
of memory. What's more, menus hung out in this same
64K heap, so the actual limit was much lower.
Even if the window manager internally used a heap
larger than 64K (which Windows 95 did), 20,000
windows comes out to over 1.5MB.
Since the 8086 had a maximum address space of 1MB,
even if you devoted every single byte of memory to
window objects, you'd still not have enough memory.
Furthermore, making each list box item a window
means that every list box would be a variable-height
list box, which carries with it the complexity of
managing a container with variable-height items.
This goes against two general principles of API design:
(1) simple things should be simple, and
(2) "pay-for-play", that if you are doing the
simple thing, you shouldn't have to pay the cost
of the complex thing.
Filling a list box with actual windows also would have
made the "virtual list box" design significantly
trickier. With the current design, you can say,
"There are a million items" without actually having
to create them.
(This is also why the window space is divided into
"client" and "non-client" areas rather than making the
non-client area consist of little child windows.)
To maintain compatibility with 16-bit Windows programs (which
still run on Windows XP thanks to the WOW layer),
there cannot be more than 65536 window handles in the
system, because any more than that would prevent 16-bit
programs from being able to talk meaningfully about windows.
(Once you create your 65537'th
window, there will be two windows with the same 16-bit
handle value, thanks to the pigeonhole principle.)
16/32-bit interoperability is still important even today.)
With a limit of 65536 window handles, your directory
with 100,000 files in it would be in serious trouble.
The cost of a window object has grown over time,
as new features get added to the window manager.
Today it's even heftier than the svelte 88 bytes of
It is to your advantage not to create more windows
If your application design has you creating thousands of
windows for sub-objects,
consider moving to a windowless model,
like Internet Explorer, Word, list boxes, treeview,
listview, and even
our scrollbar sample program.
By going windowless, you shed the system overhead of
a full window handle, with all the baggage that comes with it.
Since window handles are visible to all processes,
there is a lot of overhead associated with centrally
managing the window list.
If you go windowless, then the only program that can
access your content is you.
You don't have to worry about marshalling,
And you can use a gigabyte of memory to keep track of
your windowless data if that's what you want,
since your windowless controls don't affect any other
The fact that window handles are accessible to
other processes imposes a practical limit on how many of them
can be created without impacting the system as a whole.
I believe that WinFX uses the
"everything on the screen is an element" model.
It is my understanding that
they've built a windowless framework so you don't have to.
(I'm not sure about this, though, not being a WinFX person myself.)
The SystemParametersInfo function
gives you access to a whole slew of user interface settings,
and it is the only supported method for changing those settings.
I'm not going to list every single setting; go read the list yourself.
Here are some highlights:
Here are some control panel settings.
Notice that when using the SPI_SET* commands,
you also have to choose whether the setting changes are
temporary (lost at logoff) or persistent.
The historically-named SPIF_UPDATEINIFILE flag
causes the changes to be saved to the user profile; if you
leave it off, then the changes are not saved and are lost
when the user logs off.
You should also set the SPIF_SENDCHANGE flag so
that programs which want to refresh themselves in response to
changes in the settings can do so.
The fact that there exist both temporary and persistent changes
highlights the danger of accessing the registry directly
to read or write the current settings. If the current settings are
temporary, then they are not saved in the registry.
The SystemParametersInfo function retrieves the
actual current settings, including temporary ones.
For example, if you want to query whether menus are being animated,
and the user has temporarily disabled animation, reading the registry
will tell you that they are being animated when
in fact they are not.
Also, changes written to the registry don't take effect untll the
next logon, because that is the only time the values are consulted.
To make a change take effect immediately, you must use
It still puzzles me why people go to the undocumented registry keys
to change these settings
when there is a perfectly good documented function for doing it.
Especially when the documented function works and the
undocumented registry key is unreliable.
I remember one application that went straight for the undocumented
registry keys (to get the icon title font, I think).
Unfortunately for the application, the format of the registry key
is different between Windows 95 and Windows 2000,
and it ended up crashing. (It expected the Windows 95 format.)
If it had used the documented method
of retrieving the icon title font, it would have worked fine.
In other words, this program went out of its way to go around
the preferred way of doing something and got hoist by its own petard.
On one of our internal mailing lists, someone was wondering
why their expensive four-processor computer appeared to be using
only one of its processors.
From Task Manager's performance tab, the chart showed that
the first processor was doing all the work and the other three
processors were sitting idle.
Using Task Manager to set each process's processor affinity to use all four
processors made the computer run much faster, of course.
What happened that messed up all the processor affinities?
At this point,
I invoked my psychic powers.
Perhaps you can too.
First hint: My psychic powers successfully predicted
that Explorer also had its processor affinity set to use only
the first processor.
is inherited by child processes.
Here was my psychic prediction:
My psychic powers tell me that
Explorer has had its thread affinity set to 1 proc....
because you previewed an MPG file...
whose decoder calls SetProcessAffinityMask in its DLL_PROCESS_ATTACH...
because the author of the decoder couldn't fix his multiproc bugs...
and therefore set the process thread affinity to 1 to "fix" the bugs.
My psychic powers tell me that
Although my first psychic prediction was correct, the others
were wide of the mark, though they were on the right track
and successfully guided further investigation to uncover the culprit.
The real problem was that there was a third party shell extension
whose authors presumably weren't able to fix their multi-processor bugs,
so they decided to mask them by calling
the SetProcessAffinityMask function
to lock the current process (Explorer) to a single processor.
Woo-hoo, we fixed all our multi-processor bugs at one fell swoop!
Let's all go out and celebrate!
Since processor affinity is inherited, this caused every
program launched by Explorer
to use only one of the four available processors.
(Yes, the vendor of the offending shell extension has been
contacted, and they claim that the problem has been fixed
in more recent versions of the software.)
A common feature for many applications is to record their screen
location when they shut down and reopen at that location when relaunched.
If implemented naively,
a program merely restores from its previous position
You run into usability problems with this naive implementation.
If a user runs two copies of your program, the two windows end up in exactly
the same place on the screen. Unless the user paid close attention
to their taskbar, it looks like running the second copy had no effect.
Now things get interesting.
Depending on what the program does, the second copy may encounter
a sharing violation, or it may merely open a second copy of the
document for editing, or two copies of the song may start playing,
resulting in a strange echo effect since the two copies are out
of sync. Even more fun is if the user hits the Stop button and the
music keeps playing! Why? Because only the second
copy of the playback was stopped. The first copy is still running.
I know one user who not infrequently gets as many as
four copies of
a multimedia title running, resulting in a horrific cacophany
as they all play their attract music simultaneously, followed by
mass confusion as the user tries to fix the problem, which usually consists
of hammering the "Stop" button on the topmost copy. This stops the
topmost instance, but the other three are still running...
If a second copy of the document is opened, the user may switch
away from the editor, then switch back to the first
instance, and think that all the changes were lost. Or
the user may fail to notice this and make a conflicting set of changes
to the first instance. Then all sorts of fun things happen when the
two copies of the same document are saved.
Moral of the story: If your program saves and restores its screen
position, you may want to check if a copy of the program is already
running at that screen position. If so, then move your second
window somewhere else so that
it doesn't occupy exactly the same coordinates.
Post suggestions for future topics here instead of posting off-topic comments. Note that the suggestion box is emptied and read periodically so don't be surprised if your suggestion vanishes. (Note also that I am under no obligation to accept any suggestion.)
Topics I are more inclined to cover:
Topics I am not inclined to cover:
(Due to the way the blog server is set up, a new suggestion box gets set up every 30 days, assuming I don't forget to create a new one. If I forget, you can send me a reminder via the Contact page. You can also peek at the previous suggestion box.)
The window manager provides a pointer-sized chunk of storage
you can access via
the GWLP_USERDATA constant.
You pass it to
the GetWindowLongPtr function
the SetWindowLongPtr function
to read and write that value.
Most of the time, all you need to attach to a window is a single
pointer value anyway, so the free memory in
GWLP_USERDATA is all you need.
Note that this value, like the other window extra bytes
the messages in the WM_USER range,
belongs to the window class and not to the
code that creates the window.