Holy cow, I wrote a book!
There appears to be some confusion over whether the maximum
size of the environment is 32K or 64K.
Which is it?
The limit is 32,767 Unicode characters,
which equals 65,534 bytes.
Call it 32K or 64K as you wish, but make sure you include the units
in your statement if it isn't clear from context.
Whenever I write an article explaining that programs should avoid
doing X, I can confidently
a comment saying,
Microsoft Product Q does this!" as if
But they're saying "gotcha" to the wrong person.
Because, and I'm sure it's a shock to many people to read this,
I did not personally write every line of software Microsoft ever produced.
(And even if I did write it, I may have written it as a younger
developer, before I learned about said rule.
Because, and I'm sure this is also a shock to many people,
I was once a beginner, too.)
If you find a Microsoft product breaking a rule,
then go complain to that product team.
Complaining to me won't accomplish anything.
I don't have access to their source code, and even if I did,
don't have permission to go in and make changes to their code,
nor do I have the time to go in and learn how their product
works and figure out the right place to make the fix.
and I don't know if you all can handle three shocking revelations in
product teams do not send me every line of code for review.
Indeed, one of the reasons I write here about things programs
should or shouldn't do is because I myself will see a Microsoft
product breaking a rule!
By discussing the problem here rather than in an internal
mailing list, the information gets out to everybody.
And maybe, just maybe, the product team will read the entry and
say, "Oops, I think we do that."
Because (shocking revelation number four) not all Microsoft
programmers are seasoned experts in Win32 user-interface programming.
(Articles where I was consciously tapping my colleagues on the head
my discussion of CallMsgFilter,
the long and sad story of the Shell Folders key,
reminding you to pass unhandled messages to DefWindowProc.
In fact, for every "do/don't do this" article,
I'd say odds are good that
with enough searching, you can
find a Microsoft product that breaks the rule.
And when you do, complain to that product team.
difference between the tray and the notification area
was in part a response to all the other groups that perpetuate the
misuse of the terminology.)
So when I write something like,
"Applications shouldn't do this", go ahead and insert the phrase
"and this means all applications, including those published by Microsoft."
When I write, "some annoying programs", go ahead and
insert the phrase, "which might even include programs
published by Microsoft".
I'm not going to insert those phrases into every sentence I write.
I'm assuming you're smart enough to realize that general statements
apply to everyone regardless of who signs their paychecks.
Of course, if the consensus of my readership is that
I shouldn't tell you not to do things
until every last Microsoft product has been scoured to ensure that none
of them violate that rule either,
then I can abide by that decision.
I'll just stop posting those tips here and keep them on the internal
It's much less work for me.
Last time, we saw how
the way Win32 exports functions
much the same as the way 16-bit Windows exports functions,
but with a change in emphasis
from ordinal-based exports to name-based exports.
This change in emphasis is not expressed anywhere in the file format;
both 16-bit and 32-bit DLLs can export either by name or by ordinal
(or by both), but the designers of Win32 were biased in spirit
in favor of name-only exports.
But there is a new type of exported function in Win32, known as a forwarder.
A forwarder looks just like a regular exported function, except that
the entry in the ordinal export table says, "Oh, I'm not really a
function in this DLL. I'm really a function in that DLL over there."
if you do a
link /dump /exports kernel32.dll, you'll see a line
link /dump /exports kernel32.dll
151 EnterCriticalSection (forwarded to NTDLL.RtlEnterCriticalSection)
This means that if a program links to
the loader silently
redirects it to NTDLL.RtlEnterCriticalSection.
Forwarders are a handy way to accommodate functionality moving from
one DLL to another.
The old DLL can continue to export the function but forward it to
the new DLL.
The forwarding trick is actually better than just having a stub
function in the old DLL that calls the function in the new DLL,
because the stub function creates a dependency between the old DLL and the
(After all, the old DLL needs to be linked to the new DLL in order
to call it!)
With a forwarder, however, the new DLL is not loaded unless somebody
actually asks for the forwarded function from the old DLL.
As a result, you don't pay for the new DLL until somebody actually wants it.
Okay, we saw that with forwarders, Win32 has diverged from 16-bit Windows,
but when it comes to imports, it's a whole new ball game.
We'll pick up the story next time.
The designers of 32-bit Windows didn't have to worry quite so much
about squeezing everything into 256KB of memory.
Since modules in Win32 are based on demand-paging,
all you have to do is map the entire image into memory and then
run around accessing the parts you need.
There is no distinction between resident and non-resident names;
the names of exported functions are just stored in the image,
with a pointer (well, a relative virtual address) to the name
stored in the export table.
the 16-bit ordinal export table,
the 32-bit ordinal export table is not sparse.
If your DLL exports two functions, one as ordinal 10 and one as ordinal 1000,
you will have a 991-entry table that consists of two actual function pointers
and a lot of zeros.
Thus, you should try not to have large gaps in your ordinal exports,
or you will be wasting space in your DLL's export table.
As I noted above, there is only one exported names table,
so you don't have to distinguish between resident and non-resident names.
The exported names table functions in the same manner as the exported
names table in 16-bit Windows, mapping names to ordinals.
Unlike the 16-bit named export tables, where order is irrelevant,
the exported names table in 32-bit Windows is kept sorted so that
a more efficient binary search can be used to locate functions.
As with 16-bit Windows, every named function is assigned an ordinal.
If the programmer didn't assign one in the module definition file,
the linker will make one up for you, and as with 16-bit Windows,
the value the linker makes up can vary from build to build.
However, there is a major difference between the two models:
Recall that named exports in 16-bit Windows were discouraged (on
efficiency grounds), and as a result, every exported function was
explicitly assigned an ordinal, which was the preferred way of
linking to the function.
On the other hand, named exports in 32-bit Windows are the norm,
with no explicit ordinal assignment.
This means that the ordinal for a named export is not fixed.
For example, let's look at the ordinal that got assigned to the
kernel32 function LocalAlloc in the early years:
Now, some people are in the habit of reverse-engineering
import libraries, probably because they can't be bothered to download
the Platform SDK and get the real import libraries.
The problem with generating the import library manually is
that you can't tell whether the ordinal that was assigned to,
say, the LoadLibrary function was assigned by the
module definition file (and therefore will not change from build to build)
or was just auto-generated by the linker (in which case the ordinal
The import library generation tools could just play it safe and
use the named export, since that will work in both cases,
but for some reason,
they use the ordinal
(This is probably a leftover from 16-bit Windows,
where ordinals were preferred over names, as we saw earlier.)
This unfortunate choice on the part of the import library generation tools
to live dangerously has created compatibility
problems for the DirectX team.
(I don't know why DirectX got hit by this harder than other teams.
Perhaps because game developers don't have the time to learn the
fine details of Win32; they just want to write their game.)
Since they used one of these tools, they ended up linking to
DirectX functions like DirectDrawCreate by ordinal
rather than by name, and then when the next version of DirectX came
out and the name was assigned a different ordinal by the linker,
their programs crashed pretty badly.
The DirectX team had to go back to the old DLLs, write down all the
ordinals that the linker randomly assigned, and explicitly assign
those ordinals in the module definition files so they wouldn't
move around in the future.
There are other reasons why you cannot generate an import library
from a DLL; I'll pick up those topics later when I talk about
import libraries in more detail.
Next time, forwarders.
In the discussions following
why Windows setup lays down a new boot sector,
some commenters suggested that Windows setup could detect
the presence of a non-Windows partition as a sign that
the machine onto which the operating system is being
installed belongs to a geek.
In that way, the typical consumer would be spared
from having to deal with
a confusing geeky dialog box that they
don't know how to answer.
The problem with this plan is that not everybody with
a non-Windows partition type is necessarily a geek.
Many OEM machines ship with a hard drive split
into two partitions,
one formatted for Windows and the second a
small non-Windows partition to be used during
system diagnostics and recovery.
The presence of this small non-Windows partition
is typically not well-known, and it comes into
play only when you boot from the manufacturer's
"system recovery CD".
The upshot of this is that if Windows setup took
the "anybody with a non-Windows partition must be a geek"
it would end up tagging an awful lot of people as geeks
who really aren't.
Now, you might say,
"Well, only geeks install the operating system anyway.
Normal people typically buy a computer with the operating system
The fact that they are running Windows setup proves that they're
a geek in the first place.
Therefore, Windows setup should be optimized for geeks."
Indeed, the premise of this argument—that only geeks
run Windows setup—is true, but only
once you've reached steady state.
In the months immediately following the release of a new
version of Windows,
everybody is installing the operating system,
geeks and non-geeks alike.
(There is also an influx of non-geek people installing Windows
Magazine reviewers are writing boatloads of articles
on the new operating system, and the initial setup experience
is the very first thing they notice about Windows.
It had better be smooth and painless.
Now that we've learned what the dllimport
declaration specifier does, what if you get it wrong?
If you forget to declare a function as dllimport,
then you're basically making the compiler act like a naive compiler
that doesn't understand dllimport.
When the linker goes to resolve the external reference for
the function, it will use the stub from the import library,
and everything will work as before.
You do miss out on the optimization that dllimport
enables, but the code will still run.
You're just running in naive mode.
(There are still some header files in the Platform SDK that
neglect to use the dllimport declaration specifier.
As a result, anybody who uses those header files to import functions
from the corresponding DLL will be operating in "naive mode".
Hopefully the people responsible for those header files will
recognize themselves in this parenthetical and fix the problem for
a future release of the Platform SDK.)
Now, what about the reverse problem?
What if you declare a function as dllimport when it
The linker detects this since it sees an attempt to import a
__imp__FunctionName symbol and can't find one, though it
can find the normal FunctionName symbol.
When this happens, the linker raises
It recovers from this error by simply manufacturing a fake
__imp__FunctionName variable and initializing it with
the address of the FunctionName function.
In effect, you've imported the function from yourself.
Your code now goes through all the gyrations associated with
calling an imported function unnecessarily; it could have just
called FunctionName directly.
(There are cases where the linker can be a little smarter.
if it sees a call [__imp__FunctionName], it can change
it to call FunctionName + nop.
The nop is necessary because the
call [__imp__FunctionName] instruction is six bytes
long, whereas call FunctionName is only five.
The extra nop gets everything back in sync.)
call FunctionName + nop
Thus, in both cases where you mess up the dllimport
declaration specifier, the linker manages to recover from your mistake,
and your program does run fine, though the patching up did cost you
in code size and efficiency.
(All this discussion is for x86, by the way.
Other architectures have different quirks.)
Next time, more on import libraries, and exposing some "little
white lies" I've been telling.
When I wrote that
the symbolic name for the imported function table entry for
a function is called __imp__FunctionName,
the statement was "true enough" for the discussion at hand,
but the reality is messier, and
the reason for the messy reality is function name decoration.
When a naive compiler generates a reference to a function,
the reference is
decorated in a manner consistent with its architecture, language,
and calling convention.
(Some time ago,
I discussed some of the decorations you'll see on x86 systems.)
For example, a naive call to the GetVersion function
results in the compiler generating code equivalent to
call _GetVersion@0 (on an x86 system; other architectures
The import library therefore must have an entry for the
symbol _GetVersion@0 in order for the external reference
to be resolved.
To correspond to the stub function whose real name is
_GetVersion@0 is the import table entry whose name
In general, the import table entry name is __imp_
prefixed to the decorated function name.
The fact that names in import libraries are decorated means
that it is doubly crucial that you use the official import library
for the DLL you wish to use rather than trying to manufacture
one with an import library generation tool.
As we noted earlier, the tool won't know whether the ordinal
assigned to a named function was by design or merely coincidental.
But what's more, the tool won't know what decorations to apply
to the function (if the name was exported under an undecorated name).
Consequently, your attempts to call the function
will fail to link since the decorations will most likely not match up.
In that parenthetical, I mentioned exporting under undecorated names.
Doesn't that mean that you can also export with a decorated name?
Yes you can, but as I described earlier,
you probably shouldn't.
For as I noted there, if you export a decorated name, then that
name cannot be located via GetProcAddress unless you
also pass the decorated name to GetProcAddress.
But the decoration schema changes from language to language,
from architecture to architecture, and even from compiler vendor
to compiler vendor,
so even if you manage to pass a decorated name to the
GetProcAddress function, you'll have to wrap it
inside a huge number of #ifdefs so you pass the
correct name for the x86 or ia64 or x64, accordingly,
as well as changing the name depending on whether you're using
the Microsoft C compiler, the Borland C compiler, the Watcom
C compiler, or maybe you're using one of the C++ compilers.
And woe unto you if you hope to call the function from Visual Basic
or C# or some other language that provides interop facilities.
Just export those names undecorated.
Your future customers will thank you.
(Exercise: Why is it okay for the C runtime DLLs to use
The whole point of dynamic link libraries (DLLs) is that
the linkage is dynamic.
Whereas statically-linked libraries are built into the
final product, a module that uses a dynamically-linked library
merely says, "I would like function X from Y.DLL, please."
This technique has advantages and disadvantages.
One advantage is more efficient use of storage,
since there is only one copy of Y.DLL in memory rather than a
separate copy bound into each module.
Another advantage is that an update to Y.DLL can be made
without having to re-compile all the programs that used it.
On the other hand, the ability to swap in functionality automatically
is also one of the main disadvantages of dynamic link libraries,
because one program can change a DLL that has cascade effects on
other clients of that DLL.
Anyway, let's start with how 16-bit Windows managed imports and
exports. After that, we'll see how things changed during the
switch to 32-bit Windows, and then we'll take a look at
the compiler-specific dllimport declaration specifier.
(I already discussed dllexport earlier.)
A 16-bit DLL has not one but three export tables.
(Things are actually more complicated than I describe them here,
but I'm going to skip over the nitpicky details
just to keep everyone's heads from exploding.)
The most important table is a sparse array of
functions, indexed by a 1-based integer (the "ordinal").
It is this function table that is the master list of all
If you request a function by ordinal, the ordinal is looked up
in this table.
The table is physically rather complicated due to the sparseness,
but logically, it looks like this:
The first column in the table is the ordinal of the function,
and the second function describes where the function can be found.
(Notice that there is no function 3 or 4 in this DLL.)
Things get interesting when you want to export a function by name.
The exported names table is a list of function names with their
associated ordinal equivalents.
For example, a section of
the exported names table for the 16-bit window manager (USER)
went like this:
Wait, did I say the exported names table?
I'm sorry, that was an oversimplification.
There are actually two exported names tables,
the resident names table and the non-resident names table.
As their names suggest, the names in the resident names table
remain in memory as long as the DLL is loaded, whereas the names
in the non-resident names table are loaded into memory only
when somebody calls GetProcAddress (or one of its
This distinction is
a reflection of the extremely tight memory constraints that
Windows had to run within back in those days.
For example, the window manager (USER) has over six hundred
export functions; if all the exported names were kept resident,
that would be over ten kilobytes of data.
You'd be wasting four percent of the memory of your 256KB machine
remembering things you don't need most of the time.
The large size of the table for exported function names meant that
only functions that are passed to GetProcAddress with
high frequency deserve to be placed in the resident names table.
For most DLLs, no function falls into this category, and the
resident names table is empty.
(Head-exploding details deleted for sanity's sake.)
Since obtaining a function by name is so expensive
(requiring the non-resident names table to be loaded from disk
so it can be searched),
all functions exported by operating system DLLs are exported
both by name and by ordinal,
with the ordinal taking precedence in the import library table.
Obtaining a procedure address by ordinal avoids the name tables
Notice that every named function has a corresponding ordinal.
If you do not assign an ordinal to your named function in your
module definition file, the linker will make one up for you.
(However, the value that it makes up need not be the same from
build to build.)
This situation did not occur in practice, for as we noted above,
everybody explicitly assigned an ordinal to their exports and
put that ordinal in the import library in order to avoid the
huge cost of a name-based function lookup.
That's a quick look at how functions were exported in 16-bit Windows.
Next time, we'll look at how they are imported.
Back in 2003,
M&M offered a chance to win $5000 every summer for life,
but if you looked more carefully, the offer actually read,
"Win $5000 Every Summer For Life*",
and the asterisk at the bottom read, "Maximum 50 years".
That fine print was filled with strange stuff.
3. Sponsor responsible only for delivery of prize; not responsible
for prize utility, quality or otherwise.
10. Sponsor: M&M/Mars, High Street, Hackettstown, NJ 07840.
One of the prizes was approximately ten pounds of M&Ms.
The logical conclusion: "M&M is not responsible for the quality of
I was reminded of this by the recent flap over
how hard it was for one person to cancel his AOL account.
There was an AOL contest some years back that offered a
chance to "Win free AOL dial-up service for life!",
and it too had limited your life to 50 years in the fine print.
That one would actually be fun, though.
Imagine, in the year 2052, AOL
will still have to keep one modem up and running just for this contest
An import library resolves symbols for imported functions,
but it isn't consulted until the link phase.
Let's consider a naive implementation where the compiler
is blissfully unaware of the existence of imported functions.
In the 16-bit world, this caused no difficulty at all.
The compiler generated a far call instruction and left an
external record in the object file indicating that the
address of the function should be filled in by the linker.
At that time, the linker realizes that the external symbol
corresponds to an imported function, so it takes all the
threads them together,
and creates an import record in the module's
At load time, those call entries are fixed up and everybody
Let's look at how a naive 32-bit compiler would deal with the
The compiler would generate a normal call instruction,
leaving the linker to resolve the external.
The linker then sees that the external is really an imported
function, and, uh-oh, the direct call needs to be converted
to an indirect call.
But the linker can't rewrite the code generated by the compiler.
What's a linker to do?
The solution is to insert another level of indirection.
(Warning: The information below is not literally true,
but it's "true enough".
We'll dig into the finer details later in this series.)
For each exported function in an import library,
two external symbols are generated.
The first is for the entry in the imported functions table,
which takes the name __imp__FunctionName.
Of course, the naive compiler doesn't know about this fancy
It merely generates the code for the instruction
call FunctionName and expects the linker to produce
That's what the second symbol is for.
The second symbol is the longed-for FunctionName,
a one-line function that consists merely of a
jmp [__imp__FunctionName] instruction.
This tiny stub of a function satisfies the external reference
and in turn generates an external reference to
which is resolved by the same import library to an entry
in the imported function table.
When the module is loaded, then, the import is resolved to
a function pointer and stored in __imp__FunctionName,
and when the compiler-generated code calls the FunctionName
function, it calls the stub which trampolines (via the indirect call)
to the real function entry point in the destination DLL.
Note that with a naive compiler, if your code tries to take the address
of an imported function, it gets the address of the
FunctionName stub, since a naive compiler simply
asks for the address of the
FunctionName symbol, unaware that it's really coming
from an import library.
Next time, we'll look at the dllexport declaration specifier
and how a less naive compiler generates code for an imported function.