Holy cow, I wrote a book!
Ahem. It's spelled w-e-i-r-d.
And on all of the MSN properties, like local city guides, you can see MSN's new motto: "More Useful Everyday".
Um, another spelling error. That should read "More Useful Every Day". When used as a single word, "everyday" is an adjective, not an adverb. Like "An everyday event".
(By the way, in case people didn't get it: I'm only talking in the context of calling conventions you're likely to encounter when doing Windows programming or which are used by Microsoft compilers. I do not intend to cover calling conventions for other operating systems or that are specific to a particular language or compiler vendor.)
Remember: If a calling convention is used for a C++ member function, then there is a hidden "this" parameter that is the implicit first parameter to the function.
The 32-bit x86 calling conventions all preserve the EDI, ESI, EBP, and EBX registers, using the EDX:EAX pair for return values.
The same constraints apply to the 32-bit world as in the 16-bit world. The parameters are pushed from right to left (so that the first parameter is nearest to top-of-stack), and the caller cleans the parameters. Function names are decorated by a leading underscore.
This is the calling convention used for Win32, with exceptions for variadic functions (which necessarily use __cdecl) and a very few functions that use __fastcall. Parameters are pushed from right to left [corrected 10:18am] and the callee cleans the stack. Function names are decorated by a leading underscore and a trailing @-sign followed by the number of bytes of parameters taken by the function.
The first two parameters are passed in ECX and EDX, with the remainder passed on the stack as in __stdcall. Again, the callee cleans the stack. Function names are decorated by a leading @-sign and a trailing @-sign followed by the number of bytes of parameters taken by the function (including the register parameters).
The first parameter (which is the "this" parameter) is passed in ECX, with the remainder passed on the stack as in __stdcall. Once again, the callee cleans the stack. Function names are decorated by the C++ compiler in an extraordinarily complicated mechanism that encodes the types of each of the parameters, among other things. This is necessary because C++ permits function overloading, so a complex decoration scheme must be used so that the various overloads have different decorated names.
There are some nice diagrams on MSDN illustrating some of these calling conventions.
Remember that a calling convention is a contract between the caller and the callee. For those of you crazy enough to write in assembly language, this means that your callback functions need to preserve the registers mandated by the calling convention because the caller (the operating system) is relying on it. If you corrupt, say, the EBX register across a call, don't be surprised when things fall apart on you. More on this in a future entry.
I can make out perhaps a fifth of what's going on. If I really
concentrate (and they speak slowly enough), it might reach half.
But after the first two stories or so, my brain explodes and I have
to take a rest.
Embarrasingly, it took me weeks to figure out what they were saying
to introduce each show! "Programmet som förklarar nyheterna
på ett enklare sätt." I got stuck on the first word;
even today it sounds like the guy is saying what seems to be
the nonsense word "pörjammet".
The two types of stories I like most on Swedish radio are
(1) where they talk about the United States,
since it's enlightening to learn how others see us, and
(2) when they talk about slimy politicians.
The Swedes seem all upset that their politicians are selfish
money-grubbing sleazeballs. Hey, you idealistic Swedes,
they're politicians. Being selfish money-grubbing
sleazeballs is their job!
Exhibit A: Politicians paid for sitting on committees,
but don't actually show up for committee meetings.
When confronted, one politician explained, "I didn't realize
I was being paid." (Translation: "I don't do things unless
I get paid to do them.") Another used the excuse,
"I didn't know I was supposed to attend the meetings."
(Translation: "Sure, go ahead, pay me extra money, I'll
gladly take it, but if you expect me to do work, you have
to tell me!")
Exhibit B: Members of the Riksdag are permitted a free
rail pass to travel between their constituency and Stockholm.
Half of the MPs which avail themselves of this perk choose
the most expensive railway ticket, the so-called
Guld" (Annual Gold Card), which gets you a complimentary
three-course meal among other top-class amenities.
All these Gold Cards cost
the Swedish taxpayer over a million Kronor per year,
compared to the cost of buying them all coach tickets.
Meanwhile, here in the United States, we don't even bat an eye
one representative sneaks a US$225,000 renovation of his
home town's swimming pool into the federal budget,
and another secures a US$50 million grant to build an indoor
rain forest in Iowa.
Curiously, it is only the 8086 and x86 platforms that have
multiple calling conventions. All the others have only one!
Now we're going deep into trivia that absolutely nobody remembers
or even cares about: The 32-bit calling conventions you don't
see any more.
All of the processors listed here are RISC-style,
which means there are lots of registers, none of which
have any particular meaning. Well, aside from the zero
register which is hard-wired to zero.
(It turns out zero is a very handy number to have readily available.)
Any meanings attached to the registers are those
imposed by the calling convention.
As a throwback to the processors of old,
the "call" instruction stores the return address in a
register instead of being pushed onto the stack.
A good thing, too, since the processor doesn't officially
know about a "stack", it being a construction of the
As always, registers or stack space used to pass parameters
may be used as scratch by the called function, as can the
return value register.
You may notice that all of the RISC calling conventions
are basically the same. Once again, evidence that the 8086/x86
is the weirdo. A wildly popular weirdo, mind you.
The Alpha AXP ("AXP" being yet another of those
faux-acronyms that officially doesn't stand for anything)
has 32 integer registers,
one of which is hard-wired to zero.
one of the registers is the "stack pointer", one is the
"return address" register; and two others
have special meanings unrelated to parameter passing.
The first six parameters are passed in registers, with the
remaining parameters on the stack. If the function is
variadic, the parameters can be spilled onto the stack
so they can be accessed as an array.
Seven other registers are preserved across calls,
one is the return value,
and the remaining thirteen
1 zero register +
1 stack pointer +
1 return address +
2 special +
6 parameters +
7 preserved +
1 return value +
13 scratch =
32 total integer registers.
Function names on the Alpha AXP are completely undecorated.
The first four parameters are passed in a0, a1, a2 and
a3; the remainder are spilled onto the stack.
What's more, there are four "dead spaces" on the stack
where the four register parameters "would have been"
if they had been passed on the stack. These are for
use by the callee to spill the register parameters
back onto the stack if desired. (Particularly handy
for variadic functions.)
Function names on the MIPS are completely undecorated.
The first eight parameters
are passed in registers (r3 through r10), and the return address
is managed manually.
I forget what happens to parameters nine and up...
Function names on the PowerPC are decorated by prepending two periods.
I haven't had personal experience with the MIPS or PPC
processors, so my discussion of those processors may be
a tad off, but the basic idea I think is sound.
LSSU's approach, so here's my
list of words I'd like to ban.
Thank goodness this has faded, but there are still some citations out
there. Please don't use it to describe my work. It makes me sound
like a dog in a show. (No offense to dogs in shows!)
Everybody is "the leading this" or "the leading that".
Here's my rule: If you say you're the leading XYZ or
(even dodgier) "among the leading XYZs", then
have to list at least three companies that are not
leaders in the XYZ market. Because if nobody is following you,
then you're not really "leading", now, are you.
And the word I most would like to banish from the English language:
This has taken over Microsoft-speak in the past year or so
and it drives me batty. "What are our key asks here?",
you might hear in a meeting. Language tip:
The thing you are asking for is called a "request".
Plus, of course, the thing that is an "ask" is usually more of
a "demand" or "requirement".
But those are such unfriendly words, aren't they?
Why not use a warm, fuzzy word like "ask" to take the edge off?
Answer: Because it's not a word.
I have yet to find any dictionary which sanctions this usage.
Indeed, the only definition for "ask" as a noun is
A water newt [Scot. & North of Eng.], and that was from
Answer 2: Because it's passive-aggressive.
These "asks" are really "demands".
So don't guilt-trip me with "Oh, you didn't meet our ask.
We had to cut half our features. But that's okay. We'll just
suffer quietly, you go do your thing, don't mind us."
Here's an analogy:
Suppose somebody tells you,
"I am going to count to 100,
and you need to give continuous estimates as to when I will be done."
They start out, "one, two, three...".
You notice they are going at about one number per second,
so you estimate 100 seconds.
Uh-oh, now they're slowing down.
"Four... ... ... five... ... ..."
Now you have to change your estimate to maybe 200 seconds.
Now they speed up: "six-seven-eight-nine"
You have to update your estimate again.
Now somebody who is listening only to your estimates and
not the the person counting thinks you are off your rocker.
Your estimate went from 100 seconds to 200 seconds to 50 seconds;
what's your problem? Why can't you give a good estimate?
File copying is the same thing.
The shell knows
how many files and how many bytes are going to be copied,
but it doesn't know know how fast the hard drive or network
or internet is going to be, so it just has to guess.
If the copy throughput changes, the estimate needs
to change to take the new transfer rate into account.
The 8086 was a 16-bit version of the even older 8080 processor,
which had six 8-bit registers, named
A, B, C, D, E, H, and L.
The registers could be used in pairs to products three
16-bit pseudo-registers, BC, DE, and HL.
What's more, you could put a 16-bit address into the HL register
and use the pseudo-register "M" to deference it.
So, for example, you could write "MOV B, M" and this meant to
load the 8-bit value pointed to by the HL register pair into the B register.
The 8086 took these 8080 registers and mapped them sort of like this:
This is why the 8086 instruction set can only dereference
through the [BX] register and not the [CX] or [DX] registers:
On the original 8080, you could not dereference through [BC] or [DE],
only thorugh M=[HL].
This much so far is pretty official. The instruction set
for the 8086 was chosen to be upwardly-compatible with the 8080,
so as to facilitate machine translation of existing 8-bit code
to this new 16-bit processor.
Even the MS-DOS function calls were designed so as to
faciliate machine translation.
What about the SI and DI registers? I suspect they were
inspired by the IX and IY registers available on the Z-80,
a competitor to the 8080 which took the 8080 instruction set
and extended it with more registers. The Z-80 allowed
you to dereference through [IX] and [IY], so the 8086 lets
you dereference through [SI] and [DI].
And what about the BP register? I suspect that was invented
on the fly in order to facilitate stack-based parameter
passing. Notice that the BP register is the only 8086 register
that defaults to the SS segment register and which can be used
to access memory directly.
Why not add even more registers, like today's processors with
their palette of 16 or even 128 registers? Why limit the 8086
to only eight registers (AX, BX, CX, DX, SI, DI, BP, SP)? Well, that was then
and this is now. At that time, processors did not have lots of
registers. The 68000 had a whopping sixteen registers, but if
you look more closely, only half of them were general purpose
arithmetic registers; the other half were used only for
In the 16-bit world, part of the calling convention was fixed
by the instruction set: The BP register defaults to the SS selector,
whereas the other registers default to the DS selector.
So the BP register was necessarily the register used for
accessing stack-based parameters.
The registers for return values were also chosen automatically
by the instruction set.
The AX register acted as the accumulator and therefore was the
obvious choice for passing the return value.
The 8086 instruction set also has special instructions
which treat the DX:AX pair as a single 32-bit value,
so that was the obvious choice to be the register pair
used to return 32-bit values.
That left SI, DI, BX and CX.
(Terminology note: Registers that do not need to be preserved
across a function call are often called "scratch".)
When deciding which registers should be preserved by a calling
convention, you need to balance the needs of the caller against
the needs of the callee. The caller would prefer that all
registers be preserved, since that removes the need for the caller
to worry about saving/restoring the value across a call.
The callee would prefer that no registers be preserved, since
that removes the need to save the value on entry and restore it
If you require too few registers to be preserved, then callers
become filled with register save/restore code. But if you
require too many registers to be preserved, then callees become
obligated to save and restore registers that the caller might
not have really cared about. This is particularly important for
leaf functions (functions that do not call any other functions).
The non-uniformity of the x86 instruction set was also a contributing
factor. The CX register could not be used to access memory, so you
wanted to have some register other than CX be scratch, so that a leaf
function can at least access memory without having to preserve any
registers. So BX was chosen to be scratch, leaving SI and DI as
So here's the rundown of 16-bit calling conventions:
In summary: Caller cleans the stack, parameters pushed right to left.
Function name decoration consists of a leading underscore.
My guess is that the leading underscore prevented a function
name from accidentally colliding with an assembler reserved word.
(Imagine, for example, if you had a function called "call".)
Nearly all Win16 functions are exported as Pascal calling convention.
The callee-clean convention saves three bytes at each call point,
with a fixed overhead of two bytes per function. So if a function
is called ten times, you save 3*10 = 30 bytes for the call points,
and pay 2 bytes in the function itself, for a net savings of 28 bytes.
It was also fractionally faster. On Win16, saving a few hundred bytes
and a few cycles was a big deal.
Consequently, __fastcall was typically faster only for short leaf functions,
and even then it might not be.
Okay, those are the 16-bit calling conventions I remember.
Part 2 will discuss 32-bit calling conventions, if I ever get around
to writing it.
Even if you figure out which DLL the return address belongs to
that doesn't mean that that is actually the DLL that called you.
A common trick is to search through a "trusted" DLL for some code
bytes that coincidentally match ones you (the attacker) want to execute.
This can be something as simple as a "retd" instruction, which
are quite abundant. The attacker then builds a stack frame that
looks like this, for, say, a function that takes two parameters.
hacked parameter 1
hacked parameter 2
After building this stack frame, the attacker then jumps to
the start of the function being attacked.
The function being attacked looks
at the return address and sees trusted_retd,
which resides in a trusted DLL. It then foolishly trusts the
caller and allows some unsafe operation to occur, using
hacked parameters 1 and 2. The function being attacked then
does a "retd 8" to return and clean the parameters.
This transfers control to the trusted_retd,
which performs a simple retd, which now gives
control to the hacker_code_addr, and the hacker
can use the result to continue his nefarious work.
This is why you should be concerned if somebody says,
"This code verifies that its caller is trusted..."
How do they know who the caller really is?