<?xml version="1.0" encoding="UTF-8" ?>
<?xml-stylesheet type="text/xsl" href="http://blogs.msdn.com/utility/FeedStylesheets/rss.xsl" media="screen"?><rss version="2.0" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:slash="http://purl.org/rss/1.0/modules/slash/" xmlns:wfw="http://wellformedweb.org/CommentAPI/"><channel><title>FreiK's WebLog : X64 ABI Info</title><link>http://blogs.msdn.com/freik/archive/tags/X64+ABI+Info/default.aspx</link><description>Tags: X64 ABI Info</description><dc:language>en-US</dc:language><generator>CommunityServer 2.1 SP1 (Build: 61025.2)</generator><item><title>AMD64 unwind info gotchas</title><link>http://blogs.msdn.com/freik/archive/2007/03/12/amd64-unwind-info-gotchas.aspx</link><pubDate>Mon, 12 Mar 2007 20:34:00 GMT</pubDate><guid isPermaLink="false">91d46819-8472-40ad-a661-2c78acb4018c:1866832</guid><dc:creator>Kevin Frei</dc:creator><slash:comments>1</slash:comments><comments>http://blogs.msdn.com/freik/comments/1866832.aspx</comments><wfw:commentRss>http://blogs.msdn.com/freik/commentrss.aspx?PostID=1866832</wfw:commentRss><wfw:comment>http://blogs.msdn.com/freik/rsscomments.aspx?PostID=1866832</wfw:comment><description>&lt;P&gt;I had a brief e-mail&amp;nbsp;exchange with one of the devs on&amp;nbsp;the optimizer team about a checkin he put up for review.&amp;nbsp; He modified the compiler so that it only aligns the stack for functions that call other functions - that's the typical definition in compiler lingo of a 'leaf function'.&amp;nbsp; My first response was "don't do that - you may still have to align for reasons A, B, &amp;amp; C".&amp;nbsp; To which he responded with a quote from the ABI doc that explicitly says you only have to align the stack if you're calling a function.&amp;nbsp; So I started reading.&amp;nbsp; Turns out, he's right (and so is the doc), but there are some really nasty gotchas involved.&amp;nbsp; When we initially did this stuff, we just lived with the scenario where you might be wasting a bit more stack space...&lt;/P&gt;
&lt;P&gt;The theme of the problems revolve around restrictions on encoding information in the unwind data. If you are saving XMM registers and have an unaligned stack pointer, you must use the SAVE_XMM128_FAR opcode because the SAVE_XMM128 opcode will only accept offsets in multiples of 16 bytes.&amp;nbsp; Or I guess you could use MOVUPS instead of MOVAPS to save &amp;amp; restore your XMM register, but I wouldn't recommend that on current hardware. Similarly, you have to use the ALLOC_LARGE descriptor for stack allocation of sizes that aren’t [n *&amp;nbsp;8 + 8].&amp;nbsp;ALLOC_LARGE actually has 2 variants - one that multiples by 8, and the other than doesn't.&amp;nbsp; You can use the latter to allocate random amounts of data from the stack.&amp;nbsp; But then if you want to use a frame pointer, you're going to be in a pretty weird spot, because it will have to be unaligned, as well. Unwind data dictates that you can only have a frame pointer that = RSP + 16 * [1-15].&lt;/P&gt;
&lt;P&gt;I'll probably come back to this article &amp;amp; add some nice hyperlinks to ABI details in the doc itself, but it's been a while since I blogged anything useful, so I figured I'd just get this out there quickly.&lt;/P&gt;&lt;img src="http://blogs.msdn.com/aggbug.aspx?PostID=1866832" width="1" height="1"&gt;</description><category domain="http://blogs.msdn.com/freik/archive/tags/Developer+Info/default.aspx">Developer Info</category><category domain="http://blogs.msdn.com/freik/archive/tags/X64+ABI+Info/default.aspx">X64 ABI Info</category></item><item><title>Speaking in 'public'...</title><link>http://blogs.msdn.com/freik/archive/2006/10/17/speaking-in-public.aspx</link><pubDate>Wed, 18 Oct 2006 06:35:00 GMT</pubDate><guid isPermaLink="false">91d46819-8472-40ad-a661-2c78acb4018c:837679</guid><dc:creator>Kevin Frei</dc:creator><slash:comments>4</slash:comments><comments>http://blogs.msdn.com/freik/comments/837679.aspx</comments><wfw:commentRss>http://blogs.msdn.com/freik/commentrss.aspx?PostID=837679</wfw:commentRss><wfw:comment>http://blogs.msdn.com/freik/rsscomments.aspx?PostID=837679</wfw:comment><description>&lt;P&gt;I'll be hanging out with the &lt;EM&gt;cool &lt;/EM&gt;kids at the &lt;A class="" title="Northwest C++ Users Group" href="http://www.nwcpp.org/" target=_blank mce_href="http://www.nwcpp.org"&gt;Northwest C++ Users Group&lt;/A&gt;&amp;nbsp;tomorrow night.&amp;nbsp; If you're in the area, and want to heckle me, swing by.&amp;nbsp; We'll be in building 40 at 6:30 PM.&amp;nbsp; My talk starts at 7:00 PM.&amp;nbsp; I'm talking about the actual runtime cost of exception handling for x64 and x86 on Windows.&lt;/P&gt;
&lt;P&gt;Update:&lt;/P&gt;
&lt;P&gt;Check out &lt;A href="http://www.nwcpp.org/Meetings/2006/10.html"&gt;http://www.nwcpp.org/Meetings/2006/10.html&lt;/A&gt;&amp;nbsp;for the slides, and a video of the talk, if you're interested.&amp;nbsp; You can see my thinning hair in back, my marvelous posture, and my always entertaining speaking style.&lt;/P&gt;&lt;img src="http://blogs.msdn.com/aggbug.aspx?PostID=837679" width="1" height="1"&gt;</description><category domain="http://blogs.msdn.com/freik/archive/tags/Developer+Info/default.aspx">Developer Info</category><category domain="http://blogs.msdn.com/freik/archive/tags/X64+ABI+Info/default.aspx">X64 ABI Info</category></item><item><title>What does "Hot Patchability" mean and what is it for?</title><link>http://blogs.msdn.com/freik/archive/2006/03/07/x64-Hotpatchability.aspx</link><pubDate>Tue, 07 Mar 2006 23:10:00 GMT</pubDate><guid isPermaLink="false">91d46819-8472-40ad-a661-2c78acb4018c:545603</guid><dc:creator>Kevin Frei</dc:creator><slash:comments>6</slash:comments><comments>http://blogs.msdn.com/freik/comments/545603.aspx</comments><wfw:commentRss>http://blogs.msdn.com/freik/commentrss.aspx?PostID=545603</wfw:commentRss><wfw:comment>http://blogs.msdn.com/freik/rsscomments.aspx?PostID=545603</wfw:comment><description>&lt;p&gt;I got a question on my earlier ABI post about Hot Patchability, so I thought I'd
go into excruciating detail on that one, since it's not quite as complicated as
exception handling &lt;/p&gt;
&lt;h2&gt;What the heck does Hot Patchable mean?&lt;/h2&gt;
&lt;p&gt;Hot patchable means that, primarily, you're able to take a running application and,
given sufficient privileges, atomically changes all calls from function A to switch
to function B. At a high level, this allows you to patch a running process without
requiring that the process be stopped [2 steps better than a reboot!] This is really
important in the "Server Up-Time" scenarios that are becoming more and more common.
It allows a patch to be deployed without even stopping a process [for more than
the length of a standard context switch, anyway].&lt;/p&gt;
&lt;h2&gt;Why should I care?&lt;/h2&gt;
&lt;p&gt;Maybe you won't, but if you're deploying server code, you probably don't want to
have to terminate the process to deploy a security patch, right? Microsoft's customers
really don't like it when we make 'em restart their server processes.&lt;/p&gt;
&lt;h2&gt;Okay, I care, how does it work?&lt;/h2&gt;
&lt;p&gt;First, remember that I'm a compiler guy, not a debugger guy, or a kernel guy, or
anything else, but here's my understanding: First, you build your patched function.
This is a &lt;em&gt;non-trivial&lt;/em&gt; amount of work. It must have a compatible function
signature with the function it's replacing. It also needs to have special code to
access any globals that are needed, since it's generally loaded as a separate DLL
[though I imagine a tricky debugger guy could do some kind of nutty code injection
where that isn't necessary]. Once you have the code authored, you have to deploy
the patch. This is where the ABI restrictions come in.&lt;/p&gt;
&lt;p&gt;First, all functions must start with a 2 byte (or larger) instruction - if you start
with a push, stick a size prefix on it. Second, all functions must be preceded by
6 bytes of padding. Finally, any image that needs to be hotpatchable should also
have some amount of 'scratch space' within 2GB of it's image location. Why, you
ask? Well, here's how hot-patching actually works:&lt;/p&gt;
&lt;p&gt;Pause the process and load your hot patch dll into the address space [again, I don't
know all the mechanics for this, but I know it's not &lt;em&gt;too&lt;/em&gt; difficult. Next,
write to that 'scratch space' the address of your hot patch function. Now, write
the 6 bytes &lt;code&gt;JMP [PC-relative scratch space]&lt;/code&gt; into the 6 bytes of padding
before the function you're trying to replace. Finally, write the 2 bytes &lt;code&gt;jmp PC-6&lt;/code&gt;
into the first two bytes of your function. Resume the process, and your hot patch
function is merrily running instead of the old one.&lt;/p&gt;
&lt;h2&gt;How does that work again?&lt;/h2&gt;
&lt;p&gt;The point of the 2 byte instruction at the start of each function is so that you
don't ever have to worry about pausing your process in the middle of the two bytes
you're going to change with the &lt;code&gt;jmp PC-6&lt;/code&gt; instruction. Nothing else
is really interesting. You setup a launch pad to your scratch space, which is where
the target address lives. No rocket science, here, nosiree.&lt;/p&gt;
&lt;h2&gt;What if I don't want hot-patchability&lt;/h2&gt;
&lt;p&gt;Honestly, I don't think there's anything that prevents you from breaking these particular
rules, except the cost is so minimal, there's really just no good reason not to
do it. X86 has something like a 30% hot-patchable kernel. And x64 has a 100% hot-patchable
kernel. You tell me which one is better.&lt;/p&gt;
&lt;img src="http://blogs.msdn.com/aggbug.aspx?PostID=545603" width="1" height="1"&gt;</description><category domain="http://blogs.msdn.com/freik/archive/tags/X64+ABI+Info/default.aspx">X64 ABI Info</category></item><item><title>x64 ABI vs. x86 ABI (aka Calling Conventions for AMD64 &amp; EM64T)</title><link>http://blogs.msdn.com/freik/archive/2006/03/06/X64-calling-conventions-summary.aspx</link><pubDate>Tue, 07 Mar 2006 02:00:00 GMT</pubDate><guid isPermaLink="false">91d46819-8472-40ad-a661-2c78acb4018c:398200</guid><dc:creator>Kevin Frei</dc:creator><slash:comments>14</slash:comments><comments>http://blogs.msdn.com/freik/comments/398200.aspx</comments><wfw:commentRss>http://blogs.msdn.com/freik/commentrss.aspx?PostID=398200</wfw:commentRss><wfw:comment>http://blogs.msdn.com/freik/rsscomments.aspx?PostID=398200</wfw:comment><description>&lt;P&gt;(This is an older post, with some mild cleanup, and fixed links)&lt;/P&gt;
&lt;P&gt;Before I start: ABI = Application Binary Interface – this is the spec that describes how to call functions, pass parameters, unwind the stack, handle exceptions, etc... It’s also sometimes call the ‘Calling Convention’&lt;/P&gt;
&lt;P&gt;There is a persistent misconception among people who are implementing x64 compilers/code generators, and folks that write ASM code for x64, who have a functioning solution for x86. People too frequently assume they can just keep most of their code, while changing ESP to RSP, and things should 'just work'. This is fundamentally not true. When initially working on the x64 ABI, it was decided that we wanted to clean up the way that exception handling &amp;amp; general function invocation worked. We had a brand new architecture, we wanted to cut out all of the legacy junk that continually prevents, or at least overcomplicates, achieving great performance on the x86 platform. With this in mind, x64 was given a single calling convention – no __cdecl / __stdcall / __fastcall / __thiscall mess. There was also a dramatic change in the way that x64 unwinds the stack, compared to how x86 does it.&lt;/P&gt;
&lt;P&gt;Unwinding the stack is used in a wide variety of places, including handling exceptions, garbage collection, and displaying the call stack from a debugger. On x86, every function that needs some sort of attention due to an exception must add an element to a thread-global linked list upon entry, and remove it upon exit. For non-exception unwinds, the thing doing the unwind must grok through some nasty meta-data that tries to describe what &amp;amp; when the compiler is setting up the stack frame. This meta-data was only implemented after about 1999, and is primarily supported in debuggers. I’ll be ignoring this junk, and instead focus on how exception handling works, since this is the primary way your code will break if you don’t get this right (undebuggable code is still broken, but you won’t notice until you try to get a stack trace).&lt;/P&gt;
&lt;P&gt;So, the x86 thread-global linked list contains a list of structures, each element of such contains a function pointer to call in the event of an exception, and then some data that said function will consume. Thus, you’ll see fs:[0] references scattered throughout C++ code:every function that contains a destructor that must be invoked if an exception occurs must have one of these things. When your code creates a new object, there is a small bit of code executed to update the data on this thread-global list element. When that object is destroyed, more code is executed. Because of this, x86 can catch hundreds of thousands of exceptions per second, but if you don’t actually take any exceptions, your CPU executes lots of blobs of code that have no purpose except to be sure that the rare exception is handled properly. Finding the first function that needs to handle an exception, or destroy an object is an O(1) operation: it’s a single lookup of FS:[0]. VERY fast. Unfortunately, that should &lt;I&gt;not&lt;/I&gt; be a common scenario. Exceptions should be exceptional, not common!In addition, this linked list really resides on the stack, thus there is a function pointer sitting right below the return address on your stack!Buffer overruns, anyone?There is now an extra blob of information that indicates what functions are valid to be invoked as exception handlers (link /SAFESEH, if you’re curious), but this is only used if every contribution to your .exe or .dll has this information.&lt;/P&gt;
&lt;P&gt;With this in mind, on x64 every function has a very strict structure that must be properly described in static data. The prolog is the only place in which you can adjust your stack frame pointer. The prolog can only be up to 255 bytes long. All modifications of your stack frame pointer, as well as all saves of nonvolatile registers (RBP, RSI, RDI, R12-R15, XMM6-XMM15) must be described in this static data, so that they can be restored correctly if an exception occurs. If you function is missing this static data, and an exception is raised, the thread will be terminated by the OS.&lt;/P&gt;
&lt;P&gt;In an effort to get stuff into peoples hands, here’s an excerpt that I’ve prepended to the ABI document that has not yet seen the light of day. I’ve updated the links to point you &lt;/P&gt;
&lt;P&gt;at the sections on MSDN. Rather than posting the whole thing, I've just put in my description, with links back to the slightly older version on MSDN. The current version has some more detail, but nothing you couldn't figure out through, ahem, trial &amp;amp; error...&lt;/P&gt;
&lt;P&gt;&amp;lt;snip&amp;gt;&lt;/P&gt;
&lt;H2&gt;Overview&lt;/H2&gt;
&lt;P&gt;The in depth nature of an ABI document doesn’t lend itself to ‘easy reading’. However, it is the case that a detailed knowledge of the entire ABI is rarely necessary to accomplish most programming tasks. This section is simply a quick overview of the ABI, with pointers to the sections that describe the various aspects in more detail. It also tries to point out particular ‘gotchas’ that must be strictly adhered to, in an effort to minimize the problems encountered.&lt;/P&gt;
&lt;H3&gt;Calling convention&lt;/H3&gt;
&lt;P&gt;The x64 Windows ABI is a 4 register ‘fast-call’ calling convention, with stack-backing for those registers. There is a strict one-to-one correspondence between arguments in a function, and the registers for those arguments. &lt;I&gt;Any argument that doesn’t fit in 8 bytes, or is not 1, 2, 4, or 8 bytes, must be passed by reference. &lt;/I&gt;There is no attempt to spread a single argument across multiple registers. The x87 register stack is unused. It may be used, but must be considered volatile across function calls. All floating point operations are done using the 16 XMM registers. The arguments are passed in registers RCX, RDX, R8, and R9. If the arguments are float/double, they are passed in XMM0L, XMM1L, XMM2L, and XMM3L. &lt;I&gt;16 byte arguments are passed by reference&lt;/I&gt;. Parameter passing is described in detail at &lt;A href="http://msdn.microsoft.com/en-us/library/zthk2dkh.aspx" mce_href="http://msdn.microsoft.com/en-us/library/zthk2dkh.aspx"&gt;MSDN Library: Dev Tools and Languages: VS2008: VS: VC++: 64-bit Programming: x64 Software Conventions: Calling Convention: Parameter Passing&lt;/A&gt;. In addition to these registers, RAX, R10, R11, XMM4, and XMM5 are volatile. All other registers are non-volatile. Register usage is documented in detail at &lt;A href="http://msdn.microsoft.com/en-us/library/9z1stfyw.aspx" mce_href="http://msdn.microsoft.com/en-us/library/9z1stfyw.aspx"&gt;MSDN Library: Dev Tools and Languages: VS2008: VS: VC++: 64-bit Programming: x64 Software Conventions: Register Usage&lt;/A&gt; and &lt;A href="http://msdn.microsoft.com/en-us/library/6t169e9c.aspx" mce_href="http://msdn.microsoft.com/en-us/library/6t169e9c.aspx"&gt;MSDN Library: Dev Tools and Languages: VS2008: VS: VC++: 64-bit Programming: x64 Software Conventions: Calling Convention: Caller/Callee Saved Registers&lt;/A&gt;.&lt;/P&gt;
&lt;P&gt;The caller is responsible for allocating space for parameters to the callee, and &lt;I&gt;must always allocate sufficient space for the 4 register parameters, even if the callee doesn’t have that many parameters. &lt;/I&gt;This aids in the simplicity of supporting K&amp;amp;R C’s unprototyped functions, and ‘vararg’ C/C++ functions. For vararg/unprototyped functions any float values must be duplicated in the corresponding general-purpose register. Any parameters above the first 4 must be stored on the stack, above the backing-store for the first 4, prior to the call. Vararg function details can be found at &lt;A href="http://msdn.microsoft.com/en-us/library/dd2wa36c.aspx" mce_href="http://msdn.microsoft.com/en-us/library/dd2wa36c.aspx"&gt;MSDN Library: Dev Tools and Languages: VS2008: VS: VC++: 64-bit Programming: x64 Software Conventions: Calling Convention: Varargs&lt;/A&gt;. Unprototyped function information is detailed at &lt;A href="http://msdn.microsoft.com/en-us/library/6yy8aw4d.aspx" mce_href="http://msdn.microsoft.com/en-us/library/6yy8aw4d.aspx"&gt;MSDN Library: Dev Tools and Languages: VS2008: VS: VC++: 64-bit Programming: x64 Software Conventions: Calling Convention: Unprototypesd Functions&lt;/A&gt;.&lt;/P&gt;
&lt;H3&gt;Alignment&lt;/H3&gt;
&lt;P&gt;&lt;I&gt;Most structures are aligned to their natural alignment.&lt;/I&gt; The primary exceptions are the stack pointer and malloc/alloca memory, which are aligned to 16 byte, in order to aid performance. Alignment above 16 bytes must be done manually, but since 16 bytes is a common alignment size for XMM operations, this should suffice for most code. For more information about structure layout and alignment see &lt;A href="http://msdn.microsoft.com/en-us/library/02c56cw3.aspx" mce_href="http://msdn.microsoft.com/en-us/library/02c56cw3.aspx"&gt;MSDN Library: Dev Tools and Languages: VS2008: VS: VC++: 64-bit Programming: x64 Software Conventions: Types and Storage&lt;/A&gt;. For information about the stack layout, see &lt;A href="http://msdn.microsoft.com/en-us/library/x4ea06t0.aspx" mce_href="http://msdn.microsoft.com/en-us/library/x4ea06t0.aspx"&gt;MSDN Library: Dev Tools and Languages: VS2008: VS: VC++: 64-bit Programming: x64 Software Conventions: Stack Usage&lt;/A&gt;.&lt;/P&gt;
&lt;H3&gt;Unwindability&lt;/H3&gt;
&lt;P&gt;All non-leaf functions [functions that neither call a function, nor allocate any stack space themselves] must be annotated with data [referred to as xdata or ehdata, which is pointed to from pdata] that describes to the operating system how to properly unwind them, to recover non-volatile registers. Prologs &amp;amp; epilogs are highly restricted, so that they can be properly described in xdata. The stack pointer must be aligned to 16 bytes, except for leaf functions, in any region of code that isn’t part of an epilog or prolog. For details about the proper structure of function prolog &amp;amp; epilogs, see &lt;A href="http://msdn.microsoft.com/en-us/library/tawsa7cb.aspx" mce_href="http://msdn.microsoft.com/en-us/library/tawsa7cb.aspx"&gt;MSDN Library: Dev Tools and Languages: VS2008: VS: VC++: 64-bit Programming: x64 Software Conventions: Prolog and Epilog&lt;/A&gt;. For more information about exception handling, and the exception handling/unwinding pdata &amp;amp; xdata see &lt;A href="http://msdn.microsoft.com/en-us/library/1eyas8tf.aspx" mce_href="http://msdn.microsoft.com/en-us/library/1eyas8tf.aspx"&gt;MSDN Library: Dev Tools and Languages: VS2008: VS: VC++: 64-bit Programming: x64 Software Conventions: Exception Handling (x64)&lt;/A&gt;.&lt;/P&gt;
&lt;P&gt;&amp;lt;/snip&amp;gt;&lt;/P&gt;
&lt;P&gt;Anyway, I hope that helps. I could not &lt;I&gt;believe&lt;/I&gt; how long it took to find this stuff on MSDN. For some (lame) reason, blogs are better indexed than MSDN in both MSN search, and Google. Hopefully this entry will make finding this information easier.&lt;/P&gt;
&lt;P&gt;Here’s a link to the official x64 ABI documentation, which goes into excruciating detail about this stuff. Sorry if it’s not very readable – I wrote a few parts, along with a few other people, and we’re primarily engineers, not writers. We have had a great UE write take over on this document, and it should be slightly improved when it sees the light of day as part of the VS 2005 documentation, but until then, this is it:&lt;/P&gt;
&lt;P&gt;&lt;A href="http://msdn.microsoft.com/en-us/library/7kcdt6fy.aspx" mce_href="http://msdn.microsoft.com/en-us/library/7kcdt6fy.aspx"&gt;MSDN Library: Dev Tools and Languages: VS2008: VS: VC++: 64-bit Programming: x64 Software Conventions&lt;/A&gt;&lt;/P&gt;
&lt;P&gt;Hopefully, that link will continue to work for a while. MSDN likes to occasionally restructure the way it references information, just to keep us all on our toes. &lt;EM&gt;I've now updated the links 3 times. If they're broken, please ping me and I'll update them!&lt;/EM&gt;&lt;/P&gt;&lt;img src="http://blogs.msdn.com/aggbug.aspx?PostID=398200" width="1" height="1"&gt;</description><category domain="http://blogs.msdn.com/freik/archive/tags/Developer+Info/default.aspx">Developer Info</category><category domain="http://blogs.msdn.com/freik/archive/tags/X64+ABI+Info/default.aspx">X64 ABI Info</category></item><item><title>X64 Unwind Information</title><link>http://blogs.msdn.com/freik/archive/2006/01/04/509372.aspx</link><pubDate>Thu, 05 Jan 2006 00:48:00 GMT</pubDate><guid isPermaLink="false">91d46819-8472-40ad-a661-2c78acb4018c:509372</guid><dc:creator>Kevin Frei</dc:creator><slash:comments>1</slash:comments><comments>http://blogs.msdn.com/freik/comments/509372.aspx</comments><wfw:commentRss>http://blogs.msdn.com/freik/commentrss.aspx?PostID=509372</wfw:commentRss><wfw:comment>http://blogs.msdn.com/freik/rsscomments.aspx?PostID=509372</wfw:comment><description>&lt;FONT size=2&gt;
&lt;P&gt;I've had a fairly large number of e-mails with various people both inside &amp;amp; outside Microsoft explaining the AMD64 unwind data. I generally push them at the ABI documentation (which I've linked to in &lt;A href="/freik/archive/2005/03/17/398200.aspx"&gt;this&lt;/A&gt; entry). But the ABI documentation really requires a complete reading before you can really understand how the unwind information all fits together. So, in an attempt to make dealing with unwind information, I'm going to attempt a more 'chatty' explanation of unwind data.&lt;/P&gt;
&lt;P&gt;&lt;FONT size=4&gt;&lt;STRONG&gt;Background&lt;/STRONG&gt;&lt;/FONT&gt;&lt;/P&gt;
&lt;P&gt;&lt;FONT size=3&gt;&lt;STRONG&gt;The 2 phase exception model&lt;/STRONG&gt;&lt;/FONT&gt;&lt;/P&gt;
&lt;P&gt;Both Win32 &amp;amp; Win64 have a 2 phase exception model. The first phase walks the list of functions that _might_ handle an exception, asking them if they will _actually_ handle the particular exception that occurred (this is done by calling filter functions). During this phase, the filter functions either return EXCEPTION_CONTINUE_SEARCH (0) which indicates that this handler will not handle the exception, EXCEPTION_HANDLE_EXCEPTION (1), which indicates that this handler will handle the exception. One other option is EXCEPTION_RESUME_EXECUTION (-1) that I won't mention (look it up in MSDN if you're curious).&lt;/P&gt;
&lt;P&gt;&lt;FONT size=3&gt;&lt;STRONG&gt;Why do you need unwind data?&lt;/STRONG&gt;&lt;/FONT&gt;&lt;/P&gt;
&lt;P&gt;First, you should probably understand WHY the Windows AMD64 ABI requires unwind data. In Win32 x86 land, when an exception occurs, the OS just looks at a pointer at fs:0 which is the head of a linked list of information regarding what to do in the event of an exception. Each function that requires any sort of attention when an exception occurs needs to create a node on this linked list. This node must contain all information necessary to handle the two phase exception mechanism of Win32. For Windows for AMD64, the way the functions that need to be invoked during exception handling are discovered not by walking a linked list, but crawling the stack. Thus, the stack must _always_ be in a state that can be statically walked. To accomplish this, there are 2 fundamental issues: how to discover the stack frame size of a function, and how to recover non-volatile (aka caller saved) register values. This is exactly what AMD64 unwind data describes. The trouble most people run into, though, is that even if your function doesn't need to do anything if an exception occurs, the function may be called by a function that does, and it may then call a function that will throw an exception. If this is the case, when the exception is thrown, the function's stack frame must be fully described. As an added bonus, because all stack frames are accurately described, there's never any reason to use a frame pointer unless absolutely necessary due to something like _alloca or __declspec(align(&amp;gt;16)).&lt;/P&gt;
&lt;P&gt;So, even if you just have a tiny little function that only calls another function, you still need unwind data, or when an exception occurs, your process will simply be terminated.&lt;/P&gt;
&lt;P&gt;&lt;STRONG&gt;&lt;FONT size=3&gt;OK, What is unwind data?&lt;/FONT&gt;&lt;/STRONG&gt;&lt;/P&gt;
&lt;P&gt;Simply put, it's a meta-language that describes what, when, and how a function's frame is built. There are opcodes that indicate when &amp;amp; by how much the stack pointer has been adjusted, when &amp;amp; where a non-volatile register has been saved, or when, where, and to what offset a frame pointer has been set. When you're writing assembly code for ML64, there are predefined macros to help describe this stuff:&lt;/P&gt;
&lt;P&gt;&lt;FONT face="Courier New"&gt;PROC FRAME [optional handler address]&lt;/FONT&gt;&lt;/P&gt;
&lt;P&gt;&lt;FONT face="Courier New"&gt;.ALLOCSTACK size&lt;/FONT&gt;&lt;/P&gt;
&lt;P&gt;&lt;FONT face="Courier New"&gt;.PUSHFRAME [code]&lt;/FONT&gt;&lt;/P&gt;
&lt;P&gt;&lt;FONT face="Courier New"&gt;.PUSHREG reg&lt;/FONT&gt;&lt;/P&gt;
&lt;P&gt;&lt;FONT face="Courier New"&gt;.SAVEREG reg, offset&lt;/FONT&gt;&lt;/P&gt;
&lt;P&gt;&lt;FONT face="Courier New"&gt;.SAVEXMM128 reg, offset&lt;/FONT&gt;&lt;/P&gt;
&lt;P&gt;&lt;FONT face="Courier New"&gt;.SETFRAME reg, offset&lt;/FONT&gt;&lt;/P&gt;
&lt;P&gt;&lt;FONT face="Courier New"&gt;.ENDPROLOG&lt;/FONT&gt;&lt;/P&gt;
&lt;P&gt;Note that all those offsets are actually restricted to be properly aligned. The .SAVEREG offset must be a multiple of 8, while .SAVEXMM128 offset must be a multiple of 16. The .SETFRAME offset must be a multiple of 16, and must be between 16 and 240. In addition, all frame manipulation must be completed in the first 254 bytes of the function. If you want to push registers saves &amp;amp; restores further into the function, you must use chained unwind info, which will have to be the subject of another blog entry...&lt;/P&gt;
&lt;P&gt;Looking up the directives on MSDN will give you examples of how they're all used. If you feel like authoring code that conforms to the prologue unwind descriptors is restrictive, you'll love the epilogue requirements. All function epilogues must look like this:&lt;/P&gt;
&lt;P&gt;&lt;FONT face="Courier New"&gt;(optional) lea rsp, [frame ptr + frame size] or add rsp, frame size&lt;/FONT&gt;&lt;/P&gt;
&lt;P&gt;&lt;FONT face="Courier New"&gt;pop reg (zero or more)&lt;/FONT&gt;&lt;/P&gt;
&lt;P&gt;&lt;FONT face="Courier New"&gt;ret (or jmp)&lt;/FONT&gt;&lt;/P&gt;
&lt;P&gt;No other instructions may occur betwen the first lea/add and the final jmp or ret. At first glance, this may seem like you can't restore any XMM registers. The trick is that all non-volatile registers except those that you want to restore using a pop must be restored prior to entry to the epilogue. The reason this works is that if the OS has to unwind an epilogue, it already has the correct values in all the registers except the ones that are restored via pop, so things really do work out well.&lt;/P&gt;
&lt;P&gt;One other note: if the final jmp isn't an ip-relative jmp, but an indirect jmp, it must be preceded by the REX prefix, to indicate to the OS unwind routines that the jump is headed outside of the function, otherwise, the OS assumes it's a jump to a different location inside the same function.&lt;/P&gt;
&lt;P&gt;From here, I think I might see what kind of questions pop up, and go from there. I'll also add a future entry to describe how to use chained unwind information to allow you to save registers later in the function that the first 254 bytes (although I _think_ this requires that you need to author your own .pdata &amp;amp; .xdata which is a whole lot more complicated...)&lt;/P&gt;
&lt;P&gt;-Kev&lt;/P&gt;&lt;/FONT&gt;&lt;img src="http://blogs.msdn.com/aggbug.aspx?PostID=509372" width="1" height="1"&gt;</description><category domain="http://blogs.msdn.com/freik/archive/tags/Developer+Info/default.aspx">Developer Info</category><category domain="http://blogs.msdn.com/freik/archive/tags/X64+ABI+Info/default.aspx">X64 ABI Info</category></item></channel></rss>