They're not marvelously better, but they include the introduction that I wrote a couple years ago (and have mirrored in this blog twice, now). So, check out http://msdn2.microsoft.com/en-us/library/7kcdt6fy(VS.80).aspx to see the x64 ABI stuff in all it's MSDN formatted glory.
In other news, I presented a talk at the Developer Division 'brown bag' performance series (a lunch time lecture for MS developers) about the performance cost of exception handling. I had never really sat down and looked at how various EH constructs impact performance. I actually came to a rather interesting conclusion: for x86, EH is costly, and really does hurt performance, even if you're primarily using C++ objects with destructors. For x64, however, the performance impact of using C++ objects with destructors and throwing exceptions for errors is far outweighed by the much-less complex flowgraph of a function that doesn't have to deal with both return values & error conditions. I'll be giving a pretty similar talk at the Longhorn SDR next week. (http://www.longhornsdr.com/agenda.htm). I'll be talking in the CF07 slot (which was apparently vacated by someone else with a different talk). I'll be focusing on x64 much more so that x86. Perhaps I'll stick my slide deck up here after the talk...
And finally, I've been approached by about 4 different ISV's over the past few months about various questions about x64 & unwinding. I'm going to spend some of my free time writing a little [useless] JITter to figure out how to do this stuff right. If only I had a 64 bit laptop, I could do this while sitting watching my 2 girls do Tae Kwon Do twice a week...
-Kev
Kevin,
By the way, do you know anything about version #2 of the unwind data - I noticed that VS2005 uses it instead of #1, and there are some changes not only in the unwind data, but in the CEH tables as well.
For example, different C++ specific exception handler is used by VS2005: __CxxFrameHandler3, while it used to be __CxxFrameHandler.
Looking forward for any information from you!
Thanks!
Slava
I'm not sure which version number you're referring to - VS2005 is using version #1 (1-based indexing :-) There is not a 2nd version of this stuff for AMD64 [yet].
The C++ exception handler, however, is a big deal. Version 3 only invokes destructors and calls catch blocks when a non-C++ exception is thrown if you compiled your code -EHa, which basically makes it much more obvious when people really should have used -EHa, but instead used -GX or -EHs/-EHsc.
Thank you for the answers. But I have two more :)
1. Back to chained unwind info:
Suppose that we have two unwind regions:
B1: (starts unwind region #1)
...
B2: (starts unwind region #2)
<shrink-wrap saving of a non-volatile register>
B3:
We can chain region #2 with region #1, so unwind info table for the region #2 will have a pointer to the unwind info table for the region #1. It will result in the following static memory allocation:
PDATA section
RUNTIME_FUNCTION table for region #1:
{
<B1>
<B2>
<pointer to UNWIND_INFO table for region #1>
}
RUNTIME_FUNCTION table for region #2:
<B3>
<pointer to UNWIND_INFO table for region #2>
XDATA section
UNWIND_INFO table for region #1:
<address of exception handler>
<language-specific handler data>
UNWIND_INFO table for region #2:
<chained flag set>
Everything above looks nice according to the ABI. But what if for some purpose we reorder the blocks B1 and B2? Then we will have the following memory allocation:
PDATA section:
XDATA section:
<pointer to UNWIND_INFO table for region #1> <---
Is this OK to have the link marked "<---"? I made some experimentations and got different run-time behaviour for these two cases. Do you have any ideas why this could happen?
2. I know that a NOP should be inserted after a call instruction if it is the last instruction in an unwind region. But I am not sure about the situation when a call is followed by a jump and both instructions are in the same unwind region:
call <foo>
jmp QWORD PTR [eax]
The jump instruction has ModRM byte with mod field value 00, so it is acceptable in a routine's epilog (http://msdn2.microsoft.com/en-us/library/tawsa7cb(VS.80).aspx). Is it possible that the run-time will consider it as an epilog and will not check if the unwind region has a handler associated with it? Overall, could you specify all the cases where an extra NOP is needed?
Thank you,
Slava,
Before answering questions, I thought I'd suggest that you try my e-mail for faster response: kfrei@microsoft.com. Also, I checked your Bio, and discovered you're in Novosibirsk. A coworker of mine is also from there, and I'm interviewing someone else who's originally from Novosibirsk in a couple weeks. Small world (at least when it comes to compilers).
Now to the questions.
#1 regarding runtime differences - I'd really have to see the actual code. I've never seen code that's just straightforward have different behavior based on block layout, but then I've also seen very little code that uses shrink-wrapping be straightforward :-)
#2 The NOP after a call when the call is on a region boundary is not always necessary, but falls into the 'better safe than sorry' category in my book. The NOP is only needed when the unwinder needs to be aware of the difference between unwinding through a 'live' stack frame where the instruction after the call is the return address, and when you're in the function itself, and the call has returned. For unwinding (non-volatile register restoring), this is almost never necessary. I can't actually come up with a single example where it is necessary, actually :-( If you start talking about EH regions, then I can provide examples...
Regaring JMP encoding, that MSDN documentation is actually wrong: Any non-direct jmp, be in [rax], rax, or any other encoding of an indirect JMP can be encoded in an epilogue, but only if it includes the REX prefix. The corollary to that is if you EVER use the REX prefix on a jmp, the unwinder assumes you're in an epilogue, so don't use it unless you're doing a tail-call.
Hope that helps!