Updated ABI docs available + Exception Handling info

Updated ABI docs available + Exception Handling info

  • Comments 9

They're not marvelously better, but they include the introduction that I wrote a couple years ago (and have mirrored in this blog twice, now). So, check out http://msdn2.microsoft.com/en-us/library/7kcdt6fy(VS.80).aspx to see the x64 ABI stuff in all it's MSDN formatted glory.

In other news, I presented a talk at the Developer Division 'brown bag' performance series (a lunch time lecture for MS developers) about the performance cost of exception handling. I had never really sat down and looked at how various EH constructs impact performance. I actually came to a rather interesting conclusion: for x86, EH is costly, and really does hurt performance, even if you're primarily using C++ objects with destructors. For x64, however, the performance impact of using C++ objects with destructors and throwing exceptions for errors is far outweighed by the much-less complex flowgraph of a function that doesn't have to deal with both return values & error conditions. I'll be giving a pretty similar talk at the Longhorn SDR next week. (http://www.longhornsdr.com/agenda.htm). I'll be talking in the CF07 slot (which was apparently vacated by someone else with a different talk). I'll be focusing on x64 much more so that x86.  Perhaps I'll stick my slide deck up here after the talk...

And finally, I've been approached by about 4 different ISV's over the past few months about various questions about x64 & unwinding. I'm going to spend some of my free time writing a little [useless] JITter to figure out how to do this stuff right. If only I had a 64 bit laptop, I could do this while sitting watching my 2 girls do Tae Kwon Do twice a week...

-Kev

Leave a Comment
  • Please add 7 and 8 and type the answer here:
  • Post
  • Kevin,

    Could you explain one thing in unwind procedure? There is said:
    3. If a function table entry is found, RIP can lie within the following three regions:
    ...
    In the prolog
    If the RIP lies within the prolog, then control has not entered the function. There can be no exception handler associated with this exception for this function and the effects of the prolog must be undone to compute the context of the caller function. The RIP is within the prolog if the distance from the function start to the RIP is less than or equal to the prolog size that is encoded in the UNWIND_INFO Structure. The effects of the prolog are unwound by scanning forward through the unwind codes array for the first entry with an offset less than or equal to the offset of the RIP from the function start, then undoing the effect of all remaining items in the unwind code array. Step 1 is then repeated.
    ...

    My question is: after the effect of prolog is undone, what RIP is used to repeat step 1?
    Are there any differences for chained unwind codes?
  • After the prolog is unwound, the return address sitting on the stack is used as a new RIP, and the process resumes with the calling function.

    Chained unwind just impacts how unwind occurs.  We still unwind each function, but each function may require multiple xdata pieces to completely unwind.
  • Kevin,

    Assume that we have the following test:
    void
    foo(int arg)
    {
       try {
           <try code>
       }
       catch (...) {
           <catch code>
       }
    }

    For which the following code is generated:
    L0:
       <prolog that saves some non-vol registers, but r12>
    L1:
       #start of try block
       <save of r12>
       <try code>
       <reload of r12>
       #end of try block
    L2:
       <epilogue>
    L3:
       <prolog of catch handler>
       <catch code>
       <epilogue of catch handler>
    L4:

    I'm interested in what unwind info should be generated for the try block.

    I mean, we should have xdata for the following regions <L0, L1>, <L1, L2> and <L2, L3> (let's leave out unwind info for catch handler's code). Next, region <L1, L2> (try block) must be chained to the region <L0, L1> to properly unwind the effect of the prolog. So, the xdata entry for the region <L1, L2> will have UNW_CHAINED_INFO flag set and UNW_FLAG_EHANDLER flag unset. Then, if an exception occurs in the try block, according to the unwinding procedure, the system will assume that the try block has no associated exception handler and will just process <L1, L2>'s and <L0, L1> unwind codes.

    Am I rigth?

    Does this mean that we cannot use chained info for the regions that have handlers associated with them?

    If I'm wrong, please, describe all unwinding steps that will be taken by the system.

    Thank you!
  • Chained regions cannot have handlers associated with them.  Only parent PData entries can have handlers.  The unwinder looks up the pdata chain to find out if you have a handler, though, so you can have chained regions & EH regions together.  This is a little beyond the scope of a comment - perhaps I'll elaborate on this in a future post...
  • Then, I suppose that step 3.b of unwind procedure should be changed from:
    • Case b) If the RIP lies within the prologue, then control has not entered the function, there can be no exception handler associated with this exception for this function, and the effects of the prolog must be undone to compute the context of the caller function. The RIP is within the prolog if the distance from the function start to the RIP is less than or equal to the prolog size encoded in the unwind info. The effects of the prolog are unwound by scanning forward through the unwind codes array for the first entry with an offset less than or equal to the offset of the RIP from the function start, then undoing the effect of all remaining items in the unwind code array. Step 1 is then repeated.

    To something like that:
    ... The RIP is within the prolog if the distance from the function start to the RIP is less than or equal to the prolog size encoded in the MAIN PARENT unwind info...

    Where MAIN PARENT uwnind info is one from the chain of parent unwind info structures with unset UNW_CHAINED_INFO flag.

    Slava
  • Kevin,

    By the way, do you know anything about version #2 of the unwind data - I noticed that VS2005 uses it instead of #1, and there are some changes not only in the unwind data, but in the CEH tables as well.

    For example, different C++ specific exception handler is used by VS2005: __CxxFrameHandler3, while it used to be __CxxFrameHandler.

    Looking forward for any information from you!

    Thanks!

    Slava

  • I'm not sure which version number you're referring to - VS2005 is using version #1 (1-based indexing :-)  There is not a 2nd version of this stuff for AMD64 [yet].

    The C++ exception handler, however, is a big deal.  Version 3 only invokes destructors and calls catch blocks when a non-C++ exception is thrown if you compiled your code -EHa, which basically makes it much more obvious when people really should have used -EHa, but instead used -GX or -EHs/-EHsc.

  • Kevin,

    Thank you for the answers. But I have two more :)

    1. Back to chained unwind info:

       Suppose that we have two unwind regions:

       B1: (starts unwind region #1)

           ...

       B2: (starts unwind region #2)

           <shrink-wrap saving of a non-volatile register>

           ...

       B3:

           ...

       We can chain region #2 with region #1, so unwind info table for the region #2 will have a pointer to the unwind info table for the region #1. It will result in the following static memory allocation:

       PDATA section

       RUNTIME_FUNCTION table for region #1:

       {

           <B1>

           <B2>

           <pointer to UNWIND_INFO table for region #1>

       }

       RUNTIME_FUNCTION table for region #2:

       {

           <B2>

           <B3>

           <pointer to UNWIND_INFO table for region #2>

       }

       XDATA section

       UNWIND_INFO table for region #1:

       {

           ...

           <address of exception handler>

           <language-specific handler data>

       }

       UNWIND_INFO table for region #2:

       {

           <chained flag set>

           ...

           <B1>

           <B2>

           <pointer to UNWIND_INFO table for region #1>

       }

       Everything above looks nice according to the ABI. But what if for some purpose we reorder the blocks B1 and B2? Then we will have the following memory allocation:

       PDATA section:

       RUNTIME_FUNCTION table for region #2:

       {

           <B2>

           <B3>

           <pointer to UNWIND_INFO table for region #2>

       }

       RUNTIME_FUNCTION table for region #1:

       {

           <B1>

           <B2>

           <pointer to UNWIND_INFO table for region #1>

       }

       XDATA section:

       UNWIND_INFO table for region #2:

       {

           <chained flag set>

           ...

           <B1>

           <B2>

           <pointer to UNWIND_INFO table for region #1>    <---

       }

       UNWIND_INFO table for region #1:

       {

           ...

           <address of exception handler>

           <language-specific handler data>

       }

       Is this OK to have the link marked "<---"? I made some experimentations and got different run-time behaviour for these two cases. Do you have any ideas why this could happen?

    2. I know that a NOP should be inserted after a call instruction if it is the last instruction in an unwind region. But I am not sure about the situation when a call is followed by a jump and both instructions are in the same unwind region:

       call <foo>

       jmp QWORD PTR [eax]

    The jump instruction has ModRM byte with mod field value 00, so it is acceptable in a routine's epilog (http://msdn2.microsoft.com/en-us/library/tawsa7cb(VS.80).aspx). Is it possible that the run-time will consider it as an epilog and will not check if the unwind region has a handler associated with it? Overall, could you specify all the cases where an extra NOP is needed?

    Thank you,

    Slava

  • Slava,

    Before answering questions, I thought I'd suggest that you try my e-mail for faster response:  kfrei@microsoft.com.  Also, I checked your Bio, and discovered you're in Novosibirsk.  A coworker of mine is also from there, and I'm interviewing someone else who's originally from Novosibirsk in a couple weeks.  Small world (at least when it comes to compilers).

    Now to the questions.  

    #1 regarding runtime differences - I'd really have to see the actual code.  I've never seen code that's just straightforward have different behavior based on block layout, but then I've also seen very little code that uses shrink-wrapping be straightforward :-)

    #2 The NOP after a call when the call is on a region boundary is not always necessary, but falls into the 'better safe than sorry' category in my book. The NOP is only needed when the unwinder needs to be aware of the difference between unwinding through a 'live' stack frame where the instruction after the call is the return address, and when you're in the function itself, and the call has returned.  For unwinding (non-volatile register restoring), this is almost never necessary.  I can't actually come up with a single example where it is necessary, actually :-(  If you start talking about EH regions, then I can provide examples...

    Regaring JMP encoding, that MSDN documentation is actually wrong:  Any non-direct jmp, be in [rax], rax, or any other encoding of an indirect JMP can be encoded in an epilogue, but only if it includes the REX prefix. The corollary to that is if you EVER use the REX prefix on a jmp, the unwinder assumes you're in an epilogue, so don't use it unless you're doing a tail-call.

    Hope that helps!

Page 1 of 1 (9 items)