Subtleties of C# IL codegen

Subtleties of C# IL codegen

Rate This
  • Comments 23

It must be CLR week over at The Old New Thing because it's been non-stop posts about C# lately. Raymond's last two technical posts have been about null checks and no-op instructions generated by the jitter when translating IL into machine code.  

I'll comment on both posts here, but I want to get the no-op discussion done first, because there are some subtleties here I believe that Raymond's statement that the jitter does not generate no-ops when not debugging is not entirely correct. This is not a mere nitpick -- as we'll see, whether it does so or not actually has semantic relevance in rare stress cases.

Now, I'll slap a disclaimer of my own on here: I know way more about the compiler than the jitter/debugger interaction. This is my understanding of how it works. If someone who actually works on the jitter would like to confirm my and Raymond's interpretation of what we see going on here, I'd welcome that.

Before I get into the details, let me point out that in the C# compiler, "debug info emitting on/off" and "IL optimizations on/off" are orthogonal settings. One controls whether debug info is emitted, the other controls what IL the code generator spits out. It is sensible to set them as opposites but you certainly do not have to.

With optimizations off, the C# compiler emits no-op IL instructions all over the place.  With debug info on and optimizations off, some of those no-ops will be there to be targets of breakpoints for statements or fragments of expressions which would otherwise be hard to put a breakpoint on.

The jitter then cheerfully turns IL no-ops into x86 no-ops. I suspect that it does so whether there is a debugger attached or not.

Furthermore, I have not heard that the jitter ever manufactures no-ops out of whole cloth for debugging purposes, as Raymond implies. I suspect -- but I have not verified -- that if you compile your C# program with debug info on AND optimizations on, then you'll see a lot fewer no-ops in the jitted code (and your debugging experience will be correspondingly worse). The jitter may of course generate no-ops for other purposes -- padding code out to word boundaries, etc.

Now we come to the important point: It is emphatically NOT the case that a no-op cannot affect the behaviour of a program, as many people incorrectly believe.

In C#, lock(expression) statement is a syntactic sugar for something like

temp = expression;
System.Threading.Monitor.Enter(temp);
try { statement } finally { System.Threading.Monitor.Exit(temp); }

The x86 jitter has the nice property that the code it generates guarantees that an exception is never thrown between the Enter and the try. This means that the finally always executes if the lock has been taken, which means that the locked resource is always unlocked.

That is, unless the C# compiler generates a no-op IL instruction between the Enter and the try! The jitter turns that into a no-op x86 instruction, and it is possible for another thread to cause a thread abort exception while the thread that just took the lock is in the no-op. This is a long-standing bug in C# which we will unfortunately not be fixing for C# 3.0.

If the scenario I've described happens then the finally will never be run, the lock will never be released and hey, now we're just begging for a deadlock.

That's the only situation I know of in which emitting a no-op can cause a serious semantic change in a program -- turning a working program into a deadlocking one. And that sucks.

I've been talking with some of the CLR jitter and threading guys about ways we can fix this more robustly than merely removing the no-op. I'm hoping we'll figure something out for some future version of the C# language.

As for the bit about emitting null checks: indeed, at the time of a call to an instance method, whether virtual or not, we guarantee that the object of the call is not null by throwing an exception if it is. The way this is implemented in IL is a little odd. There are two instructions we can emit: call, and callvirt. call does NOT do a null check and does a non-virtual call. callvirt does do a null check and does a virtual call if it is a virtual method, or a non-virtual call if it is not.

If you look at the IL generated for a non-virtual call on an instance method, you'll see that sometimes we generate a call, sometimes we generate a callvirt. Why? We generate the callvirt when we want to force the jitter to generate a null check. We generate a call when we know that no null check is necessary, thereby allowing the jitter to skip the null check and generate slightly faster and smaller code.

When do we know that the null check can be skipped? If you have something like (new Foo()).FooNonVirtualMethod() we know that the allocator never returns null, so we can skip the check. It's a nice, straightforward optimization, but the realization in the IL is a bit subtle.

  • The JIT does emit different code when a debugger is attached.  I don't know if that specifically has an effect on no-ops; but it wouldn't be surprising.  It would be fairly easy to see if the assembly was NGENed or attached to by the debugger after the code had been run outside the debugger.

  • >>This means that the finally always executes if the lock has been taken,

    >>which means that the locked resource is always unlocked.

    Not always, if it's in a background thread then finally might not be called if the process ends (all forground threads have ended). I found that out the hard way when I had a finally that sometimes never completed in some code I wrote :-)

    Great blog by the way!

    Regards

    Lee

  • i'm sure there is a legitimate reason that the Monitor.Enter(temp) can't go in the try. but i don't know what it is. so ... why not? it would seemingly make sense.

  • Mikey: because if the Monitor.Enter were in the try block and it caused an exception, the finally block would always get executed.  You can only assume that if Monitor.Enter threw an exception that it didn't lock.  If it didn't lock then there's no reason to enter the finally block to ensure that Monitor.Exit is called.  So, it's outside the try block.

  • peter: true. oops.

    but surely there is a reasonably sensible way to handle that in the try? (i.e. wrap it in an inner try, perhaps, [not pretty i suppose]) or check if you received the lock before exiting.

  • Mikey: sure, you can skip the whole C# lock keyword and do whatever you want with Monitor.Enter, Monitor.Exit, try and finally; adding any number of check you want. But, what would be the point?  Likely you'd want it so it's debug-only; so you've got that complexity, add that with all the other complexities can you guarantee that all instances of that code will be fault free?

    Keep in mind, the scenario that Eric discusses will only occur on a debug built on a multi-processor computer and two threads have to be executing the same instruction (essentially) at the same time.  That's an extremely rare occurrence.  Yes, it might happen; but I wouldn't suggest changing your code to compensate for it.  Debugging multithreaded code has many other problems.

  • > or check if you received the lock before exiting.

    As I said, we're talking with the CLR guys to try to do something like that. For example, we could have a version of Enter which reports whether the lock was taken, and then put the Enter in the try. Then "lock(x) statement" would translate to

    bool mustExit = false;

    bool temp = x;

    try{

     Enter(temp, out mustExit);

     statement;

    }

    finally {

    if (mustExit) Exit(temp);

    }

    Enter would have to set the out parameter atomically with taking out the lock.

    However, there are drawbacks to that approach as well.  I'll probably write a blog article about that at some point.

  • Even if you patched up the block to guarantee that the finally block would be entered, isn't there still the potential problem that the same thing could happen in the finally block?

    e.g.:  try { ... } finally { NOP; cleanup; }

    Is it guaranteed that an exception cannot be thrown between the start of the finally block and the first statement of the finally block?  If not, it seems like that would need to be patched as well.

  • Peter: "Keep in mind, the scenario that Eric discusses will only occur on a debug built on a multi-processor computer and two threads have to be executing the same instruction (essentially) at the same time."

    That's not how I read it.  It seemed that Eric was saying that the NOP can occur even when not in debug mode.  It also seems that the problem wasn't from two threads executing the same code, but instead, that one thread is attempting to take the lock when another thread kills it.  This is probably (hopefully) still rare, but not as rare as what you're describing.

  • Derek: to be clear, yes you can get NOPs to be emitted in release mode; but you'd have to disable optimizations.  Optimizations by default are only disabled for debug mode.

    To correct myself: it's not that the two threads would be executing the same instruction at the same time, it's that one thread would need to be executing the NOP after Monitor.Enter (the try block begins at the instruction after that, which may also be a nop, i.e. the next instruction is not a "try" instruction) and the other thread would have to call that thread's Abort method while the NOP instruction was being executed.  I would think that would be even more rare.

  • Eric: could the .try directive simply not include the NOP following the Monitor.Enter to solve the problem?  

    This is what I'm seeing in IL:

       L_0011: call void [mscorlib]System.Threading.Monitor::Enter(object)

       L_0016: nop

       L_0017: nop

       L_0018: ldc.i4.1

       L_0019: stloc.1

       L_001a: nop

       L_001b: leave.s L_0025

       L_001d: ldloc.2

       L_001e: call void [mscorlib]System.Threading.Monitor::Exit(object)

       L_0023: nop

       L_0024: endfinally

       L_0025: nop

      .try L_0017 to L_001d finally handler L_001d to L_0025

    What would be the debugging consequences of changing it to:?

      .try L_0016 to L_001d finally handler L_001d to L_0025

  • peter:

    do you mean trying it to L_0018 instead of L_0016? otherwise you're including two nops. also, wouldn't the fact that you aren't including the nops mean you can no longer put a break point on the start of the

    try {

    statement?

    i think a nop before a try is valid and fine; it just seems that the generated Monitor.Enter should be within the try, with a 'achievedLock' boolean result from .Enter.

  • Mikey: Yes, L_0016, not L_0018--which would include both the NOPs in the protected region (aka try block).  Yes, I suppose that would not allow you to put a break point at the start of a try block, but Visual C# 2005 doesn't let me do that anyway.  I can put a break point on the open brace, not the try.  Which puts it at the second NOP at the start of the protected region (I'm assuming, this is x86 instructions, there's two x86 NOP instructions just like the IL; I'm assuming there's a one-to-one relationship).  If including both NOPs in the protected region gets around this issue, let's do it, or just get rid of the first NOP.

    lock(someObject)

    { ... }

    is a different issue.  I need to be able to break on the lock statement because I need to break before the call to Monitor.Enter (I can get that with the try because Monitor.Enter is on it's own line).  Also with lock I can break on the open brace; and as with the try the breakpoint on the open brace puts it the second not, at the start of the protected region.

    Now, you might be saying, "well, having both NOPs allows me to fine-tune by breakpoints in the disassembly"; but it doesn't, you already have a much finer-grained ability to break on disassembly instructions, adding a couple of NOPs adds nothing.  In disassembly I can put a breakpoint on the call to Monitor.Enter, or the first instruction in the protected region just as I can with the C# debugger--I see no need to be able to put a breakpoint after the call to Enter but before the protected region (the first NOP), and I don't see a need to put a breakpoint before the first instruction in the protected region but within the protected region (the second NOP).

    Now, as far as I can tell (Eric can correct me or validate this) this is strictly a C# compiler thing (the two NOPs).  I believe he's said he really doesn't no why the NOPs are there and it's his interpretation they're there to provide an instructions to put a breakpoint on, which explains one NOP but not two since the C# debugger only allows you to put a breakpoint on one of them, the one in the protected region--so the first NOP that's causing all the fuss isn't even being used by the debugger.

  • Eric perhaps you can help out with this? Any explanation?

    http://11011.net/archives/000714.html

  • Continuing the theme of Thead.Sleep is a sign of a poorly designed program , I've been meaning to

Page 1 of 2 (23 items) 12