Why are base class calls from anonymous delegates nonverifiable?

Why are base class calls from anonymous delegates nonverifiable?

  • Comments 18

I'm still learning my way around the C# codebase – heck, I'm still learning my way around the Jscript codebase and I've been working on it for nine years, not nine weeks. Here's something I stumbled across while refactoring the anonymous method binding code last week that I thought might be interesting to you folks.

Consider this simple program:

using System;
public delegate void D ();
public class Alpha {
  public virtual void Blah() {
    Console.WriteLine("Alpha.Blah");
  }
}
public class Bravo : Alpha {
  public override void Blah() {
    Console.WriteLine("Bravo.Blah");
    base.Blah();
  }
}

Pretty straightforward so far. Any virtual call to Blah on an instance of Bravo does a non-virtual call to the base class implementation. Now let's do a similar thing from an anonymous delegate in Bravo:

  public void Charlie() {
    int x = 123;
    D d = delegate {
      this.Blah();
      base.Blah();
      Console.WriteLine(x);
    };
    d();
  }

When you compile this thing up you get a crazy-sounding warning:

warning CS1911: Access to member 'Alpha.Blah()' through a 'base' keyword from an anonymous method or iterator results in unverifiable code. Consider moving the access into a helper method on the containing type.

And indeed, if you compile this up and run it through PEVERIFY.EXE, sure enough you'll get an unverifiable code warning. Unverifiable code requires full trust and is generally to be avoided – what is going on here?

This is an artefact of the way that C# realizes anonymous delegates. The anonymous delegate above captures both this and x, and therefore we actually generate code that would look something like this if you decompiled it:

public class Bravo : Alpha {
  public override void Blah() {
    Console.WriteLine("Bravo.Blah");
    base.Blah();
  }
  private class __locals {
    public int __x;
    public Bravo __this;
    public void __method() {
      this.__this.blah();
     
__nonvirtual__ ((Alpha)this.__this).Blah());
      Console.WriteLine(this.__x);
    }
  }
  public void Charlie() {
    __locals locals = new __locals();
    locals.__x = 123;
    locals.__this = this;
    D d = new D(locals.__method);
    d();
  }
}

Of course there is no such thing in C# as __nonvirtual__, so really there is no way to decompile this thing back into real C#.  Which is exactly the problem! What's happening here is that we are doing a non-virtual call on a this reference which is not the "real" this reference of the call.

That's a pretty suspicious programming practice.  People who build object hierarchies that manipulate sensitive information have a reasonable expectation that the only way to call a base class implementation of a virtual method is from the derived class itself, not from some third-party method.  Therefore the CLR treats this as unverifiable code, and therefore we have to issue a warning. (I suppose the CLR team could have made an exception for nonvirtual references from classes scoped to within the class in question, or for that matter, classes from the same assembly, but they didn't.)

Now, I was refactoring the code that generates the captured variable class, and so of course I wanted to write a few additional unit tests to make sure that I wasn't breaking anything. When I ran into this warning the first unit test that I wrote was actually a somewhat simplified version of the above:

  public void Charlie() {
    D d = delegate {
      this.Blah();
      base.Blah();
    };
    d();
  }

This issues the warning above, but when I ran the generated executable through PEVERIFY I was surprised to discover that it verified just fine.

As it turns out, the compiler is clever about this one. It sees that the only captured variable is the this reference, and therefore it can optimize away the locals class.  For this case it simply generates the anonymous method as just another method on the Bravo class. Since this is a method of Bravo, not a special locals class, the nonvirtual call is on the real this reference, so it is verifiable.

We decided to issue the warning even in this case because we thought that it would make C# seem really weird and brittle to suddenly start issuing the warning when you add an additional outer local variable reference to the anonymous delegate. Even when this doesn't actually generate nonverifiable code, it's a good idea to get in the habit of creating a helper method on the real class that does the nonvirtual access.

Of course, all of the stuff above is implementation details. You cannot rely upon future versions of C# to continue to generate anonymous methods in this manner. We probably will, but who knows what new features will be added to the CLR that might make it possible to not generate a bunch of hidden classes behind the scenes to do this work? Please do not attempt to do anything silly like reflecting upon the class to discover the hidden nested classes and use them; you're just asking for future pain if you do.

  • But if you know the solution (create a helper method) and the compiler's going to soo much work to magically generate invisible classes and suchlike to make everything just work the way you'd expect, why on *earth* doesn't it just generate the helper method for you? The security concerns would be mitigated by making that helper method internal so that only code in the same assembly could call it, and since you (csc.exe) have control over the entire assembly you could make sure no code in the assembly tries it. Somebody with fulltrust could reflect into it, but somebody with fulltrust can do much more harmful things already.

    So why doesn't the compiler "just fix it"?
  • Seems to me that the guideline should be: Don't have ANY real code in any anonymous delegates, they exist merely to give you a closure that carries the local scope through. Doing anything else is a code-smell as it'll lead to code duplication and odd coupling.
    So yeah, keep the warning. If I had my way, it would gripe for anything other than a direct call to a method passing the scoped values along, but that's just me.
  • Good idea Stuart -- I'll run that by the rest of the compiler team and perhaps we can fix this for a future release.
  • I just followed up with the compiler lead. We _do_ already have this as a potential work item for the next release.

    The reason that we did not do this in the 2.0 release is because the change to the verifier was made very late, and we considered the risk of changing the code generator to be too large so close to ship. Creating a new warning is much lower risk than changing codegen late in the game.
  • Marc, I disagree that anonymous delegates should be limited to a single method call. The right unit of code to put in an anonymous delegate is no different than the right unit of code to put in an if statement or a for loop; it only needs to be broken out into a separate method if it's going to be called from more than one place in the code or if it's a reasonable standalone chunk of code that makes more sense that way. Forcing a once-off chunk of code to be extracted into a method just because you're using an anonymous delegate hurts readability and maintainability far more than it helps.
  • It makes me really affraid. It makes me crazy. It makes me sad.

    Don't you think non-verifiable code should be available with /unsafe option only?

    IMHO, it should be treat as 'error' but not 'warning'. And it was reasonable for C# Compiler Team to do it before November 7, not after.
  • I do think that in an ideal world it would be impossible to generate unverifiable code accidentally. However, we do not live in that world. We live in a world where we had to choose one of the following:

    1) roll back the verifier change, thereby allowing potentially unsafe code to be blessed by the verifier.

    2) do nothing, thereby silently generating unverifiable code

    3) generate an error, thereby making the C# 2.0 implementation incompatible with the published specification, which certainly does not call out making a nonvirtual call from a delegate as illegal.

    4) generate an error and change the specification.

    5) Change the code generator, risking destabilizing all of Whidbey and putting the ship date into jeopardy.

    6) generate a warning and fix it in the next version.

    None of these are great choices, but these are the choices you have to make when you ship software at this level -- thousands of people working on millions of lines of code that ships to millions of people. I strongly believe that (6) was the right choice for the C# team to make for our customers.


  • Eric, option (6) is not more compatible with C# specification than option (3).

    With current behavior Microsoft breaks 'unsafe' part of specification. As you can see in specification, unsafe features are available in unsafe context only.

    Do you agree? I hope you do.

    So in current implementation, C# Compiler Team just hiddenly breaks specification without clear notice.

    I understand that sometimes there is no time to complete product to date predefined. But it will be much better to report error with clear notice: 'Just now we cannot be compatible with specification because of lack of time'. But instead they push instable version to market.

    It is really correct word: instable. How can customers to debug unsafe-related errors? What do you think about partially trusted scenarios? In partially-trusted code UNSAFE IS STRONGLY PROHIBITED! This means ClickOnce may fail sometimes with hard-discoverable cause: some acrobatic anonymous method usage. What percent of C# programmers have enough degree to correctly understand this 'behavior'?

    I am Developer Security MVP, so Microsoft platform is something I am care about. It is why I am pay so many attention to this behavior. This flame is not because I am Linux guy and like to criticize Windows. I'd like to make it better :-)
  • Themes == Oleg Mihailik. I don't know why it call me 'Themes' :-)
  • > So in current implementation, C# Compiler Team just hiddenly breaks specification without clear notice.

    A warning that says "this is unverifiable, here is how to work around it" is hardly what I would call "without clear notice"! Developers like yourself who care about verifiability should now have enough information to diagnose and fix the problem.


  • Eric, I dont think verifiability is thing that is important only for developers like myself. It is important for most developers even if they don't care about explicitly.

    About 'clear notice'. There are some details that makes that notice not so clear.

    First detail is that is not error but warning. As you know, sometimes developer have no time to clean his/her product from warnings. Especially from warnings that is not clearly understood.

    Other detail is the term 'unverifiable'. 'Unferifiable' sounds more safe than well-known 'unsafe' for C# programmers. Let's ask some common C# programmer 2 questions:

    "Is it important for your program to be safe?" and
    "Is it important for your program to be verifiable?"

    I predict it will be about 90% / 30%. So this questions makes programmer feels false safety.

    And other detail. That warning says nothing about C# specification break. Don't you think it makes notice unclear? Even you did not found this break clearly until I explicitly claimed it in comments.
  • Offtopic; sorry.

    Eric, you used to work on WSH, didn't you?

    It's a bit late in the game to ask, but why was the decision made to make simple, common scriping tasks so horribly painful? Say, opening a text file and writing to it, or running a child process? Was it an NIH thing, or was there nobody on the team who'd ever used an interpreted language, or did somebody decide that it was more important to be orthogonal than usable?

    It's a real misery, just writing to a text file: First you create a FileSystemObject. Then you call GetFile() on it. If the file doesn't exist, you create it. Once the file exists, you open a TextStream on it (or try, anyway: All it gives me is "Object doesn't support this property or method", though the file on the disk has certainly been created). You waste fifteen minutes digging around in MSDN and fighting with the debugger. It's crazy.

    I mean, it's much easier to do the same job in K&R C. The usual thing is that in a higher-level language, everyday tasks get easier.

    The obvious solution is just to use Perl, but that's not widely installed.

    So what happened?
  • I agree that the scripting object models are in general a bit of a mess.

    There are many, many reasons why the scripting object models ended up far from the way I would have liked them to be. The two big ones are:

    * simply not enough manpower. The scripting dev team was I think three devs at the time and we had a lot on our plates. We made an explicit decision to leave building kick-ass object models to the WMI team and concentrate on language tools.

    * bad coordination -- WSH was actually developed by MSN Japan! We took it over very late in the game, after most of the code had been written and shipped to customers. We got stuck with what I consider to be a not particullarly well-factored and very ad-hoc object model. You go to war with the army you've got -- MSN Japan had already shipped bits and customers were using them, so we inherited a backwards-compat problem

    A third reason:

    * scripting has many masters. Should we have invested in making the languages themselves suitable for spawning processes and writing files, or should that have been left to object models? If the former, we have to justify the expense to the IE team, our #1 customer. The expense isn't that the IE team is spending their budget, the expense is that we are spending time developing features that actively destroy the IE security model and provide zero benefit, when we could be adding new security features to the browser.

    Could the object models have been better designed? Yes, but ultimately you run into some pretty strong limitations of object models. Microsoft took a long time to discover this, and that's why I'm so excited about LINQ -- finally we get to build data stuff in a highly general way into the language itself, not into some clunky object model.

    But could we have done something like LINQ in scripting? I don't think so, not without abandoning other important goals.

    Should we have had more staff? yeah, probably, but that's certainly not my decision to make!



  • Eric,

    Point well taken about IE security, but doesn't IE now simply refuse to instantiate arbitrary COM objects?

    Aside from that, I guess the development model was a problem: The "APIs" I'm comparing it to were all "designed" in a much more leisurely and and-hoc way, with a lot more time to fiddle around and discard ideas that didn't wear well, with no real worries about backwards compatibility or any of the other crap MS has to deal with.

    Warts and all, JS in WSH is still really cool. And WSH itself is really cool, too: Windows Script Components, for example.

    But anyhow, thank you for a very thoughtful, informative, and polite response to a needlessly obnoxious question.
  • This is a bit pointless. The anonymous methods are supposed to help us write less code. In the end, I simply duplicated a lot of redundant code. Imagine a class that inherits a typed datasets table adapter that you would like to take over it's update methods so that they call other methods within a transaction before updating through the base commands. I was able to pass an anonymous method for each update signature to a central method that accepted my delegate. It did all the other necessary work within a transaction and then invoked the anonymous method. Simple, but now that I need these helper classes, I am rite back where I started with loads of redundant code to work with. Helper classes don't help me, they just help the compiler. What do I care about them?

Page 1 of 2 (18 items) 12