The void is invariant

The void is invariant

Rate This
  • Comments 21

[UPDATES below] 

A while back I described a kind of variance that we’ve supported since C# 2.0. When assigning a method group to a delegate type, such that both the selected method and the delegate target agree that their return type is a reference type, then the conversion is allowed to be covariant. That is, you can say:

Giraffe GetGiraffe() { … }

Func<Animal> f = GetGiraffe;

This works logically because anyone who calls f must be able to handle any animal that comes back. The actual method claims to only return animals, and in fact, makes the stronger claim to only return giraffes.

This works out in the CLR because the bits that make up a reference to an instance of Giraffe are exactly the same bits that make up a reference to that Giraffe interpreted as an instance of Animal. We can allow this magical conversion to happen because the CLR guarantees that it will all just work out without going in there and having to futz around with the bits.

This is why this trick only works with reference types. A method that returns, say, a double cannot be converted via a covariant conversion to a delegate type that expects the method to return an object. Somewhere there would have to be code emitted that takes the returned double and boxes it to object; the bits of a double and the bits of a reference to an object boxing a double are completely different.

But why doesn’t this trick work with void types? Here we have a method that returns some sort of success or failure code. Maybe we don’t care what it returns.

static bool DoSomething(bool b)
{
  if (b) return DoTheThing();
  else return DoTheOtherThing();
}

Action<bool> action = DoSomething;

This doesn’t work. Why not? The caller of the action is not even going to use the returned value, so it doesn’t matter one bit what it is! Shouldn’t “void” be considered a supertype of all possible types for the purposes of covariant return type conversions from method groups to delegate types?

No, and I’ll tell you why.

Consider what happens when you do this:

bool x = DoSomething(true);

We spit out IL that does the following:

(1) put true on the IL stack – the stack gets one deeper
(2) call DoSomething – the argument is removed from the stack and the return value is placed on the stack.  Net, the stack stays the same size as before
(3) stuff whatever on top of the stack into local variable x – the stack now returns to its original depth.

Now consider what happens when you do this:

DoSomething(true);

We spit out IL that does the first two steps as before. But we cannot stop there! There is now a bool on the IL stack which needs to be removed. We generate a pop instruction to represent the fact that the returned bool has been discarded.

Now consider what happens when you do this:

action(true);

The compiler believes that action is a void-returning method, so it does not generate a pop instruction. If we allowed you to stuff DoSomething into the action, then we would be allowing you to misalign the IL stack!

But didn’t I say “the stack is an implementation detail?” Yes, but that’s a different stack. The CLI specification describes a “virtual machine” which passes around arguments and returned values on a stack. An implementation of the CLI is required to make something that behaves like the specified machine, but it is not required to do so in any particular manner. It is not required to use the million-bytes-per-thread stack supplied to each thread by the operating system as its implementation of the IL stack; that’s a convenient structure to use, of course, but it’s an implementation detail that it does so.

(As an aside: when we implemented the script engines, we also first specified our own private stack-based virtual machine. When we implemented it, we decided to put the information about “return addresses” – that is, “what code do I run next?” on the system stack, but we put arguments and return values of script functions in a stack-shaped block of memory that we allocated on our own. This made building the JScript garbage collector easier.)

In practice, the jitter uses the system stack for some things and registers for other things. Return values are actually often sent back in a register, not on the stack. But that implementation detail doesn’t help us out when deciding what the conversion rules are; we have to assume that the implementation can do no more than what the CLI specification says. Had the CLI specification said “the returned value of any function is passed back in a ‘virtual register’” rather than having it pushed onto the stack, then we could have made void-returning delegates compatible with functions that returned anything. You can always just ignore the value in the register. But that’s not what the CLI specified, so that’s not what we can do.

[UPDATE]

A number of people have asked in the comments why we do not simply generate a helper method that does what you want. That is, when you say

Action<bool> action = DoSomething;

realize that as

static void DoSomethingHelper(bool b)
{
   bool result = DoSomething(b); // result is ignored
}
...
Action<bool> action = DoSomethingHelper;

We could do that. But where would you like the line to be drawn? Should you be able to assign a reference to a method that returns an int to a Func<Nullable<int>>? We could spit a helper method that converts the int to a nullable int. What about Func<double>? We could spit a helper method that converts the int to a double. What about Func<object>? We could spit a helper method that boxes the int, unexpectedly allocating memory off the heap every time you call it. What about a Func<Foo> where there is a user-defined implicit conversion from int to Foo?

We could be spitting arbitrarily complex fixer-upper methods that would seamlessly "do what you meant to say", and we have to stop somewhere. The exact semantics of what we do and do not fix up would have to be designed, specified, implemented, tested, documented, shipped to customers and maintained forever. Those are costs. Plus, every time we add a new conversion rule to the language we add breaking changes. The costs of those breaking changes to our customers have to be factored in.

But more fundamentally, one of the design principles of C# is "if you say something wrong then we tell you rather than trying to guess what you meant". JScript is deliberately a "muddle on through and do the best you can" language; C# is not. If what you want to do is make a delegate to a helper method then you express that intention by going right ahead and making that method.

  • that's one deep post about the type system. I have never analyzed it in so much depth although i run into the problem before with action<int> (btw i allways wandered if int is the most commonly used type ever and i think it is) trying to do the same thing... The compiler yelled at me so I considered aaah void is not a supertype of int me bad, never even thinking about the stack

    now i am one thing smarter:)

    thanks for that

    luke

  • This seems like an implementation detail. In the end, the actual stack that is used when executing code is native stack, not IL stack - and it seems like CLR itself could insert cleanup calls for cases where a non-void method would be called via a void delegate.

    In fact, it seems that it wouldn't even need any cleanup on x86 for most cases, given that in native code the return value is just EAX for all primitives <= sizeof(int) and all reference types, which can be safely ignored. So it would only have to do additional setup for returning value types and such.

    Of course, when looked that way, it's not a C# limitation - it's a CLR limitation.

  • why not to translate it as

    Action<bool> action = x => DoSomething(x);

  • Another reason why the unit approach from ML used in F# is superior to the strange special-case void keyword =)

    It would allows syntax like C++'s "void return" too:

       void foo() { return; }

       void bar() { return foo(); }

    Whereas trying the same thing in C# gives error CS0127:

           private void Foo()

           {

               return;

           }

           private void Bar()

           {

               return Foo();

           }

  • Not to nitpick,

    Really? -- Eric

    but shouldn't "member group" be "method group" in the sentence "When assigning a member group to a delegate type"? Also, in the same sentence you say "selected member" and I think that should be "selected method"

    Yes. I make this mistake all the time; if you read back in the blog carefully you'll see that I call out this mistake here and there. The reason is because the specification says "method group" but the structures in the compiler say "member group", and I read the compiler sources more often than I read the spec. The data structures in the compiler can represent both methods and propery accessors, so "member group" is a reasonable name. I also have a bad habit of saying "type variable" when I mean "type parameter" for the same reason. The compiler and the spec were written at the same time; the compiler authors had to guess what the spec authors were going to choose as the jargon, and sometimes they guessed wrong. One of these days I'll clean all those up. -- Eric

  • It looks like you meant

    Action<void> action = DoSomething;

    where you wrote

    Action<bool> action = DoSomething;

    ... as written, the code compiles just fine.

  • CPDaniel: I don't have Visual Studio on me to check, but your comment puzzles me. He didn't mean Action<void> because that would mean a method that accepts a void parameter - which is invalid any way. Eric was talking about return value covariance, where the Func<bool, bool> becomes a Action<bool>, returning void instead of a bool.

  • Fascinating as always :) Do you personally know of any good textbooks on this kind of thing - type systems, lambda calculus-y goodness? (The obvious SICP notwithstanding.) I'd love to flesh out my knowledge more on the theory behind languages and compilers.

    "The Little Schemer" uses a fun question-and-answer approach to learning about CS theory via lambdas. But I recommend asking this question on stackoverflow; you'll get a lot higher quality of responses. -- Eric

  • "Shouldn’t “void” be considered a supertype of all possible types for the purposes of covariant return type conversions from method groups to delegate types?"

    I see no reason why it should. It doesn't work for normal delegates, so I can't imagine why it should for Func<T> or Action<T>.

  • configurator:

    Yes, you're right - I wasn't paying attention!  The void return type is implied by the Action<T> generic delegate type.

  • An interesting insight into the internals of how the compiler treats delegates and method group conversions to delegates. One question though, couldn't the compiler still allow this by automatically wrapping the method in a dynamically generated lambda, like so:

     Action<bool> action = a => DoSomething(b);

    Aside from the potential performance implications of the double indirection, is there a non-obvious reason why this could be problematic?

  • Leopold,

    As previously pointed out, it is not just a simple "double indirection". Right in the middle code must exist which cleans up the return value of "DoSomething". When using the technique explicitly, the developer should be aware of this. If the compiler implemented this "automatically" (appearing to be co-variant) then a "secret" overhead would exist.

    At least that is why I am glad that the compiler does not do this automatically....

  • A secret overhead already exists in many places in the JIT - for example, covariance and contravariance of delegate types when interfaces are involved cannot be free - it has to do some pointer shifting and such.

    Even C++ has that - compile a code sample with covariant return types of virtual functions with virtual base classes with /FA and see what dances the compiler has to do to make it all work...

  • Eric, yes, really. How will I learn without asking these annoying questions all the time? Maybe I misunderstood something and a member group is actually something else? Now you've assured me that I understood your post correctly - and gave me a little insight into the compiler stuff.

    Do this member groups represent all properties (as in, a setter and a getter is the same group), or overloaded indexers, or something else that I haven't thought of? (I realize there can be a group with exactly one member and there would always be a group. I'm asking when is there more than one member in the group?)

  • Very interesting, but I'd second freeborn's question. If a very simple boilerplate code shape can solve the problem, then the technical problem doesn't seem to explain the language design decision.

    But then, if a very simple boilerplate code shape can solve the problem, that's obviously going to push this problem down the list of priorities.

    At one point I was dead keen on the possibility that C# (or the CLR) would one day unify Func<void> and Action, making them just synonyms for exactly the same thing. I even tried to work the whole thing out, for my own amusement... but in fact it makes more sense for them to be separate. Action<...> is for things where order of execution is important because there may be side effects. Func<T, ...> is ideally for side-effect free, order-ignorant computations. So it may be better to keep them separate.

    Similar to the popular debate over a "ForEach" extension method, which also appealed to me until I read what Eric had to say about it. There is probably some value in making actions and functions "look" different in our code.

Page 1 of 2 (21 items) 12