Lambda Expressions vs. Anonymous Methods, Part Three

Lambda Expressions vs. Anonymous Methods, Part Three

Rate This
  • Comments 16

Last time I said that I would describe a sneaky trick whereby you can get variable type inference out of a lambda by using method type inference.

First off, I want to again emphasize that the reason we are adding so many type inferencing features to C# 3.0 is not just because it is convenient to reduce the redundancy currently required. That's a nice-to-have feature, but certainly not a must-have feature. No, the reason why we are adding so many type inferencing features is because we are adding anonymous types. (We are adding anonymous types because they make writing queries so much more pleasant.) Since an anonymous type is, by definition, nameless, you need to be able to infer the type of anything which would otherwise have to be declared with a type name.

But if the type of a lambda expression comes from its target type, and the target type has to be named in the declaration, how can you create a lambda expression which, say, returns an anonymous type?

var f = (Customer c)=>new {c.Name, c.Age, c.Address};

We have no way of saying in any type declaration what the type of the return is.  The best we can do is the incredibly weak

Func<Customer, object> f = (Customer c)=>new {c.Name, c.Age, c.Address};

Yuck, all the type information about the tuple has been lost. We can do better than this.

The trick here is that we have extended the type inferencing algorithm on methods so that generic method type variables can be inferred from the return types of lambdas. Suppose we have this little helper identity function:

Func<A, R> MakeFunction<A, R>(Func<A, R> f) { return f; }

Now look what happens when we say

var f = MakeFunction((Customer c)=>new {c.Name, c.Age, c.Address});

The method type inferencing engine says ok, we have an actual argument which is a lambda from Customer to some anonymous tuple type. We have a formal parameter which is a delegate from A to R. Therefore A is Customer, R is the anonymous tuple type. Now we know the return type of this generic function, and therefore abracadabra, the right hand side of the declaration has a type, so it is legal! (And it's not even particularly unperformant, since the jitter will likely optimize away the identity function. Even if it doesn't, it's tiny compared to the cost of creating the delegate object.)

What if the lambda has an anonymous type for its parameter? Suppose we want a function from the anonymous tuple type above which returns a string. This is a little kludgier but we can still do it:

Func<A, R> MakeFunction<A, R>(Func<A, R> f, A a) { return f; }

var f = MakeFunction(c=>c.Name, new {Name="", Age=0, Address=""} );

Now the type inference engine can't infer anything from the lambda parameter, since it is untyped. But it can infer the parameter type from the second actual argument. Once it knows the type of A, it can infer R from the return type of the lambda.

Pretty neat, eh?

  • So just to clarify things. This wouldn't be a problem if C# supported polymorphic functions?

  • Maybe I've missed something, but this seems bonkers. If I pass a closure to a function:

    void f<A, R>(Func<A, R> q) {}

    f( (int)i => i );

    it deduces A and R. And presumably this works too:

    void f<R>(Func<int, R> q) {}

    f( i => i );

    This means that what you said last time about being unable to deduce the return type is clearly false. So why don't you allow:

     var x = (int)i => i;

    All I can imagine is that you're not prepared to deduce that that lambda is a delegate R T<A,R>(A) but you are prepared to deduce the rest. Why not? Is it the case that two identically declared delegate types aren't equivalent? That is, is my T<A,R> considered the same type as Func<A,R>? Or is there some other reason you're not prepared to infer a conversion from lambda to delegate?

    Also, this highlights a weakness of C#'s var, compared to the potential of C++'s auto:

     std::pair<auto, auto> x = std::make_pair(a, b);

    may well be legal one day, whereas:

     Func<var, var> x = (int)i => i;

    will never scan well due to us reading 'var' as 'variable'.

    PS Your blog software doesn't allow comment submission from Firefox.

  • Richard:

    From the last post: "Delegate types are not even structurally equivalent; you can't even assign a variable of type D1 to a variable of type D2. We would always potentially choose wrong."

    So delegates carry more meaning than just signature.  They encode something along the lines of intent as well.

  • kfarmer: Thanks -- I noticed that a moment after posting.

    One solution to this problem would be to invent a type T<A, R> to which a lambda taking A as an argument and returning R can be converted (as is deduced by MakeFunction), then allowing T<A,R> to convert to any delegate for which the argument type converts to A and R converts to the result type. This type could then be used as both the type of "delegate(int x) { return x; }" and of "(int)x=>x", and would be the type deduced for y in "var y = (int)x=>x;".

    The current situation essentially seems to be:

     var y = WorkAroundCSharpDeficiency((int)x => x);

    (for some value of WorkAroundCSharpDeficiency), and that seems like a poor language choice to me. But maybe introducing a new (presumably anonymous) type to remove this wart complicates matters too much?

  • Even though delegates aren't structurally equivalent (and that's a CLR decision that C# can't change) it would still be possible to make equivalent delegate types castable to each other by explicit conversion in C#, by just making the compiler introduce an identity delegate of the target type in those cases.

    And if that were done, the compiler could synthesize a type Lambda<T1, R> (and if necessary Lambda<T1, T2, R> etc), let that be the type of the lambda, and then specialcase lambdas so that they're treated as implicitly convertible rather than explicitly. And have the compiler optimize out the Lambda<> class when it's unnecessary.

    That seems conceptually much simpler than saying the expression HAS no type.

    By the way, I thought of something related to expression trees that I think really ought to be in the language. Although I'm sure it introduces lots of complexity.

    Suppose I have:

    public void Process<T, U>(Expression<Func<T, U>> expr)

    In the body of Process I examine the expression and realize it uses some feature that I can't interpret. I should be able to do:

    U u = expr.Call(t);

    In other words, I shouldn't have to choose up front between Expression<Func<...>> and Func<...>, I should be able to get the best of both worlds.

    Perhaps this would be a lot of overhead to introduce to every single user of expressions, in which case perhaps introduce a CallableExpression class derived from Expression which enables this. When the compiler notices a call to a method that takes a CallableExpression<T> parameter, it should compile the argument BOTH as a delegate (as if the method took a T) AND as an expression tree (as if the method took an Expression<T>), and construct a CallableExpression out of both.

    Furthermore, the subexpressions of a CallableExpression should also be Callable...

    Thoughts?

  • @Stuart,

    This is possible today:

    var originalDelegate = expression.Compile();

    var three = originalDelegate.Invoke(2);

  • Richard Smith: delegate types aren't compatible or convertible. See the previous post :) I'd really like to know why that is, since it's trivially possible to write an adaptor from any delegate type to another (especially with C#'s closure support) and it's difficult to imagine where the constraint prevents accidents. Probably just the CLI architects being overly cautious.

    Eric Lippert:

    Let me suppose that C# was using an alternate model for type deduction, one that's similar but not quite identical to what you describe. Suppose that "a => yadda" could be rewritten as a static method in some unknown class:

     public static R anon<R, A>(A arg) { return yadda; }

    and that all that really needs to happen here is for the types of the generic to be deduced against the target type (and, ugh... it's horrible to stick R all the way out there before you see what it is... I think language designers should pretend they don't have fancy-pants parsers when they start to do this sort of thing, for the benefit of puny humans whose abilities to maintain multiple parse-trees at once are lacking)

    In other words, suppose the C# model didn't differ much from C++98's:

     template<typename R, typename A> R anon1(A a) { return a; } // a=> a

     int (*fn)(int) = anon1; // deduces anon1<int, int>

    So this post introduces deduction of the return type from the expression itself, similar to C++0x's provisional:

     template<typename A> auto anon2(A arg) -> decltype(expr) { return expr; }

    This is where my simple, naive translation of what you've said breaks down: I'm having trouble unifying this (if you'll pardon the pun) with what you described above. Can you explain the actual deduction algorithm for MakeFunction? Don't misunderstand me: I see how the types fit together there, I just want to know how far it goes.

    In C++, the following code is broken:

     template<class A> A identity(A a) { return a; }

     template<class A> void test(A (*f)(A), A a) { assert(f(a) == a); }

     int main() { test(identity, 0); } // error. A is ambiguous

    How does C# resolve A to int in the call to test? How easy would I find it to construct an example that is unambiguous, but which C# can't solve. How easy would you find it to describe what those cases are?

    Also, I'm still bothered by something you said in the first post. Why is the error in f6 different from that in f5? Why aren't the errors both in trying to convert the argument i to M1's parameter type? This distinction seems to be the entire point of your first post, so it must be pretty important. Is that because C# victims aren't clued into instantiation errors the same way that C++ victims are?

    Really, I guess I'm asking for more clarity in how the C# 3.0 model differs from the C++98 model.

    Thanks,

    James.

    (PS: If this appears, it was successfully posted in Firefox)

    (PPS: Isn't it a shame that C# has all that seldom-used 'unsafe' stuff? The arrow operator would be a much better fit for => otherwise...)

  • Welcome to the nineteenth Community Convergence. I'm Charlie Calvert, the C# Community PM, and this is

  • Interesting stuff, Eric. Thanks for the post, looking forward to C# 3!

  • Hi Eric

    while I think this trick is quite smart, I'm afraid it acutally affirms a point dynamic language advocates have always been making: the type checking just gets in your way. One of the reasons for type inferrence in C# was to reduce the boiler plate code programmers would have to write to satisfy the compiler, right?

    Then again, I'm still a static language guy, so I could not resist playing with it. I don't like cast-by-example for several reasons:

    - it's weak because it relies on the fact that anonymous types are reused. Make one little derivation, and I'm sure the compiler messages will be pretty misleading. Names have to be rewritten, and "=0" is quite a weak type declaration.

    - even worse, what if you want a non-trivial member type? e.g. new { Name="", Address = new Address() }

    even more objects would have to be instantiated, just to derive member types. what if you have no default ctors? what if ctors have side effects? this could get dirty.

    - you actually create instances you're never going to use, which will keep the GC quite busy if done too often. especially if you have to new up objects to specify member types...

    I tried to improve your sample by replacing the A in MakeFunction with a delegate:

    Func<A, R> MakeFunction<A, R, T> (Func<A, R> f, Func<T, A> a) { return f; }

    so that I could invoke it like that:

    var f = MakeFunction (c => c.Name, (Person p) => new {p.Name, p.Age, p.Address} );

    (overloads with more than one "T" argument would be useful for join results)

    This would have two advantages:

    - by using a source object, we could receive member types and names from other (named) types, therefore reducing the probability of errors. changing members in the source type (Person in my example) would actually be done right using refactoring tools, or create useful error messages otherwise

    - no new objects have to be created. this prevents both problems (GC and non-default ctors)

    There's a disadvantage too: It does not work, because the complier (may CTP) cannot infer the types (ironically, it suggests I specify the type arguments for MakeFunction explicitly ;-))

    However, I don't see a reason why it should not be able to infer this. Is this going to work in RTM?

  • A week or so back David Ing raised a concern that the new language features in C# 3.0 are bloating the

  • That seems conceptually much simpler than saying the expression HAS no type.

  • NOTE: If you haven't read the first post in this series, I would encourage you do to that first , or

  • NOTE: If you haven&#39;t read the first post in this series, I would encourage you do to that first

  • NOTE: If you haven&#39;t read the first post in this series, I would encourage you do to that first

Page 1 of 2 (16 items) 12