A Face Made For Email, Part Three

A Face Made For Email, Part Three

Rate This
  • Comments 14

Yes, it has happened again. This time, our fabulous C# Community Program Manager Charlie Calvert was good enough to put together a little half-hour-long video of me talking about the scenarios which justify changes to the type inference algorithm for C# 3.0. We've already made some interesting changes which will make it into the beta version, and there likely will be further refinements to the algorithm as we see how it works.

The video is here.

I shall also likely be blogging about these algorithms over the next few weeks when I find time to turn my developer notes into blog posts.

  • Very interesting indeed. It looks like the compiler will have to be very liberal in what it accepts while parsing expressions and statements inside lambda expressions, since it won't have the type information during the recursive descent (thinking of the SSCLI C# compiler).

    I wonder if there's a whole duplicate parser for this less common case of parsing lambdas, or if the parser is now as late-typed (relatively) when parsing all expressions.

  • The body of an expression lambda must be an expression, period, and the body of a statement lambda must be a block of statements.  This is the same grammar that the parser uses any other time it needs to produce an expression or statement block.

    I don't understand what you mean by "late typed".  The grammar of the language is entirely independent from its type system.  What is an example of a language where the _parse_ of the language changes based on type information?  I am not familiar with any such language.

    I think perhaps you are confusing parsing with "expression binding", whereby we analyze the existing parse tree for type annotations.  Yes, the lambda _expression binding_ code is insanely complicated.  It defers type annotating the body of the lambda until information about what delegate type the lambda is being converted to has been determined.  Once that happens then we can bind the body and check it for type errors.

    Type inference complicates this matter tremendously, as we must do speculative binds to determine whether particular possible inferences will produce type errors or not.  

  • Sorry - didn't make myself clear. Problem with making off-the-cuff comments on a blog.

    By late typed, I meant that, when constructing the AST (during parsing), the compiler doesn't have access to the type info of the symbols represented by identifier tokens, since the argument types are inferred later. I wasn't referring to dynamic typing or anything like that.

    Re language where the parse of the language changes based on type info: the Delphi language does, for one, when parsing structured constants, e.g.:

    ---8<---

    type

     TRecord = record

       x: Integer;

     end;

    const

     x = 12;

     a: TRecord = (x : 42);

     b: Integer = (x + 42);

    --->8---

    Of course, one could construct a sufficiently vague grammar so that this could be accommodated, but the parser is LL(2) and uses type information from the declared type of the constant to parse the constant expression on the right of the '='. There are other situations where similar things happen, esp. with parsing arguments to funcs/procs/methods - largely for backward compatibility reasons after language features have been added down the years.

  • Interesting!  Next time I see Anders I'll ask him about that.  

  • The notion of using rounds of inference seems a bit kludgy.

    The inferencing abilities of C# 2.0 are very weak, and C# 3.0 seems to be headed towards another complicated approach which is still not general.

    Have you considered a backtracking, unification-based algorithm like the Hindley-Milner type inference algorithm? This is the approach that I used which has a very simple implementation but which is also fully general.

  • Wesner, I passed your concerns on to the language design team. Their reply was basically that:

    The problems with the hindley-milner algorithm are (a) the classic H-M algorithm handles subtyping and variance poorly, and (b) backtracking algorithms have potentially bad worst-case running times.  

    We already have some contravariance, covariance and invariance in the type system and we would like to have an algorithm which handles those well.  We also have come up with an algorithm which is guaranteed to be linear in terms of number of formal parameters + number of method type arguments.  (Every round we either eliminate a formal parameter from future consideration or fix a type argument, and we'll eventually run out of one of them, at which point inference is known to either succeed or fail.)

  • Hi, thanks for the great video :-). It seems that I discovered it a bit late, but here is what comes to my mind after watching it.

    FIRST THING>>

    It looks to me that the third - probably the most controversial step of algorithm was added because this behavior is needed in the LINQ joins (however there may be some other problems not mentioned in the video...). When I write the query "by hand" using extension methods I wouldn't mind having to write the type cast explicitly. Something like:

    ... = Join(customers, orders, c => (int?)c.ID, o => (int?)o.ID, (c, o) = new { ... });

    In this case your inference algorithm doesn't need the third step (if I understand it correclty :-)). However I understand that this isn't acceptable syntax for writing LINQ queries. For the queries I would suggest adding something like "outer join" clause that would be translated to the previous code (= it would add the type cast to T? from T which is inferred from lambda). You could than write the follwoing:

    from c in customers

     outer join o in orders on o.ID equals c.ID

     select ..

    //in this case nullable types can be used

    from c in customers

     join o in orders on o.ID equals c.ID

     select ..

    //no nullable types allowed here

    This could solve the problem without adding too much complexity into the algorithm, but still allowing the most important scenario. Did you discuss solution like this? Are there any additional issues?

    SECOND THING>>

    Thanks for adding the second algorithm step! This scenario is very important (IMHO) because it makes it possible to declare lambda/expression tree with anonymous type as local variable:

    var lambda = DeclareLambda(c => new { c.Something });

    var tree = DeclareTree(c => new { c.Something });

    // where DeclareLambda is:

    Func<A, T> DeclareLambda<A, T>(Func<A,T> foo) { return foo; }

    // DeclareTree is similar...

  • The link is broken.  Is this video still available somewhere?  

  • Welcome to the twelfth Community Convergence . Please go here to post comments. This edition of Community

  • This is index to the various technical posts I have created for this blog. As I add to each section,

  • The February CTP (aka as the March CTP) is now available for download as a regular install and as a virtual

  • I'm trying to pull together lists of available C# videos. I'll be working on this list over time, but

  • The link is broken.  Is this video still available somewhere?  

  • It's a bit rainy and snowy today in Redmond. What an excellent time to curl up by the fire and watch

Page 1 of 1 (14 items)