Type inference woes, part three

Type inference woes, part three

  • Comments 10

There were a lot of good comments and questions posted in the last two entries which deserve answers. However, I am once more off to the beaver-shark infested shores of the great Canadian inland sea for my annual summer vacation. I'll get to them when I'm back, in late June.

Before I go I should mention what algorithms we're considering to solve the "infer the best type for this set of expressions" problem that is going to crop up all over the show in C# 3.0.

We've got four candidates right now. Suppose you've got a set of expressions, E. The set of their types is T.

Algorithm One: pick the unique type in T to which every type in T is implicitly convertible.

This has the advantage of being simple and already in the specification. It has the disadvantages of being incompatible with existing behaviour and not handling typeless expressions well.

Algorithm Two: pick the unique type in T to which every expression in E is implicitly convertible.

This has the advantage of being what we actually implement. However it has the disadvantages of turning some corner cases that should work according to the spec (such as literal-zero-vs-short) into error cases.

We could combine the two algorithms. Consider the subset of T consisting of all types in T to which every expression in E is implicitly convertible. Call this subset S.

Algorithm three: pick the unique type in S to which every type in S is implicitly convertible.

This algorithm has considerable charm. As far as I've been able to determine, it is 100% backwards compatible with existing non-error behaviour, while at the same time turning those cases which ought to be successful according to Algorithm One back into success cases which agree with the spec. Literal-zero-vs-short would go to int. However, there is a down side, which is addressed by Algorithm Four.

Algorithm Four: pick the unique type in S from which every type in S is implicitly convertible. That is, pick the smallest type in S, not the largest.

This algorithm also maintains backwards compatibility but would make literal-zero-vs-short go to short. And isn't that better? If you have an implicitly typed array of ten thousand shorts, and you throw a single literal zero in there, you probably want it to stay as an array of shorts. Given the choice of several types to which every expression goes, surely picking the smallest is the best choice.

Algorithm Four is currently the front-runner in our thinking but of course I am blogging about this in the first place because there is still time for feedback. Any thoughts on these choices, or ideas for better algorithms, would be appreciated.

Many thanks to my colleauges Wes and Peter for their analyses of these algorithms.

See you when I'm back!

  • I'm definately a fan of option 4, and it's not that hard to explain to me... and the value of picking the "from which" pattern is that it seems logical given the nature of an assignment expression.  I want the type of the "var" --which I still would rather be called "auto" or "let" incidentally-- to be driven by what it is being assigned FROM.
  • I think option three has the advantage of being most similar to the language's type coercion; and would be the most intuitve of the bunch (at least to Cx programmers, which is indeed the audience).
  • I'm mostly a fan of 4, but it has an issue. Sometimes it's best to stick with what's most common and not with what's shortest. For example lets say I have a bunch of arrays that I'm going to do something with (say join) and one is short, but the rest are int's. It is far more expenseive to constantly convert the sets to short when only a single set is short while the others are int.

    Also look at it this way. Let's say I have an array of ints and I throw a short in there. I do not want them all converted to shorts.

    It's almost as if you want to somhow convert to the most common type and not the smallest one.
  • Your point about joining is well taken. Note that type inference on equijoins where the two things being equated are of different types is one of the applications of this algorithm.

    However, I think you may be misunderstanding the algorithms I'm proposing. If you have an array of ints and you throw a short in, then short won't even be in the set S, so it's not a candidate.  None of these algorithms _ever_ does a conversion which loses magnitude.  We will never be turning ints into shorts, only shorts into ints.  The only ints we would turn into shorts would be integer literals which are known at compile time to fit in a short, and we would simply codegen them as shorts in the first place.

  • Have fun in Canada, Eric. Tell your mom and Mike that I said hello (and that I healed well!).
  • Have you given any consideration to the possibility of picking types that *aren't* in T?

    I understand how big a can of worms that is, but I really really want "expr ? 1 : null" to go to Nullable<int>. The number of places in my code that have an ugly cast of null to "int?" for the benefit of the ternary is painful (and default(int?) would be uglier still of course).

    A "?." operator (where x?.y is equivalent to "x == null ? null : x.y" and automatically going to Nullable when y is a value-typed property) would help, but making the ternary do the right thing would be better.

    Have you completely ruled out the option of going this route?
  • Hey Stuart,

    We have given it consideration, but one of the design principles of C# is that no type gets inferred that is not right in front of you already.  That is, we do not _synthesize_ the best common type, we _identify_ the best common type.  I wouldn't say that it's _completely_ ruled out, but getting the language design committee to abandon that principle would be an uphill battle to say the least.

  • Last time in this series I discussed how we are probably going to identify a &quot;best&quot; type from a set...
  • PingBack from http://microsoft.wagalulu.com/2006/07/18/type-inference-woes-part-four/
  • Last time in this series I discussed how we are probably going to identify a "best" type from a set of

Page 1 of 1 (10 items)