Covariance and Contravariance in C#, Part Eight: Syntax Options

Covariance and Contravariance in C#, Part Eight: Syntax Options

Rate This
  • Comments 66

As I discussed last time, were we to introduce interface and delegate variance in a hypothetical future version of C# we would need a syntax for it. Here are some possibilities that immediately come to mind.

Option 1:

interface IFoo<+T, -U> { T Foo(U u); }

The CLR uses the convention I have been using so far in this series of “+ means covariant, - means contravariant”. Though this does have some mnemonic value (because + means “is compatible with a bigger type”), most people (including members of the C# design committee!) have a hard time remembering exactly which is which.

This convention is also used by the Scala programming language.

Option 2:

interface IFoo<T:*, *:U> { …

This more graphically indicates “something which is extended by T” and “something which extends U”.  This is similar to Java’s “wildcard types”, where they say “? extends U” or “? super T”.

Though this isn’t terrible, I think it’s a bit of a conflation of the notions of extension and assignment compatibility. I do not want to imply that IEnumerable<Animal> is a base of IEnumerable<Giraffe>, even if Animal is a base of Giraffe. Rather, I want to say that IEnumerable<Giraffe> is convertible to IEnumerable<Animal>, or assignment compatible, or some such thing. I don’t want to conceptually overwork the inheritance mechanism. It's bad enough IMO that we conflate base classes with base interfaces.

Option 3:

interface IFoo<T, U> where T: covariant, U: contravariant { …

Again, not too bad. The danger here is similar to that of the plus and minus: that no one remembers what “contravariant” and “covariant” mean. This has the benefit at least that you can do a web search on the keywords and get a reasonable explanation.

Option 4:

interface IFoo<[Covariant] T, [Contravariant] U>  { …

Similar to option 3.

Option 5:

interface IFoo<out T, in U> { …

We are taking a different tack with this syntax. In all the options so far we have been describing how the user of the interface may treat the interface with respect to the type system rules for implicit conversions – that is, what are the legal variances on the type parameters. Here we are instead describing this in the language of how the implementer of the interface intends to use the type parameters.

I like this one a lot; the down side of this is of course that, as I described a few posts ago, you end up with situations like

delegate void Meta<out T>(Action<T> action);

where the "out" T is clearly used in an input position.

Option 6:

Do something else I haven’t thought of. Anyone who has bright ideas, please leave comments.

Next time: what problems are introduced by adding this kind of variance?

  • interface IFoo<T, U, V> is IFoo<base(T), U, derived(V)>

  • Another variant on the notation I proposed earlier:

    delegate R Func<R, A>(A a)

     where Func<R, base(A)> is Func<base(R), A>;

    or

    delegate R Func<R, A>(A a)

     where Func<R, A> is Func<base(R), A>

     where Func<R, base(A)> is Func<R, A>;

  • stuart:

    but this re-introduces the double-meaning of where (contraints/variance), plus I don't think that base() and derived() make it any clearer who is the base and who derives than the T:* notation

    I'm still not entirely happy with the "is" constraint syntax. it's clear, but is it really useful to have a syntax that spells out the meaning of variance? I don't have a problem with the verbosity, but for me the "covariant on" and "contravariant on" clauses on the interface (not on the type arguments) would be the most intuitive ones when I'm working with it. I have to grasp co/contravariance once, the only notation that takes this away is the in/out notation, which is either wrong (Erics option #5) or overly verbose (my proposal of applying the in/out constraints not only to T, but maybe also to IDoer<T> etc.). plus, while in/out syntax is easier to write and understand, it makes it much harder to actually understand what assignment compatibility it really achieves.

    I also ask everyone to challenge my assumption that automatic variance is less of a problem for delegates than for interfaces. the more I think about it, I believe that it would be a Good Thing, probably even as opt-out (on by default). Am I wrong here?

    to summarize, for interfaces I'd prefer

    interface IFoo<T,U>

       covariant on T

       contravariant on U

    clear in what it does (google-able keywords) to which party (interface, not type parameters). once you grasped, easy to understand which assignment results from it (co: same as type-parameter, contra: opposite)

    for delegates I'd prefer automatic co/contravariance, maybe with the possibility to opt out ("invariant on T")

    everything else I've posted are mere thoughts that I would not like to see in the language unless somebody comes up with better variations.

  • Stefan, I don't need to challenge your proposal of automatic variance on delegates, because Eric already ruled it out as actually impossible.

    delegate void Circular1<T>(Circular2<T> param);

    delegate void Circular2<T>(Circular1<T> param);

    If Circular1 is covariant on T then Circular2 is contravariant on it, and vice versa. So simply stating that variance is intended is not sufficient, you really do have to spell out which way round.

    I could live with the "covariant on" / "contravariant on" syntax but I actually do think it's helpful to spell out the meaning of it. Even after all eight of Eric's blog posts on the subject, I still didn't really fully grok it until I started thinking about it in terms of how IFoo<T> relates to IFoo<T's base class> and IFoo<T's derived classes>. Describing the meaning *concretely*, in terms of how it relates to the actual type you're declaring, made a huge difference to my ability to understand it. I can actually write statements about Action and Meta and get their variance the right way round with this kind of syntax. Which isn't true for any of the others, including "covariant on" and "contravariant on" - at least until I spent some time writing code using them. And I don't think that *users* of the interfaces will ever spend enough time to get that understanding.

  • Thank you all for this fascinating discussion.  I have not yet absorbed nearly all of it. There is a wealth of possibilities here which I shall summarize and take to the design team at some point over the next few weeks.

    To answer the earlier question -- the proposed feature is guaranteed-typesafe interface and delegate variance on reference types, no more, no less.  No variance on classes, at least not this go-round.  No unsafe variance.  No call-site variance, no virtual overload return type variance, etc.  Those are all features that we will consider, of course, but interface and delegate variance is the one I'm interested in today.

  • sure, but do circular type references in delegates make any sense? do they even exist in the wild? how bad would it be if automatic co/contravariance would not work in this case, and you'd have to spell it out in those few cases (given there are any, and variance matters)?

    circular type references are quite common in interfaces, but in delegates?

    if that's really a problem, an alternative solution would be automatic variance detection for delegates with opt-in, so that the compiler could produce an error message if you want variance, but it cannot find out which one.

    ok, back to the topic of spelling out the meaning. my problem with that is that, once I've found out the meaning, I need a way to remember it in an intuitive way, or I wouldn't be able to use it. the difference between T:* and *:T, or worse, IFoo<T> assignable to IFoo<X> where T:X (as opposed to X:T), is something that I'll have forgotten when I'm reading the next line of code. I'd have to grasp the concept that assignability of IFoo goes WITH assignability of T in the class hierarchy, or against it. co/contravariance spell this concept out. I can remember it, communicate it in whiteboard sessions, etc.

    T:* or *:T are just obvious while I'm looking at exactly this line of code. I guess it would be the same for base(T) and derived(T).

    + and - are better, but still worse than covariance/contravariance IMO.

    ok, we have to separate our concerns here.

    1) which one is easiest to write?

    2) which one is easiest to grasp when you're not familiar with co/contravariance?

    3) how important is it to grasp the intention of the author when you're not familiar with variance? would you typically not just try it and fix it if your assignments fail? (after all, who knows assignment compatibility rules of arrays? still, they're being used every day)

    4) which one is easiest to work with once you're familiar with variance?

    we need to set priorities on these problems.

    I think priority #1 should be that programmers familiar with variance get a syntax that is logical to read AND write. everyone else is probably best served with trial and error.

    in/out (as well as automatic detection) would be easier to write, but you'd still have to think about how this translates to variance when using those types. in fact, any constraints-based syntax would have this disadvantage.

    so besides being overly verbose, specifying that I'm only using T by using Action<T> as an input might be easier to write, but I'd have to think much harder to understand how this affects assignability of the Meta delegate type. this is probably a shot in the foot.

    (note: if I'd have automatic detection for delegates, decompilers/help generators could still spell it out for me using the "normal" covariant on/contravariant on syntax.)

  • Eric, you're welcome, but I'd really love to see you take part in that discussion here, and not just take it to the Holy Halls, disappear for a few years just to announce something way later that might or might include what we thought up here. I love your blog, but it's really sometimes a bit frustrating that you leave a lot of (sometimes thoughtful, I like to think) comments unanswered.

    I'm fully aware that you don't want to democratize language design, but that's a different story.

    Back to the topic. I'm no longer sure that we need to make it clear that the co/contravariance clause describes a quality of the interface (as opposed to the type parameter), because I believe this is pretty much commutative and ultimately doesn't matter.

    Still, I think you should separate this information out of the angle brackets, because it's just too much information in a single place. We dont write IFoo<class T, struct U> either. How about that?

    interface IFoo<T,U>

     where T: class

     where U: struct

     covariant T

    (I could live with unmodified options 3 and 4 too, these are just minor flaws IMO, and ultimately matters of taste)

    quote:

    C# Language Specification

    12.5 Array covariance

    For any two reference-types A and B, if an implicit reference conversion (Section 6.1.4) or explicit reference conversion (Section 6.2.3) exists from A to B, then the same reference conversion also exists from the array type A[R] to the array type B[R], where R is any given rank-specifier (but the same for both array types). This relationship is known as array covariance. Array covariance in particular means that a value of an array type A[R] may actually be a reference to an instance of an array type B[R], provided an implicit reference conversion exists from B to A.

    I believe this way explanation is much easier to understand than anything that involves bigger and smaller types.

    Covariance: If you can assign A to B, you can assign IFoo<A> to IFoo<B>

    Contravariance: If you can assign A to B, you can assign IFoo<B> to IFoo<A>

    I believe that apart from automatic detection, it doesn't get any easier than this. It's relatively easy to understand, and once you've grasped it, very easy to remember. (That would rule out option 1 btw.)

    Option 2 and all suggestions involving "T:*", "X where T:X", "X where T is X", "base(T)" etc. are just an attempt to put a formal specification of co/contravariance into the declaration syntax. although I came up with a few of them myselves, those would hurt MY brain when I'd be busy _using_ co/contravariance in the course of solving another problem instead of thinking about it on its own.

    Please let's not let the discussion about automatic detection in delegates die. The more I think about it, the more I like it.

  • What about introducing an accept clause:

    interface Foo<T>

    accept base for[/on/of] T

    Or:

    interface Foo<T>

    accept derrived for[/on/of] T

  • Thomas,

    I liked that at the first look. However, I think you have to know what co/contravariance is in order to understand this. Otherwise, it would be misleading. Just reading your syntax and not knowing about variance, I'd assume that this somehow indicates that an IFoo<Mammal> can take an Animal object (for accept base/contravariance), which is nonsense, or a Giraffe object (for accept derived/covariance), which goes without saying.

  • Welcome to the thirty-fifth edition of Community Convergence. We have an interesting and controversial

  • Stefan's idea of automatic variance for delegates is growing on me. If you automatically assign variance in the cases where it can be unambiguously determined (which is 99% of cases) but also allow it to be specified manually using the same syntax that interfaces use, that seems to fit well with the "simple things should be simple; complex things should be possible" principle.

    I'm not sure you even need a syntax to explicitly specify that a delegate type should NOT be variant when it could be. You don't want it to be automatic on interfaces because of the fact that adding members to the interface could inadvertantly change the variance - and adding members to an interface is not normally a breaking change for consumers of the interface (although it is for implementers). I'm not sure there's any equivalent to that for delegates - ANY change to a delegate is a breaking change for consumers of the delegate. So if the variance changes too in that case it's not a big deal as all code consuming the delegate is already broken.

  • Stuart, exactly. The chance of a delegate signature breaking assignability and the chance of a modification to an interface's body causing this are two very different things. Plus, for the remaining 1% of delegates (if it's really that much), figuring out sensible variance declarations even manually would probably make my brain explode. delegate Circular1<T> (Circular2<T>) ... - I wasn't even sure this would compile until I tried it ;-)

    I guess there are languages out there who provide some sort of variance on higher-order functions with type parameters. I doubt that they have explicit syntax for this, but in any case looking at them (and talking to users of these languages) could be helpful.

    Does anybody know any such language?

  • While scanning the examples, something like T:*,  *:U really draws my attention to the * in the font my browser uses. I'm wondering if something like T:var, var:U might be better. Does var have the right connotation linking the generic's type to its variance?

  • You could combines Lukes idea with the F# way and use underscores instead of asterisks:

    delegate R Func<_ is A, R is _> (A a)

  • hey, if * looks bad in murman's font, shouldn't we make this symbol configurable? ;-)

Page 4 of 5 (66 items) 12345