Covariance and Contravariance in C#, Part Nine: Breaking Changes

Covariance and Contravariance in C#, Part Nine: Breaking Changes

Rate This
  • Comments 34

Today in the last entry in my ongoing saga of covariance and contravariance I’ll discuss what breaking changes adding this feature might cause.

Simply adding variance awareness to the conversion rules should never cause any breaking change. However, the combination of adding variance to the conversion rules and making some types have variant parameters causes potential breaking changes.

People are generally smart enough to not write:

if (x is Animal)
  DoSomething();
else if (x is Giraffe)
  DoSomethingElse(); // never runs

because the second condition is entirely subsumed by the first. But today in C# 3.0 it is entirely sensible to write

if (x is IEnumerable<Animal>)
  DoSomething();
else if (x is IEnumerable<Giraffe>)
  DoSomethingElse();

because there did not used to be any conversion between IEnumerable<Animal> and IEnumerable<Giraffe>. If we turn on covariance in IEnumerable<T> and the compiled program containing the fragment uses the new library then its behaviour when given an IEnumerable<Giraffe> will change. The object will be assignable to IEnumerable<Animal>, and therefore the “is” will report “true”.

There is also the issue of existing source code changing semantics or turning compiling programs into erroneous programs. For example, overload resolution may now fail where it used to succeed. If we have:

interface IBar<T>{} // From some other assembly
...
void M(IBar<Tiger> x){}
void M(IBar<Giraffe> x){}
void M(object x) {}
...
IBar<Animal> y = whatever;
M(y);

Then overload resolution picks the object version today because it is the sole applicable choice. If we change the definition of IBar to

interface IBar<-T>{}

and recompile then we get an ambiguity error because now all three are applicable and there is no unique best choice.

We always want to avoid breaking changes if possible, but sometimes new features are sufficiently compelling and the breaks are sufficiently rare that it’s worth it. My intuition is that by turning on interface and delegate variance we would enable many more interesting scenarios than we would break.

What are your thoughts? Keep in mind that we expect that the vast majority of developers will never have to define the variance of a given type argument, but they may take advantage of variance frequently. Is it worth our while to invest time and energy in this sort of thing for a hypothetical future version of the language?

  • I think breaking scenarios would be rare.  And the scenarios where it does break probably makes a bug visible at compile time and arguably could have been written differently so it won't be affected by these changes.

    I agree that a feature like this is sufficiently compelling to be worth the breaking changes.

    ...otherwise, you could never introduce co-/contra-variance...

  • I'm really conflicted. I think the feature is worthwhile to have, but I'm not sure that by itself it meets the -100 bar in terms of value versus added conceptual complexity. In combination with other variance-related features it becomes more of a no brainer, but I've played that broken record enough by now ;)

    Having said that, I don't think the breaking changes by themselves are enough of a reason to not include this feature.

  • I think that covariance of interfaces is one of the biggest missing features currently in C# -- When trying to design elegant type-safe libraries, we currently have to revert back to having our interfaces derive from a non-generic version simply to put them in a collection...  This leads to lots of kludgy code that could be eliminated with variance.

  • I can imagine several cases where overload resolution might pick a different method or become ambiguous.

    E.g.

    class BaseClass {}

    class DerivedClass : BaseClass {}

    void Test(IEnumerable<BaseClass> a) {}

    void Test(IEnumerable b) {} // non-generic "fallback" method

    // call with:

    Test(new List<DerivedClass>());

    However, in the places where I've seen such things, it was always done to work around C# not supporting variance. The implementation of the non-generic method would often be

    void Test(IEnumerable b) { Test(b.Cast<BaseClass>()); }

  • This is a fascinating look into language design, but I'm having a hard time coming up with real-world uses for variance.  Does anyone have a brief, real-world example of code that would be improved by variance?

  • Do it. Most people won't even see it's there because it will just work as they consume classes and interfaces that used the feature. But the point is, today people see it's *not* there when they hit problems.

    I think it's the mark of a truly great feature when it simplifies your life without you even noticing.

    And if somehow you can spot the broken old code at compile-time, bonus points. Of course, if existing assemblies suddenly start breaking that could be quite puzzling to debug.

  • Could the IDE perhaps spot this code during migration of the project to C#4? "This code will behave differently from prior versions because an IEnumerable<Giraffe> is now an IEnumerable<Animal>".

    Catching all the conceivable outcomes of an "is" check is probably halting-problem complete (and worse, because you may not even have access to the callers of a public API to know what types might be passed in) but it should certainly be possible to catch cases where overload resolution will have different results - just evaluate the overload resolution with and without taking variance into account, and if the results differ, flag the code for the user to examine.

  • I think the feature should make into a future version of language. C++ has covariant return types for ages and that feature is useful for library designers most of the time.

    Your plan is much more ambitious. There will be huge hurdles even if it does make it past the -100 point mark. But on the other side, I have seen code written by inexperienced people who have never heard of virtual functions, love 'as' and 'is', in the end their code is littered with type casts.

  • I, personally, would love to see variance implemented for generic delegates. It just seems wrong that I can't assign a Func<string,object> (taking string and returning object) instance into a Func<object,string> (taking object and returning string) when I'd clearlybe allowed to make that call were the delegates not involved. To get round this problem I've written some delegate casting extension methods which wraps one delegate inside another but this is both clunky to use and has the overhead of one additonal and seeming unnecessary delegate invocation.

    Conversely, when trying to create a "once and for all" implementation of the VISITOR design pattern based around a dictionary of delegates (returning void and taking T and indexed by Type T) the easiest way for me to capture these was to wrap each delegate as a Delegate<object> so they all had the same type. If I could have created my dictionary to allow any Delegate<T> where T:object then this step wouldn't have been necessary. I'd be happy to take the risk of an invalid parameter at runtime as I would have carefully controlled which delegate would be called under which circumstances in my code.

    Incidentally, my syntax preference was Luke's delegate R Func<* is A, R is *> (A a). That just seems more intuitive to me and makes my brain hurt less! In the instance above that'd give me a Dictionary<Type,Proc<T is object>>. The question is, given that T will always be object would I need to explicitly specify it - possibly I would in order to 'turn on' the variance?

  • I'd like to see it done. Imagine we would not have variance in arrays, how much casting and copying would we have to do? Now if I get the chance to have the same for IEnumerable (only without array covariance's problems), that alone would be worth the trouble in the context of LINQ. I've missed variance before. (And I'll probably keep missing it until you have it implemented for generic classes too, instead of just interfaces)

    Breaking changes might occur, I'd take the risk. Flagging the places in the source code for a one-time review, as Stuart suggested, is a good idea.

    Now here's another suggestion: I though about how you could handle the problem of array covariance and came to think about the "const" keyword (which C# doesn't have, and which is probably too hard to add now, but stay with me for a moment).

    Another option I was thinking about is an IReadOnlyList<T> interface. I'd have liked that before, because I'd find it more elegant that nesting some IsReadOnly property. But with covariance it would make so much more sense: I could get the (hypothetical) covariance of IEnumerable and the simpler/faster list access using an indexer, Count etc.

    How could those work together? Most functions that take arrays as parameters treat them as read-only constructs. Now changing their parameters from T[] to IEnumerable<T> would make everything slower, plus you'd have to rewrite your code. But changing them to IReadOnlyList<T> probably would do nothing bad to performance. We could even have IArray<T>, which could provide a "Length" property instead of Count. (Additional thinking required for n-dimensional arrays, but maybe interfaces for ranks 2 and 3 would be sufficient.)

    OK, I change all my "constant" T[] input parameters to IArray<T>. I might even use a tool that supports this. What next?

    I could turn on a compiler warning that jumps in my face every time I assign an Giraffe[] to a Animal[] variable or parameter! Assuming that arrays are either used as read-only references (in which case they should be declared as IArray<T>) or as modifiable array (in which case they should be considered invariant), this might fly.

    Admittedly, I haven't thought that through. Anyway, this would be a nice opt-in for people who care about the problems of array covariance, pretty easy to implement in the compiler, and it should not affect anybody who just wants to compile their old code. On the cost side, you'd probably have to modify a lot of BCL method signatures to make this useful.

    When you propagate those warnings to errors and mark fully checked assemblies, the JIT might be able to skip type verification on assignments too. Althogh I worry less about this.

    (I'd also consider an invariant syntax for declaring array parameters and variables, like "invariant Animal[]", or alternatively, a modifiable interface like IModifiableArray; or call the covariant, constant version IReadOnlyArray and the invariant modifiable version IArray. Whatever.)

  • 4th paragraph, "that nesting some IsReadOnly property" should be "than testing".

  • no, wait, the part about performance is wrong. arrays are treated natively by the JIT, which should be much faster than calling an indexer via a vtable. probably costs more than type checking on assignments...

    so we'd need "real" C++ like "const" support in order to make this usable in a general sense. which is unlikely.

    Have there been any discussions about introducing const in the CLR/C# lately? I know that many people think that const doesn't pull its own weight in C++, and I tend to agree. But the CLR could enforce it, prevent casting-away of const in sandboxed scenarios via CAS policies, so this might be a really interesting security feature. Also, considering how functional programming favors immutable objects, this might make things easier for parallel processing (PLINQ).

  • I'm tentatively for, though I'm worried that if the barrier to implementing your own reuseable types grows too high, it'll actually reduce reuse as an unfortunate side-effect.  Extremely successful languages such as C don't have such a huge difference in learning curve between consumers and users of "modules" (whether those are libraries, assemblies, interfaces...).  Still co/contravariance seems to promote reuse, and that's a good thing, right?

  • I just got through reading this 9-part article series.  I haven't read many MSDN articles or blogs, so I have to ask: what is the -100 test?  Aside from that, I like that each article is short - as this can be a very complex subject.

    I feel that the syntax would be far simpler, and far more familiar to simply use IFoo<+R, -A>.  It's already like that in the CLR, and everyone who knows what variance is knows the +/- syntax.  However, I'd bet that everyone has to puzzle over the alternatives as they are all new.

    For anyone who's been annoyed by C#'s lack of variance, adding it will be a major improvement.  For those who have no idea what variance is, I suspect they won't be the least bit bothered by it.  I always thought IAction<Mammal> could be assigned to IAction<Animal> when the type parameter is an argument.  I was very confused the first time I tried compiling such code and C# stated the types were incompatible.  I had to stare at the error for a long time, then go ask someone why my code wouldn't compile!  In short - the answer was: "While obvious, C# does not support variance.  You probably have no idea what variance is, but all you have to know is you can't do that even though it seems like you should be able to.  I've been harping on Microsoft for years to fix this problem."

    Meaning, including variance should be the logical *default*, and not supporting variance seems more like the exception.  Experts hate the lack of variance, beginners are confused by the lack of variance (they have to "learn" that it is unsupported, as opposed to "learning" about how to use variance).

    All this discussion, and the current design of C# seems to imply that variance is this "fancy new feature".  I argue that the lack of variance support is an anti-feature -- something the language designers went out of their way to annoy you!

    And finally, I think the +/- syntax is simple, intuitive for those who will write such code, and won't get in the way.  It seems like something that was supposed to be there all along, (as proven by the CLR support), and C# just took it out to "baby" you.  Although really it's just plain annoying, and this discussion wouldn't even be happening if it was included to begin with.

  • "-100 points" refers to this article by former C# team member Eric Gunnerson:

    http://blogs.msdn.com/ericgu/archive/2004/01/12/57985.aspx

Page 1 of 3 (34 items) 123