Fabulous Adventures In Coding
Eric Lippert is a principal developer on the C# compiler team. Learn more about Eric.
I have been wanting for a long time to do a series of articles about covariance and contravariance (which I will shorten to “variance” for the rest of this series.)
I’ll start by defining some terms, then describe what variance features C# 2.0 and 3.0 already support today, and then discuss some ideas we are thinking about for hypothetical nonexistant future versions of C#.
As always, keep in mind that we have not even shipped C# 3.0 yet. Any of my musings on possible future additions to the language should be treated as playful hypotheses, rather than announcements of a commitment to ship any product with any feature whatsoever.
Today: what do we mean by “covariance” and “contravariance”?
The first thing to understand is that for any two types T and U, exactly one of the following statements is true:
For example, consider a type hierarchy consisting of Animal, Mammal, Reptile, Giraffe, Tiger, Snake and Turtle, with the obvious relationships. ( Mammal is a subclass of Animal, etc.) Mammal is a bigger type than Giraffe and smaller than Animal, and obviously equal to Mammal. But Mammal is neither bigger than, smaller than, nor equal to Reptile, it’s just different.
Why is this relevant? Suppose you have a variable, that is, a storage location. Storage locations in C# all have a type associated with them. At runtime you can store an object which is an instance of an equal or smaller type in that storage location. That is, a variable of type Mammal can have an instance of Giraffe stored in it, but not a Turtle.
This idea of storing an object in a typed location is a specific example of a more general principle called the “substitution principle”. That is, in many contexts you can often substitute an instance of a “smaller” type for a “larger” type.
Now we can talk about variance. Consider an “operation” which manipulates types. If the results of the operation applied to any T and U always results in two types T’ and U’ with the same relationship as T and U, then the operation is said to be “covariant”. If the operation reverses bigness and smallness on its results but keeps equality and unrelatedness the same then the operation is said to be “contravariant”.
That’s totally highfalutin and probably not very clear. Next time we’ll look at how C# 3 implements variance at present.
PingBack from http://www.artofbam.com/wordpress/?p=9267
I'd love to hear that there are plans for covariant return types in future versions of C#. That's something I've sorely missed from C++ (specifically for virtual methods in abstract class factories).
Oh, and kudos for use of "highfalutin"; would have been more applicable to your series on regular expressions though I think ;-)
Sorry to disappoint, but as I will discuss in part four or five of this series, odds are good that we are not going to get to covariant return types on virtual overrides in the hypothetical future next version of C#. We may, however, make certain variant reference type conversions legal. Hypothetically.
@AdamM : it is possible to return a subclass from a virtual method override in C++. Isn't this what you were referring to?
I think it's highly misleading to refer to types as being 'bigger' or 'smaller' than each other. They're not, in any canonical sense. They're supertypes and subtypes, and the subtype relation just happens to be a partial order -- and there are many other partial orders over types. It's like saying that sets are bigger, smaller, equal or not related to other sets simply because the relation of is-a-subset-of happens to be a partial order. Replacing 'bigger' and 'smaller' with 'supertype' and 'subtype' in the above article makes it an order of magnitude clearer and more obvious.
No, it is not highly misleading. Your characterization of bigger/smaller is completely wrong in a world with variance in the type system.
Let me be absolutely crystal clear on this: SMALLER does NOT mean "is a subtype of". Absolutely not. The whole _point_ of my introducing a new term rather than using an existing term is to call attention to the fact that they are different.
Yes, a subtype is always smaller than its supertype, but introducing variance into the type system causes types which are NOT supertypes or subtypes of each other to have "smaller than" relationships. Conflating the two is wrong and misleading, which is why I was careful to not do so.
Read the article more carefully. I never said that "smaller than" means "is a subtype of". Rather, I said that "smaller than" means "is assignment compatible with", which is very different in a world with variant types.
If you still don't believe me, think about this. Is string smaller than object? Since C# supports covariant array types, yes it is. You can store an object of type string in a variable of type object. Now, answer me this: is string a subtype of object? Absolutely not! string's base type is System.Array, not object.
C# implements variance in two ways. Today, the broken way. Ever since C# 1.0, arrays where the element
No, string is not smaller than object... because arrays have bidirectional data transfer they are neither covariant nor contravariant. The fact that C# treats array types as covariant breaks type safety and forces runtime type checks on array store operations, which would not otherwise be necessary.
Now IEnumerable<string> is, using your terminology, smaller than IEnumerable<object>... But there is also a subtype relationship present. There is not "derivation" in the .NET sense, but it IS a subtype. The fact that base type/derived type is no longer the only relationship that creates supertype/subtype status is why variance with generics is so important.
Ben, apparently you and I are using the same terms in different ways.
I am doing my best to be precise here. Let me state this again so as to be sure I am clear.
1) I am defining "covariant"/"contravariant" as "preserving/reversing a smaller-than relationship between types".
You are defining them to mean something else -- I think you are defining them to mean "preserving/reversing a smaller-than relationship between types in a guaranteed-at-compile-time-typesafe manner"
Now, you are free to define those words any way you like, but that is not the standard way of defining "covariant" and "contravariant". For the rest of this series, I will stick with my definition.
2) I am defining "smaller" as "assignment-compatible in the CLR type system".
You are defining "smaller" to mean something else, since you are claiming that string is not smaller than object but IE<string> is smaller than IE<object>. Neither of those claims are true in the CLR today for my definition of "smaller", so therefore you must either be mistaken or you have some different definition of "smaller".
I'm not sure what your definition of "smaller" is exactly, though again, it appears to have something to do with the ability to determine type safety at compile time.
Again, you are free to use any definition of "smaller" you want, but this is the definition I have chosen for the purpose of this series of articles, so for the rest of this series, we'll stick with that.
And finally, you and I are using "subtype" to mean different things. By "subtype" I mean "is on the transitive closure of immediate base type". I am not sure what your definition of "subtype" is.
Your argument that array covariance in CLR/C# leads to runtime checks is entirely correct, as I pointed out in today's article. But that there are cases where this is broken does not mean that it is not covariant.
Again, you are free to define "covariant" as "guaranteed type safe covariant" if you want to. That is not how I am defining "covariant" for the purposes of this series of articles.
> I am defining "smaller" as "assignment-compatible in the CLR type system".
Thank you for finally defining this term. That helps tremendously.
You're welcome. I thought though that I had defined it already, in boldface. "At runtime you can store an object which is an instance of an equal or smaller type in that storage location."
I had the same misunderstanding tonight as other readers. In the literature I've read on type theory for programming languages, writers often take a type to be a set of values (that is, a subset of the set of all possible values), so when you started saying things like "T is bigger than U" I assumed you were talking about set containment, especially since you gave it the properties of a partial order (so we knew you weren't talking about cardinality.) When I came to your comment "At runtime you can store an object..." I took it as an important (hence the boldface) comment about the C# runtime system, but not a definition.
Welcome to the Thirty-Fourth issue of Community Convergence. This is a time when the team is in transition.
you can take a look at my post here:
I've written a small example where covariance problems are clear. I didn't know about all that theory before a kind contributer sent me here :-)