Fabulous Adventures In Coding
Eric Lippert is a principal developer on the C# compiler team. Learn more about Eric.
I've written a lot about this already, but I think one particular point bears repeating.
As we're getting closer to shipping C# 4.0, I'm seeing a lot of documents, blogs, and so on, attempting to explain what "covariant" means. This is a tricky word to define in a way that is actually meaningful to people who haven't already got degrees in category theory, but it can be done. And I think it's important to avoid defining a word to mean something other than its actual meaning.
A number of those documents have led with something like:
"Covariance is the ability to assign an expression of a more specific type to a variable of a less specific type. For example, consider a method M that returns a Giraffe. You can assign the result of M to a variable of type Animal, because Animal is a less specific type that is compatible with Giraffe. Methods in C# are 'covariant' in their return types, which is why when you create a covariant interface, it is indicated with the keyword 'out' -- the returned value comes 'out' of the method."
But that's not at all what covariance means. That's describing "assignment compatibility" -- the ability to assign a value of a more specific type to a storage of a compatible, less specific type is called "assignment compatibility" because the two types are compatible for the purposes of verifying legality of assignments.
So what does covariance mean then?
First off, we need to work out precisely what the adjective "covariant" applies to. I'm going to get more formal for a bit here, but try to keep it understandable.
Let's start by not even considering types. Let's think about integers. (And here I am speaking of actual mathematical integers, not of the weird behaviour of 32-bit integers in unchecked contexts.) Specifically, we're going to think about the ≤ relation on integers, the "less than or equal to" relation. (Recall that of course a "relation" is a function which takes two things and returns a bool which indicates whether the given relationship holds or does not hold.)
Now let's think about a projection on integers. What is a projection? A projection is a function which takes a single integer and returns a new integer. So, for example, z → z + z is a projection; call it D for "double". So are z → 0 - z, N for "negate" and z → z * z, S for "square".
Now, here's an interesting question. Is it always the case that (x ≤ y) = (D(x) ≤ D(y))? Yes, it is. If x is less than y, then twice x is less than twice y. If x is equal to y then twice x is equal to twice y. And if x is greater than y, then twice x is greater than twice y. The projection D preserves the direction of size.
What about N? Is it always the case that (x ≤ y) = (N(x) ≤ N(y))? Clearly not. 1 ≤ 2 is true, but -1 ≤ -2 is false. But we notice that the reverse is always true! (x ≤ y) = (N(y) ≤ N(x)). The projection N reverses the direction of size.
What about S? Is it always the case that (x ≤ y) = (S(x) ≤ S(y))? No. -1 ≤ 0 is true, but S(-1) ≤ S(0) is false. What about the opposite? Is it always the case that (x ≤ y) = (S(y) ≤ S(x)) ? Again, no. 1 ≤ 2 is true, but S(2) ≤ S(1) is false. The projection S does not preserve the direction of size, and nor does it reverse it.
The projection D is "covariant" -- it preserves the ordering relationship on integers. The projection N is "contravariant". It reverses the ordering relationship on integers. The projection S does neither; it is "invariant".
Now I hope it is more clear exactly what is covariant or contravariant. The integers themselves are not variant, and the "less than" relationship is not variant. It's the projection that is covariant or contravariant -- the rule for taking an old integer and making a new one out of it.
So now lets abandon integers and think about reference types. Instead of the ≤ relation on integers, we have the ≤ relation on reference types. A reference type X is smaller than (or equal to) a reference type Y if a value of type X can be stored in a variable of type Y. That is, if X is "assignment compatible" with Y.
Now consider a projection from types to types. Say, the projection "T goes to IEnumerable<T>". That is, we have a projection that takes a type, say, Giraffe, and gives you back a new type, IEnumerable<Giraffe>. Is that projection covariant in C# 4? Yes. It preserves the direction of ordering. A Giraffe may be assigned to a variable of type Animal, and therefore an sequence of Giraffes may be assigned to a variable that can hold a sequence of Animals.
We can think of generic types as "blueprints" that produce constructed types. Let's take the projection that takes a type T and produces IEnumerable<T> and simply call that projection "IEnumerable<T>". We can understand from context when we say "IEnumerable<T> is covariant" what we mean is "the projection which takes a reference type T and produces a reference type IEnumerable<T> is a covariant projection". And since IEnumerable<T> only has one type parameter, it is clear from the context that we mean that the parameter to the projection is T. After all, it is a lot shorter to say "IEnumerable<T> is covariant" than that other mouthful.
So now we can define covariance, contravariance and invariance. A generic type I<T> is covariant (in T) if construction with reference type arguments preserves the direction of assignment compatibility. It is contravariant (in T) if it reverses the direction of assignment compatibility. And it is invariant if it does neither. And by that, we simply are saying in a concise way that the projection which takes a T and produces I<T> is a covariant/contravariant/invariant projection.
UPDATE: My close personal friend (and fellow computer geek) Jen notes that in the Twilight series of novels, the so-called "werewolves" (who are not transformed by the full moon and therefore not actually werewolves) maintain their rigid social ordering in both wolf and human form; the projection from human to wolf is covariant in the social-ordering relation. She also notes that in high school, programming language geeks are at the bottom of the social order, but the projection to adulthood catapults them to the top of the social order, and therefore, growing up is contravariant. I am somewhat skeptical of the latter claim; the former, I'll take your word for it. I suppose the question of how social order works amongst teenage werewolves who are computer geeks is a subject for additional research. Thanks Jen!
Great explanation. A small suggestion: when giving the integers example at first, you write
"It's the projection that is covariant or contravariant." You might want to add "with respect to the <= ordering" just to make that explicit.
Thank you, thank you, thank you! I finally understand the concept - this is the first explanation I have read that *makes sense*!
Thanks a lot Eric, for such neat explanation. I was always confused about these until now.
Nicely done. Wikipedia has an article on it here: http://en.wikipedia.org/wiki/Covariance_and_contravariance_(computer_science)
On Jen's note, in Teen Wolf, the wolf catapulted Scott Howard up the social-ordering relation, transforming from a relative nobody into the most popular kid in school (and an excellent basketball player). In this case, would the projection from human to wolf be contravariant? If so, would that render the overall projection invariant, since Twilight apparently goes against those rules?
That was the best explanation of those two terms that I have yet com across. Thanks!
This is another excellently communicated post. If you are looking for more information on covariance/contravariance, I found this link helpful:
It is more abstract than this post, but it helps to explain why the in/out keywords were selected, although they use + and - in the paper.
That is an amazingly un-intuitive use of the word "invariant". "Invariant" means "unchanging" in every other context, but an "invariant generic type" is apparently one which changes subtyping relationships (in a way other than simply reversing them)! That would break my brain if I had to deal with that term on a regular basis.
(For the record, vectors and tensors can have "mixed variance", and I'm not sure if there's any term at all in category theory for something which is neither covariant or contravariant; category theory applies those terms to "functors" and a functor has to be either covariant or contravariant by definition.)
It's only going to get worse. On Thursday I'm going to give a deeply counter-intuitive definition of "invariantly valid".
A small saving grace is that an invariant generic type simply destroys subtyping relationships. A List<X> is in no way related to List<Y> no matter how (different) X and Y are related. -- Eric
I am with @Erik - couldn't have said it better. Thanks a lot!
As to the teenage werewolves who are computer geeks, if we assume that the New Age Techno-werewolfs are transformed not by the full moon, but by the prospect of better salary rate and/or career advance, then "growing up" and "transition from human to wolf" are pretty much the same ptojection. So one of them cannot be covarian, and the other contravariant! :-D
Thanks for (at last) explaining co and contravariance without mentioning arrays and programming language consturucts. I didn't know the roots of this terms and have troubles with differentiating them when reading in pure programming context (especially when comparing Java to .NET).
Great analogy, I'll definitely use it sometime..
I have always wondered why those particular words were used to describe that principle (covariance, contravariance, invariance). Does anyone have an idea on their etymology?
If everybody who calls themselves programmers had picked up a copy of Object-Oriented Software Construction (2nd ed) and actually read it, this wouldn't be an issue :-)
As an "old" Eiffel programmer it is nice to see that most of Eiffel's main ideas are slowly making their way into the world of curly braces ;-)
Btw, nice explanation, Eric.
"A generic type I<T> is covariant (in T)"
It would be nice if you mentioned in footnotes that I<T1, T2, T3> can be covariant (in T1), contravariant (in T2) and invariant (in T3) at the same time, just for completeness.
Thanks. 2 weeks ago I spend 6 hours to get a similar understanding. Next time I will wait for your blog article ;)
Now I understand and can even explain to someone else )