Fabulous Adventures In Coding
Eric Lippert is a principal developer on the C# compiler team. Learn more about Eric.
C# implements variance in two ways. Today, the broken way.
Ever since C# 1.0, arrays where the element type is a reference type are covariant. This is perfectly legal:
Animal animals = new Giraffe;
Since Giraffe is smaller than Animal, and “make an array of” is a covariant operation on types, Giraffe is smaller than Animal, so an instance fits into that variable.
Unfortunately, this particular kind of covariance is broken. It was added to the CLR because Java requires it and the CLR designers wanted to be able to support Java-like languages. We then up and added it to C# because it was in the CLR. This decision was quite controversial at the time and I am not very happy about it, but there’s nothing we can do about it now.
Why is this broken? Because it should always be legal to put a Turtle into an array of Animals. With array covariance in the language and runtime you cannot guarantee that an array of Animals can accept a Turtle because the backing store might actually be an array of Giraffes.
This means that we have turned a bug which could be caught by the compiler into one that can only be caught at runtime. This also means that every time you put an object into an array we have to do a run-time check to ensure that the type works out and throw an exception if it doesn’t. That’s potentially expensive if you’re putting a zillion of these things into the array.
Unfortunately, we’re stuck with this now. Giraffe is smaller than Animal, and that’s just the way it goes.
I would like to take this opportunity to clarify some points brought up in comments to Part One.
First, by "subtype" and "supertype" I mean "is on the chain of base classes" for classes and "is on the tree of base interfaces" for interfaces. I do not mean the more general notion of "is substitutable for". And by “bigger than” and “smaller than” I explicitly do NOT mean “is a supertype of” and “is a subtype of”. It is the case that every subclass is smaller than its superclass, yes, but not vice versa. That is, it is not the case that every smaller type is a subtype of its larger type. Giraffe is smaller than both Animal and System.Array. Clearly Giraffe is a subtype of System.Array, but it is emphatically not a subtype of Animal. Therefore the “is smaller than” relationship I am defining is more general than the “is a kind of” relationship. I want to draw a distinction between assignment compatibility (smaller than) and inheritance (subtype of).
Next time we’ll discuss a kind of variance that we added to C# 2.0 which is not broken.
PingBack from http://www.artofbam.com/wordpress/?p=9671
I have just stumbled onto a code like that above. After reading this article am
under an Impression now I need to add validation to it.
I am curious, is there a way to stop/prevent coding like that or check right up in
front that you are adding say parrots to an array of toucans (the size may be
If you have an array of a sealed type, does it still do the runtime type checking?
Probably not, but I am not an expert on the JIT. Why not try it both ways and examine the jitted code in the debugger? Then you'll know.
Keep in mind that you will want to attach the debugger to the jitted code AFTER it jits. The jitter is allowed to generate more debugger-friendly code if it detects that the process is being debugged while jitting is going on. That might include turning off certain optimizations.
How should the following case be handled?
Giraffe giraffes = new Giraffe;
// somewhere else
Animal animals = giraffes;
That's perfectly legal, since giraffes is of a type that is smaller than Animal
Interesting that this is actually not the case for Generics. I came across this a few months ago and, after banging my head on the desk several times, finally gave up and threw together an EnumerableWrapper<S, T>. I never did understand why I couldn't pass an IEnumerable<B> for an IEnumerable<A> parameter (where B : A), but this explains it - that's actually the non-broken way of doing it. :-)
(Actually, in my case, B is a class and A is an interface that it implements, but I'm sure the logic is the same)
Of course, if these were all immutable types, then it wouldn't be an issue, right?
You are somewhat anticipating where we're going here. In a few more posts I will propose that we add exactly that kind of variance to interfaces in a future version of C#.
In general, immutability certainly helps make things more variant. As we will see in several more posts, since immutable objects are by definition read-only, and parameters may be covariant when they appear only on the "output" side, it is easy to create covariant immutable data structures if you have covariance in the type system.
My debugging skills are not particularly l33t, but the compiled version of the program (.Net 2.0) appears to be doing exactly the same assignment regardless whether the array's element type is sealed or not. I fear we are stuck with runtime type checking on arrays with a reference type element type.
Hi Eric :-)
A link to this is posted on programming.reddit.com right now, with the title "It Should Always be Legal to Put a Turtle into an Array of Animals". Right when I read the title and saw blogs.msdn.com, I thought to myself "this has to be a post by Eric Lippert". I was right :-)
How's it going? (Are you going to try windsurfing tomorrow, with the storm coming in? 50+ MPH winds they say...) Did my project end up being useful to anyone? :-X
This is not actually covariance.
The type of consumers that can accept Giraffes is "smaller" than the type of consumers that can accept Animals: if a consumer can accept an Animal, then it can accept a Giraffe, but not vice versa.
For example, the type of write-only arrays of Giraffes should be a subtype of the type of write-only arrays of Animals.
The type of producers that can make Giraffes is "larger" than the type of producers that can make Animals: if a producer can make Giraffes, then it can make Animals.
For example, the type of read-only arrays of Animals should be a subtype of the type of read-only arrays of Giraffes.
Since an array can both produce (e.g. animals) and consume (e.g. animals=myanimal), Animal should be neither a subtype nor a supertype of Giraffe.
So to address the question of how the following should be handled:
I say it should be handled with a compile-time error.
I agree with you that this situation is extremely irksome. However, it is one that we are stuck with.
However, it _is_ covariance. I have a partial ordering on types (CLR assignment compatibility) and an operator from type to type ("make an array type") which preserves the ordering. That's the _definition_ of "covariant operator".
Perhaps you prefer "always-typesafe liskov-style substitution" as the basis for a partial ordering, rather than the sometimes-not-typesafe ordering that I've chosen. If that's your choice of partial ordering, then I agree with you, making arrays is not covariant. But that's not the partial ordering I'm choosing for the purposes of this series of articles.
You theory guys seem bound and determined to reject my choice of partial ordering today. That is your right, of course, but since I'm the guy who is going to have to implement this in a compiler that targets the CLR, choosing "assignment compatible in the CLR" is a highly pragmatic choice for me.
Without the type of covariance as implemented in C#, how would I write a function like this:
static bool ArrayEquals(object a, object b)
;;if (a.Length != b.Length) return false;
;;for (int i = 0; i < a.Length; i++) if (!a[i].Equals(b[i]) return false;
A good question.
In this case the covariance is perfectly safe because you are treating arrays as read-only for the purposes of this method. It's only when you try to write to the array that you have a potential problem, and you don't do that here.
The way to get safe covariance is to separate out the read-only operations into interfaces and make them covaraint. If you wrote your program to take two IEnumerable<object> objects and we could somehow make IEnumerable<T> covariant in T then that would be perfectly type safe. IEnumerable does not allow you to write to the object, only read from it, so it is safe for covariance. You could then pass your IEnumerable<Giraffe> and IEnumerable<Animal> to the function and it would be just fine.
Of course, I have just spoiled the surprise; that is precisely where this series of posts is heading.
To elaborate on my original example, should the following code throw a compile-time error, a run-time error, or do copy-on-write? I really am curious to see what you think the right way is. I've been goofing around a lot with algebraic data types lately and contemplating the differences between ADTs and objects, and when each is appropriate. So far, the practical rule of thumb I've come up with is: If you find yourself doing RTTI, that object hierarchy should've been represented with ADTs.
amimals.append(new Turtle()); /* not sure if it's really valid C#, but you get the point */