Covariance and Contravariance, Part Eleven: To infinity, but not beyond

Covariance and Contravariance, Part Eleven: To infinity, but not beyond

Rate This
  • Comments 22

UPDATE: Andrew Kennedy, author of the paper I reference below, was good enough to point out some corrections and omissions, which I have addressed. Thanks Andrew!

As I've discussed at length in this space, we are considering adding covariance and contravariance on delegate and interface types parameterized with reference types to a hypothetical future version of C#. (See my earlier articles in this series for an explanation of the proposed feature.)

Variance is highly useful in mainstream cases, but exposes a really irksome problem in certain bizarre corner cases. Here's just one example.

Consider a "normal" interface contravariant in its sole generic type parameter, and a "crazy" invariant interface which extends the normal interface in a strange way:

public interface IN<in U> {}
public interface IC<X> : IN<IN<IC<IC<X>>>> {}

This is a bit weird, but certainly legal.

Before I go on, I want to digress and talk a bit about why this is legal. Most people when they first see such a beast immediately say "but an interface cannot inherit from itself, that's an illegal circularity in the inheritance chain!"

First off, no, that is not correct. Nowhere does the C# specification make this kind of inheritance illegal, and in fact, a weak form of it must be legal. You want to be able to say:

interface INumber<X> : IComparable<INumber<X>> { ... }

that is, you want to be able to express that one of the guarantees of the INumber<X> contract is that you can always compare one number with another one. Therefore, it must be legal to use a type's name in a type argument of a generic base type.

However, all is not rosy. This particularly gross kind of inheritance that I give as an example is in fact illegal in the CLR, even though it is not illegal in C#. This means that it is possible to have the C# compiler generate an interface type which then cannot be loaded by the CLR. This unfortunate mismatch is troubling, and I hope in a future version of C# to make the type definition rules of C# as strict or stricter than those of the CLR. Until then, if it hurts when you do that, don't do it.

Second, unfortunately, the C# compiler presently has numerous bugs in its cycle detector such that sometimes things which kinda look like cycles but are in fact not cycles are flagged as cycle errors. This just makes it all the more difficult for people to understand what is a legal cycle and what isn't. For example, the compiler today will incorrectly report that this is an illegal base class cycle, even though it clearly is not:

public class November<T> {}
public class Romeo : November<Romeo.Sierra.Tango> {
   public class Sierra {
       public class Tango {}
    }
}

I have devised a new (and I believe correct!) cycle detection algorithm implementation, but unfortunately it will not make it into the service release of the C# 3 compiler. It will have to wait for a hypothetical future release. I hope to address the problem of bringing the legal type checker into line with the CLR at the same time.

Anyway, back to the subject at hand: crazy variance. We have the interfaces defined as above, and then give the compiler a little puzzle to solve:

IC<double> bar = whatever;
IN<IC<string>> foo = bar;  // Is this assignment legal?

I am about to get into a morass of impossible-to-read generic names, so to make it easier on all of us, I am going to from now on abbreviate IN<IC<string>> as NCS. IC<double> will be abbreviated as CD. You get the idea I'm sure.

Similarly, I will notate "is convertible to by implicit reference conversion" by a right-pointing arrow. So the question at hand is true or false: CD→NCS ?

Well, let’s see. Clearly CD does not go to NCS directly. But (the compiler reasons) maybe CD’s base type does.

CD’s base type is NNCCD. Does NNCCD→NCS? Well, N is contravariant in its parameter so therefore this boils down to the question, does CS→NCCD ?

Clearly not directly. But perhaps CS has a base type which goes to NCCD. The base type of CS is NNCCS. So now we have the question does NNCCS→NCCD ?

Well, N is contravariant in its parameter, so this boils down to the question does CCD→NCCS ?

Let’s pause and reflect a moment here.

The compiler has “reduced” the problem of determining the truth of CD→NCS to the problem of determining the truth of CCD→NCCS! If we keep on “reducing” like this then we’ll get to CCCD→NCCCS, CCCCD→NCCCCS, and so on.

I have a prototype C# compiler which implements variance – if you try this, it says “fatal error, an expression is too complex to compile”.

I considered implementing an algorithm that is smarter about determining convertibility; the paper I reference below has such an algorithm. (Fortunately, the C# type system is weak enough that determining convertibility of complex types is NOT equivalent to the halting problem; we can find these bogus situations both in principle and in practice. Interestingly, there are type systems in which this problem is equivalent to the halting problem, and type systems for which the computability of convertibility is still an open question.) However, given that we have many other higher priorities, it’s easier to just let the compiler run out of stack space and have a fatal error. These are not realistic scenarios from which we must sensibly recover.

This is just a taste of some of the ways that the type system gets weird. To get a far more in-depth treatment of this subject, you should read this excellent Microsoft Research paper

  • "expression is too complex to compile" is actually what my brain said, too. Indeed, the whole subject is fascinating and your posts on the issue are on the "toread" list.

    For now, I just can't manage to assign null to a struct - a feat that Nullable<T> seems to be perfectly capable of. How can I implicitly convert a null to a struct? the closest I get gives me CS0553...sorry I ranted here, but it was bugging me and your post right from the compiler front hit me on my Feed :)

    Thanks for your valuable and brain-crunching posts!

  • Nullable<T> does not assign a null to a struct. Nullable<T> is nothing more nor less than simply a struct something like:

    struct Nullable<T> where T : struct

    {

     readonly T t;

     readonly bool hasValue;

     ... constructors, accessors, and so on

    }

    That's all folks. The rest is just compiler and runtime magic.  The compiler turns "if (n == null)" into "if (n.HasValue == false)", turns "int n = null" into "n = new Nullable<int>();", etc.

    Perhaps what you want to do is not "convert a null to a struct" but rather "get the default value for a struct"?  In that case you can just say:

    T t = default(T);

    If T is numeric, default(T) is zero.  If T is a reference type, default(T) is null.  bool -- false.  Nullable type -- nullable type with hasValue=false. struct -- struct with all its fields set to THEIR default types.

  • My head is about to spin clean off my neck. "expression is too complex to compile" is a dramatic understatement.

  • I think Church shows that this kind of the problem,i.e. to decide which expressions are equivalent, is an unsolvable problems.

  • Hi Eric,

    thank you for your answers. I think I expressed myself too bland yesterday, I was more or less aware of the issues...

    I was looking for a way as a .NET developer to get an analogous code statement to compile (In the wake of me having a lot of fun with the implicit operator). Anyway, most importantly for me was the info that the compiler does some magic, so there is no way to get the effect via developer-made .NET code. That's fine by me, but good to know.

    Cheers and thanks :)

  • Is it not so that since double and string have no hierarchical relation, co- and contravariance are not relevant for determining A<double> -> B<string>, whatever A and B may be? Covariance preserves ordering, contravariance reverses ordering, but neither _creates_ ordering, as far as I see.

  • Joren, it is in fact so, but that is not exactly the question at hand.

    The question is not whether or not the cast is valid. The question is how to write the compiler to know that.

    On the same note, think of as example for the halting problem (even though it is not equivalent, as Eric said):

    write a program that can tell if the following method stops or not:

    int thing(int x) { return thing(x + 1); }

    Surely, both you and I can see that this method will never stop (disregarding the overflow exception). But for a computer program, it would not be that obvious.

  • Re: The question is how to write the compiler to know whether or not the cast is valid.

    Yes, I understand that. Eric said the problem is not solvable in general, so returning the error 'too complex to compile' is a fine approach.

    But I thought: is it not at least possible to write a compiler that can follow the reasoning from the string-double relation (or other simple relations) to the interface relation, in stead of only trying to reduce the generic interface relation to a simple relation (which, as my previous post implied, can fail while the other method works)?

    It seems that trying the problem A<X> -> B<Y> from both sides (both from the relation between X and Y, and the relation between A and B) will lead to more classes of cases that the compiler can handle without giving up.

    Of course the question remains whether anything whose validity can't be ascertained from the relation between A and B alone will ever be practically useful. I'd say 'no' is most likely, but I've hardly investigated the problem any further.

  • Rather than place the links to the most recent C# team content directly in Community Convergence , I

  • I wonder if some future CSC could allow implicit interface implementation in variant way?

    Like this:

    class MyPOD : IClonable

    {

     public MyPOD Clone() { ... } // this could implicitly implement IClonable.Clone

    }

    Eric, can such variance be specified on IClonable interface?

  • That kind of variance is called "return type covariance". As I mentioned early on in this series, (a) this series is not about that kind of variance, and (b) we have no plans to implement that kind of variance in C#.  

  • Hello Eric,

    After my last re-reading of your co[ntra]variance series, I believe I came to a new point.

    After all there are only 3 possibilites:

    1. Clear covariance (IEnumerator<T>).

    2. Clear contravariance (IComparer<T>).

    3. Undecidable (IList<T>).

    All the 3 possibility are easily detectable by compiler and I believe for the first two we would like the compiler will decide the variance automatically. We need a solution for the third case only.

    To see the solution, I would like first analyze how we selected which case the given interface belongs to. We analyzed the method signatures in which T is involved:

    1. If in all of them T is 'out' parameter, the interface is covariant.

    2. If in all of them T is 'in' parameter, the interface is contravariant.

    3. If there is a mix - undecidable.

    My solution is to say the compiler how I'm going to use my variable and enable either covariant or contravariant invocations:

    IList<Animal> aa = whatever;

    IList<in Giraffe> ag = aa;

    IList<out Giraffe> agX = aa; //fails to compile

    ag user sees: 'int ag.IndexOf(Giraffe)' and 'object ag[int]'.

    IList<Giraffe> ag = whatever;

    IList<out Animal> aa = ag;

    IList<in Animal> aaX = ag; //fails to compile

    aa user sees: 'int aa.IndexOf(<only null can be passed>)' and 'Animal aa[int]'.

    Now I can say in a type safe manner:

    class X<T> {

    T _t;

    void Test(IList<out T> at) {

    T t = at[0]; //clearly t is 'out' parameter

    }

    void Test(IList<in T> at) { //overloaded method!

    int i = at.IndexOf(_t); //clearly _t is 'in' parameter

    }

    }

    ...

    new X<Animal>().Test(new List<Giraffe>()); //first method is called

    new X<Giraffe>().Test(new List<Animal>()); //second method is called

    Kosta

  • I found a very interesting presentation, which shows different options, pros and cons for the implementation: http://research.microsoft.com/~akenn/generics/ECOOP06.ppt.

    It's clear that my approach is a variation on Java theme.

    Interestingly enough that in order to create a covariant/contravariant projection in C#, currently one will need to define a special read/write interface. While this is somehow acceptable with one parameter,  what is suggested if I have 2 or more, i.e.:

    MyClass<T1,T2,T3>{}

    To cover all the possibilities I need to define 2^3 interfaces. Yes, may be some of them are not useful, but I easily may fail into a situation where I need declare 4-5 interfaces. How is it addressed?

  • does this cover situations like ObservableCollection<T1<T2>> ?

  • I'm curious, on what basis does the CLR decide that

    public interface IN<in U> {}

    public interface IC<X> : IN<IN<IC<IC<X>>>> {}

    is illegal, and does it become legal if U is made invariant?

    OT: why the heck isn't return type covariance on the table? In terms of usefulness, it's in the same ballpark as variance in generics but a vastly simpler concept and dead easy to implement (except for compatibility issues). I've wanted return type covariance for years :(

Page 1 of 2 (22 items) 12