Covariance and Contravariance in C#, Part Ten: Dealing With Ambiguity

Covariance and Contravariance in C#, Part Ten: Dealing With Ambiguity

Rate This
  • Comments 43

OK, I wasn’t quite done. One more variance post!

Smart people: suppose we made IEnumerable<T> covariant in T. What should this program fragment do?

class C : IEnumerable<Giraffe>, IEnumerable<Turtle> {
    IEnumerator<Giraffe> IEnumerable<Giraffe>.GetEnumerator() {
        yield return new Giraffe();
    }
    IEnumerator<Turtle> IEnumerable<Turtle>.GetEnumerator() {
        yield return new Turtle();
    }
// [etc.]
}
 
class Program {
    static void Main()  {
        IEnumerable<Animal> animals = new C();
        Console.WriteLine(animals.First().GetType().ToString());
    }
}

Options:

1) Compile-time error.
2) Run-time error.
3) Always enumerate Giraffes.
4) Always enumerate Turtles.
5) Choose Giraffes vs Turtles at runtime.
6) Other, please specify.

And if you like any options other than (1), should this be a compile-time warning?

  • > There is no IEnumerable<Animal>.GetEnumerator() that returns an IEnumerator<Animal>. It can only return an IEnumerator<Giraffe> or IEnumerator<Turtle>, both of which may be treated as an IEnumerator<Animal> and thus the ambiguituy: CS1640.

    OK, but how does the _compiler_ know that?  Suppose I split the code into three assemblies:

    assembly #1:

    public class D { public static void M(IEnumerable<Animal> animals) { foreach(Animal a in ....

    assembly #2:

    public class E { public static void N(object x) { D.M((IEnumerable<Animal>)x); } }

    assembly #3:

    public class C : IEnumerable<Giraffe>, IEnumerable<Turtle> { ... }

    public class F { public static void P() { E.N(new C()); } }

    I compile assembly one, then assembly two, then assembly three.  In which one do I get a compilation error?

  • Peter Ritchie:

    I was thinking that notationally what you would get would be a set of operations common to every instance in the type return set.  So if we constrain the output to be of type either turtle or giraffe and there is a method signature called reproduce that returns something common to both turtles and giraffes (perhaps, something in the <giraffe | turtle> set) then it would be ok.  I was thinking of "weak duck typing", to put a made-up conceptual phrase on it.  Ie//

    IEnumerable<Giraffe | Turtle> obj = new C();

    Giraffe g = c.reproduce();  // Failure.  Could be a Turtle.

    Giraffe g2 = c.reproduce() as Giraffe;  // Sucess.  Turtle objects will be cast to NULL.

    Turtle t = c.reproduce(); // Failure.  Could be a Giraffe.

    Lion l = c.reproduce();  // Failure.  Not in the set of Turtle | Giraffe.

    <Giraffe | Turtle> gt = c.reproduce();  // Sucess.

    Animal a = c.reproduce();//  Sucess.

    You'd have to check operations on that derived code later on to tell whether it was asked to perform any operations that are invalid on the combined set.  Ie//

    gt.LayEggs();  // Failure.  Giraffe does not implement a "Lay Eggs" method.

    gt.RunAwayFromPredator(Predator p);  // Success.  Both Turtle and Giraffe implement the run away method.

    Another way to think of it would be automatic base-classing, where you'd get a duck-typed base class from the compiler that contained the set of operations common to both giraffes and turtles.

    And remember, the underlying object would still be either a giraffe or a turtle, with the associated memory layout.  All you're doing notationally in my proposal is allowing the compiler to accept either one where it makes sense.  The compiler would still have to keep the set of types around and resolve whether any operations on the "multi-typed" type.

  • > OK, but how does the _compiler_ know that?

    Is there no way for the compiler to determine the exact type that something points at?

    If not, then it may need to be a run-time error.

    However, before that, in that case, it would fail when it looks for and can not find IEnumerator<Animal> IEnumerable<Animal>.GetEnumerator() in animals (of type IEnumerable<Animal>, but really pointing to C). That fact that C can also provide a IEnumerator<Giraffe> and IEnumerator<Turtle> which could pass as an IEnumerator<Animal> is immaterial if the compiler can not determine that animals is really an instance of C. Again, this would be a compile-time error, but not due to variance but rather because the compile can not determine that animals (IEnumerator<Animal>) is really a C instance.

  • It would be a terrible idea for C# to depend on the order of interfaces in the "inheritance list".  This would be like depending on the order attributes are applied.  I would be shocked if it weren't a rule for C# design to avoid making ordering in such cases matter.  There's just too much room for error.

    The type-casting logic needs to take into account the diamond problem and raise a compile-time error when possible, otherwise a run-time exception.  I see no reasonable way for the compiler to know how to choose IEnumerable<Turtle> over IEnumerable<Giraffe>.  If the author of the class wants to support IEnumerable<Animal>, he/she should implement it (I would say explicitly, but that has another meaning in C#).

    Now, would it be reasonable to produce a compilation warning about possible diamond hierarchies?

    Eric: your blog comments desperately needs a way to render code in <code>/<pre> tags (for inline/multiline code).  :-p

  • The problem here is not in covariance

    With present compiler, this code produces this compile error:

    'Test.C' does not implement interface member 'System.Collections.IEnumerable.GetEnumerator()'

    If I have to implement GetEnumerator in anyway, I would expect that all three versions of GetEnumerator would result in enumerating the same values.

    So these values must be of both type Giraffe and Turtle:

           System.Collections.IEnumerator System.Collections.IEnumerable.GetEnumerator()

           {

               yield break ;

           }

    It makes no sense to implement this interface more than once.

    foreach (var x in new C()) : how would you infer the type of Animals? the 3.0 compiler produces:

    foreach statement cannot operate on variables of type 'Test.C' because it implements multiple instantiations of 'System.Collections.Generic.IEnumerable<T>'; try casting to a specific interface instantiation

    So I expect that it should not be allowed to implement a generic interface more than once.

    It would be a breaking change, but it would only impact very badly written programs.

  • Yes, the problem here IS in covariance. I picked IEnumerable<T> as just one example, but you are concentrating solely on the semantics of IEnumerable.  How about if I pick some completely different interface:

    interface IFoo<+T> { T Get { get; } }

    class Bar : IFoo<Giraffe>, IFoo<Turtle> {

     IFoo<Giraffe>.Get{ get { return new Giraffe(); } }

     IFoo<Turtle>.Get{ get { return new Turtle(); } }

    }

    ...

    IFoo<Animal> f = new Bar();

    Console.WriteLine(f.Get().GetType().ToString());

    No IEnumerables at all. What should this do?

  • Why is implementing the same interface instantiated with two different sets of type arguments a bad programming practice?  If a Frob can be compared to a Frib or a Frub then why shouldn't class Frob : IComparable<Frib>, IComparable<Frub> be perfectly legal?

  • Speaking as a novice...

    The real question is about the nature of covariant interfaces: does the object bear the burden of providing the full spectrum or are an object's interface definitions pulling double-duty?  If the invariance is little more than shorthand for additional interface implementation requirements, then the class has to disambiguate IFoo<Animal> just to compile.  If the interface itself is making the guarantees, then Bar has two distinct IFoo<Animal> implementations, but neither should be directly accessible.

    In the former case, it's Bar's job to provide an unambiguous IFoo<Animal> implementation.  Unless the covariance guarantee is simply weak and translates to a runtime exception.

    In the latter case, it's the runtime's job to find the best match and detect ambiguities:

    IFoo<Animal> f = new Bar(); // runtime error

    IFoo<Mammal> m = new Bar();  // Get returns an Animal which is a Giraffe

    IFoo<Animal> g = new Bar() as IFoo<Giraffe>; // Get returns an Animal which is a Giraffe

    IFoo<Animal> t = new Bar() as IFoo<Turtle>;  // Get returns an Animal which is a Turtle

  • "IFoo<Animal> f = new Bar();

    Console.WriteLine(f.Get().GetType().ToString());

    No IEnumerables at all. What should this do?

    "

    The first line is statically ambiguous, with two legal-but-different conversions to IFoo<Animal> (you can go via IFoo<Giraffe> and via IFoo<Turtle>, with no way to choose).  Static ambiguity should (obviously) yield a compile-time error.  To resolve it you should have to choose a path, i.e. IFoo<Animal> f = (IFoo<Giraffe>)(new Bar()); to enumerate in the Giraffe style or IFoo<Animal> f = (IFoo<Turtle>)(new Bar()); to enumerate in the Turtle style.

    For the similar but different situation:

    "

    object o = new Bar();

    IFoo<Animal> f = (IFoo<Animal>)o;

    Console.WriteLine(f.Get().GetType().ToString());"

    Unfortunately we can't make the first line ambiguous (there's only one path back to object bceause there's no multiple inheritance--if we had instead multiply inherited ABCs (which would extend object) instead of interfaces then the static ambiguity would return and be detectable).  So instead we make the second line ambiguous, at runtime.  It should throw an exception (AmbiguousCastException, say) that is extended from (but different to) InvalidCastException.

  • Eric,

    It appears that the very first line is itself inherently ambiguous. (That's your point - right?) IEnumerable<Giraffe> implies one implicit conversion to IEnumerable<Animal>, and IEnumerable<Turtle> implies another (very different) implicit conversion to IEnumerable<Animal>. Thus, the definition of class C itself should result in a compile-time error, unless the developer provides some form of disambiguation.

    So, why not simply require class C to also implement IEnumerable<Animal> directly?

  • I agree with the first answer by Stuart Ballard, and I would add that

    IEnumerable<Animal> a1 = (IEnumerable<Giraffe>) new C();

    IEnumerable<Animal> a2 = (IEnumerable<Turtle>) new C();

    Console.WriteLine(a1.First().GetType().ToString()); // Giraffe

    Console.WriteLine(a2.First().GetType().ToString()); // Turtle

    is workable, and appears to provide a sufficient means of disambiguation. Therefore, it shouldn't be mandatory that C implement IEnumerable<Animal>.

    It appears to me that making generic IEnumerable covariant is not really a breaking change because code that tried to convert C to IEnumerable<Animal> was already illegal. OTOH, once IEnumerable becomes covariant, adding a new implementation of IEnumerable<T> (for a specific T) to almost any class will typically break existing code that performs a conversion to IEnumerable<object>.

    I wonder how C# and the CLR would handle code like the following--now and after adding variance:

    class Foo<T> : IEnumerable<string>, IEnumerable<T> where ... { ... } // um, T could be set to string

    class Foo<S,T> : IEnumerable<S>, IEnumerable<T> where ... { ... } // can this even be legal?

  • My gut says that it is a compile time error, not (as some above seem to feel?) in converting C to IEnumerable<Animal>, but in defining C to implement two interfaces related by a covariance conversion.

  • So nicely step by step blogged by Eric Lippert for &quot;Covariance and Contravariance&quot; as &quot;Fabulous

Page 3 of 3 (43 items) 123