Covariance and Contravariance in C#, Part Ten: Dealing With Ambiguity

Covariance and Contravariance in C#, Part Ten: Dealing With Ambiguity

Rate This
  • Comments 43

OK, I wasn’t quite done. One more variance post!

Smart people: suppose we made IEnumerable<T> covariant in T. What should this program fragment do?

class C : IEnumerable<Giraffe>, IEnumerable<Turtle> {
    IEnumerator<Giraffe> IEnumerable<Giraffe>.GetEnumerator() {
        yield return new Giraffe();
    }
    IEnumerator<Turtle> IEnumerable<Turtle>.GetEnumerator() {
        yield return new Turtle();
    }
// [etc.]
}
 
class Program {
    static void Main()  {
        IEnumerable<Animal> animals = new C();
        Console.WriteLine(animals.First().GetType().ToString());
    }
}

Options:

1) Compile-time error.
2) Run-time error.
3) Always enumerate Giraffes.
4) Always enumerate Turtles.
5) Choose Giraffes vs Turtles at runtime.
6) Other, please specify.

And if you like any options other than (1), should this be a compile-time warning?

  • Another vote for compile-time error. It's pretty rare to see so much agreement on a language topic - I think the consensus is clear on this one :)

    Jon

  • It depends on your First method, which you omitted.

    I am away from the VS2008 beta at the moment.

    Is it an extension method on IEnumerable<T>?

    Is it supposed to return the first _implemented_ type? That seems a little ridiculous, but that is is up to the definition of the method.

    Besides the assignability of C to IEnumerable<Animal> I do not think that this example has any relevance on variance. The fact that C simultaneously implements IEnumerable<Giraffe> and IEnumerable<Turtle> seems to be a poor design, but from the language point of view it is certainly legal regardless of variance. There may even be appropriate uses for it (I do not know).

    Thus, I will have to choose "6) Other, please specify": clean compile, no error, warning, or run-time exception. Depending on the definition of First, either Giraffe or Turtle would be acceptable. If you can explain the definition of First, I will try to give a better answer.

  • First extends IEnumerable<T> and returns a T - the first one in the sequence.

    So, which would it return? A Turtle or a Giraffe? I can't see how it could do anything other than be a compilation error. Picking one sequence over the other arbitrarily would be disasterous, IMO.

    Jon

  • What Peter Ritchie said. You're providing two different IEnumerable<Animal> implementations, so the user has to specify which one they want (by casting the C to IEnumerable<T> for T in {Giraffe, Turtle}).

  • I wonder if it would be possible to return a "multi-type" variable?  Ie// one that is either a Giraffe or a Turtle, but never a "Lion" or an "Alligator".  

    To me, at least, this code is trying to tell you that the container can hold only Giraffes and Turtles and nothing else from the set of Animals (perhaps because storing Giraffes and Lions in the same container leads to bad results).  Enumerating through the container should get a single aggregate set of both Turtles and Giraffes, depending upon what was stored in there.  Trying to store a Lion in that code should fail at compile time but adding either Turtles or Giraffes should be just fine.

    In that conceptual model, the "First().GetType()" call should return a hybrid type of "Giraffe" | "Turtle" in the generic sense, but a specific "Giraffe" or "Turtle" in the specific case depending upon the type stored in the first element.

    Is there a better way to express this programatic desire?  Currently, the only way to put N siblings together is to either implement a seperate container for each, to include the base class or to implement some middle "GirraffeAndTurtle" class which inherits from "Animal" and from which both Turlte and Giraffe are derived.  It would be nice to have something clearer.

    Maybe something notationally like:

    Class C: IEnumerable<Turtle | Giraffe> {

    IEnumerator<Giraffe | Turtle> IEnumerable<Giraffe | Turtle>.GetEnumerator() {

          if (current_obj is Giraffe)

           yield return new Giraffe(current_obj);

          if (current_obj is Turtle)

            yield return new Turtle(current_obj);

          else throw new IncompatibleAnimalException();

       }

    }

    class Program {

       static void Main()  {

           #success

           IEnumerable<Animal+> animals = new C();

           #All the "giraffes"' stored in the set

           IEnumerable<Giraffe> giraffes = new C();

           #All the "Turtles" stored in the set

           IEnumerable<Turtle> turtles = new C();

           # Breaks at compile time

           IEnumerable<Lion> lions = new C():

           Console.WriteLine(animals.First().GetType().ToString());

       }

    }

  • James: if there were a "multi-type" return, what would it mean to call its "Reproduce" method, considering the Giraffe is a mammal that births its young and the turtle is a reptile that lays eggs.  At some point you need to tell the compiler which of the two types you want to deal with.

  • I also do not understand First.

    I assume it is a (pseudo) extension method, but exactly what is it suppose to do?

    I briefly looked for some documentation but to no avail.

    Jon Skeet wrote:

    > First extends IEnumerable<T> and returns a T - the first one in the sequence.

    I still do not understand. How would First, a method, extend the type IEnumerable<T>? And if First is a type, then how is it going to return a T? In particular, please expand on "it returns a T - the first one in the sequence". What sequence? The only sequence here is the order of implemented interfaces. Are you suggesting First uses reflection to look at the order of implemented interfaces and returns the first one? That seems rather foolish, but it is certainly quite possible and should not be a compiler error. Or would it call GetEnumerator() and then get the first element? In which case, animals would need to be upcasted to the desired interface to bind to the appropriate GetEnumerator(). This has nothing to do with variance the current rules apply.

    Like Ben, without more information, it is difficult to judge. However, depending on the definition of First, either Giraffe or Turtle would be OK. I would not expect a warning, error, or exception.

  • First is an extension method which takes an IEnumerable<T>, calls GetEnumerator, and returns the first T in the sequence.  If there is no first T then it throws an exception.

  • The "First" is irrelevant. If it helps, suppose I had written

    foreach(Animal in animals) Console.WriteLine(animal.GetType().ToString());

    Should this code compile? Should it give a warning?  What output should it produce?

    Now suppose instead of

           IEnumerable<Animal> animals = new C();

    I'd said

           object x = new C();

           IEnumerable<Animal> animals = (IEnumerable<Animal>)x;

    foreach(Animal in animals) Console.WriteLine(animal.GetType().ToString());

    Should this code compile? Should it give a warning?  What output should it produce?

  • Considering Eric's clarification and new example, error for both cases. More specifically, CS1640 seems appropriate. Why? Because it is the same issue as in existing implementations and has nothing to do with variance. Don't over think it. The only variance issue here is the acceptability of assigning C to to IEnumerable<Animal>.

  • How would the compiler know to give an error for the second case? The conversion from C to object is unambiguous, as is the conversion from object to IE<Animal>.

  • Eric,

    1: object x = new C();

    2: IEnumerable<Animal> animals = (IEnumerable<Animal>)x;

    3: foreach(Animal in animals) Console.WriteLine(animal.GetType().ToString());

    Lines 1 and 2 are not an error.

    However, upon reaching line 3 (foreach), animals, which points to a C instance, implements IEnumerable<Giraffe> and IEnumerable<Turtle>, the same interface multiple times and should thus produce CS1640.

  • Ben, because you're converting to the IEnumerable<Animal> type first, the compiler will know which GetEnumerator to use.  The same happens in C# now:

    public class C : System.Collections.IEnumerable, IEnumerable<int>, IEnumerable<string>

    {

       IEnumerator<int> IEnumerable<int>.GetEnumerator()

       {

           yield break;

       }

       IEnumerator<string> IEnumerable<string>.GetEnumerator()

       {

           yield break;

       }

       System.Collections.IEnumerator System.Collections.IEnumerable.GetEnumerator()

       {

           return (System.Collections.IEnumerator)((IEnumerable<string>)this).GetEnumerator();

       }

    }

    public class Test

    {

       public static int Main()

       {

           IEnumerable<int> e = new C();

           foreach (int i in e){}    // no error CS1640

           return 1;

       }

    }

  • Peter Ritchie,

    The difference though is that in your example, C does in fact implement IEnumerable<int>. Thus, it knows exactly which one to bind to. Changing e to C e = new C() will of course produce CS1640.

    > because you're converting to the IEnumerable<Animal> type first, the compiler will know which

    > GetEnumerator to use.

    It does? And which one would that be? There is no IEnumerable<Animal>.GetEnumerator() that returns an IEnumerator<Animal>. It can only return an IEnumerator<Giraffe> or IEnumerator<Turtle>, both of which may be treated as an IEnumerator<Animal> and thus the ambiguituy: CS1640.

  • Ben, the example code was pulled from the documentation for CS1640 and slightly modified.  This is the original code:

    public class C : IEnumerable, IEnumerable<int>, IEnumerable<string>

    {

       IEnumerator<int> IEnumerable<int>.GetEnumerator()

       {

           yield break;

       }

       IEnumerator<string> IEnumerable<string>.GetEnumerator()

       {

           yield break;

       }

       IEnumerator IEnumerable.GetEnumerator()

       {

           return (IEnumerator)((IEnumerable<string>)this).GetEnumerator();

       }

    }

    public class Test

    {

       public static int Main()

       {

           foreach (int i in new C()){}    // CS1640

           // Try specifing the type of IEnumerable<T>

           // foreach (int i in (IEnumerable<int>)new C()){}

           return 1;

       }

    }

    In Eric's example, if he first converts to a IEnumerable<Animal> object, that object must have an IEnumerable<Animal> GetEnumerator() method so foreach knows exactly what GetEnumerator() to use, just as it knows in the IEnumerable<int>/IEnumerable<string> example because I forced it to use the IEnumerable<int> interface.  If you don't tell it what type of IEnumerable<T> to use, yes, the compiler thankfully won't make the decision for you and spits out CS1640; but that's not the case the Eric presented.

Page 2 of 3 (43 items) 123