Odious ambiguous overloads, part one

Odious ambiguous overloads, part one

Rate This
  • Comments 9

As you might have gathered, a lot of the decisions we have to make day-to-day here involve potential breaking changes on odd edge cases. Here's another one.

Consider the following terrible but legal code:

public interface I1<U> {
    void M(U i);
    void M(int i);
}

My intense pain begins when the user writes:

public class C1 : I1<int> {

In the early version of the C# specification it was actually illegal to have an ambiguity like this, but the spec was changed so that when doing overload resolution on a call to M we can choose one, according to section 14.4.2.2:

If one [argument type] is non-generic, but the other is generic, then the non-generic is better.

But still, as you will soon see, we're in a world of hurt for another reason, namely that this class must now implement both M(int i) and, uh, M(int i). Fortunately, a nonexplicit implementation binds to both methods, so this works just fine:

public class C1 : I1<int> {
    public void M(int i) {
        Console.WriteLine("class " + i);
    }
}

The method implements both versions of M and the contract is satisfied.  But we have problems if we try to do an explicit interface implementation:

public class C2 : I1<int> {
    void I1<int>.M(int i) {
        Console.WriteLine("explicit " + i);
    }
}

Does this explicitly implement both members of I1?  Or just one?  If so, which one?

In the current compiler this code produces a terrible, terrible error:

error CS0535: 'C2' does not implement interface member 'I1<int>.M(int)'

Is that so?  It sure looks like it implements it!

What happens when we have both an explicit implementation and a class implementation?  The spec does not actually say what to do. It turns out that we end up in a situation where runtime behaviour depends on source code order of the interface! Check this out:

public interface I1<U> {
    void M(U i); // generic first
    void M(int i);
}

public interface I2<U> {
    void M(int i);
    void M(U i); // generic second
}

public class C3: I1<int>, I2<int> {
    void I1<int>.M(int i) {
        Console.WriteLine("c3 explicit I1 " + i);
    }
    void I2<int>.M(int i) {
        Console.WriteLine("c3 explicit I2 " + i);
    }
    public void M(int i) {
        Console.WriteLine("c3 class " + i);
    }
}

class Test {
    static void Main() {
        C3 c3 = new C3();
        I1<int> i1_c3 = c3;
        I2<int> i2_c3 = c3;
        i1_c3.M(101);
        i2_c3.M(102);
    }
}

What happens here is that the explicit interface implementation mappings in the class match the methods in the interfaces in a first-come-first-served manner:

void I1<int>.M(U) maps to explicit implementation void I1<int>.M(int i)
void I1<int>.M(int) maps to implicit implementation public void M(int i)
void I2<int>.M(int) maps to explicit implementation void I2<int>.M(int i)
void I2<int>.M(U) maps to implicit implementation public void M(int i)

Then (because of the aforementioned section 14.4.2.2) when we see

        i1_c3.M(101);
        i2_c3.M(102);

we prefer the typed-as-int versions to the generic substitution versions, so this program calls the two non-generic versions and produces the output:

c3 class 101
c3 explicit I2 102

And as you'd expect, if we force the compiler to pick the generic versions then we get similar behaviour:

static void Main() {
    C3 c3 = new C3();
    Thunk1<int>(c3,103);
    Thunk2<int>(c3, 104);
}
static void Thunk1<U>(I1<U> i1, U u) {
    i1.M(u);
}
static void Thunk2<U>(I2<U> i2, U u) {
    i2.M(u);
}

The binding of the overload resolution in the thunk bodies happens before the substitution of the type parameters, so these always bind to the generic versions of the methods. As you would expect from the mappings above, this outputs

c3 explicit I1 103
c3 class 104

Again, this shows that source code order has an unfortunate semantic import.

Given this unfortunate situation -- no spec guidance and an existing implementation that behaves strangely -- what would you do? (Of course "do nothing" is an option.) I'm interested to hear your ideas, and I'll describe what we actually did next time.

  • "Does this explicitly implement both members of I1?  Or just one?  If so, which one?"

    The explicit int one.  Because it explicitly says int.

    Invent a syntax such as
    public class C2 : I1<int> {
       void I1<int>.M(int i) {
           Console.WriteLine("explicit " + i);
       }
       void I1<int>.M(U i) {
           Console.WriteLine("generic " + i);
       }
    }
    for the generic one.
  • @DrPizza,
    how do you bind U to the generic parameter?

    public class C2 : I1<int> {
      void I1.M(int i) {
          Console.WriteLine("explicit " + i);
      }
      void I1.M(int i) {
          Console.WriteLine("generic " + i);
      }
    }

    special casing this particular instance? So if you have this scenario, the non generic one doesn't get to specify generic parameters (don't think this would be a problem if implementing several generics types of the same interface I1<int>, I1<string>).
    The generic version force it. It can be done with a clear error message.

    Another option is a compiler attribute, which is ugly as well.
  • Why wouldn’t the compiler just collapse the interface down and remove the duplicate for the case of U = int?

    public interface I1<int>
    {
         void M(int i);
    }

    public interface I1<string>
    {
         void M(int i);
         void M(stirng s);
    }

    There is no implementation allowed in the interface, so I do not see why doing something like this would cause a problem.
  • I would prefer to see templates as parametric code specifications, and let the logical consequences of code expansion follow.  

    In this example the code expansion I1<int> leads to non-compilable code and should throw a error.  If we let all template expansion errors have a “Upon expansion of <classname>: ” prefix, then the error thrown should look something like:

    ERROR (line 11): Upon expansion of l1<int>: M(U i) has same signature as M(int i)

    It would even be nicer to have the template expanded with the errors indicated!

  • IIRC, one of the design goals of C# was to produce a clean language without nasty gotchas.  So doing nothing is not an option; the code order dependency is very nasty.

    Also, there's no obvious reason why implementing the interface with "public void M(int i)" should implement different interface methods compared with using "void I1<int>.M(int i)".

    So for the sake of consistency and comprehension I'd make the behaviour of "void I1<int>.M(int i)" the same as "public void M(int i)".  i.e. They both implement both interface methods.  Of course, this would break any existing code that uses this "feature", but it's a timebomb anyway, so it's probably better that the developers find out.
  • In the case of an ambiguity, I would have the compile choose a random outcome. That way somebody can still write code which doesn't care, but nobody would be able to rely on a specific thing happening.

  • There were a number of ideas in the comments for what we should do about the unfortunate situation...
  • There were a number of ideas in the comments for what we should do about the unfortunate situation I

  • First, regarding the interface declaration itself. If someone writes an interface such as I1:

    interface I1<U>

    {

       void M(int i);

       void M(U i);

    }

    There should be a compiler warning of possible ambiguous use of the interface, since using I1<int> is valid. At least then, the designer of the interface and its consumers would recognize that there is a potential problem, which is often quite difficult to realize until the unfortunate time when an ambiguous usage comes into play. It would then be possible to change the interface up front (before any usage), to something like:

    interface I1<U>

    {

       void N(int i);

       void M(U i);

    }

    Ambiguity isn't possible anymore, assuming an overload of the M method was not absolutely neccesary (and I would guess this is always an option, overloads are never "neccesary"...only convenient).

    Second, if an attempt were to be made to resolve the ambiguity of I1<int>'s M method when implemented, I think the only way would be to use some kind of explicit notation. I am not a fan of using something like this (mentioned once before), as it moves the ambiguity from the compiler to the code itself, making it less understandable:

    public class C: I1<int>

    {

       void I1<int>.M(int i) { // ... }

       void I1<int>.M(U i) { // ... }

    }

    Perhapse a better solution would be something like so:

    public class C: I1<int>

    {

       void I1<int>.M(int i) { // ... }

       void I1<int>.M(<int> i) { // ... }

    }

    With such a notation, your code is understantable (you know which method is the generic method, and which one is the explicit, without question). The implementation ambiguity would then be gone, and the compiler could compile this code without issue. The question then becomes, how do you use class C? When you call the implementation of I1.M, how do you call it? If you don't extend the explicit requirement beyond just implementation to consumption, you again have an ambiguity problem:

    ((I1)myCInstance).M(10); // Ambiguous

    You could so something like the following:

    ((I1)myCInstance).M(10);

    ((I1)myCInstance).M(<int>10);

    You could call it a generic call cast or something, casting the function call rather than a type...but its kind of an odd notation. Maybe something simpler:

    ((I1)myCInstance).M(<>10);

    But that looks even odder...

Page 1 of 1 (9 items)