Color Color

Color Color

Rate This
  • Comments 24

Pop quiz: What does the following code do when compiled and run?

class C
{
    public static void M(string x)
    {
        System.Console.WriteLine("static M(string)");
    }
    public void M(object s)
    {
        System.Console.WriteLine("M(object)");
    }
}
class Program
{
    static void Main()
    {
        C c = new C();
        c.M("hello");
    }
}

(1) writes static M(string)
(2) writes M(object)
(3) uh, dude, this code doesn’t even compile much less run
(4) something else

Think about that for a bit and then try it and see if you were right.

.

.

.

.

.

.

.

In option (1), the compiler could decide that the best match is the static one and let you call it through the instance, ignoring the instance value. In option (2), the compiler could decide that the object version is the best match and ignore the static one, even though the argument match is better. But neither of those actually happens; the rules of C# say that the compiler should pick the best match based on the arguments, and then disallow static calls that are through an instance! The actual result here is therefore (3):

error CS0176: Member 'C.M(string)' cannot be accessed with an instance reference; qualify it with a type name instead

What is up with this craziness? If you believe that the rule “static methods should never be accessed through instances” is a good rule – and it seems reasonable – then why doesn’t overload resolution remove the static methods from the candidate set when called through an instance? Why does it even allow the “string” version to be an applicable candidate in the first place, if the compiler knows that it can never possibly work?

I agree that this seems really goofy at first.

To explain why this is not quite as dumb as it seems, consider the following variation on the problem. Class C stays the same.

class B
{
  public C C = new C();
  public void N()
  {
      C.M("hello");
  }
}

What does a call to N on an instance of B do?

(1) writes static M(string)
(2) writes M(object)
(3) compilation error
(4) something else

.

.

.

.

.

.

A bit trickier now, isn’t it? Does C.M mean “call instance method M on the instance stored in this.C?” or does it mean “call static method M on type C”? Both are applicable!

Because of our goofy rule we do the right thing in this case. We first resolve the call based solely on the arguments and determine that the static method is the best match. Then we check to see if the “receiver” at the call site can be interpreted as being through the type. It can. So we make the static call and write “static M(string)”. If the instance version had been the best match then we would successfully call the instance method through the property.

So the reason that the compiler does not remove static methods when calling through an instance is because the compiler does not necessarily know that you are calling through an instance. Because there are situations where it is ambiguous whether you’re calling through an instance or a type, we defer deciding which you meant until the best method has been selected.

Now, one could make the argument that in our first case, the receiver cannot possibly be a type. We could have further complicated the language semantics by having three overload resolution rules, one for when the receiver is known to be a type, one for when the receiver is known to not be a type, and one for when the receiver might or might not be a type. But rather than complicate the language further, we made the pragmatic choice of simply deferring the staticness check until after the bestness check. That’s not perfect but it is good enough for most cases and it solves the common problem.

The common problem is the situation that arises when you have a property of a type with the same name, and then try to call either a static method on the type, or an instance method on the property. (This can also happen with locals and fields, but local and field names typically differ in case from the names of their types.) The canonical motivating example is a public property named Color of type Color, so we call this “the Color Color problem”.

In real situations the instance and static methods typically have different names, so in the typical scenario you never end up calling an instance method when you expect to call a static method, or vice versa. The case I presented today with a name collision between a static and an instance method is relatively rare.

If you’re interested in reading the exact rules for how the compiler deals with the Color Color problem, see section 7.5.4.1 of the C# 3.0 specification.

  • I remember been quite surprised when I first found out that "Color Color" is a grammar corner case that is specifically covered by the spec in quite a lot of detail and extra special wording. Of course, in retrospect, it makes perfect sense, and I appreciate the fact that it's there (it does make life easier when coming up with fitting member names).

    It would be interesting to know the design history behind this. Is it something that was considered from the very beginning of C#? Or rather something that only came up when frameworks (I would guess WinForms) ran into the problem?

    This issue first appears in the design notes on June 22nd, 1999. Of course, by that point, people already had started writing the FCL in pre-beta versions of the language, it is entirely possible that this was driven by someone running into the problem in the framework. The notes do not say. -- Eric

  • Eric,

    Can you explain why in the following code:

    static void M(object[] array) {...}
    static void M(object obj) { ... }
    ...
    M(null);

    M(object[] array) is called instead of M(object obj)?

    Well, suppose it had been

    static void M(Animal x ) {...}
    static void M(object x) { ... }
    ...
    M(new Giraffe());

    Which would you expect to be called? When given the choice between "Animal" and "object", clearly "Animal" is more specific. We assume you want to call the more specific match when there are two possible matches. All Animals are objects, but not all objects are Animals, so Animal is more specific.

    In your example, null matches both. All arrays are objects but not all objects are arrays, therefore the array one is more specific. The more specific one wins. -- Eric

  • I've gotta confess that I got this wrong (thought option 2 was correct).  I made the erroneous inference from disallowed behavior (can't access a static member through an instance reference).  Thinking like a human so often leads one (me?) astray!

    Thanks for illuminating the distinction and giving us a little insight into the order in which these operations (method resolution vs static access) are examined.

  • Unfortunately this also means that in order to call the instance method (in either example), the argument has to be cast to object. Upcasting always leaves a bad taste in my mouth; I would have preferred a solution that required the developer to explicitly remove the ambiguity in either the static case or the instance case by explicitly qualifying the receiver (i.e. global::C or this.C), rather than implicitly selecting an overload by upcasting an argument; the resulting code would be cleaner IMHO.

  • On second thought, there is nothing stopping you from explicitly qualifying the receiver; for some reason the first "solution" I thought of was upcasting, despite that being my least preferred solution.

  • commongenius: And what about locals?

    void N() {

       C C = new C();

       C.M("?");

    }

    How would you explicitly qualify the receiver?

  • Will the same resolution rules be retained when using dynamic types in C# 4.0?

    In other words:

      dynamic C = new C();

      C.M(...);  // what gets called here and why?

    I've often wondered whether typing something as dynamic will cause overload resolution to behave differently than it would if they type were known at compile time. There's already one potential case you can read about here ... http://stackoverflow.com/questions/987176/overload-resolution-in-c-4-0-using-dynamic-types

  • Having static and instance methods with the same name doesn't seem terribly useful, it's not obvious what the compiler will do with them and even when it's explained people don't necessarily agree that the compiler is doing the right thing.  If you were starting over would you retain this feature?

  • That's why it's better if you live in Australia. We have:

    class MyClass

    {

      public Color Colour { get; set; }

    }

    and all is well :-)

  • @Dean: As a New Zealander, I believe I can safely say AAARRRGH! :)

  • Interesting, and nice to see the thought that went into allowing Color Color to work - although, as a side note, that particular class name doesn't cause me problems as being British I always write Color Colour!! :)

    When starting out with C#, I got caught out with a different problem that's kind of related (from the point of view of thinking of names for things): you might call it "Color.Color" ... in other words, class C in namespace C.  That seems to work less well.

    Indeed, naming a class and namespace the same thing is a bad idea that violates our naming guidelines. I have been meaning to blog about that for a while now. -- Eric

  • Carlos: I assume this rule was not put in the C# spec to encourage people to write eponymous static & instance methods (which is indeed terrible design), but rather to allow for inheritance across assemblies that are maintained by different companies, and where such name clashes might occur accidentally.

    Say you get assembly A version 1 with class C that has no static methods, and you derive your own class from A in your own assembly that adds instance method Foo.  Now you get assembly A version 2 that adds a static method C.Foo.  Now what should the C# compiler do -- break compilation against the new version?

  • @Luke:

    It was pointed out to me by a colleague that namespaces which are named after the sorts of things they contain could often be better named with a plural. E.g.:

    namespace Animals {

    ____class Animal {}

    ____class Giraffe : Animal {}

    }

    This is mentioned here in the Microsoft Design Guidelines for Developing Class Libraries:

    http://msdn.microsoft.com/en-us/library/ms229026.aspx

    "Consider using plural namespace names where appropriate. For example, use System.Collections instead of System.Collection."

    As an example, there's System.Net.Sockets.Socket.

  • I think "Color Color" should have been disallowed in the runtime; types should have been named differently than members in the guidelines.

    Of course, much of my experience is writing tools for C#, and this would have made my life easier.  Perhaps allowing this really is better for users of the language, but I am doubtful.

    I think we'd be better of with a 'C' prefix on classes (and an 'S' prefix on structs, protecting us from a set of bugs when we handle structs in a way that only works for reference types).  You may cry "Hungarian", but I point out that we already name interfaces with an 'I' prefix, so this change would improve consistency.

  • The compiler error should be on the second declaration of method c.M with an error "Cannot define second function with the same name as a static function."

    The error on the call to the function is misdirection and should be avoided.

    Improving the .NET compiler to produce much better diagnosis of these conditions would be great.  Putting significant effort into improved static analysis of C# code would greatly help (e.g., much improved FxCop).   Tools for embedded C++/C have been around doing this in great detail for 20+ years.

Page 1 of 2 (24 items) 12