Method Hiding Apologia

Method Hiding Apologia

  • Comments 25

Here's some back-and-forth from an email conversation I had with a user a while back.

Why should one avoid method hiding? 

If there were no advantages and only disadvantages then we would not have added it to the language in the first place. C# implements hiding because hiding is frequently useful. 
 
I therefore deny the premise of the question. One should not avoid method hiding if it is the right thing to do.

But I find method hiding confusing. It lets derived types appear to break the contracts of base types. If a derived type D hides a method M on base class B because D.M does something different than B.M, shouldn't it have a different name?

That sounds remarkably like a good answer to your original question.

However, method hiding is for exactly those times when you need to have two things to have the same name but different behaviour.
 
Obviously that is not a pleasant situation to be in. I agree with you that it is almost always preferable to have different things have different names. However, I can think of a few situations in which it's desirable. Consider this real-world example:
 
interface IEnumerable<T> : IEnumerable {
  new IEnumerator<T> GetEnumerator();
}
 
Is there a better name than GetEnumerator? That's what it does: it gets an enumerator. And in this case, you want to suppress the usage of the base type's non-generic GetEnumerator as much as possible; this version is intended to fully replace the non-generic version.

Of course, if C# had method return type covariance, this would not have to be a new method. That gives us a larger good reason for method hiding; it allows for something like return type covariance in a language that does not have such a feature.

I suppose that makes sense. Could you give some other examples of valid usage?

Sure, but I will have to digress in a prolix manner first.

We want to write computer programs which solve real-world problems, and therefore we want to design languages which enable developers to model their real-world problems in natural, flexible and intuitive ways. We want to take problem solving techniques which work well in the non-computer realm and enable developers to use the real-world intuitions and skills they’ve learned in their program.

One of the techniques that we developed millennia ago to solve problems is organization of objects into hierarchies. Hierarchies are often imperfect – there are the occasional platypuses which crop up and resist easy classification. But hierarchies are such a powerful and useful tool that we use them, despite their flaws, to help organize the world’s data and solve real problems.

Thus, it should be clear that class-based inheritance exists in programming languages because we wish to enable developers to naturally and easily write programs which model problems solved by hierarchies.

So here’s the rub: in a world where objects fit into a hierarchy, who gets to decide how to manipulate that object based on its position in the hierarchy?  Does the object get to decide (at runtime), or does the code doing the manipulation get to decide (at compile time)? 

The former is called “virtual dispatch”, the latter “non-virtual dispatch”.  Which you choose to use depends entirely upon the problem being modeled. It’s not like one of them is absolutely morally better than the other. The better one is the one which models the real-world problem better.

Many problems – probably the majority of problems in this space – are best modeled by letting the object decide how it is to be treated.  When you call Feed on an instance of Animal, let the instance decide what happens based on whether the instance is a Squid or a Zebra. Do a virtual dispatch at runtime. But it would be an error to say that because most problems are modeled this way, that the language ought not to allow modeling the problem any other way. Sometimes it is best for the caller to decide how the object is treated, because the caller has more information.

For example, when I was a teenager my father owned a restaurant. This was at the time that the Conservative government in Canada introduced a highly unpopular value-added sales tax on goods and services called, unimaginatively enough, the Goods and Services Tax.  The GST rules were roundly criticized as being insanely complicated. No government wants to be known as “the government that increased the price of food for poor people”, so grocery items were exempted from the GST. But what qualified as a “grocery item”? That had a whole other complex set of rules.

A cake sold in a grocery store, no GST. The exact same cake, sliced and plated in my father’s restaurant, GST. The exact same cake, NOT sliced, sold whole in the restaurant, no GST. A muffin in a grocery store, no GST. The same muffin in my father’s restaurant, GST. A box of six muffins sold in the restaurant, no GST.

I gather that in the last twenty years the tax has been rationalized somewhat. That’s not my point. My point is that this is a case where what the object is (virtual dispatch) is only one factor in how it behaves with respect to taxes. An equally important factor is how the object is being classified right now (non-virtual dispatch).

How might we model this in a programming language? There are lots of ways, and we could certainly argue about which is best. C# allows you the flexibility to come up with a variety of different designs and decide for yourself which models your problem best. One way in which we could reasonably model this problem would be to have a hierarchy:

abstract class Food {
    public decimal TaxRate { get { return 7.0m;} }
}
abstract class Grocery : Food {
    new public decimal TaxRate { get { return 0.0m; } }
}
class Cake : Grocery {
    new public decimal TaxRate { get { return 7.0m; } }
}

And so on. We could continue to complexify this to model the tax rules better. Now when I have a variable of type Cake, it’s tax rate is 7%. When I have the same cake stuck into a variable of type Grocery, its tax rate is 0%. That is, what the object “really” is at runtime is less relevant to the problem at hand then how I am presently classifying it.

One might make the argument that this is a violation of the Single Responsibility Principle, that the tax logic should be in a class of its own, that groceryness should be a flag and not a part of the type system, that really what we need is both a Food and a TransactionContext hierarchy, blah blah blah. I am sure that we could come up with a completely different hierarchy of objects which did not require hiding, and that other hierarchy might even be “better” in some ways. We seem to be badly conflating mechanism with policy here, which is worrisome.

But that's not my point. My point is that in C# we want to give you the flexibility to model problems this way if you choose to, if you believe that this is the best way to model them. And I think that’s a good thing.

I wrote a couple of articles about how we designed this same feature into JScript .NET. It is a language greatly complicated by its dynamic nature, so the design decisions became correspondingly harder. See

http://blogs.msdn.com/ericlippert/archive/2004/06/07/150367.aspx
http://blogs.msdn.com/ericlippert/archive/2004/06/08/151209.aspx

The comments are particularly useful in these articles as well.

  • I think the method hiding feature is very useful, yet I think it should not get into the language.

    The reason is because it's a _client_ feature. In other words, if the client is already compiled, replacing the server will not affect the client behavior...

    In this view 'method hiding' is very similar to 'default parameters' feature, which, as far I know, was not included into the language because of the very same reason.

    So what is the solution? In my opinion it's extension methods, implemented by the _client_, while the server make all its methods unambiguous.

    Thus my operational proposition is to make extension methods _precede_ class methods in resolution chain, letting the above suggestion possible.

    P.S.

    1. The fact that if I have an extension method ambiguous with the class method, it simply disappears (no warning or similar) looks buggy (csc 3.5).

    2. Having a possibility to implement the same interface several times along the class derivation chain actually is an instance of the same 'feature'. Since it's used a lot in .Net BCL, as mono contributor I several times fixed bugs caused by different implementation of the same interface along the chain.

    3. Q: what about IEnumerable<T>, it does not fit with the guidelines. A: To my opinion IEnumerable<T> shouldn't derive from IEnumerable. In general, derivation for hiding looks problematic to me due to very hard runtime bugs, which it enables.

  • How is method hiding different from polimorphism?

  • Konstantin Triger,

    Method hiding is already part of the C# specification, so it is here to stay unless it proves to be used by no one. (Since the CLR uses it, that is not likely)

    Extension methods should be created with naming schemes that avoid that possibility as much as possible. Try for instance using less common verbs for what you are doing. So AquireData rather then GetData, that way you get the definition you want, and the peace of mind that there is a low chance that the underlying developers would likely choose the simpler word over the more "complex" one.

    Warning on extension method hiding would be nice, although I suspect that it will be coming in a future version.

    About your #3 P.S., IEnumerable<T> is an enumerable of objects, it just also happens to be an enumerable of T's, not having it inherit from IEnumerable would not make logical sense, shouldn't an enumerable of T be able to treated as an enumerable of objects?

    Javier,

    They have similar issues, but method hiding requires active work to get to be extremely complex (After all it is designed to be used sparingly), compared to polymorphism's ability to quickly mess up everything if one of your parent classes adds a new method that matches the other parent.

  • To Joshua:

    Thanks for your comments!

    I think all this thread is a "post mortem" analysis. Clearly we won't be able to remove this feature. Yet we want understand whether it was the right thing to do, so we will be able to recommend better practices, add FxCop rules etc.

    The main point of my post is that Extension methods could give a reasonable solution to the problem, if and only if the compiler would make them to precede in name resolution chain, i.e.:

    class Y {

       public void test() {

           Console.WriteLine("Y");

       }

    }

    class X : Y {

       public void testEx() {

           Console.WriteLine("X");

       }

    }

    //declared by the user in  the user code!

    static class ExtensionX {

       public static void test(this X x)  {

           x.testEx();

       }

    }

    ...

    new X().test(); // -> 'X' is printed (currently the output is 'Y')

    So the user will be able to call a 'simpler' extension method.

    About my and your #3 P.S.: Following your logic IEnumerable<T> should be also IEnumerable<object>, which is not the case. If due to some reason you need to convert your IEnumerable<T> to IEnumerable, one way would be a method like this:

    public static IEnumerable ToGenericIEnumerable<T>(IEnumerable<T> e) {

       foreach (T t in e)

           yield return t;

    }

    Of course that's not needed, since IEnumerable<T> derives from IEnumerable, but what would you do if you needed to convert IEnumerable<T> to IEnumerable<X>, provided T derives from X?

  • To me, the usefulness of hiding is to change the result of a method call based on which subtype it gets casted to. This is a good way to "downgrade specialization". Think of a scenario where such a type gets passed to methods where the parameter may be defined as type A or type B - thus altering the behavior of the passed object.

  • @Konstantin,

    The reason IEnumerable<T> needs to implement IEnumerable but doesn't need to worry about implementing IEnumerable<object> is so that a reference to IEnumerable<T> can be used with code that pre-dates generics. For example:

    public void OldFoo(IEnumerable bars)

    {

    // ...

    }

    public void NewFoo<T>(IEnumerable<T> bars)

    {

      // This doesn't work if IEnumerable<T> doesn't implement IEnumerable

      OldFoo(bars);

    }

    While you could say the same thing about IEnumerable<object>, there are well known ways around that problem (using generic methods instead of IEnumerable<object>, for example), whereas with pre-generic code the only workaround is writing a wrapper class. Also, generic covariance, if it ever gets added to the language, will take care of the IEnumerable<object> problem, whereas it can't take care of the IEnumerable problem since they are two different types.

  • Personally, programming is not about creating frankenstein concepts. This cake example in my view demonstrates exactly how not to use these concepts.  Consider the scenario when a programmer (possibly new to the project) is tasked with calculating tax on a bunch of items, and creates a list of generic groceries, only to calculate the wrong tax because of the frankenstein relationships.  Programming is hard enough right?

  • I think this is a fairly decent article.. but frankly I'm not sold on method hiding at all.

    I understand there could be situations where one can make use of method hiding; but it seems pretty non-intuitive by breaking how people expect polymorphism and your API to work.

    I won't say I'll never use it.. but I hope I never do; because it seems that if you're having to do a lot of funky method hiding then your object hierarchy is probably pretty hosed.

    Bottom line I just see very little advantage to using method hiding over method overriding. The latter is much more intuitive and is how most people expect things to work. The examples given are relatively simple yet still somewhat confusing - I don't even want to think how much damage you could do by hiding methods in larger chunks of code.

    cheers,

    - bri

  • I still don't see why the language includes both method overriding and hiding. Can this be explained without a reference to covariance?

  • I personally like this feature... backwards compatibility

    //old communication object

    internal class CommunicationObject : ICommunicationObject

    {

       public CommunicationObject(ISomeInterface someInterface)

       {

            this.ISomeInterface = someInterface;

       }

       protected ISomeInterface { get; private set; }

    }

    //new, more advanced, communication object

    internal class CommunicationObjectEx : CommunicationObject, ICommunicationObjectEx

    {

       public CommunicationObjectEx(ISomeInterfaceEx someInterface)

           : base(someInterface) {   }

       new protected ISomeInterfaceEx SomeInterface

       {

            get { return base.SomeInterface as ISomeInterfaceEx; }

       }

    }

    //public facing factory

    public static CommunicationObjectFactory

    {

       //old factory method

       public static ICommunicationObject CreateCommunicationObject()

       {

           return new CommunicationObjectEx(new SomeClassEx());

       }

       //new factory method that lets the caller initialize a new version of the communication object without breaking old code

       public static ICommunicationObjectEx CreateCommunicationObjectEx()

       {

           return new CommunicationObjectEx(new SomeClassEx());

       }

    }

Page 2 of 2 (25 items) 12