Implementing the virtual method pattern in C#, Part Two

Implementing the virtual method pattern in C#, Part Two

Rate This
  • Comments 15

(This is part two of a three-part series; part one is here; part three is here.)

So far we've gotten rid of instance methods; they're just static methods that take a hidden "this" parameter. But virtual methods are a bit harder. We're going to implement virtual methods as fields of delegate type containing delegates to static methods.

abstract class Animal
{
  public Func<Animal, string> Complain;
  public Func<Animal, string> MakeNoise
;
  public static string MakeNoise(Animal _this)
  {
    return "";
  }
}

OK, everything seems fine so far...

class Giraffe : Animal
{
  public bool SoreThroat { get; set; }
  public static string Complain(Animal _this)
  {
    return _this.SoreThroat ? "What a pain in the neck!" : "No complaints today.";
  }
}

Trouble. "Animal" does not have a property SoreThroat. But Complain cannot take a Giraffe, because then it is not compatible with the delegate type, which after all, expects an Animal as the "_this" formal parameter.

What we need to make the virtual method pattern work is a guarantee that the caller will never pass in a Cat to a "virtual" method that is expecting a Giraffe. Let's assume that we have such a guarantee. Since we have that guarantee, we can make a conversion:

class Giraffe : Animal
{
  public bool SoreThroat { get; set; }
  public static string Complain(Animal _this)
  {
    return (_this as Giraffe).SoreThroat ? "What a pain in the neck!" : "No complaints today.";
  }
}
class Cat : Animal
{
  public bool Hungry { get; set; }
  public static string Complain(Animal _this)
  {
    return (_this as Cat).Hungry ? "GIVE ME THAT TUNA!" : "I HATE YOU ALL!";
  }
  public static string MakeNoise(Animal _this)
  {
    return "MEOW MEOW MEOW MEOW MEOW MEOW";
  }
}
class Dog : Animal
{
  public bool Small { get; set; }
  public static string Complain(Animal _this)
  {
    return "Our regressive state tax code is... SQUIRREL!";
  }
  public static string MakeNoise(Dog _this)  // Remember, we forgot to say "override"
  {
    return _this.Small ? "yip" : "WOOF";
  }
}

Everything good? Not yet. We forgot to initialize the fields!

Here for the first time we come to something that you actually cannot do in "C# without instance methods". In the CLR, the virtual method "fields" are actually initialized after the memory allocator runs but before the constructor runs. (*) We have no way in C# of doing that. So let's go crazy here; we've already gotten rid of instance methods; instance constructors are basically just instance methods that are called when the object is created. So let's get rid of instance constructors too. Instead, instance constructors will be replaced by the factory pattern, where a static method creates and initializes the object. (We presume that this static method has the powers of a constructor; for example it is allowed to set readonly fields, and so on.) A call to a default constructor now does nothing but allocate memory.

abstract class Animal
{
  public Func<Animal, string> Complain;
  public Func<Animal, string> MakeNoise;
  public static string MakeNoise(Animal _this)
  {
    return "";
  }
  // No factory; Animal is abstract.
  public static void InitializeVirtualMethodFields(Animal animal)
  {
    animal.Complain = null; // abstract!
    animal.MakeNoise = Animal.MakeNoise;
  }
}
class Giraffe : Animal
{
  public bool SoreThroat { get; set; }
  public static string Complain(Animal _this)
  {
    return (_this as Giraffe).SoreThroat ? "What a pain in the neck!" : "No complaints today.";
  }
  public static void InitializeVirtualMethodFields(Giraffe giraffe)
  {
    Animal.InitializeVirtualMethodFields(giraffe);
    giraffe.Complain = Giraffe.Complain;
    // Giraffe did not override MakeNoise, so ignore it.
  } 
  public static Giraffe Create()
  {
    // There are no more instance constructors; this just allocates the memory.
    Giraffe giraffe = new Giraffe();
    // Ensure that virtual method fields are initialized before other code is run.
    Giraffe.InitializeVirtualMethodFields(giraffe);
    // Now do the rest of the initialization that the constructor would have done.
  }
}
class Cat : Animal
{
  public bool Hungry { get; set; }
  public static string Complain(Animal _this)
  {
    return (_this as Cat).Hungry ? "GIVE ME THAT TUNA!" : "I HATE YOU ALL!";
  }
  public static string MakeNoise(Animal _this)
  {
    return "MEOW MEOW MEOW MEOW MEOW MEOW";
  }
  public static void InitializeVirtualMethodFields(Cat cat)
  {
    Animal.InitializeVirtualMethodFields(cat);
    cat.Complain = Cat.Complain;
    cat.MakeNoise = Cat.MakeNoise;
  } 
  public static Cat Create()
  {
    Cat cat = new Cat();
    Cat.InitializeVirtualMethodFields(cat);
    // Now do the rest of the initialization that the constructor would have done.
  }
}
class Dog : Animal
{
  public bool Small { get; set; }
  public static string Complain(Animal _this)
  {
    return "Our regressive state tax code is... SQUIRREL!";
  }
  public static string MakeNoise(Dog _this)  // Remember, we forgot to say "override"
  {
    return _this.Small ? "yip" : "WOOF";
  }
  public static void InitializeVirtualMethodFields(Dog dog)
  {
    Animal.InitializeVirtualMethodFields(dog);
    dog.Complain = Dog.Complain;
    // Dog did not override MakeNoise, so ignore it.
  } 
  public static Dog Create()
  {
    Dog dog = new Dog ();
    Dog.InitializeVirtualMethodFields(dog);
    // Now do the rest of the initialization that the constructor would have done.
  }
}

What about the call sites?  We rewrite our program as:

string s;
Animal animal = Giraffe.Create();
// creates a new Giraffe and initializes the Complain field to Giraffe.Complain,
// and initializes the MakeNoise field to Animal.MakeNoise. We continue to
// rewrite the calls so that the "receiver" is passed as "_this" to each delegate:

s = animal.Complain(animal);
// Invokes delegate animal.Complain, which refers to static method Giraffe.Complain

s = animal.MakeNoise(animal);
// invokes delegate animal.MakeNoise, which refers to static method Animal.MakeNoise

animal = Cat.Create();
// Creates a new Cat and initialzies the fields to Cat.Complain and Cat.MakeNoise.

s = animal.Complain(animal);  // I hate you
s = animal.MakeNoise(animal); // meow!

Dog dog = Dog.Create();
// Initializes the fields to Dog.Complain and Animal.MakeNoise

animal = dog;
s = animal.Complain(animal);
s = animal.MakeNoise(animal);
// Invokes delegate animal.MakeNoise, which refers to static method Animal.MakeNoise

s = Dog.MakeNoise(dog); // yip!
// Does not invoke a delegate at all; overload resolution sees that Dog has a valid
// MakeNoise method that is declared on a more-derived type than the delegate
// field of the base class, and chooses to call the more-derived static method.

And we're done; we've successfully emulated virtual and instance methods in a language that only has static methods (and delegates to static methods.)

However, this is not very space-efficient. Suppose there were a hundred virtual methods on Animal instead of two. That means that every class derived from Animal has a hundred fields, and in most of them, those fields are exactly the same, all the time. You make three hundred giraffes and each one of them will have exactly the same delegates in those hundred fields. This seems redundant and wasteful.

Next time we'll solve this memory wastage problem.

-----

(*) The rules of the CLR are that the virtual function "slots" are correctly initialized as soon as the object is created; this is in contrast to the rules of C++, which say that the values of the virtual functions change as the object goes through its construction process. Before I started on this team I was very much in favour of the C++ approach, as you can see from this blog post from 2005 shortly before I joined the C# team. Both approaches have their pros and cons; I now think the way the CLR does it is marginally better, but still it is best to not tempt fate: simply don't call virtual methods in constructors.

(This is part two of a three-part series; part one is here; part three is here.)

  • Nice, thanks for making my head hurt.

    Did you mix up Create() and Construct() methods ? Or am I missing something ?

    That was an editing error which I have corrected, thanks. -- Eric

  • You could even use closures to curry the _this parameter during construction. This would make calls to MakeNoise in this scheme look exactly like those in regular C#.

    abstract class Animal

    {

     public Func<string> Complain;

     public Func<string> MakeNoise;

     public static string MakeNoise(Animal _this)

     {

       return "";

     }

     // No factory; Animal is abstract.

     public static void InitializeVirtualMethodFields(Animal animal)

     {

       animal.Complain = null; // abstract!

       animal.MakeNoise = () => Animal.MakeNoise(animal);

     }

    }

  • Why do you use "as" operator instead of a regular cast? Newbies tend to do that and get very surprised to see an unhelpful NullRef Exception instead of a much more specific InvalidCastException. Thanks for the post otherwise!

    I don't see how it matters, since the assumption we're making here is that the compiler is going to get it right. The compiler is not going to allow a virtual call where the "this" argument is null or is of the wrong type. Why generate the code to do the invalid cast exception when that exception is by assumption impossible? -- Eric

  • @Robert, that's just begging the question (sort of). How do you implement closures if you don't have instance methods?

  • A couple of interesting observations here. First, you've chosen to make both the static implementation method as well as the instance delegate properties public - in principle only the Func<> properties need to be public - the static methods could be protected, no? Second, these public delegate properties can be set from outside the type ... which can lead to some interesting problems:

    Animal gir = Giraffe.Create();

    Animal cat = Cat.Create();

    gir.MakeNoise = cat.MakeNoise;

    // we've tried to make a hateful, tuna-loving giraffe

    gir.MakeNoise(gir);  // this fails with NullRefEx

    This problem could be addressed by making the delegates properties ... but that would require instance methods again. An equivalent implementation would be to have the static method call protected Func<> properties instead - whose implementation would be backed by protected static methods. This would complicate the example and obscure the intent.

    It is interesting to not how much work the compiler and CLR are saving the programmer by providing a built-in paradigm for object-oriented mechanisms.

  • In the case of the CLR, the most complicated (but also extremely rare) case is not covered by the delegate desugaring: Generic virtual methods. Since the call site specifies the type parameters, I can think of no way to implement these without using the builtin CLR magic - which is much more complicated than in the usual case, since not only the right method, but also its correct instantiation needs to be looked up.

    Indeed, I'm deliberately ignoring generics. -- Eric

  • @Leo: You'd only need to make the Func<> fields readonly, since they only get set at object creation. Of course, that would mean that you couldn't have InitializeVirtualMethodFields, since only the constructor (or, in this case, Create()) can set readonly fields.

    Of course, if we're assuming that this is compiler syntactic sugar, then it's easy for the compiler to enforce the readonlyness of these fields.

  • I loved reading Andrew Kennedy and Don Syme's paper on the generics in C# and .Net, it did blow my mind in places though. research.microsoft.com/.../DesignAndImplementationOfGenerics.pdf

  • >> Next time we'll solve this memory wastage problem.

    this problem sounds similar to the problem with a class having a large number of custom events....is that right?

  • I think that both the C++ and the CLR approach to the virtuality of the method calls in the constructors are right in the context of the environments they operate in. Calling the overloaded virtual method in the CLR is more natural and expected behaviour and what is more it has some uses. On the other hand in C++ values of member fields are undefined before they are initialized (i.e. before the constructor runs) therefore a hypothetical virtual call in C++ may be operating on a memory that hasn't been initialized yet which can lead to an undefined behaviour. In contrast the CLR always has defined behaviour because all fields are initialized with standard default values which are always the same. Therefore the virtual calls will always behave the same.

  • @praveen: I would guess not. The issue with a large number of events is that usually only a few of them are ever used but they are expected to be different per instance, so the typical approach is to store them in a hash table or something. In this case, they are all used, but have the same values per instance. So the approach will most likely be to add another lever of indirection where the set of function pointers are assigned once and stored in some static object, and each instance contains a reference to that single object.

  • Re: "as" operator - I'm no "why as operator", but I have the same pet peeve. I get that this is "pretend this is doing the thing we want it to, not what C# spec says", but I've seen far too many newbie C# coders abuse 'as' like that to feel happy seeing it :)

    How would you feel about an imaginary language that errored on a constructor that could call virtual methods? (Since it's imaginary already, also imagine that this is feasible)

    Suppose the base ctor calls an instance method, which then in turn calls a virtual method. Should that be illegal too? What if the base ctor calls an instance method which puts "this" in a static field of a second class, and then invokes a (possibly virtual) method on a third class which calls a virtual method on the object in the field of the second class? Determining if a virtual method is invoked before the construction finishes is not as easy as you might think. *(These sorts of awful scenarios help motivate why it is simply illegal to use "this" in any way in a field initializer.) -- Eric

     

    @Shuggy: Thanks for the link - should be interesting, especially how "where IFoo, IBar" works.

  • I always thought vtables were singleton in C++, why would you store these delegates per instance? Stick a pointer to vtable for each concrete type in each instance and you only spend 4 bytes (8 on x64).

    Did you perhaps stop reading the article before getting to the last paragraph? -- Eric

  • Just reading the 2005 article I find your reasoning from 6 years ago a little amusing.

    With .Net logic in vtab / type creation process you at least have a chance showing your overridden UI. With the C++ logic your overriden mehtod ShowProjectUI() will not be called at all, so either the method should not be virtual or it should not be called from the ctor. So again the right way would be: redesign the original Project class.

    Why is calling a virtualual method from a ctor not at a warning? What are the usefull cases?

  • "Why is calling a virtualual method from a ctor not at a warning? What are the usefull cases?"

    You want to create some object within the base class constructor conforming to an interface/abstract base class but the exact type is defined by the sub classes.

    Further the construction of the instance requires access to a this reference (otherwise you could do it by passing Func<T> to the base class constructor).

Page 1 of 1 (15 items)