Why Do Initializers Run In The Opposite Order As Constructors? Part Two

Why Do Initializers Run In The Opposite Order As Constructors? Part Two

Rate This
  • Comments 42

As you might have figured out, the answer to last week's puzzle is "if the constructors and initializers run in their actual order then an initialized readonly field of reference type is guaranteed to be non null in any possible call. That guarantee cannot be met if the initializers run in the expected order."

Suppose counterfactually that initializers ran in the expected order, that is, derived class initializers run after the base class constructor body. Consider the following pathological cases:

class Base
{
    public static ThreadsafeCollection t = new ThreadsafeCollection();
    public Base()
    {
        Console.WriteLine("Base constructor");
        if (this is Derived) (this as Derived).DoIt();
        // would deref null if we are constructing an instance of Derived
        Blah();
        // would deref null if we are constructing an instance of MoreDerived
        t.Add(this);
        // would deref null if another thread calls Base.t.GetLatest().Blah();
        // before derived constructor runs
    }
    public virtual void Blah() { }
}
class Derived : Base
{
    readonly Foo derivedFoo = new Foo("Derived initializer");
    public DoIt()
    {
        derivedFoo.Bar();
    }
}
class MoreDerived : Derived
{
    public override void Blah() { DoIt(); }
}

Calling methods on derived types from constructors is dirty pool, but it is not illegal. And stuffing not-quite-constructed objects into global state is risky, but not illegal. I'm not recommending that you do any of these things -- please, do not, for the good of us all. I'm saying that it would be really nice if we could give you an ironclad guarantee that an initialized readonly field is always observed in its initialized state, and we cannot make that guarantee unless we run all the initializers first, and then all of the constructor bodies.

Note that of course, if you initialize your readonly fields in the constructor, then all bets are off. We make no guarantees as to the fields not being accessed before the constructor bodies run.

Next time on FAIC: how to get a question not answered.

 

  • Carrying on from the previous article... I think the C++ model of (as you put it) "objects that mutate their own runtime type" is appropriate. The constructor is what makes an object of a type. Until the constructor is run, the type's invariants aren't met (a type theorist would say that it's not yet of that type). Once the destructor has run, the invariant is once again not met (it's no longer of the fully-derived type). This leads to some surprises (pure virtual function calls being the obvious ones).

    On the other hand, to me, the CLR model is deeply weird. As I understand things, even before my constructor runs, my member functions can be called, and my member variables can be read or even changed. Any hope of a well-defined notion of a class invariant is lost. Now, you could argue that the problem is the same in both cases -- that essentially, trying to treat a not-yet-constructed object as its derived type before the derived constructor is run is simply an error -- but I would disagree. In C++, you can't get into trouble without explicitly (static_)casting to the derived class, but in C#, you can get into trouble if you call a virtual function from the base class's constructor.

  • Consider this code:

    abstract class B { public B() { if (this is D) Foo.Prop = this as D; } }

    class D : B { }

    You're telling me that when B's constructor runs, "this is D" should return false, but once control leaves B, suddenly it starts being true?

    And you're telling me that it should be possible for an object to report that it's current runtime type is abstract?

    That's weird, man. :-)

    The constructor doesn't make an object of a particular type. The allocation of the object makes it of a particular type. The constructor is just a method that runs on the object.

  • To Eric: If constructor is just a method, why bother to have constructors at all?

  • I strongly agree with Richard, I don't see why Java and .NET took a different design decision than C++. I'll admit that for a novice the C++ behaviour is strange at first glance but looking deeper it makes much more sense. Not being a "type theorist" I still think that an object is not of type X until the X constructor has finished running.

    Proof to the fact that the .NET model isn't consistent is that a hack in initializers' behaviour plug a hole that regular constructors do not. As Eric said, all bets are off.

  • Bill Wagner published an article on the subject in Visual Studio Magazine in December, 2007

    http://visualstudiomagazine.com/columns/article.aspx?editorialsid=2377

  • > To Eric: If constructor is just a method, why bother to have constructors at all?

    An excellent question. There are languages which have no constructors -- you want to run code to initialize an object, you go right ahead and run that code.

    The reason we have constructors is because the "run a particular method exactly once when an object is created but never again" is a very common pattern, so common that the designers of several languages have deemed it worthy of inclusion in and enforcement by the language and runtime.

  • To Eric: Then why not to restrict calls from the constructor to base(), this() and static helpers?

  • Because then you end up with duplicated code. Consider a mutable object which represents an enumerator over some sort of collection. A common pattern is

    class C {

     public C() { Reset(); }

     public void Reset() { ...  }

    }

    With your way, either you have to force the user to call Reset() after construction, which produces an opportunity for a bug, or you duplicate the code in Reset(), which is an opportunity for maintenance problems.

  • To Eric: You're buying the ability to never see a NULL readonly member at the price of allowing to call derived class members before the derived class constructor. This means the derived class cannot implement an invariant that would hold throughout its lifetime (post-constructor and pre-destructor).

  • I'm not following you. How does the order of running intializers before constructors make it impossible to implement an invariant?

  • To Eric: There shall be no way to call instance methods before instance is constructed, so that post constructor will be the first one called after instance construction.

  • Well, the other possibility was to have "A is B" return false (because the object is not B yet) and thus disallow downcasting to a not-yet-constructed type.

    the way you have it, the invariant in  

    class A {

    A() { Invariant(); }

    public virtual void Invariant() {}

    };

    class B : public A {

    private int i;

    public B(int i_) { i=i_; }

    public void Invariant() { assert(i==1);}

    }

    does not hold if A calls Invariant(). In fact, strictly speaking it's accessing an uninitialized variable. It's just that the runtime system went and zeroed out all the fields, so they seem initialized.

  • Correct. That's why its a bad programming practice to call virtual methods from constructors.

    The reason it is a bad idea to call virtual methods from constructors is because a method on a derived class might run before the derived class constructor runs.

    That is, the tradeoff made is we are trading the benefit of "an object is always of one type throughout its entire lifetime" against the benefit of "it is always safe to call a virtual method, even from a constructor".

    What I am confused about is your statement that this restriction on when you should call virtual methods is a _consequence_ of the fact that initializers run before constructors. That is saying it backwards. Rather, both the fact that you should not call virtual methods in constructors, AND the fact that initializers run before constructors, are _consequences_ of the fact that object type is not mutable.

    Does that make sense?

  • Okay, from this discussion, I can see why it makes sense to initialize class member variables of both Base and Derived before running any constructors; despite the assumption from the previous article, that behaviour didn't surprise me at all.  Doing things in between calling the Base and Derived constructors would be more surprising to me.

    What I don't understand is why it's necessary to do Derived's members before Base's.  According to my experiments, member initializers can't refer to "this" anyhow, so nobody can (accidentally) get any reference to any of either Base's or Derived's member initializers during the initialization phase.

    If that's true, there seems to be no obvious reason to run Derived's first; it's irrelevant either way.  It will probably never affect me, but just for symmetry with constructors, running Base's initializers before Derived's seems to make more sense.

  • The implementation of the "reverse order" semantics is simple -- have the constructor run the initializers, then call the base class constructor, then run the constructor body.

    Suppose you wanted the base class initializers to run before the derived class initializers. Imagine you are the compiler developer; how would you implement it?

Page 1 of 3 (42 items) 123