Why Do Initializers Run In The Opposite Order As Constructors? Part Two

Why Do Initializers Run In The Opposite Order As Constructors? Part Two

Rate This
  • Comments 42

As you might have figured out, the answer to last week's puzzle is "if the constructors and initializers run in their actual order then an initialized readonly field of reference type is guaranteed to be non null in any possible call. That guarantee cannot be met if the initializers run in the expected order."

Suppose counterfactually that initializers ran in the expected order, that is, derived class initializers run after the base class constructor body. Consider the following pathological cases:

class Base
{
    public static ThreadsafeCollection t = new ThreadsafeCollection();
    public Base()
    {
        Console.WriteLine("Base constructor");
        if (this is Derived) (this as Derived).DoIt();
        // would deref null if we are constructing an instance of Derived
        Blah();
        // would deref null if we are constructing an instance of MoreDerived
        t.Add(this);
        // would deref null if another thread calls Base.t.GetLatest().Blah();
        // before derived constructor runs
    }
    public virtual void Blah() { }
}
class Derived : Base
{
    readonly Foo derivedFoo = new Foo("Derived initializer");
    public DoIt()
    {
        derivedFoo.Bar();
    }
}
class MoreDerived : Derived
{
    public override void Blah() { DoIt(); }
}

Calling methods on derived types from constructors is dirty pool, but it is not illegal. And stuffing not-quite-constructed objects into global state is risky, but not illegal. I'm not recommending that you do any of these things -- please, do not, for the good of us all. I'm saying that it would be really nice if we could give you an ironclad guarantee that an initialized readonly field is always observed in its initialized state, and we cannot make that guarantee unless we run all the initializers first, and then all of the constructor bodies.

Note that of course, if you initialize your readonly fields in the constructor, then all bets are off. We make no guarantees as to the fields not being accessed before the constructor bodies run.

Next time on FAIC: how to get a question not answered.

 

  • To Eric: Have "hidden" initializer that runs base initializer then runs itself then you run constructor which calls base constructor.

    By the way. Do you know why VB.NET specifies "Java style" initialization in language definition? Did they have any reasons or did it just because VB programmers (myself included) are supid and won't know better?

  • > Have "hidden" initializer that runs base initializer then runs itself then you run constructor which calls base constructor.

    And then how does the base constructor know that it doesn't have to run its hidden initializer?  It had better not run it again, otherwise we've just initialized all the base stuff twice.

    > Do you know why VB.NET specifies "Java style" initialization in language definition?

    Nope. I have not attended VB design team meetings since 2001, and even then I was only there as an expert on the differences between VB and VBScript. I have no idea why they made the specific design choices that they did; you should ask a VB expert. Paul Vick, say.

  • Constructor doesn't invoke initializer. Initializers invoke initializers, constructors invoke constructors.

    I asked Paul Vick, his principles well known and published: Working in a natural way is a higher priority than language purity.

  • It's like baking a cake!

    You source and buy the ingredients (initialising) before mixing & baking (constructing) and once it has finished baking you can use it for what ever purpose you intended - usually eating...

    It wouldn't make sense so start mixing before sourcing the ingreidents, you would getting half way through and realise you need to go to the shop for baking powder...

    Or may be it's just me who sees it like baking a cake...

  • <i>Working in a natural way is a higher priority than language purity.</i>

    I agree -- but it's not obvious to me which is more unnatural: running all initializers before all constructors, or having supposedly immutable fields take on different values during initialization/construction. I have to say after many years of Java I find the C# approach rather attractive.

  • At least I have found why MyClass keyword was introduced in VB.NET. It allows one to call virtual functions of the class even if they are overloaded in the derived class. Thus one can simulate C++ behavior by prepending virtual function calls in constructor with MyClass.

  • > Constructor doesn't invoke initializer. Initializers invoke initializers, constructors invoke constructors.

    OK, so who gets the ball rolling?  You've got to tell the CLR in the metadata of the assembly which method to call when an object is constructed by "new".  You cannot tell it the method that runs just the initializers, and you cannot tell it the method that runs just the constructors. What are you going to do?

    I anticipate your answer -- generate a third method that runs both, and have that be the "real" constructor.

    So in short, you're suggesting that every constructor declaration potentially create three different methods, one which implements initializers, one which implements constructor bodies, and one which calls the other two.

    This added complication would not maintain any invariant about the class, since the order of initialization makes no difference to the class itself -- the whole point is that the instance is not inspectable until after the initializers run. The only difference it makes is if there is a side effect in two or more of the initializers, and you care about the order in which those side effects are effected, and you want them to go base to derived.

    I do not see "side effects are effected in a different order" as a compelling reason to massively complicate the code generator for constructors. It's complicated enough already, believe me!

  • I got your point. Thank you for your patience with all the clarifications.

  • You're welcome! Thanks for asking a good question and bearing with me through the answer. :-)

  • To Eric: I retract my first comment, I didn't understand the point about C# objects "always being of one type". Everything pretty much follows from that requirement. In fact, you could even initialize derived members first, too. Would create less of a surprise for those derived virtual functions that can be called before the constructor is.

  • Welcome to the forty-first Community Convergence. The big news this week is that we have moved Future

  • Eric said: "You've got to tell the CLR in the metadata of the assembly which method to call when an object is constructed by "new".  You cannot tell it the method that runs just the initializers, and you cannot tell it the method that runs just the constructors. What are you going to do?"

    Have the newobj instruction call the initializer, then the specified constructor.  Since there is only one initializer for any type there's no need to pass it to newobj.

  • People, have a look at the IL, it's quite informative and fairly straightforward.

    .method public hidebysig specialname rtspecialname

           instance void  .ctor() cil managed

    {

     // Code size       37 (0x25)

     .maxstack  8

     .language '{[deleted]}'

    // Source File 'C:\Dev\Program_Original.cs'

    //000027:         readonly Foo derivedFoo = new Foo("Derived initializer");

     IL_0000:  ldarg.0

     IL_0001:  ldstr      "Derived initializer"

     IL_0006:  newobj     instance void EL.Foo::.ctor(string)

     IL_000b:  stfld      class EL.Foo EL.Derived::derivedFoo

    //000028:         public Derived()

     IL_0010:  ldarg.0

     IL_0011:  call       instance void EL.Base::.ctor()

     IL_0016:  nop

    //000029:         {

     IL_0017:  nop

    //000030:             Console.WriteLine("Derived constructor");

     IL_0018:  ldstr      "Derived constructor"

     IL_001d:  call       void [mscorlib]System.Console::WriteLine(string)

     IL_0022:  nop

    //000031:         }

     IL_0023:  nop

     IL_0024:  ret

    } // end of method Derived::.ctor

    Essentially, the compiler sets the types new() method to be

    exec initializers

    call base ctor()

    exec constructor code

    The JIT'er isn't doing anything special at runtime. This is all compiled code. All the JIT'er guarantees is that new gets called once. There's no special method called 'run initializers'.

    The issue I see is that initializer fields are available slightly before other fields. As in your example of calling a virtual method from within a constructor, a developer could really shoot themselves in the foot. Consider a change released code. If someone decided to move code from an initializer into the constructor body. Virtual methods in subclasses possibly now fail if they, purposely or otherwise, depend on this fragile ordering dependancy. You're at the mercy of the base class coder, compiler writers, and colleagues who don't understand the subtelty of what's happening. Before reading this I hadn't really thought about it either.

    Calling a virtual method from a constructor is bad practice, but since you can do it, people will.

    At least it's possible to detect this in code, but very annoying to have to do. I'm not sure what I'd rather have happen, and I don't feel that preventing virtual methods would be better. There are some cases where calling virtual methods from a constructor could be valid. (e.g. variant of the specification pattern)

    Perhaps it would be better to have the compiler use the following order

    call base ctor()

    exec initializers

    exec constructor code

    Better is only better because it's more predictable. A virtual method in a subclass that uses a field will fail predictably if called from the constructor, instead of just sometimes. It should be the case that setting a field in an initializer versus a constructor should be completely transparent. It ought to just be syntactic sugar, but as you've pointed out, it's not.

    Regards

  • As a long time (35+ years) developer, I have been involved in this debate since C++ as introduced.

    There are definite pros and cons to both approaches, and the safest bet is to never call virtual methods from within the constructor. Unfortunately it is very difficult (even with fxCop) that a call to a non-virtual member which in it's body calls a virtual method is flagged.

    One BIG advantage of the C++ model, applies to the development of library code. In the C++ model (Initializers run right before the opening brace of the constructor) is that the initialization behavior of a base class is 100% invariant with regards to the dervied class. this definately produces more predictable results. However it does also impose limitations

  • In c++, if we check the type of an object which is being creating, we will get a interesting answer. "*this" is a Base type until the Base constructor finished. And "if (this is Derived)" this line will never happen in c++, because Base Class should never know about what Derived will do. If a object has different type in its life, this problem will be solved.

Page 2 of 3 (42 items) 123