Inheritance in JavaScript - Part 2

Inheritance in JavaScript - Part 2

  • Comments 1

Last time we took a (very) long-winded look at a simple JavaScript program that used basic inheritance. This time, we'll expand on that a little bit and improve on the techniques.

Calling the Constructor Directly

First of all, let's look back at the Point constructor again:

function Point(id, x, y)
{
  this._id = id;
  this._x = x;
  this._y = y;
}

Last time we noted that we just stash the value if id into the _id member because that's the only thing that the Thing constructor does with it. But what if Thing did some more complex processing, or took a larger number of arguments? We could keep copy-and-pasting the code from Thing into Point, but that is a tiresome process known as the "double maintenance problem" and will very quickly lead to bugs.

Much better is to simply let Thing do its, err, thing, for us, by calling into it directly:

function Point(id, x, y)
{
  Thing.call(this, id);
  this._x = x;
  this._y = y;
}

The only potentially weird bit here is that rather than just calling the Thing function directly, we call a function off the Thing function to do some special house-keeping work for us. How so, you ask? Remember that in JScript, functions are first-class objects and can have properties and methods of their own, just like any other object. They just happen to have some internal properties like [[Call]] and [[Construct]] that make them cleverer than the average object.

So, the Thing function has some functions of its own, one of which is called call. What the call method does is call the function that it belongs to and passes in a random object as the this object. In ECMAScript you don't usually have to ask yourself "How does my function get a this object anyway?" because it just works. It's only when it doesn't work that you get confused.

Well, when you call a function as a constructor (i.e., as part of a new expression), JavaScript creates a new object for you and then passes it in as the this object. So you can write stuff like this._id = id and the this object just happens to be there for you. And then typically you assign the result of the constructor to a variable -- something like var thing = new Thing(42) -- and now the thing variable holds onto that object. When you call a method off an object, JScript once again figures out the this object for you (it's the thing holding the method), and passes it in for you. If you just call a function as if it were a normal method -- i.e., not as part of a new expression, or not off an existing object -- then the script engine will just use the global object as the this object.

So what if you want to call a method on an object you already have? That's the case here -- we already have our this object inside the Point constructor, and we want to ask Thing to do some work for us. But we don't want to create a new Thing -- we're deriving from it, remember -- and we don't want it to operate on the global object either; we want it to operate on our currently existing this object. And that's where the call method (and it's cousin apply) comes in -- it lets you call a method and supply an arbitrary object that will become the this object inside the method call. So we just call Thing.call and supply our this object as Thing's this object, and everything is peachy. (The apply method is basically the same as call, but deals better with a variable number of arguments -- a programming practice you should try to avoid).

OK, so now we have our Point object calling into Thing to do the work for it, and we have saved ourselves the cost and complexity if maintaining the same code in two places. Are we done?

Nope!

Two things are still pretty ugly about this: this method is still too expensive, and it's not conducive to copy-and-paste programming.

Reducing Duplicate Function Calls

Let's look at the code for Point again:

function Point(id, x, y)
{
  Thing.call(this, id);
  this._x = x;
  this._y = y;
}

Point.prototype = new Thing();
Point.prototype._className = "Point";
Point.prototype.constructor = Point;

Notice something fishy going on here? If we create n Point objects, we actually call the Thing constructor n + 1 times -- that "+ 1" is caused by the initial call to set the Point.prototype property. In this specific example, that's not too bad, because Thing doesn't really do much. But in a more complicated scenario, it could cause problems. For a start, if the Thing constructor is expensive, we will waste some CPU on that first initialisation (hopefully not a huge concern, but you never know).

Secondly, if Thing actually does anything "interesting" in the constructor (and by "interesting" I mean interacting with the world outside of this -- incrementing a global counter, printing information on the screen, creating a row in a database, or whatever), then you will have an additional bogus interaction from that Point.prototype = new Thing() call. Unlike the first point, this is more likely to be of concern, so it's something we should try to avoid.

Thirdly, we construct Thing without any arguments -- so if the constructor expects to receive meaningful parameters, it is likely to throw an exception or perform some other unpleasant acts because it got a bunch of undefined values instead of real data. One way around this would be of course to supply Thing with the parameters it needed, but since they are "throw away" parameters (they aren't really used for any specific Point instance, because we're going to call the constructor with real values later on), it kind of defeats the purpose. And, again, it could put us into unchartered waters further down the road, depending on the complexity of the Thing constructor.

We can easily avoid all these pitfalls by using a different coding pattern that is tuned more to the way ECMAScript works and less to the way C++ works (unless you write COM objects for a living). Rather than doing any work inside the constructor, I make the constructor do the bare minimum initialisation steps (basically set everything to null / zero / empty string) and then do the real work inside an Initialise method. This avoids duplicating effort inside that initial call to setup the prototype, but still leaves the object in a bare-bones usable state should someone forget to call Initialise before using it. (You could leave the constructor completely empty if you like, choosing to do all the initialisation during the Initialise call, but personally I think that setting up all the properties in the constructor provides better documentation -- you only need to look at the constructor to find out what members the object has).

So, using this technique, the code is modified thusly (only new things are re-printed here, not the whole script):

function Thing()
{
  this._id = "";
}

Thing.prototype.Initialise = function Thing_Initialise(id)
{
  this._id = id;
};

var thing = new Thing();
thing.Initialise('thing');

//////////////////////////////////////////////////

function Point()
{
  this._x = -1;
  this._y = -1;
}

Point.prototype.Initialise = function Point_Initialise(id, x, y)
{
  Thing.prototype.Initialise.call(this, id);
  this._x = x;
  this._y = y; 
};

var point = new Point();
point.Initialise('point', 0, 0);

Whilst this style might offend some of the more hard-core C# / Java / etc. enthusiasts, and it is a new coding convention that takes some getting used to, it definitely helps curtail the one-too-many-constructor-calls problem that will come up to bite you in the proverbial sooner or later. I recommend you consider it.

Making Code Copy-And-Paste-able

Everyone knows that a lot of "programming" consists of copying your last project (or someone else's :-) ) and then tweaking it to do whatever you need to do next. In support of this, we should look for coding practices that make it easy to copy-and-paste code in such a way that the "programmer" doesn't have to go over every line with a fine-toothed comb to make sure there are no bugs. Because of JScript's loose typing (and very forgiving semantics), it is possible to copy and paste code that happens to "work" by mistake but isn't doing what you think it is. Days or weeks later you make a (seemingly) unrelated change elsewhere, and suddenly things stop working.

Argh!

So let's look at the code again:

Point.prototype.Initialise = function Point_Initialise(id, x, y)
{
  Thing.prototype.Initialise.call(this, id);
  this._x = x;
  this._y = y;
};

If we're going to copy-and-paste this code a lot of times, we really want it to be as loosely-coupled to Thing as possible... and yet here we have a direct call into a Thing.prototype method right smack bang in the middle of our function (and imagine this is a large code-base where you have lots of methods that make calls to their base class' methods all over the place). The scary thing is that copied code will still work, even if the object you copied it into doesn't derive from Thing, because there exists a Thing and it has an Initialise method. Thing will just happily dork around with the this object, creating new expando properties if it needs to, or reading undefined out of ones that don't exist, and it probably won't crash or do anything terribly noticeable. But the fact is that it won't be doing what you intended it to do, and you might never know that it's not really doing what you expected until it's too late.

So how do we solve it? We remove any reference to Thing from inside our functions. And how do we do that? I'm glad you asked:

function Point()
{
  this._x = -1;
  this._y = -1;
}

Point.prototype = new Thing();
Point.prototype._className = "Point";
Point.prototype.constructor = Point;
Point._base = Thing.prototype;

Point.prototype.Initialise = function Point_Initialise(id, x, y)
{
  Point._base.Initialise.call(this, id);
  this._x = x;
  this._y = y;
};

Aha! We add a new _base property to the Point object itself (remember, Point is an object too!) that points to the base object (in this case, Thing.prototype) and then we call that instead of Thing.prototype. OK, OK, so you may be saying "But Peter you twit, this didn't solve anything -- when I copy-and-paste my code, I could still end up with Point._base.prototype instead of Thing.prototype and the same problems will still ensue!"

How true! The difference, dear reader, is that I expect you are more likely to do a search-and-replace for "Point" when you copy this code than you are to search-and-replace "Thing" (remember, you're copying code from Point to YourNewObject), and so the reference to Point._base will be updated to YourNewObject._base without any hassles. Of course if you don't do any kind of search-and-replace, you have the same problem you always had... life is tough.

Why Being Smarter Won't Work

The "obvious" solution to the last problem is to put the _base property onto the this object -- rather than the constructor function itself -- then you can always refer to this._base.prototype and never have to worry about referring to the wrong thing ever again, right?

Wrong.

Exactly why it's wrong is a subject for another post (or perhaps a reader can comment...). Try it out for yourself, and see what happens (hint: you'll need more than two classes to see why it doesn't work).

Coming Up...

In the next instalment: a feature that some people claim is impossible with JScript. Just you wait and see!

  • Cool posts, thanks!

    After mulling it over for a while (but not actually trying it - like I should get my hands dirty!  *grin*), I believe that being smarter won't help because you'll end up in an infinite loop.  Construct three classes as you describe in your blog entries:  Thing, Point, and BluePoint; each derives from the previous.

    If we set this._base in the constructors, then an attempt to create a BluePoint goes like this:

    var bp = new BluePoint();
    // generic object is created
    // passed as 'this' to BluePoint function
    // this._base set to the Point.prototype function
    bp.Initialize("some", "parameters");
    // bp doesn't have an Initialize function
    // check its prototype next
    // ah, call BluePoint.prototype.Initialize
    // first thing it does:
    // call this._base.Initialize
    // which is really Point.prototype.Initialize
    // first thing it does:
    // call this._base.Initialize
    // which is really Point.prototype.Initialize
    // first thing it does:
    // call this._base.Initialize
    // which is really Point.prototype.Initialize
    // ...
Page 1 of 1 (1 items)