A JScript .NET Design Donnybrook, Part Two

A JScript .NET Design Donnybrook, Part Two

  • Comments 11

I am totally amused that the comments on yesterday's entry are nigh-isomorphic to the argument that we had over this in October 2000. As noted, f.bar(f); does in fact call the base class function, not the derived class function as some might expect.

I was on the fence back in 2000, but leaning towards the derived class, as was most of the rest of the team. Peter Torr was leaning towards the base class, and Herman, the architect, was pretty clearly in the base class camp. My argument was pretty similar to the arguments that most of you guys raised yesterday. In short, I cleaved to the principle:

Calls to methods on objects of unknown type should have the same semantics as if the type was known to be the most derived type. If developers want to restrict a call to a method on a particular type, they can add the annotation.

Though that is clearly a decent principle, in this situation it conflicts with some other design principles, such as:

JScript is all about guessing what the developer meant to say, and muddling on through if it's not clear.

and

Class-based code is a more-strictly-typed alternative to dynamic prototype-based code, and therefore class semantics should opt for increased type safety and efficiency rather than increased dynamism.

It really comes down to guessing what the developer meant. My example was deliberately abstract. Consider a more realistic example. In Canada, there's a Goods and Services Tax which is rather complicated. There are situations where a cake is taxable if you sell it in a restaurant, but not taxable if you sell it in a grocery store. (The rules are considerably more complex than that -- my father owned a restaurant at the time this tax became law, and figuring out when a single muffin vs a box of muffins was taxable was quite tricky.)

class Grocery {
  function get tax() {
    return 0
;
  }

  function PrintTaxrate(item) {
    print(item.tax);
  }
}

class Cake extends Grocery { 
   
hide function get tax() {
   
return 7
;
  }
}


Here the meaning of the "hide" is "if a cake is being treated as a cake, it's taxable. If it is being treated as a grocery item, it's not taxable."

But uh-oh, the developer of the base class did not annotate the method as taking a Grocery.

Now, suppose that you are developing the Cake class. The Grocery class already exists, and you are extending it -- perhaps the base class is even in an entirely different package. The developer of the Grocery class has forgotten to annotate the method, not anticipating that someone might have a non-virtual override. In order to enable the developer of the Cake class to implement the desired semantics, the code gen for the Grocery class must as its first guess assume that the caller intended this thing to take a Grocery.

Clearly there are arguments on both sides; it's a judgment call whenever you have to guess what the user meant.

Another advantage of this approach is that it is more efficient. Calling late-bound is very inefficient compared to speculatively casting the object to a class, and calling the method if the cast succeeds. The cost of late binding is enormous, and we wanted to do everything possible to eliminate it while still keeping the language scripty.  

Therefore what the code gen does in this case is looks for every visible class which has the appropriate method, and generates a cast-and-call for each, and finally bottoms out to do the late-bound call if none of the casts worked. Unless you have a large number of visible classes with the same method names, this is pretty darn efficient. If you choose to always call the most derived type's method then there is no way to generate this code efficiently because you always must do the late bound call first.

Philip Rieck's head exploded shortly after he discovered that the generated code actually checks to see if the argument is an instance of base, and THEN checks to see if it is an instance of derived. Isn't that unreachable code? Put your head back together Philip. Yes, it appears to be a codegen bug. I can't think of any circumstance in which that would not be dead code. I'll mention it to the JScript .NET team the next time I see him.

Peter Torr might remember more reasons than I do why we chose to do it this way. Peter?

 

  • Peter has posted some comments here:

    http://blogs.msdn.com/ericlippert/archive/2004/06/07/150367.aspx#151237

    Thanks Peter!
  • Argh!

    Your post almost exactly mirrors my comment to your last post. It's drivin' me nuts ;-)

    Oh and yeah, it sounds like a codegen bug (my bad on the previous comment). Maybe there was a bug in isinst at some point in time, or maybe it was hard to perform the inheritance graph check with some particular build of Reflection, or maybe it just wasn't worth the effort of bloating the compiler even more to do this check before spitting out the code -- it's not like that extra isinst is going to do you any harm, now is it? :-)

    Come to think of it, say you're implementing this in the compiler. Instead of just having:

    SpeculativeBindingCodeGen(type, methodname)

    and then calling it in a loop for all visible types with a method 'methodName', you now have to have

    SpeculativeBindingCodeGen(type, methodname, previouslyDoneTypes)

    and manage that list of things you've already checked, and do the inheritance checks inside the method, etc. Or do some other similar hack. Probably not really worth it, and all that code is just going to add some other weird bugs to the compiler anyways...
  • And now you post a reply while I post a reply!

    Me nuts have been driven off into the sunset!
  • > you now have to have peculativeBindingCodeGen(type, methodname, previouslyDoneTypes)

    Actually, no, we could fix this pretty easily. The way its implemented, the code gen first creates a list of all visible classes which have the particular method. Then it runs down the list, generating a speculative cast to each.

    It already generates that list of classes in an order such that derived classes always come after base classes. Otherwise the feature we're talking about here wouldn't work!

    But once a base class is on the list, none of its derived classes need to be added. We could skip adding them to the list entirely, and therefore would never generate a speculative cast that could not be exercised.

    Regardless, all this is doing is making the codegen a few bytes unnecessarily larger, and slightly slower in some fairly unlikely cases. It would be nice to fix this, but it's not much of a slowdown.
  • No, my head is one solid part again. A (very minor, IMHO) codegen "bug" is a good explanation -- no matter if it's just because it's easier or a bug, or whatever. The extra code didn't worry me because of the "bloat"...

    Rather, I was deathly afraid there was some extremely clever reason for it that I couldn't understand. Then I'd have to reclassify you people who understand and actually work on JS.NET as evil genius gods, rather than just geniuses. (well, either that or reclassify myself as *not* a god. Nah, when someone asks if you're a god, you say "Yes"!)
  • Huh, and here I had thought you cleverly optimized the case where foo2 and foo3 were in separate compilation units. If the definition of foo3 changes to no longer be a foo, you still get the fast isinst treatment. Although now that I think of it, that's a pretty unlikely thing to optimize for.
  • >But once a base class is on the list, none of its derived classes need to be added. We could skip adding them to the list entirely

    Eric, perhaps I'm not understanding this, but take this code:
    class foo {};
    class foo2 extends foo {
    function bar(ob) {
    print(ob.xyz);}
    }
    class foo3 extends foo2 {
    function get xyz(){
    return 10;}
    }
    var f = new foo3();
    f.bar(f);

    If what you are saying is true, then calling f.bar(f) the 'ob' type is of 'foo', and then xyz() does not exist. The code compiles and prints '10' as I would expect. I imagine you mean making the speculative casts until the most appropriate base class that supports the calling parameters is hit and not adding any others after this point?

    The above question could probably be clarified by adding a foo4 that extended foo3 and hid the xyz method. Constructing f as a foo4 object and calling f.bar(f) would then make 'ob' in the ob.xyz call a foo3, and not a foo. Correct?
  • Correct -- we only generate speculative early bindings to classes which actually have the method!
  • > here I had thought you cleverly optimized the case where foo2 and foo3 were in separate compilation units

    We considered it, but it turned out to be a lot of work for little gain. Getting the semantics right across separate compilation units is a real pain, particularly when you start to think about versioning issues. Also, at the time we were thinking about allowing circular dependencies amongst compilation units, which makes it even trickier.
  • > JScript is all about guessing what the
    > developer meant to say, and muddling on
    > through if it's not clear.

    Is Eric a candidate for the Understatement Award or what?

  • Here's some back-and-forth from an email conversation I had with a user a while back. Why should one

Page 1 of 1 (11 items)