I received an e-mail from a customer referencing this newsgroup post and asking two questions about virtual methods and inheritance:
1. Why does it work like this?
2. What's the 'security' implication?
Funnily enough, I just read Eric's post on a very similar topic, but I'm going to talk about it too because I'll (hopefully) end up with a slightly different explanation, and maybe you'll learn something else. You should definitely read Eric's post though (either before or after mine).
So let's say you have the following class declarations (there's a plain English description below in case you're not familiar with JScript or if you just find code hard to read in a blog ;-) ):
// Base class
// Call the helper function
// Helper method that's not visible
// to the general public
// Derived class
Derived extends Base
// Override Helper method
Basically you have a class named Base that has two methods, MainMethod and Helper. Users are going to call MainMethod, and it will call the non-public Helper method to do some of its work. Because Helper is marked as protected, only members of Base or it's derived types may call it -- the great unwashed masses cannot.
You then have a class Derived that (funnily enough) derives from Base and provides its own implementation of Helper.
Aside: One thing to note here is that in JScript .NET, all non-static methods are virtual and there's no way to declare a non-virtual method. C# and VB allow virtual methods, but they are not the default. We made them the default in JScript because the philosophy of the language was "people should be able to get stuff done," and in general if you write code like the code above the intent is to perform a method override, which requires using virtual methods. C# has a different philosophy, which is "developers should know what they're doing," and so they force developers to mark methods as virtual. I don't know what VB's philosophy on this matter is, but they also force you to explicitly declare a method as Overridable to make it virtual.
So anyway, you have these two classes, and then you write some code like this:
// Create instance of Base
b : Base = new Base
// Call MainMethod on the Base class
// Create instance of Derived
d : Derived = new Derived
// Call MainMethod on the Derived class
// Now assign the derived instance to the
// Base variable
b = d
// No, really, call it on the Base class!
And you get some output:
So far so good; you had a Base and you got the MainMethod and the Helper
Still good; you had a Derived and you got the MainMethod and the Helper as expected
What's this? You had a reference typed as Base but you still got the Derived version of Helper
This is just to show you that extra casting is a waste of time ;-). The compiler already knows that b is a Base, so doing an explicit cast won't get you anywhere.
What happens is that when MainMethod goes to call Helper, it doesn't look directly at the function declared in Base and start executing it. (This would be the behaviour if Helper was non-virtual, but remember all methods are virtual in JScript). Instead what happens is that MainMethod consults a special lookup value that then tells it where to find the most derived implementation of the method, which in this case is Derived.Helper. There's no way to make an instance of Derived call Base's version of the Helper method, even if you call it through a reference of type Base or try and do a cast. It's just not possible.
Aside: Note that it is possible for a method to call its immediate base class' implementation via the super keyword (base in C# and MyBase in VB), but this only works from within the derived class itself, and only "up" one level -- you can't do this from outside the class. And although it will call the direct parent's method, any methods that it calls will be subject to the normal rules (ie, find the most derived implementation).
You probably knew that, so why is this a security concern? Let's assume that the Derived class performs some special access checks in the Helper method. Maybe this is where it does a username lookup, or an account balance check, or some other important action that must succeed before the rest of the MainMethod can be allowed to complete. Now if the rule above (you always call the most-derived implementation) were violated, it would be possible to break the system either accidentally or maliciously.
In the accidental case, you have an instance of Derived and you pass it to a method that only knows how to handle Base objects. Since Derived is a subclass of Base, the compiler has no problem with this and will happily pass off the object, and the method you call has no problem since it knows how to deal with instance of Base. But if it ever called into MainMethod then it, in turn, would call the Base implementation of Helper (bypassing the security checks) and you'd be in trouble. In the malicious case, you'd deliberately cast the reference to Base in order to circumvent the checks in Helper.
But whilst this behaviour is explicitly designed to help enforce security, it (ironically) presents a security problem of its own -- you can't really be sure what your virtual methods are getting up to. Let's say that instead of having the security checks be performed inside Derived's version of the Helper method, we have the security checks be made inside Base and have Derived's implementation simply return successfully without doing any checks at all. Now we have the same sequence of events -- you pass off an instance of Derived to a method that only knows how to deal with Base objects -- but now instead of guaranteeing that the security checks in Helper are made, it guarantees that the security checks are skipped!
The way to get around this little problem is to ensure that the method (or perhaps the entire class) is marked as final (aka NotOverridable in VB), which tells the compiler (and, more importantly, the CLR) that no class is allowed to provide a different implementation of this method. Or if there's no need for arbitrary classes to call the method at all, mark it as private or perhaps internal (Friend in VB). Don't let your security code be skipped by a pesky hacker deriving from your type and overriding your methods! (See this note for security concerns around internal virtual methods).
Aside: As a rule of thumb, if you don't explicitly intend for people to subclass your types, you should mark them as final as a matter of course. It's kind of the "least privilege" rule applied to software design -- if there's no good reason to enable a scenario, you should explicitly disable it. In fact I often wonder why the C# team didn't make final the default for classes, given their general approach to things, but perhaps that was just a bit too much for their customers to deal with. You do have to weigh the security risks of allowing people to subclass your types with the benefits you get from having a rich, easy-to-extend object model.
The two basic rules then are:
· If you have a base class and you expect derived classes to specialise a method, make sure it is virtual so the specialised method is always called (and remember that all methods are virtual in JScript)
· If you have a base class and you want to ensure that your method is never circumvented by a subclass, make sure it (or the entire class) is marked as final, or that it is not visible to the derived class, or (in languages that support it) it isn't virtual
As a final trick, let's see how you can keep the same function name but get out of the virtual-ness game. In JScript, you can use the hide modifier (new in C# or Shadows in VB) which says "even though this method has the same name as a method that I could override in the base class, I want you to give me a new lookup value so I don't clobber my base class' implementation."
Add the following declarations:
// Another derived class
MoreDerived extends Derived
// We'll make a *new* Helper!
hide function Helper()
// Most derived class
MostDerived extends MoreDerived
// Override the 'new' Helper!
And the following test code:
// Let's test it out...
// Create instance of MoreDerived
m : MoreDerived = new MoreDerived
// Call MainMethod on the MoreDerived class
// Explicitly call Helper
// Create instance of MostDerived
t : MostDerived = new MostDerived
// Call MainMethod on the MostDerived class
Now we get the following results:
When we call MainMethod, it looks for the most-derived version of Helper to call. But since the version in MoreDerived is marked as hide, it actually defines a new method and so the most derived version that MainMethod can find is the one previously defined in the Derived class. Calling the Helper method explicitly though does call the new one we just created (note I didn't mark it as protected, just to make the example easier)
Aside: JScript, like VB, defaults to having members be public, because it fits into the philosophy of "if the user defined a function, it's probably because they want to call it." C# defaults to private, because they want developers to be explicit about how their code should work.
More of the same; Derived.Helper is still the most derived implementation of the "original" Helper method, but MostDerived provides an override for the new Helper.
This can get tricky whey you have deep hierarchies of classes with some virtual methods and some non-virtual method and some hiding methods and some super calls... Anyway, as usual this turned out much longer than I originally anticipated, and probably most of it was "common knowledge," but once I get writing I can't really stop.