Why Can't I Access A Protected Member From A Derived Class? Part Six

Why Can't I Access A Protected Member From A Derived Class? Part Six

Rate This
  • Comments 22

Reader Jesse McGrew asks an excellent follow-up question to my 2005 post about why you cannot access a protected member from a derived class. (You probably want to re-read that post in order to make sense of this one.)

I want to be clear in my terminology, so I’m going to define some terms. Suppose we have a call foo.Bar() inside class C. The value of foo is the “receiver” of the call. The compile-time type of foo is the “compile time type of the receiver”. The “runtime type of the receiver” could potentially be more derived. The “call site type” is C.

The rule in C# is that if Bar is protected, then the compile-time type of the receiver must be the call site type, or a type derived from the call site type.  It cannot be a base type of the call site type, because then the runtime type of the receiver might have a “cousin” (or sibling, or uncle… but let’s not split hairs, let’s just call them all cousins) relationship to the call site type, rather than an “ancestor/descendant” relationship.

Jesse quite rightly points out that my original answer to the question was not really complete. There are two questions unanswered:

1) Why would it be a bad thing to allow calling a protected method from a receiver whose runtime type is a “cousin” class?

2) Supposing for the sake of argument that there is a good answer to (1) – why doesn’t that argument apply equally well to calling a protected method via a receiver whose compile-time type is the same as the call site type? The runtime type of the receiver still could be more derived.

I’ll answer the first question first. In this answer I’m going to use a humorous and exaggerated example to illustrate the problem; I want to emphasize that this is not how you should actually design applications that need “big S” Security, like bank accounts. What I want to illustrate here is simply that allowing protected methods to be called via “cousins” makes it difficult to maintain invariants and therefore difficult to write correct, robust code.

// Good.dll:

    public abstract class BankAccount
    {
      abstract protected void DoTransfer(
        BankAccount destinationAccount, 
        User authorizedUser,
        decimal amount);
    }
    public abstract class SecureBankAccount : BankAccount
    {
      protected readonly int accountNumber;
      public SecureBankAccount(int accountNumber)
      {
        this.accountNumber = accountNumber;
      }
      public void Transfer(
        BankAccount destinationAccount, 
        User authorizedUser,
        decimal amount)
      {
        if (!Authorized(user, accountNumber)) throw something;
        this.DoTransfer(destinationAccount, user, amount);
      }
    }
    public sealed class SwissBankAccount : SecureBankAccount
    {
      public SwissBankAccount(int accountNumber) : base(accountNumber) {}
      override protected void DoTransfer(
        BankAccount destinationAccount, 
        User authorizedUser,
        decimal amount)
      {
        // Code to transfer money from a Swiss bank account here.
        // This code can assume that authorizedUser is authorized.
        // We are guaranteed this because SwissBankAccount is sealed, and
        // all callers must go through public version of Transfer from base
        // class SecureBankAccount.
      }
    }
    // Evil.exe:
    class HostileBankAccount : BankAccount
    {
      override protected void Transfer(
        BankAccount destinationAccount, 
        User authorizedUser,
        decimal amount) {  }
      public static void Main()
      {
        User drEvil = new User("Dr. Evil");
        BankAccount yours = new SwissBankAccount(1234567);
        BankAccount mine = new SwissBankAccount(66666666);
        yours.DoTransfer(mine, drEvil, 1000000.00m); // compilation error
        // You don't have the right to access the protected member of
        // SwissBankAccount just because you are in a
        // type derived from BankAccount.
      }
    }

Dr. Evil's attempt to steal ONE... MILLION... DOLLARS... from your Swiss bank account has been foiled by the C# compiler.

Obviously this is a silly example, and obviously, fully-trusted code could do anything it wants to your types -- fully-trusted code can start up a debugger and change the code as its running. Full trust means full trust. Again, do not actually design a real security system this way! 

But my point is simply that the "attack" that is foiled here is someone attempting to do an end-run around the invariants set up by SecureBankAccount in order to access the code in SwissBankAccount directly.

The second question is "Why doesn't SecureBankAccount also have this restriction?"  In my example, SecureBankAccount says “this.DoTransfer(destinationAccount, user, amount);”

Clearly "this" is of type SecureBankAccount or something more derived. It could be any value of a more derived type, including a new SwissBankAccount. Doesn’t the same concern apply? Couldn't SecureBankAccount be doing an end-run around SwissBankAccount's invariants?

Yes, absolutely! And because of that, the authors of SwissBankAccount are required to understand and approve of everything that their base class does!  You can't just go deriving from some class willy-nilly and hope for the best. The implementation of your base class is allowed to call the set of protected methods exposed by the base class. If you want to derive from it then you are required to read the documentation for that class, or the code, and understand under what circumstances your protected methods will be called, and write your code accordingly. Derivation is a way of sharing implementation details; if you don't understand the implementation details of the thing you are deriving from then don't derive from it.

And besides, the base class is always written before the derived class. The base class isn't up and changing on you, and presumably you trust the author of the class to not attempt to break you sneakily with a future version. (Of course, a change to a base class can always cause problems; this is yet another version of the brittle base class problem.)

The difference between the two cases is that when you derive from a base class, you have the behaviour of one class of your choice to understand and approve of. That is a tractable amount of work. The authors of SwissBankAccount are required to precisely understand what SecureBankAccount guarantees to be invariant before the protected method is called. But they should not have to understand and trust every possible behaviour of every possible cousin class that just happens to be derived from the same base class. Those guys could be implemented by anyone and do anything. You would have no ability whatsoever to understand any of their pre-call invariants, and therefore you would have no ability to successfully write a working protected method. Therefore, we save you that bother and disallow that scenario.

And besides, we have to allow you to call protected methods on receivers of potentially more-derived classes. Suppose we didn't allow that and deduce something absurd. Under what circumstances could a protected method ever be called, if we disallowed calling protected methods on receivers of potentially-more-derived classes?  The only time you could ever call a protected method in that world is if you were calling your own protected method from a sealed class! Effectively, protected methods could almost never be called, and the implementation that was called would always be the most derived one. What's the point of "protected" in that case? Your "protected" means the same thing as "private, and can only be called from a sealed class". That would make them rather less useful.

So, the short answer to both questions is "because if we didn't do that, it would be impossible to use protected methods at all."  We restrict calls through less-derived receiver compile-time types because if we don't, it's impossible to safely implement any protected method that depends on an invariant.  We allow calls through potential subtypes because if we do not allow this, then we don't allow hardly any calls at all.

Finally, an unasked question: what if you are the person writing the base class with a protected method? Essentially you are in the same boat as the person writing any public or virtual method; then you have to accept that anyone can call your public method in any way it chooses, or that derived class can override your virtual method and make it do something completely different, at their discretion. If you write a protected method, you have to accept that any derived class can call that method in any way it chooses and write the base class code accordingly.

When you write a public method, you have to consider the consequences of bad callers; if there are ways that bad callers can misuse your public method, then you need to consider the cost to the user vs the cost of hardening the method against the potentially bad caller. The same is true of virtual methods; you have to consider the consequences of bad overriders. And the same is true of protected methods; you have to consider the consequence of bad derived classes calling your code.

Designing robust code that has public methods is hard. Designing code that is robust and has virtual or protected methods is even harder; designing for extendability is a difficult problem in general, and shouldn’t be taken on lightly. Consider sealing classes that aren’t designed for extension.

  • @Brian

    ""Only classes that have been purposefully created with inheritance in mind should be used." I think we could (too late now, obviously) argue that this implies that the default should be sealed."

    yeah that's exactly my point. In many classes inheritance doesnt even make sense but it's still used. Like I said before, I've come across tons of code where classes that have no virtual methods are extended via inheritance. These particular cases aren't dangerous per se as you cant mess with the internals of said class and you can't modify the behavior expected by the consuming code but it's still wrong in my opinion. Such classes should be extended through other tools at your disposal.

    People tend to abuse inheritance in such cases simply because C# classes are by default not sealed which really doesnt make any sense. Why arent methods then virtual by default?  Its completely incoherent with most of how C# works.

  • Eric: Maybe your car has its hood sealed shut, but mine doesn't. Just because the manufacturer didn't explicitly design each part of my car for extensibility doesn't mean I can't extend it. While I understand that there are parts that must be sealed for performance or security reasons (airbags, odometer, alarm system), just about every other part of my car can be changed whether intended or not. Can you imagine if the only parts you could change on your car were ones the manufacturer explicitly allowed? "Sorry, sir, but you can't put snow tires on our convertible models because we didn't design them to be used in the winter."

    Anyway, I suppose there are two schools of thought. One says that extensibility should only be permitted where it is designed; the other says that extensibility should be disallowed only when there's a good reason. People who design libraries and keep getting bitten by implementors using their libraries in unexpected ways prefer the first school, while implementors who keep running into walls where the functionality they need is blocked by interal/private/sealed prefer the second school.

    I read Raymond's blog, so I understand why people are in the first school. However if you read my first comment on this post, it should be obvious why I'm in the second school.

  • @Gabe,

    The hood of your car may not be sealed shut, but I guarantee that you signed all kinds of legal documents when you bought the car, voiding the warranty and relieving the manufacturer of any responsibility should you decide to install third-party parts in the engine. Software companies can't do that; or rather they can, but when they try, they are vilified for it. Programmers scream bloody murder when they try to extend something and it doesn't work. Try browsing the issues on Connect sometime if you don't believe me. Sure, Microsoft (or Sun or Oracle or whoever) can say "it's not our fault you tried to extend our software in a way it wasn't designed for." But then they are accused of being monopolistic and evil and ignoring their customers. So instead they take on a massive support burden, trying to keep their customers from shooting themselves in the foot.

    Moreover, car manufacturers don't have to take versioning into account. There is very little concept of a "breaking change" in car design. But Microsoft, and anyone else producing a library for public consumption, has to pay a lot of attention to breaking changes, because programmers expect to be able to drop in a new version of a library and for everything to just work. This is of course a completely unreasonable expectation, but that doesn't stop people from crying foul when they have to make even the tiniest change to accommodate the new version of a library. The problem is that virtual methods allow customers to take dependencies, not just on the behavior of your methods, but on the timing of those method calls. This makes it impossible to refactor anything which touches a virtual method without incurring a breaking change. I quote Jan Gray from the Framework Design Guidelines (first edition, page 169):

    "If you ship types with virtual members you are promising to forever abide by subtle and complex observable behaviors and subclass interactions. I think framework designers underestimate their peril. For example, we found that ArrayList item enumeration calls several virtual methods per each MoveNext and Current. Fixing those performance problems could (but probably doesn't) break user-defined implementations of virtual methods on the ArrayList class that are dependent on virtual method call order and frequency."

    The bottom line is that people who advocate virtual everything have never had to support or maintain such a library on any kind of scale. The frustration that arises when we as library consumers can't always extend things the way we want is a small price to pay for Microsoft being able to its collective sanity.

    All that said, Pavel is right: if library designers, Microsoft included, would design to interfaces instead of implementations, most of this problem would go away. Of course the problem there is that, as hard as it is for the average programmer to grok inheritance, it is actually even harder for them to grok composition. So I am not sure that wouldn't create even more problems than it solves.

  • David Nelson: Since I live in the US, the Magnuson-Moss Warranty Act (http://en.wikipedia.org/wiki/Magnuson%E2%80%93Moss_Warranty_Act) means that the manufacturer of my car cannot require that I only use their parts in the engine or anywhere else in the car. So in fact, I signed no such paperwork.

    Anyway, let's consider the ArrayList example. Assume that I am using some library that uses ArrayLists and has some algorithm that calls IndexOf a lot. Since IndexOf is currently O(n), I can speed things up by orders of magnitude with a fast IndexOf() method. With the right things being virtual, I can derive from ArrayList, add an index, and override the necessary methods (Add, Remove, IndexOf) to use and maintain my index. Then I just have to pass MyFastArrayList objects into the library and everything will be orders of magnitude faster.

    If ArrayList were sealed, I could only create MyFastArrayList by completely reimplementing ArrayList and modifying the few methods that need to maintain the index. Of course, that would be completely useless because there's no way to pass MyFastArrayList instances to the library that expects ArrayList instances. My only recourse here is to REIMPLEMENT the WHOLE library. This scenario certainly puts less burden on the maintainers of ArrayList and the 3rd-party library, but only because I can't use them anymore. In fact, perhaps the 3rd-party library maintainers would prefer the first scenario because at least then I would be their customer.

    Ideally, there would be some IArrayList which is implemented by SealedArrayList and VirtualArrayList. That way all libraries would just take IArrayList instances and most users who don't care about extending it would use the SealedArrayList that can get performance revs, while users who need an index can extend VirtualArrayList.

    Now, imagine a more complex version of scenario 2: I want to use some library of yours but I need to override something that isn't virtual, so I go into Reflector, copy the classes I need, paste them into my code, and change the function that I need to override. What happens when you discover a security flaw in the part of your library that I needed to change? That fix isn't going to automatically propogate to my code because I'm using a copy of it. Is this any better of a maintenance scenario?

  • Perhaps the act of installing a third-party component does not void the warranty, but if that part can be shown to have caused damage to the engine (which make the analogy more closely match the situation we are talking about), then I promise you are not going to be able to enforce the warranty.

    I am fully aware of the inconvenience that can be caused by the inability to override methods. But as usually happens in these arguments, you are only considering the very narrow situation where you need to change some specific behavior for your specific need. Yes, in that case, non-virtual methods in ArrayList would be a setback. But for ALL other users in ALL other cases that don't require such an override (which is the vast majority of them), non-virtual methods would be a benefit. Microsoft should not micro-optimize for the few at the expense of the many.

    You are right that the real solution to this problem is for APIs to bind to interfaces, not implementations. For ArrayList, that is already the case! There is already an IList interface that has all of the important members of ArrayList, and most APIs already bind to it. So you are already free to create your own implementation of IList when you don't find ArrayList acceptable; there is no need for ArrayList methods to be virtual.

    However, there are many other types and APIs that bind to implementations instead of interfaces. That is the core of the problem that this debate centers on, and that is what we should be driving home to Microsoft, instead of continually rehashing this tired argument about virtual methods that Microsoft has long since been decided, correctly, and is not going to change.

  • "Extensibility is a feature, and all features should be designed for particular business purposes and customer scenarios, not just thrown in for free because it's cheap and easy to do a bad job."

    I tend to agree with you, but it leads me to the question, why aren't all classes sealed by default?  I mean, why do we have a "sealed" keyword instead of an "unsealed" or "extendable" keyword?  It would seem that the language design is at direct odds with the standard guidance in this scenario.

    Obviously the ship has sailed on doing that at the language level.  But the New Class... template in VS could certainly be modified to declare the new class as sealed, which would have the same effect of requiring the programmer to make a conscious choice to unseal a class.

    Another useful feature to support the guidance might be a keyword that means "only extendable within its own assembly."  Designing a class to be extendable by the general public is certainly more work than designing it to be extendable only by people who have the ability to modify the class itself.

  • I don't get why does the DoTransfer Method in BankAccount have code, if it is an abstract method. I have read in the Stack Overflow's Web phorum that abstract methods do not have code , but it's body is implemented by the classes that derive that class.

Page 2 of 2 (22 items) 12