Why Can't I Access A Protected Member From A Derived Class, Part Two: Why Can I?

Why Can't I Access A Protected Member From A Derived Class, Part Two: Why Can I?

Rate This
  • Comments 38

This is a follow-up to my 2005 post on the same subject  which I believe sets a personal record for the longest time between parts of a series. (Of course, I didn't know it was a series when I started it.) Please read the previous article in this series, as this post assumes knowledge of part one.

.......

OK, now that you've read that, it's clear why you can only access a protected member from an instance of an object known to be of a type at least as derived as the current context. You can therefore deduce the answer to the question asked to me by a (very polite) reader this morning: Why did this code compile in C# 2.0 but give an error in C# 3.0?

public abstract class Item{
  private Item _parent;
  public Item Parent {
    get { return _parent; }
    protected set { _parent = value; }
  }
}
public class Bag:Item{
  private List<Item> list = new List<Item>();
  public void Add(Item item)
  {
    item.Parent = this; // Error in C# 3.0
    list.Add(item); 
  }
}
public class Torch : Item { }

It compiled in C# 2.0 because the compiler had a bug. We forgot to enforce the semantics of "protected" access for the property setter. Though the compiler did not generate an error, it did generate code which would not pass the CLR verification check! The code would run if it happened to be fully trusted, but it was not safe to do so. We fixed the bug in C# 3.0 and took the breaking change.

This raises a more interesting point though. Now that we correctly prevent you from setting the "parent" reference in this manner, how would you implement the desired pattern? The reader wanted the following conditions to be met:

  • There are two kinds of items -- "leaf" items, and "container" items. Container items may contain either kind of item, so you could have a box containing a bag, which in turn contains a torch.
  • An item may be in a container, and if it is, its parent reference refers to that container.
  • These classes must be extensible by arbitrary third parties.
  • "Unauthorized" code must not be able to muck around with the parenting invariants. 

The reader was attempting to enforce these invariants by making the parent setter protected. But even in a world where it is legal to access a protected member from an arbitrary derived class, that does not ensure that the invariants are maintained! If you allow any derived class to muck with the parenting, then you are relying upon every third-party derived class to "play nicely" and maintain your invariant. If any of them are buggy or hostile, then who knows what can happen?

When I'm faced with this kind of problem, I try to go back to first principles. Considering each design constraint leads us to an implementation decision which implements that constraint:

  • There are two kinds of abstract items -- items and containers. Therefore, there should be two abstract base classes, not just one.
  • Containers are items. Therefore, the container class should derive from the item class.
  • The parenting invariant must be maintained across all containers. Therefore the invariant implementation must be inside the abstract container base class.
  • Every possible container requires "write" access to the parent state of every possible Item.  We want that access to be as restricted as possible. Ideally, we want it to be private. The only by-design way to get access to a private is to be inside the class.

And now a full solution becomes very straightforward:

using System;
using System.Collections.Generic;

public abstract class Item {
  public Item Parent { get; private set; }
  public abstract class Container : Item {
    private HashSet<Item> items = new HashSet<Item>();
    public void Add(Item item) {
      if (item.Parent != null)
        throw new Exception("Item has inconsistent containment.");
      item.Parent = this;
      items.Add(item);
    }
    public void Remove(Item item) {
      if (!Contains(item))
        throw new Exception("Container does not contain that item.");
      items.Remove(item);
      item.Parent = null;
    }
    public bool Contains(Item item) {
      return items.Contains(item);
    }
    public IEnumerable<Item> Items {
      // Do not just return items. Then the caller could cast it
      // to HashSet<Item> and then make modifications to your
      // internal state! Return a read-only sequence:
      get {
        foreach(Item item in items) yield return item;
      }
    }
  }
}

// These can be in third-party assemblies:

public class Bag : Item.Container { }
public class Box : Item.Container { }
public class Torch : Item { }
public class TreasureMap : Item { }

public class Program {
  public static void Main() {
    var map = new TreasureMap();
    var box = new Box();
    box.Add(map);
    var bag = new Bag();
    bag.Add(box);
    foreach(Item item in bag.Items)
      Console.WriteLine(item);
  }
}

Pretty slick eh?

A couple questions for you to ponder:

1) Suppose you were a hostile third party and you wanted to mess up the parenting invariant. Clearly, if you are sufficiently trusted, you can always use private reflection or unsafe code to muck around with the state directly, so that's not a very interesting attack. Any other bright ideas come to mind for ways that this code is vulnerable to tampering?

2) Suppose you wanted to make this hierarchy an immutable collection, where "Add" and "Remove" returned new collections rather than mutating the existing collection. How would you represent the parenting relationship?

Next time, another oddity involving "protected" semantics. Have a good weekend!

 

  • Like Maxim, I always saw limiting access to members using access modifiers as a good OO practice, not a security consideration. But I agree with you Eric, that "one certainly cannot sensibly make the argument that the intended semantics of the access modifiers are orthogonal to security concerns."

    Adding to your comments about the security meaning of "private", I want to emphasize that especially in the case of a desktop application (Windows.Forms or WPF), relying on access modifiers for security enforcement is always a bad idea. Quoting Keith Brown from April 2004 edition of MSDN Mag. (http://msdn2.microsoft.com/en-us/magazine/cc163990.aspx): "I hope I'm making it clear that it's impossible to give someone an assembly and restrict them from calling certain methods or accessing certain variables. Unless, of course, you control the machine on which they are running your code, in which case you can force them to run in a partially trusted environment. Lots of security guarantees disappear in a fully trusted environment."

    The root of the problem of course, is that almost all Windows users run apps with admin privileges, so theoretically a malicious program can tamper with all the environment settings.

  • 1) My point with the hierarchy getting bigger was the following: What happens if you have serveral objets derived from Items that have their own way of preserving the invariant? In your example I doesn't make that much sense but maybe the following example with the invariant "each action must be validated before it can be executed" may help:

    public abstract class Action

    {

    public void TryExecute()

    {

    if (Validate())

    Execute();

    }

    protected abstract bool Validate();

    protected abstract void Execute();

    }

    No problem if I just dervied other actions from Action and implement the abstract methods.

    But considering the following case: i have sereval actions and I just want to execute the first that works, I thougt i could implement it the following way:

    public class AlternativeActions : Action

    {

    public AlternativeActions(IEnumerable<Action> actions)

    {

    this.actions = actions.ToArray();

    }

    protected override bool Validate()

    {

    return this.actions.Any(action => action.Validate());

    }

    protected override void Execute()

    {

    this.actions.First(action => action.Validate()).Execute();

    }

    private Action[] actions;

    }

    The invariant is still holds but this won't compile; my solution to this (you can probably imagine) was to make the abstract methods protected internal.

    This is more or less taken from a project I had half a year ago (of course heavily abstracted) and I had several actions that needed to call the protected 'sibling' methods and had their own implemantation of preversing the invariant; so stuffing all actions into the abstract base class was not really a solution.

    2) I reread your post. And I really had an eye-opener when I read the comment on Container.Items: I never understood why .net - collections don't implement something like IReadOnlyCollection with properties/methods like Count or read-only accessors to the collection. I thougt by doing this you would be able to get both easy access to the (seemingly immutable) collection and security (for invariants concering the collection and the class where it is contained).

    Now I realize what was wrong about this: just stupid casting...

  • Welcome to the forty-second issue of Community Convergence. The last few weeks have been a busy time

  • I found this is an interesting post, and an interesting solution to the problem. For that thank you. I want to chime in about the discussion on access modifiers. I aggree that access modifiers won't stop a determined hacker from messing with your code, but few things will stop a truly motivated, intelligent hacker.

    The idea of each thing you do in the name of security is to raise the bar and help protect your code / systems from one classification of user. What do I mean by this?

    95% of the code I have ever written runs on servers and is not distributed to users in the first place. Yet I use access modifiers to increase the security and safety of my code 100% of the time. Who am I protecting my code / systems from? The web developer sitting a couple of cubes over who is using my framework.

    By limiting the public surface of any API, I prevent a normal non-malicous user from accidently using my framework in an unintended way. Is it 100% fool-proof. No but nothing is.

    Maxim, you said "Sorry again, but I don't think it is productive usage of time trying to use something that won't work anyway. Unless it is just a mind game with the single goal to train our brain cells :)"

    I have one question for you. Do you lock the doors to your house / apartment? Why bother since it won't stop someone from picking the lock, or bashing down the door.

    jonskeet: That is an easy attack to prevent by verifying that you are not setting the parent to itself; however, something I have struggled with in the past is what happens when you have objects: a,b,c and d.

    And a contains b which contains c which contains d which contains a. It's the same attack just a little bit harder to defend against.

  • Generating a compiler error in that case is counter-intuitive to me. If that's the way it is then that's the way it is (and I had to verify it in VS 2008 because it definitely surprised me at first); but it certainly violates my principle of least surprise, being a long time C++ programmer, and having used C# professionally for the last six years. A protected member SHOULD be available to any arbitrary derived class; if you don't want an arbitrary derived class accessing a protected member, then you shouldn't make it protected in the first place. That seems pretty straightforward to me. Not sure why this scenario needed to be special cased.

  • First off, C++ has the same rule. (The statement of the rule in C++ is "A friend or a member function of a derived class can access a protected nonstatic member of one of its base clases only through a pointer to, reference to, or object of the derived class or any class derived from that class.")

    If you find this surprising about C#, you should find it equally surprising about C++.

    Second, your statement is correct -- a protected member should be available to any arbitrary derived class. It is.  A protected member is available to ANY DERIVED CLASS.  That is the key point: DERIVED CLASS.  

    Is Bag a derived class of Torch?  No, it is not. Therefore, Bag does not get access to Torch's protected member.

  • Josh: There are a number of ways to prevent cycles.  Here are just two:

    Method #1) Suppose you have A<--B<--C, all containers. (The arrow is "contained in".)  When you try to put A into C, check to see that A != C. If A is C, throw an exception. Then look up the parent list of C and see if it already contains A.  If it does, throw an exception.

    Method #2) Right now the code only allows inserts when the inserted item is not contained in anything. You could change this to say "an item may be inserted in another only if both items presently have the same parent and the items are different"  

    So, for example, suppose you have C, and A<--B.  You want to put C into B, but A and C have different parents, so that is not legal. You have to first put C into A, because A and C have the same (null) parent.  Then you can put C into B, because C and B now have the same parent.

    If you have A<--B<--C, you cannot put A into A, because they are the same.  You cannot put A into B or A into C, because they all have different parents. The only way to do that is to remove B from A, then put A into B, then put A into C, which has no cycle.

  • I'm with Maxim.

    I've seen, too many times, bad code written by people who thought accessibility had something to do with "preserving implementation secrets".

  • Another very simply way in which hostile code might be able to tamper with the parenting environment (though probably not reliably) is by multithreading.  None of the methods are thread safe.  Two concurrent executions of .Add could for instance both execute the following lines simultaneously:

    if (item.Parent != null)

           throw new Exception("Item has inconsistent containment.");

    item.Parent = this;

    such that item.Parent = this; is executed twice.

    Other methods are similarly vulnerable.  If you're not very picky about the kind of invariant violation you require this might suffice for an attack.

  • I'm fascinated by the idea of putting one class inside another to restrict access like this. In this situation, it seems a great fit. However, I've found that it doesn't apply well to other situations where I wanted to do something similar, and yet I can't think of a better way.

    Suppose class A can generate and consume instances of classes B and C. I would like B and C to be "black boxes" to users of class A. A (horrible) way to do this is as follows:

    public sealed class C

    {
      private readonly int specialCValue;
      private C(int v) {
        specialCValue = v; 
        public sealed class B {
          private readonly string specialBValue;
          private B(int v) { specialBValue = v; }
          public sealed class A {
            public B MakeB() { ... }
            public C MakeC() { ... }
            public void ConsumeB(B b) { ... }
            public void ConsumeC(C c) { ... }
        }
      }
    }

    I think this sort of thing is was Frederick was referring to in his post. If a class really needs access to private members of more than one other class, the only solution seems to be to stack them like russian dolls.

    The thing is, I can't figure out another way to achieve this with quite the same properties. This way, users of A can only make Bs or Cs using A, and can't mess about with them. ConsumeB and ConsumeC can have certainty (I think) that their arguments (if not null) were made by an instance of A.

    We could do away with the nesting and make B and C public classes with internal constructors and methods for the benefit of A, but that does mean that the internals are exposed to the rest of the assembly, and in general I don't think that's ideal. However, it does seem to be the best solution available.

    We could make B and C private classes inside A, implementing public interfaces. But then ConsumeB and ConsumeC can only take the interface as input, anything about the class that they want to access has to be in the interface, and they can't rely that they will actually get a well-behaved instance of B or C, unless they cast their argument, which seems pretty nasty. If B or C guaranteed useful invariants about their properties,  we can't rely on the same from IB or IC.

    Is there a better way to do this sort of thing? I'm sorry that I've been so abstract, but I'm wary of posting pages and pages explaining the situations where this has come up, boring everyone to tears.

    Best regards, loving this blog,

    Weeble.

  • Ack, the code didn't work correctly. What's the right way to preserve indentation?

  • I fixed the indentation. Not sure how to do it from this interface.

    It seems to me that if you want A to have access to internals of more than one class B, C in your assembly, that's what the aptly named "internal" access modifier was invented for.

  • Thanks. Yes, as I said, "internal" does seem to be the best that's possible. I had just gotten my hopes up of a more rigorous solution after seeing your example of nesting classes. It would have been nice to make sure that even others working in the same project are left in no doubt that they're not supposed to be constructing Bs and Cs, but perhaps that's more of a policy problem than a language one! Cheers.

  • The option using "internal" looks reasonable to me - notice that both the "mananger" who consumes and generates must be aware of the classes it manages and that the "black-box" classes must be aware of the manager; that means you're very unlikely to be able to put the manager and the blackbox in separate assemblies anyhow due to circular dependancies.  Therefore, putting both in a single "trusted" assembly seems reasonable.  Of course, technically, the manager class might provide its encapsulation services to any sort of class (and thus breaking the circular dependancy), but I can't really imagine such a situation in practice...

    The following would be an alternative (but it's not pretty...)

       public class ManageP

       {

           private object key=new object();

           private static Func<P> trustedMaker;

           public static void SetPMaker(P proof, Func<P> suggestedMaker) {

               if (proof != null && trustedMaker == null) trustedMaker = suggestedMaker;

           }

           static ManageP() { P.Initialize(); }

           static void Main(string[] args)        {

               foreach (var i in Enumerable.Range(0,10))           Console.WriteLine(trustedMaker());

           }

       }

       public sealed class P

       {

           static int count=0;

           private P() { count++; }

           public override string ToString() {return "#" + count;}

           public static void Initialize() { ManageP.SetPMaker(new P(), () => new P()); }

       }

  • I think that what I show next could be an

    elegant solution to the "friend class" design problem:

    namespace TwoMutuallyKnownClasses

    {

    public abstract class Item

    //: BaseClass, BaseInterface

    friends Bag

    //where T : SomeType

    {

    private Item _parent;

    public Item Parent

    {

    get { return _parent; }

    }

    // This method is accessible from derived classes (protected) and from friend classes (friends)

    protected friends void SetParent(Item parent)

    {

    _parent = parent;

    }

    }

    public abstract class Bag : Item

    {

    private List<Item> _list = new List<Item>();

    public void Add(Item item)

    {

    item.SetParent(this); // Not an error in C# 4.0

    _list.Add(item);

    }

    }

    }

    Something like this captures the special relationship

    between these two classes.

    The trust relationship that is expressed

    does not pervase to the containing assembly's types,

    by lack of expressiveness of the language.

    That would be the case if the «internal» keyword had to be used,

    in this place, to express the same real relationship.

    The number of types in an assembly can easily get very large,

    and it may not be feasible to assume that they all

    share the same level of trust, as the «internal» keyword obliges.

    Also it is common that several different people contribute to a given assembly,

    possibly at different points in time,

    with decreasing knowledge of the original author's intentions.

    Personally, I think code accessibility and intent expressiveness

    should not (need to) depend on assembly boundaries.

    Just like the «internal» keyword denotes all types of an assembly,

    the «friends» keyword would denote the set of types considered friends

    by a given class.

    Notice that the proposed expressiveness is somehow inverse

    of that of the C++ language,

    where a class «Item» says:

    "Method Bag.Add has access to me"

    or if Item trusts all of Bags methods:

    "Class Bag has access to me".

    Instead, in a CIL language (http://en.wikipedia.org/wiki/Common_Intermediate_Language)  

    we would say:

    "My method SetParent may be accessed by (any method of) any of my friend classes"

    and that:

    "Class Bag is my friend"

    Would this be a good approach?

    Would it bring any problems with it?

Page 2 of 3 (38 items) 123