Why Can't I Access A Protected Member From A Derived Class, Part Two: Why Can I?

Why Can't I Access A Protected Member From A Derived Class, Part Two: Why Can I?

Rate This
  • Comments 38

This is a follow-up to my 2005 post on the same subject  which I believe sets a personal record for the longest time between parts of a series. (Of course, I didn't know it was a series when I started it.) Please read the previous article in this series, as this post assumes knowledge of part one.

.......

OK, now that you've read that, it's clear why you can only access a protected member from an instance of an object known to be of a type at least as derived as the current context. You can therefore deduce the answer to the question asked to me by a (very polite) reader this morning: Why did this code compile in C# 2.0 but give an error in C# 3.0?

public abstract class Item{
  private Item _parent;
  public Item Parent {
    get { return _parent; }
    protected set { _parent = value; }
  }
}
public class Bag:Item{
  private List<Item> list = new List<Item>();
  public void Add(Item item)
  {
    item.Parent = this; // Error in C# 3.0
    list.Add(item); 
  }
}
public class Torch : Item { }

It compiled in C# 2.0 because the compiler had a bug. We forgot to enforce the semantics of "protected" access for the property setter. Though the compiler did not generate an error, it did generate code which would not pass the CLR verification check! The code would run if it happened to be fully trusted, but it was not safe to do so. We fixed the bug in C# 3.0 and took the breaking change.

This raises a more interesting point though. Now that we correctly prevent you from setting the "parent" reference in this manner, how would you implement the desired pattern? The reader wanted the following conditions to be met:

  • There are two kinds of items -- "leaf" items, and "container" items. Container items may contain either kind of item, so you could have a box containing a bag, which in turn contains a torch.
  • An item may be in a container, and if it is, its parent reference refers to that container.
  • These classes must be extensible by arbitrary third parties.
  • "Unauthorized" code must not be able to muck around with the parenting invariants. 

The reader was attempting to enforce these invariants by making the parent setter protected. But even in a world where it is legal to access a protected member from an arbitrary derived class, that does not ensure that the invariants are maintained! If you allow any derived class to muck with the parenting, then you are relying upon every third-party derived class to "play nicely" and maintain your invariant. If any of them are buggy or hostile, then who knows what can happen?

When I'm faced with this kind of problem, I try to go back to first principles. Considering each design constraint leads us to an implementation decision which implements that constraint:

  • There are two kinds of abstract items -- items and containers. Therefore, there should be two abstract base classes, not just one.
  • Containers are items. Therefore, the container class should derive from the item class.
  • The parenting invariant must be maintained across all containers. Therefore the invariant implementation must be inside the abstract container base class.
  • Every possible container requires "write" access to the parent state of every possible Item.  We want that access to be as restricted as possible. Ideally, we want it to be private. The only by-design way to get access to a private is to be inside the class.

And now a full solution becomes very straightforward:

using System;
using System.Collections.Generic;

public abstract class Item {
  public Item Parent { get; private set; }
  public abstract class Container : Item {
    private HashSet<Item> items = new HashSet<Item>();
    public void Add(Item item) {
      if (item.Parent != null)
        throw new Exception("Item has inconsistent containment.");
      item.Parent = this;
      items.Add(item);
    }
    public void Remove(Item item) {
      if (!Contains(item))
        throw new Exception("Container does not contain that item.");
      items.Remove(item);
      item.Parent = null;
    }
    public bool Contains(Item item) {
      return items.Contains(item);
    }
    public IEnumerable<Item> Items {
      // Do not just return items. Then the caller could cast it
      // to HashSet<Item> and then make modifications to your
      // internal state! Return a read-only sequence:
      get {
        foreach(Item item in items) yield return item;
      }
    }
  }
}

// These can be in third-party assemblies:

public class Bag : Item.Container { }
public class Box : Item.Container { }
public class Torch : Item { }
public class TreasureMap : Item { }

public class Program {
  public static void Main() {
    var map = new TreasureMap();
    var box = new Box();
    box.Add(map);
    var bag = new Bag();
    bag.Add(box);
    foreach(Item item in bag.Items)
      Console.WriteLine(item);
  }
}

Pretty slick eh?

A couple questions for you to ponder:

1) Suppose you were a hostile third party and you wanted to mess up the parenting invariant. Clearly, if you are sufficiently trusted, you can always use private reflection or unsafe code to muck around with the state directly, so that's not a very interesting attack. Any other bright ideas come to mind for ways that this code is vulnerable to tampering?

2) Suppose you wanted to make this hierarchy an immutable collection, where "Add" and "Remove" returned new collections rather than mutating the existing collection. How would you represent the parenting relationship?

Next time, another oddity involving "protected" semantics. Have a good weekend!

 

  • Comme l'explique Eric Lippert dans son dernier post , en C# 3.0, le code suivant ne compilera pas : public

  • one way you can mess with things is to use a badly behaved GetHashCode() and/or Equals function in a new Item implementation.

    This would exploit the HashSet guts of the collection you could add something to one collection, then remove it but exploit knowledge of the number of calls to GetHashCode() and/or Equals that would be used to do:

    firstContainer.Add(nastyFoo)

    nastyFoo.ActivateChangeAssumingRemoveSemantics();

    firstContainer.Remove(nastyFoo);

    //   if (!Contains(item))  => this underlying call  to GetHashCode and Equals should behave normally

    // items.Remove(item);  => this underlying call should not, change hashcode and or Equals

    // item.Parent = null; => this neatly breaks the Parent invariant

    nastyFoo.RestoreOriginalSemantics();

    secondContainer.Add(nastyFoo);

    firstContainer.Contains(nastyFoo) && secondContainer.Contains(nastyFoo);; // true!

  • I should point out that you can guard against it by forcing the GetHashCode() and Equals() implementations in the two abstract base classes respectively, object equality is probably sufficient for this example so

    public override sealed GetHashCode()

    {

     return base.GetHashCode();

    }

    public override sealed Equals(object o)

    {

     return base.Equals(o);

    }

    (plus an IEquatable<Item> implementation if you wanted)

  • A slightly more mundane "attack" in terms of modeling something strange:

    box.Add(box);

    It wouldn't cause anything hugely nasty until someone tried to walk their way up the (now broken) tree with something like:

    Item current = someItem;

    while (current != null)

    {

       // Do something with current

       current = current.Parent;

    }

  • I'm still waiting for compiler-enforced immutable objects...

    ... and a declarative (e.g. Attribute-based) way to implement GetHashCode and Equals and similar methods - i.e. have non-virtual implementations of these methods and Ignore-Attributes on fields or something alike, and implemented in a performant way (e.g. auto-generated code).

  • The idea of having "hostile third parties", who are needed to be fighted against, is at least strange (if speaking politely). You are abusing access modifiers yourself, when you use them as a firewall agains malicious users.

    The intended usage of access modifiers is documenting your commitments about future API changes. They have one advantage against plain comments - the compliance to them can be checked automatically by compilers. And like for any other documentation, the 80/20 rule applies to them. That is, if you can't document your commitment using access modifiers in your favorite programming language, just write a comment.

    Unfortunately, many devs think, access modifiers are a tool to protect "their" code from unintendent usage. This is very ridiculous, because of course you can never be made responsible for any problems when your code is used in an unappropriate manner. So why bother? (If you ARE being made responsible for such kind of things, change your employer).

  • I always find it strange when people who are not on the language design team tell me what the intended usage of language features is.

    I hate to be contradictory, but no, for better or for worse, access modifiers in the CLR are security features. The rules for access are thoroughly conflated with the security and type safety systems.  

    One could make the argument -- I have made it myself -- that this is perhaps not the best idea. One could make the argument that a better design would be to make security and type safety much more orthogonal. One could make the argument that features like Restricted Skip Visibility make the situation worse, not better. One could even argue that the CLR security system is baroque and too complicated for its own good.

    But one certainly cannot sensibly make the argument that the intended semantics of the access modifers are orthogonal to security concerns.  That is emphatically NOT how they were designed, nor is it how they were intended to be used, nor is it how they were implemented.  

    Perhaps you would like it better if that had been the design, intention or implementation.  Frankly, there are days when I would too. But that's not the world we actually live in, so making unsupported claims to the contrary is counterproductive.

    To address your second point, of course we are responsible for the misuse of our code. Everyone who writes code has a moral responsibility to ensure that it cannot be misused to harm users. Do you really think that, say, Microsoft has no moral obligation to fix security holes in Internet Explorer?  Or that Sun has no moral obligation to fix security holes in Java?  Of course we do. We all have a moral responsibility to our users to provide software which can be used with confidence. Not to mention, we have a fiscal responsibility to our shareholders to provide the highest quality implementations we can.

    Absolutely I am responsible for the security of every bit of code I write, and absolutely I am responsible for doing everything in my power to ensure that it cannot be misused by hostile people in order to hurt the customers who trusted me to do my job. Am I responsible for the actions of hostile people? No, of course not. If they commit crimes, they're the one's going to jail when they're caught, not me. But it is thoroughly specious to conclude that because I am not responsible for the crime, that I am also not responsible for doing everything in my power to prevent it!  THAT I most certainly am responsible for, and I take that responsibility very seriously.

    Accessiblity modifiers, and their integration into the .NET security system, are just one tool in my toolbox for making high-quality secure implementations of software that enforces the invariants desired by my customers. But as we've seen in this article, it's just one tool, not a panacea. There's lots of other ways that this code can get messed up; you cannot rely solely upon "private" to save you, but it is definitely a necessary step in the right direction.

  • Sorry, but as opposed to most things you write on your blog I don't like the solution you presented, Eric. My major concern is: What happens if your class hierarchy gets bigger - that means many things derived from Items with individual behaviour etc. - and you still have to represent this hierarchy in one class. Not really nice imo.

    C# is missing something like a 'friend' keyword in C++ here. The only approach that I can think of is to make the Parent internal instead of protected.

    public abstract class Item{

     private Item _parent;

     public Item Parent {

       get { return _parent; }

       protected set { _parent = value; }

     }

    }

    But ... exactly like in your previous example replacing 'protected' with "protected internal' compiles at least with my VS 2008. This can't be the other "oddity involving "protected" semantics" you mentioned, can it?

  • Hey I have written this comment BEFORE your answer to the presceding post - my English just sucks, so I am slow at typing. Unfair :(

  • Thank you Eric for sharing to us a bit of access modifiers design. I had to speculate about why access modifiers have been included in C#, because there is no public C# design spec. So I've chosen the best reason I could think of. Sorry for being a little provocative.

    Security as intended usage of access modifiers... Gosh. Is it me or it is really clear for anybody, that this won't work? Access modifiers is a language feature, not even VM feature! Hackers are not bound to any particular language or VM. So until CLR is a public standard and there are things like Mono available, you can always hack on levels much deeper than C# with its rules. Besides that, we have Reflection and we have Reflector. So you can either access private members in run-time or decompile accemblies. And I hope (HOPE) that you won't restrict this functionality for private/protected members in next releases, because it has plenty of its legal uses.

    Sorry again, but I don't think it is productive usage of time trying to use something that won't work anyway. Unless it is just a mind game with the single goal to train our brain cells :)

    About responsibilities: I'm so happy that hammer manufacturers don't feel themselves responsible for the possible usage of a hammer as a killing tool. Therefore, hammers are cheap and very usable because they can be kept simple. In other words, if somebody blame Microsoft for something, it doesn't mean automatically Microsoft is responsible for that :)

  • Maxim I fear you have a serious lack of understanding of the specifications of the CLR.

    Protection levels on fields *are* part of the spec. If you are using reflection then you require high trust levels, if you don't understand what that means don't spout off about the security features controlled in part through access restrictions, it makes you look silly, especially when you're doing it on the blog of someone who *really* knows what they are talking about...

  • Matt,

    CLR itself is not protected. Take Mono, change it to ignore protection levels, run your "protected" software and do with it anything you want. It is a silly idea trying to achieve any security with software being executed by a hacker. In the enviroment of the hacker.

    Even native software (written on C or C++) is being easily hacked using IDA or some virtualization tools. CLR made things *simpler*, not *harder* for hackers. This is the reality how I can see it, not some abstract specification.

    Security happens only on the server side today (server side in broader sence, i.e. an interner browser is a "server" from the point of view of a malicious web site).

  • Six things:

    1) Frederik, I'm not following your point about the class hierarchy getting bigger. You can make this hierarchy get as big as you want and the invariant that items are correctly parented remains invariant. (Modulo the other problems people have mentioned of course.)  Can you expand on this?

    2) Your English is fine!

    3) C# does have "friend" semantics but only at the assembly/internal level, not at the class/private level. That is, you can say "this other assembly has access to this assemblies internals". This is done with the InternalsVisibleTo attribute.

    4) "Until the CLR is a public standard"... but, the CLR spec _is_ a public standard, published by ECMA. I'm not following your point here.

    5) Of course you can always use a decompiler -- or, for that matter, a debugger -- to look at the contents of private state. Like I said before, there are any number of ways that FULLY TRUSTED code can look at private state. That's what "fully trusted" means.  That's not interesting.  The interesting attacks which we must protect against are attacks where PARTIALLY TRUSTED code that may be hostile is running on the system.

    6) Microsoft does not make anything so simple as a hammer, so that's a bad analogy. Microsoft makes servers which run the businesses of massive multinational corporations. We make web browsers which hundreds of millions of people use for business and entertainment. We make productivity tools and operating systems.  All of these systems are massively more complex than a hammer.  We are not responsible for the actions of those criminals who use flaws in that software to attack our customers. But we are responsible for doing the best we possibly can to eliminate those flaws.

    Pick a better analogy. Cars, say. Car manufacturers must meet stringent safety and reliability standards so that the public which relies upon their products knows that a certain level of dilligence has gone into ensuring that the car works well even when other people abuse it. Car manufacturers cannot just say "we've designed this car to work well in "normal use", and someone else crashing into you at high speed is not a normal use of this car. If someone else crashes into you, that's their fault, and we bear no responsibility for ensuring that the design of our car is safe in that eventuality."  Would you buy a car from a company who took that attitude towards safety?

    We take security very, very seriously here, because people DO rely upon their software being robust against misuse by hostile third parties. If we as an industry cannot get a simple invariant like "parented by its container" right, how can we possibly get complex invariants right, like "bank account information is only visible to authorized parties" ?

  • I've said this a couple of times but I want to emphasize it again.

    Maxim, you are misunderstanding the security meaning of "private".  "private" does not mean "no code can possibly read/write this secret".  Obviously you can read/write private data much more easily than writing your own CLR by modifying mono -- you can just run a debugger.

    The security meaning of private is "not accessible to code that does NOT have sufficient access to break the security system".  

    This is like any other security system. You don't put stuff in a vault to protect it from the people who have the combination. Nor do you put it in a vault to protect it from people who are uninterested in attacking you. You put it in a vault to protect it from people who are both hostile AND unauthorized.

  • LINQ in Portuguese ( Direct ) http://www.linqpad.net Eric Lippert Why Can't I Access A Protected Member

Page 1 of 3 (38 items) 123