Protected Semantics, Part Five: More on immutability

Protected Semantics, Part Five: More on immutability

Rate This
  • Comments 5

I asked a second follow-up question back in Part Two:

Suppose you wanted to make this hierarchy an immutable collection, where "Add" and "Remove" returned new collections rather than mutating the existing collection. How would you represent the parenting relationship?

The short answer is "I wouldn't". First off, it sounds an awful lot like what is intended to be represented here is some sort of highly mutable system, where objects are being added and removed from containers. It might be more sensible to represent that thing using mutable data structures rather than immutable data structures.

If you did want to represent the thing with immutable data structures, then I would think twice before attempting to represent parenting at all. Here's why: when you have an immutable tree structure without parents, and you want to make a new tree structure that is based on an old one, typically the only stuff you have to change is from the edited node on up to the root node; usually hierarchies are shallow, so this is not very many nodes.

But this logically implies that the root node changes on every edit. If the root node changes, then all of its children have a new parent, and all of their children then have a new parent... and you've just rewritten the entire tree structure, not just a few nodes.

Now, it is often highly convenient to know what the parent of a given node is. To achieve that, what I'd probably do is try to not need to query the parent of any node until I was at a point where I wanted to ask about a lot of parents, and the tree was likely to remain unedited while I was asking. A quick visitor pass that builds up a hash table mapping every node to its parent would be easy enough to write, and then the parent table could be discarded and its memory reclaimed once I was done.

  • PingBack from http://www.alvinashcraft.com/2008/05/06/dew-drop-may-6-2008/

  • Eric,

    Could you post an example of how to actually use such immutable collections?

    As I understood it, you would need a mutable reference, to the current collection. If you want to share it with other classes - allowing them to have "write access" - it would have to be a dedicated container object

    I also seem to have missed a point on thread safety. If used in a multithreaded environment, then every modification of the reference would also have to be locked either. Given an example with a producer, a consumer, and a buffering queue, it could be easy to get unwanted results when there is no locking involved:

    class Example

    {

    Queue<int> queue = Queue<int>.Empty;

    void Producer(object dummy)

    {

    const int count=100;

    Random r = new Random();

    for (int i=0; i<count; i++)

    {

    int val = r.Next(1000);

    queue = queue.Enqueue(val);

    }

    }

    void Consumer(object dummy)

    {

    while (!queue.IsEmpty)

    {

    Queue<int>curr = queue;

    queue = curr.Dequeue();

    Thread.Sleep(curr.Peek());

    }

    }

    void RunExample()

    {

    ThreadPool.QueueUserWorkItem(new WaitCallBack(Producer));

    Thread.Sleep(0);

    ThreadPool.QueueUserWorkItem(new WaitCallBack(Consumer));

    }

    If the consumer finishes Queue.Dequeue() and updates the queue reference while producer is running Queue.Enqueue(), the dequeued element will be in the queue again once the producer updates the reference.

    And if the producer finishes Queue.Enqueue() and updates the queue ref while the consumer is running Queue.Dequeue(), the last Enqueued item is lost.

    So, it seems to me that using immutable objects is as thread(un)safe (requires same locking) as using mutables.

  • > As I understood it, you would need a mutable reference, to the current collection.

    Your question assumes that your program has such a thing as "the current collection", and that the value of "the current collection" can change. Why is that a warranted assumption?

    If your program is modelling something with exactly one collection which can change throughout time, then a mutable collection seems like a reasonable data structure to choose. But not all programs do that.

    > Given an example with a producer, a consumer, and a buffering queue,

    ... then I would use a threadsafe mutable collection, rather than serializing access to a mutable variable containing an immutable collection. It is probably more efficient to allow the collection to handle its own thread safety.

    Your making a straw man argument here. You're taking examples where mutable data structures are clearly superior and then saying "hey, an immutable data structure would be inferior here", because you're talking about scenarios in which mutability is a key aspect of the scenario.

    Think outside the box of mutable programming for a bit. Let me give you a real-world example: expression trees.

    Expression trees are immutable trees. In programs that manipulate expression trees, there is no one central "current tree" which is then updated. Rather, you have lots of little bits of trees all over the place. You build up new trees out of old ones, you pull old trees apart, but the trees never change.

    This means that the trees themselves are threadsafe.  I can take an expression tree representing (a*1)+b*c, and pass it to a method that logs expression trees on one thread, and pass it to an optimizer to produce the expression tree a+b*c on another thread without worrying that the threads are going to interfere with each other. They cannot, because turning (a*1)+b*c into a+b*c does not modify the original structure at all.  

    That's the kind of thread safety I'm talking about.  Mutable collections are "threadsafe" only in a very strange way -- it just doesn't seem strange because you're used to it.  For example, the following operation is NOT "threadsafe" even for a threadsafe mutable collection:

    Stack s = mystack;

    if (!s.IsEmpty) s.Pop();

    because of course IsEmpty might be false, and then another thread pops the stack, and now the s.Pop() throws a stack-is-empty exception.  What kind of craziness is that?  Immutable collections do not have this problem; they are truly threadsafe. If you have an immutable collection:

    Stack s = mystack;

    if (!s.IsEmpty) s = s.Pop();

    then you are guaranteed that this never throws.  Mutable collection "thread safetly" in fact requires you to have a full, global understanding of exactly what every thread is doing to the collection at all times. That nonlocal analysis is terribly difficult to get right. Much easier with immutable collections, where all the analysis is local. You just have to give up the whole idea of "the canonical mutable collection", that's all.

  • Thanks for your reply,

    my post wasn't ment as an attack, but rather stated my understanding of this (which obviously hasn't evolved much from the mutable-dominated world). I thought of the immutable approach that it could be a drop-in replacement which eases life - wich it is, of course, but not for the examples i was able to think of then...

  • Rather than place the links to the most recent C# team content directly in Community Convergence , I

Page 1 of 1 (5 items)