Immutability in C# Part Four: An Immutable Queue

Immutability in C# Part Four: An Immutable Queue

Rate This
  • Comments 35

An immutable queue is a bit trickier than an immutable stack, but we’re tough, we can handle it.

First off, the interface is straightforward; enqueuing or dequeuing results in a new queue:

    public interface IQueue<T> : IEnumerable<T>
    {
        bool IsEmpty { get; }
        T Peek();
        IQueue<T> Enqueue(T value);
        IQueue<T> Dequeue();
    }

But how ever are we to implement this? It’s not immediately obvious how to make an immutable first-in-first-out list.

The first insight we need is that a stack is essentially a “backwards” queue. Because I think it will come in handy later, I’m going to declare an extension method right now which takes a stack and reverses it:

        static public IStack<T> Reverse<T>(this IStack<T> stack )
        {
            IStack<T> r = Stack<T>.Empty;
            for (IStack<T> f = stack; !f.IsEmpty; f = f.Pop())
              r = r.Push(f.Peek());       
            return r;
        }

This is an O(n) operation for a stack of size n.

Now that we’ve got that we can make a first cut at our queue implementation. Let’s say that a queue is backed by a stack such that every time we need to get at the “dequeue end”, we’ll just reverse the stack:

sealed class Queue<T> : IQueue<T> { 
  private static readonly IQueue<T> empty = new Queue<T>(Stack<T>.Empty);
  public static IQueue<T> Empty { return empty; }
  private readonly IStack<T> backwards;
  private Queue(IStack<T> backwards) { this.backwards = backwards; }
  public T Peek() { return backwards.Reverse().Peek(); }
  public IQueue<T> Enqueue(T t) { return new Queue<T>(backwards.Push(t)); }
  public IQueue<T> Dequeue(T t) { return new Queue<T>(backwards.Reverse().Pop().Reverse()); }
  public bool IsEmpty { get { return backwards.IsEmpty; } }
  public IEnumerator<T> GetEnumerator() { return backwards.Reverse().GetEnumerator(); }
  IEnumerator IEnumerable.GetEnumerator() {return this.GetEnumerator();}
}

“Surely this is incredibly inefficient!” I hear you say. Yes, most certainly it is! Imagine enqueuing a thousand elements and then dequeuing them all. The thousand enqueues are each cheap, but the first dequeue requires us to entirely reverse a stack of 1000 items and then reverse another stack of 999 items.  The next dequeue requires reversing stacks of 999 and 998 items. Dequeing a single item costs O(n) in the size of the queue! The total cost for all the dequeues is the reversal of 2000 stacks with an average size of about 500, so that’s about a million operations to dequeue the whole thing.

When we reversed the stack to dequeue it the first time, we had a stack that was in the right order for the rest of the dequeues. Why didn’t we keep that thing around? Only because the “forwards” stack that resulted is not in the right order for enqueuing new items. So let’s keep two stacks around, one in “forwards” order, ready to be dequeued, and one in “backwards” order, ready to be enqueued. If we go to dequeue and find that the forward stack is empty, only then will we reverse the backwards stack. (And let's make the empty queue a singleton, like we did with the empty stack.)

    public sealed class Queue<T> : IQueue<T>
    {
        private sealed class EmptyQueue : IQueue<T>
        {
            public bool IsEmpty { get { return true; } }
            public T Peek () { throw new Exception("empty queue"); }
            public IQueue<T> Enqueue(T value)
            {
                return new Queue<T>(Stack<T>.Empty.Push(value), Stack<T>.Empty);
            }
            public IQueue<T> Dequeue() { throw new Exception("empty queue"); }
            public IEnumerator<T> GetEnumerator() { yield break; }
            IEnumerator IEnumerable.GetEnumerator() { return this.GetEnumerator(); }
        }
        private static readonly IQueue<T> empty = new EmptyQueue();
        public static IQueue<T> Empty { get { return empty; } }
        public bool IsEmpty { get { return false; } }
        private readonly IStack<T> backwards;
        private readonly IStack<T> forwards;
        private Queue(IStack<T> f, IStack<T> b)
        {
            forwards = f;
            backwards = b;
        }
        public T Peek() { return forwards.Peek(); }
        public IQueue<T> Enqueue(T value)
        {
            return new Queue<T>(forwards, backwards.Push(value));
        }
        public IQueue<T> Dequeue()
        {
            IStack<T> f = forwards.Pop();
            if (!f.IsEmpty)
                return new Queue<T>(f, backwards);
            else if (backwards.IsEmpty)
                return Queue<T>.Empty;
            else
                return new Queue<T>(backwards.Reverse(), Stack<T>.Empty);
        }
        public IEnumerator<T> GetEnumerator()
        {
            foreach (T t in forwards) yield return t;
            foreach (T t in backwards.Reverse()) yield return t;
        }
        IEnumerator IEnumerable.GetEnumerator() {return this.GetEnumerator(); }
    }
}

Isn’t this just as inefficient? No. Reason it through: every element (except the first one) is pushed on the backwards stack once, popped off the backwards stack once, pushed on the forwards stack once and popped off the forwards stack once. If you enqueue a thousand items and then dequeue two, yes, the second dequeue will be expensive as the backwards list is reversed, but the 998 dequeues following it will be cheap. That’s unlike our earlier implementation, where every dequeue was potentially expensive. Dequeuing is on average O(1), though it is O(n) for a single case.

Next time: OK, so we can do stacks and queues. Can we make an immutable binary search tree with guaranteed good search performance? But of course we can.

  • Alright, I understand your point.

  • I can smell my cortext burning. :)

  • Jomo Fisher--Reading one of my favorite blogs this morning--Eric Lippert's Fabulous Adventures in Coding--I

  • You argue for the efficiency of this solution, but you only speak of speed, not of memory. This solution assumes that you are willing to shoulder the significant extra memory burden of keeping multiple copies of the queue for each queue object. Of course, if you are using immutable objects, then memory concerns are probably not at the top of your priority list, and .NET in general does not have have memory efficiency as a primary design goal. However, whereas the immutable stack does not take any more memory than the mutable stack (and can in fact take less when you have multiple instances derived from each other), the immutable queue will always take significantly more memory than its mutable counterpart. I can see the lack of transparency into these kinds of implementation details leading to significant problems down the road.

  • I'm not following your train of thought here.  You agree that an immutable stack takes no more memory than a mutable stack, and sometimes far less when multiple instances share state.  An immutable queue is built out of two immutable stacks, so why shouldn't an immutable queue gain the same sharing benefit?  

    I do not see why an immutable queue would take up significantly more memory than a mutable queue.  I agree that an immutable queue spends more _time_ allocating memory than a mutable queue, but the garbage collector will throw away the old state if it is no longer alive. Why should the working set of an immutable queue be any larger than the working set of a mutable queue?

  • "An immutable queue is built out of two immutable stacks..."

    And therefore uses up roughly twice the memory, because it has to keep both the forwards and backwards versions in memory at the same time, whereas with a normal queue the backwards version doesn't even need to exist. Is that not obvious?

  • I'm still not following you.

    Can you show me a sequence of enqueues and dequeues which results in a queue which logically has n items, where the total size of the forward and backwards stacks added together is not n?

  • David, look at it this way.

    Define f as the size of the forwards stack, b as the size of the backwards stack.  The memory size of the queue is f + b.

    If you enqueue you push something onto the backwards stack. So enqueueing goes from memory size f + b to f + (b + 1).

    If you dequeue, there are three cases. The queue starts with memory size f + b:

    if f >= 2 then the new queue has size (f - 1) + b

    else if f == 1 && b == 0 then the new queue has size 0

    else if f == 1 && b > 0 then the new queue has size b + 0

    Which in every case goes from f + b to f + b - 1.

    So, enqueuing increases the memory size by one, dequeuing decreases the memory size by one.  

    Is that clear?

  • Eric wouldn't it be simpiler to point out that f and b don't contain duplicates of each others contents ?

    Dr Calvin: "I make the robots appear more human"

    Det. Spooner: "There wasn't that easier to say?"

    Dr Calvin: "No, not really"

    ;)

  • Eric, your articles are very interesting, thank you!

    Stack of Pushed objects and lazy instantiated stack-cache of elements to Pop - wow, you're genious!

    But this queue is very unreliable in multithreaded environment.

    Look at this line (pushing new elements):

    IQueue<T> _myQueue = _myQueue.Push(newElement);

    We are going to lose elements here where several threads do this!

    I think we'll always have to use locks anyway.

  • No, the problem is that you are narrowly defining what you mean by "reliable in a mulithreaded environment".  You are defining 'reliable' to mean 'two threads can enqueue "the same" queue and both enqueued items are present in the resulting queue.  This is a very "procedural programming" way to think about thread safety -- thread safety is PRESERVATION OF ALL SIDE EFFECTS.

    If that's the kind of reliability you want then you should probably be using a mutable threadsafe queue! If you try to use an immutable queue for this purpose then essentially you will have to synchronize access not to the queue -- the queue is threadsafe -- but to the variable which contains the "canonical" version of the queue.

    Immutable queues provide a _different_ kind of reliability, namely "two threads can enqueue the same queue at the same time, and the result on each thread is NOT polluted by the result on the other."  This is a very functional programming way of thinking about thread safety -- thread safety is IMMUNITY FROM SIDE EFFECTS. Enqueueing X onto queue Q1 should result in the queue which is Q1 plus X, always, independent of any nonsense happening to Q1 on another thread.

    Once you embrace a world where side effects are to be eliminated, not preserved, things get much easier to reason about.

  • There is a little mess with immutable and lock-free structures in comments. It is a different things. Immutable structures are thread safe utilizing the "don't worry about it" principle, but they don't provide the data coherency. Lock-free structures are thread safe by the same mean and also coherent.

  • Eric,

    Two features that would make this sort of thing easier would be case classes a la Scala, and pattern matching. That way you could write Haskell style algebraic data types easily without all that boiler plate.

  • I got thinking about immutability when I was trying to use Enums to create lists of things the IDE recognized.  As you know,

    Example:

      public Enum fruitType

      {

         Apple,

         Orange,

         Pear  

       }

    This is so much better than creating lists of things that the IDE doesn't recognize, i.e. Hashtable dictionaries.  

      public Hashtable fruitTypeHT = new Hashtable ();

      fruitTypeHT["Apple"] = null;

      fruitTypeHT["Orange"] = null;

      fruitTypeHT["Pear"] = null;

    In the case of the Enumerated list, if you wish to change Pear to AsianPear it is simple to use the refactoring to rename all instances of Pear to AsianPear.  There is no such help for the Hashtable version, plus, it is possible to still modify the hashtable dictionary contents during run-time.

    It would be fantastic if we could have "popsicle" Enums then we could fill them from the database during program startup, yet still refer to concrete items in the Enum list in the code.  I know this may seem like wanting to have your cake and eat it too; maybe so.  But this is the basic difficulty with working with meta-data that is held in database tables.   Your code has to refer to specific items of meta-data, but you want a way to easily find or change all specific instances referenced.  This would be a great problem to solve, and I am convinced immutability will be a key part of the solution.

    One thing I tried to my hand at was creating an Immutable Hashtable and an Immutable ArrayList.  They are wrappers for standard Hashtables and ArrayLists, but there is a makeReadOnly () method in each that freezes the list from that point on.  I use this to handle some forms of meta-data from the database, where I don't want my program to ever attempt to change or add items to the list pulled from the DB.

    http://www.codeproject.com/KB/cs/ImmutableHashtable.aspx

    http://www.codeproject.com/KB/cs/ImmutableArrayList.aspx

    What do we have to do to get MS to beef up Enums?  I'm already standing on my head!!

  • For some reason, there's been a lot of buzz lately around immutability in C#. If you're interested in

Page 2 of 3 (35 items) 123