Immutability in C# Part Four: An Immutable Queue

Immutability in C# Part Four: An Immutable Queue

Rate This
  • Comments 35

An immutable queue is a bit trickier than an immutable stack, but we’re tough, we can handle it.

First off, the interface is straightforward; enqueuing or dequeuing results in a new queue:

    public interface IQueue<T> : IEnumerable<T>
    {
        bool IsEmpty { get; }
        T Peek();
        IQueue<T> Enqueue(T value);
        IQueue<T> Dequeue();
    }

But how ever are we to implement this? It’s not immediately obvious how to make an immutable first-in-first-out list.

The first insight we need is that a stack is essentially a “backwards” queue. Because I think it will come in handy later, I’m going to declare an extension method right now which takes a stack and reverses it:

        static public IStack<T> Reverse<T>(this IStack<T> stack )
        {
            IStack<T> r = Stack<T>.Empty;
            for (IStack<T> f = stack; !f.IsEmpty; f = f.Pop())
              r = r.Push(f.Peek());       
            return r;
        }

This is an O(n) operation for a stack of size n.

Now that we’ve got that we can make a first cut at our queue implementation. Let’s say that a queue is backed by a stack such that every time we need to get at the “dequeue end”, we’ll just reverse the stack:

sealed class Queue<T> : IQueue<T> { 
  private static readonly IQueue<T> empty = new Queue<T>(Stack<T>.Empty);
  public static IQueue<T> Empty { return empty; }
  private readonly IStack<T> backwards;
  private Queue(IStack<T> backwards) { this.backwards = backwards; }
  public T Peek() { return backwards.Reverse().Peek(); }
  public IQueue<T> Enqueue(T t) { return new Queue<T>(backwards.Push(t)); }
  public IQueue<T> Dequeue(T t) { return new Queue<T>(backwards.Reverse().Pop().Reverse()); }
  public bool IsEmpty { get { return backwards.IsEmpty; } }
  public IEnumerator<T> GetEnumerator() { return backwards.Reverse().GetEnumerator(); }
  IEnumerator IEnumerable.GetEnumerator() {return this.GetEnumerator();}
}

“Surely this is incredibly inefficient!” I hear you say. Yes, most certainly it is! Imagine enqueuing a thousand elements and then dequeuing them all. The thousand enqueues are each cheap, but the first dequeue requires us to entirely reverse a stack of 1000 items and then reverse another stack of 999 items.  The next dequeue requires reversing stacks of 999 and 998 items. Dequeing a single item costs O(n) in the size of the queue! The total cost for all the dequeues is the reversal of 2000 stacks with an average size of about 500, so that’s about a million operations to dequeue the whole thing.

When we reversed the stack to dequeue it the first time, we had a stack that was in the right order for the rest of the dequeues. Why didn’t we keep that thing around? Only because the “forwards” stack that resulted is not in the right order for enqueuing new items. So let’s keep two stacks around, one in “forwards” order, ready to be dequeued, and one in “backwards” order, ready to be enqueued. If we go to dequeue and find that the forward stack is empty, only then will we reverse the backwards stack. (And let's make the empty queue a singleton, like we did with the empty stack.)

    public sealed class Queue<T> : IQueue<T>
    {
        private sealed class EmptyQueue : IQueue<T>
        {
            public bool IsEmpty { get { return true; } }
            public T Peek () { throw new Exception("empty queue"); }
            public IQueue<T> Enqueue(T value)
            {
                return new Queue<T>(Stack<T>.Empty.Push(value), Stack<T>.Empty);
            }
            public IQueue<T> Dequeue() { throw new Exception("empty queue"); }
            public IEnumerator<T> GetEnumerator() { yield break; }
            IEnumerator IEnumerable.GetEnumerator() { return this.GetEnumerator(); }
        }
        private static readonly IQueue<T> empty = new EmptyQueue();
        public static IQueue<T> Empty { get { return empty; } }
        public bool IsEmpty { get { return false; } }
        private readonly IStack<T> backwards;
        private readonly IStack<T> forwards;
        private Queue(IStack<T> f, IStack<T> b)
        {
            forwards = f;
            backwards = b;
        }
        public T Peek() { return forwards.Peek(); }
        public IQueue<T> Enqueue(T value)
        {
            return new Queue<T>(forwards, backwards.Push(value));
        }
        public IQueue<T> Dequeue()
        {
            IStack<T> f = forwards.Pop();
            if (!f.IsEmpty)
                return new Queue<T>(f, backwards);
            else if (backwards.IsEmpty)
                return Queue<T>.Empty;
            else
                return new Queue<T>(backwards.Reverse(), Stack<T>.Empty);
        }
        public IEnumerator<T> GetEnumerator()
        {
            foreach (T t in forwards) yield return t;
            foreach (T t in backwards.Reverse()) yield return t;
        }
        IEnumerator IEnumerable.GetEnumerator() {return this.GetEnumerator(); }
    }
}

Isn’t this just as inefficient? No. Reason it through: every element (except the first one) is pushed on the backwards stack once, popped off the backwards stack once, pushed on the forwards stack once and popped off the forwards stack once. If you enqueue a thousand items and then dequeue two, yes, the second dequeue will be expensive as the backwards list is reversed, but the 998 dequeues following it will be cheap. That’s unlike our earlier implementation, where every dequeue was potentially expensive. Dequeuing is on average O(1), though it is O(n) for a single case.

Next time: OK, so we can do stacks and queues. Can we make an immutable binary search tree with guaranteed good search performance? But of course we can.

  • Amazing article & code!

    Looking forward for more immutables (what about hash table? :)) )

  • This sounds terribly familiar... It reminds me strongly of Chris Okasaki's book "Purely Functional Data Structures", which I believe is based on his thesis: http://www.cs.cmu.edu/~rwh/theses/okasaki.pdf

    It's pretty heavy going, but it starts with examples like this Queue and goes on to implement some impressively complex immutable/purely functional data structures. No hash tables (surely that's not even possible?) but it does have red-black trees and heaps.

  • I cannot see (I am neww to all this) how immutable Queue will help us with ten threads queueing stuff and ten threads dequeueing stuff without a need for the synchronization.

  • I did wonder how you would do this. Having seen your solution it brings railroad tracks to mind. Shunt the coaches into a railroad siding. When you need to "dequeue" them reverse them into another siding before dequeuing the first coach.

  • Well, start by worrying about two threads, not ten threads. There are no problems involving ten threads that are not also problems involving two threads, so keep it simple.

    If your goal is to have one queue which is shared by a producing thread and a consuming thread, then you are better off modeling the problem like that: a single threadsafe queue that can be stuck into a shared location.  Immutable data structures do not make that scenario easier. (Rather, they move the problem from the queue provider's plate onto the queue user's plate; the queue user must then synchronize access to the variable shared across threads.)

    However, if your goal is to have one thread build up data structures and then hand them off to other thread to be consumed later, then immutable data structures are your friends. Once they are created, you don't need to worry about whether they are being accessed on multiple threads at a time because they are read-only structures. As many threads as you want can access them at once.

  • If we passing the data structure for the 'read-only' consumption, why do we care that it was Queue, or Stack (i.e. had some behavior)? It had some meaning for the creator of the data, but for the consumer it is just some readonly vector (or sequence/stream).

  • Maybe. Or maybe that thread intends to do some further transformation of it and hand it off to a third thread.

  • Your solution is fine for a single thread, but has a weakness with respect to a multithreaded situation, namely that if you enqueue a thousand items and then pass that one queue off to a thousand threads which each dequeue twice, all those threads will incur considerable expense. In "Purely Functional Data Structures", Okasaki discusses solutions to this problem.

  • Right. I deliberately glossed over that one. If we were really buff then we'd do something like memoize the result of the dequeue so that the cost was only born once -- but then of course you have the problem of implementing a threadsafe memoizer! There is no free lunch.

  • There ain't no free lunch, but some lunches are cheaper than others -- and if memoization is part of your infrastructure, then someone else is paying. I guess that's why Okasaki uses Haskell.

  • Another great post. Thanks!

    Are readers permitted to use the source code?

  • You're welcome, and do whatever you want with the code.

  • These immutable data structures are nice, but I don't really understand the link between these and a hypothetical immutability-related new langage feature. I'm sure you have something in mind when writing these articles about immutable data stuctures.

    I guess I'm too impatient.

  • I don't understand it either. I would like to give some examples of the kind of coding style that works today as a starting point for answering the question "how can we make this easier and more powerful?" You don't know what sorts of problems even need solving until you try it a few times.

    For instance, we've seen this pattern:

    private sealed class EmptyThing : IThing {/*...*/}

    private static readonly EmptyThing = new EmptyThing();

    public static IThing Empty { get { return empty; } }

    And any time I see a pattern like that, I wonder if the language ought to be making that easier. But it depends on how you slice it. One way would be to attack the singleton:

    private singleton class EmptyThing : IThing { /*...*/}

    public static IThing Empty { get { return new EmptyThing(); // singleton ctor always returns same thing } }

    one way would be to attack the property initializer:

    public static IThing Empty { get ; } = new EmptyThing();

    etc.  

  • This presentation (actually handout), gives nice graphical representation of how functional data structures work:

    www.cis.upenn.edu/~bcpierce/courses/advprog/lectures/lec6.ppt

Page 1 of 3 (35 items) 123