What’s new for the coordination data structures in Beta 2?

What’s new for the coordination data structures in Beta 2?

  • Comments 8

Rejoice!  Visual Studio 2010 Beta 2 is upon us and it includes lots of great changes for the Parallel Extensions.  TPL guru Danny Shih has already covered what’s new in TPL and PLINQ aficionado Ed Essey has covered the best way to use LINQ (specifically between Beta 1 and Beta 2) but your favorite set of thread-safe collections and synchronization primitives are in on the action too. 

So what’s changed?  Overall, we’ve been fixing bugs, of course, but we’ve also spent a lot of time fine-tuning these types to get the best performance possible.  That said, performance varies greatly from architecture to architecture and from scenario to scenario, so make sure you stick on your detective hat, fire up the Concurrency Visualizer and go sleuthing around your app’s performance profile to ensure that you’re using the right types in the right way.  We’re always happy to see perf results and field perf-related questions on our forum.

I’m not going to talk about specific bugs or perf improvements here but let’s jump into functionality that’s been added or changed:

Barrier
In Beta 1, Barrier used to track phase numbers using a 32-bit integer.  While most scenarios for Barrier couldn’t conceive of exhausting a range of 4+ billion numbers, for long running services, it’s not infeasible.  That’s why we’ve decided to change Barrier to track phases using a 64-bit integer (so check for any code that called AddParticipant{s} or CurrentPhaseNumber).  We encourage you to take comfort in the fact that your tasks and threads can joyously join together in cyclic fashion for an eternity*. 

Also, in previous releases, if you decided to give Barrier a Post Phase Action to execute when all threads completed a phase, any exceptions thrown from that action would be thrown unwrapped and only on the thread that actually ended up executing the action.  We’ve since realized that that’s a little silly.  If your Post Phase Action faulted, more than likely, your algorithm is corrupt but only one thread would know about it.  In Beta 2, all of your threads participating in a Barrier are now privy to these kinds of failures: any exceptions thrown from your action are wrapped in a BarrierPostPhaseException and thrown on all the threads that participated in that phase.

*Not really an eternity. A really really long time though. 

BlockingCollection<T>
Good ol’ blocking collection. It’s giving producer threads and consumer threads everywhere a common ground upon which to communicate and brimming with thread-safety at every API.  What’s that you say?  In Beta 1 BlockingCollection<T>.Add() and BlockingCollection<T>.CompleteAdding() were not thread-safe with respect to each other?  Using them concurrently might’ve corrupted the data structure so in Beta 1 you couldn’t do something like a nice parallel graph-traversal?

var targetNode = …;
var bc = new BlockingCollection<Node>(startingNodes);
// since we expect GetConsumingEnumerable to block, limit parallelism to the number of
// procs, avoiding too much thread injection

var parOpts = new ParallelOptions() { MaxDegreeOfParallelism = Enivronment.ProcessorCount }; Parallel.ForEach(bc.GetConsumingEnumerable(), parOpts, (node,loop) => { if (node == targetNode) { Console.WriteLine(“hooray!”); bc.CompleteAdding(); loop.Stop(); } else { foreach(var neighbor in node.Neighbors) bc.Add(neighbor); } });

Well worry no longer.  Our team of concurrency-geniuses has crammed even more thread-safety into BlockingCollection<T> for your graph-traversing pleasure by making these two methods thread-safe. (Note that Add() will now throw an exception if called after CompleteAdding() has been called so with Beta 2, you’ll need to put a catch block around the call to Add in this example.)

ConcurrentDictionary<TKey,TValue>
Since Beta 1 has released, we’ve gotten a lot of feedback, both internal and external, about ConcurrentDictionary<TKey,TValue>.  CD is slightly more special than the other concurrent collections added in .NET 4: unlike the other collections, it provides direct access to all of it’s elements.  This, of course, supports a whole different set of scenarios.  While ConcurrentStack<T>, ConcurrentQueue<T>, and ConcurrentBag<T> are really about processing arbitrary elements, ConcurrentDictionary<TKey,TValue> is about processing elements associated with known keys.  With the former, threads will only race on access to the end elements of the collection.  With the latter, access to every element may result in a race.  That said, in Beta 1, it was very difficult to do any multi-step operations with the elements in a ConcurrentDictionary<TKey,TValue>.  Suppose you just wanted to create a simple thread-safe cache.  You were forced to check if the value existed and add it if it did not. Of course, in the time between checking for the item and adding it, an element with the same key could have been added, forcing you to check for such a condition and trying again. 

Enter two new atomic multi-step operations* on CD: GetOrAdd and AddOrUpdate.

GetOrAdd allows you to specify a key as well as a value (or delegate to produce the value). The method will atomically check for the existence of a given key and, if the key already exists, return the existing value.  If the key does not exist, this method will add the new value you’ve specified (with or without the delegate).  For example:

private ConcurrentDictionary<string, Data> _cache = 
    new ConcurrentDictionary<string,Data>();

// on multiple threads
public Data ProcessRequest(string input)
{
    return _cache.GetOrAdd(input, (key) =>
    {
         ...;
         return ...;
    }    
}

AddOrUpdate enables you to either add a new element at a given key or update an existing one if that key already exists, great for scenarios that need thread-safe lookup-tables like counting the number of occurrences of a given string in parallel:

private ConcurrentDictionary<string,int> _counts = 
    new ConcurrentDictionary<string,int>();

// on multiple threads
public int CountRequest(string input)
{
    return _counts.AddOrUpdate(input, 1, (key,old) => old + 1);
}

With AddOrUpdate, you specify the key that you’re interested in, the value to add if the key is not present (or a delegate to create that value), and, finally, a delegate that will generate the new value in the case that the key is already present.  This delegate passes along the key and existing value key so your delegate can use it to determine the new value. 

Nifty!

*Atomic with regards to other mutating methods on the collection (e.g. TryAdd/TryUpdate/TryRemove/etc.), excluding the execution of the user-provided delegate.  This means that if multiple threads race to call GetOrAdd all of their delegates may be called and some delegates may be called multiple times. Users are expected to rollback any unwanted side-effects.

ConcurrentLinkedList<T> and ConcurrentLinkedListNode<T>
In each and every software professional’s career there comes a point where he or she might have to swallow their pride and let a creation that they love go.  For some reason or another, their fancy invention just ultimately, doesn’t provide enough value to justify its existence.  Now I know we went and got you all excited about ConcurrentLinkedList<T> in Beta 1 but we had to let it go (we did warn you though!).  Unfortunately, with the time we had available we just couldn’t get it to be usable and perform well.  There seem to be many thread-safe linked list implementations that are very scalable but usually that scalability is based on some assumption or odd caveat in the design that severely degrades the practicality of the type.  It hurts us deeply to take CLL<T> out but its performance just wasn’t good enough to ship it.  No need to send any flowers.

Lazy<T>, LazyInitializer, and ThreadLocal<T>
Oh boy has Lazy<T> been fun!  In Beta 1, we told you that we renamed LazyInit<T> to Lazy<T>, added a new struct-based version called LazyVariable<T> and moved the thread-local mode over to its own type (ThreadLocal<T>).  Well, in the interest of making these primitives as usable as possible, we’ve gone and done it again!  Here’s a breakdown of the lazy-initialization primitives in Beta 2:

  • Lazy<T> is still a class-based lazy initialization primitive that’s safe to pass around.  We’ve given it it a simple constructor that accepts a bool to indicate whether it needs to be thread-safe or not.* 

    Also, we now detect recursive calls to Value in the valueFactory delegate.  If your valueFactory delegate attempts to access the Value property on the Lazy<T> to which it was passed, an InvalidOperationException will be thrown. 

    Furthermore, we firmed up our stance on exception handling.  In the interest of keeping the type observationally pure, if a valueFactory delegate throws an exception, that exception will be cached and subsequent calls to Value will result in that same exception being rethrown. 
  • LazyExecutionMode has been removed now that we have only two simple initialization modes on Lazy<T>*
  • LazyVariable<T> has also been removed (we said it would!).  There were just too many issues with having a struct-based version.  If you need lighter-weight lazy initialization, use LazyInitializer.  If you need the lighter-weight and an object to pass around, it’s simple enough to create your own struct that uses LazyInitializer’s methods internally. 
  • LazyInitializer still contains all the advanced static methods for lazy initialization patterns.
  • ThreadLocal<T> remains unchanged other than some great perf improvements!

*Though in Beta 2 we reduced the number of execution modes to simply thread-safe and not-thread-safe, customer requests have proven that we still need the three.  By RTM, there will be an additive constructor to support advanced developers that need tighter control over how thread-safety is achieved.  Mostly, this is to enable library writers to use third-party initialization delegates without having to worry if they take locks and will potentially cause a deadlock.  Regardless, feel free to use the constructors provided today as any changes are purely additive.

Serialization
In Beta 1, ConcurrentBag<T>, ConcurrentDictionary<TKey,TValue>, ConcurrentQueue<T>, ConcurrentStack<T>, and Lazy<T> all implemented ISerializable.  Turns out we were a little behind the times.  ISerializable is not version tolerant. We’ve remedied the situation and all of these types now use VTS for serialization, with little to no impact on you!

Have fun with Beta 2 and drop us a line. We’d love any feedback you have on the changes!

Josh and the coordination data structures crew

Leave a Comment
  • Please add 3 and 7 and type the answer here:
  • Post
  • In a class I wrote (primarily for web caching) called PartitionedDictionary<TKey, TValue> (it has 16 internal buckets to lower lock contention) I  extensively rely on its implementation of :

    ISafeKeyedCollection.TryAddValue

     bool TryAddValue(T key, T value, out T contains)

    Where contains is either the new addition or the existing value, and it returns true if Add was successful (no prior key).

    The bool is important and allows this kind of functionality :

    CacheLine reservation = new CacheLine(key);

    CacheLine cacheLine;

    if (App.Sessions.TryAddValue(key, reservation, out cacheLine))

    {

      // Did not already exist.

      cacheLine.Value = GetEntity();

    }

    else

    {

       // Existed

       cacheLine.KeepWarm();

    }

    return cacheLine.Value;

    The interface means I can swap out the container for something else in the future, possibly even a wrapper on MS Velocity.?

  • Just noticed that I've put T as the type of the key and value.

    Also there's a race on CacheLine.Value - looking at my actual code it turns out that in a caching situation I use it rather differently, including locking a CacheLine.SyncRoot property, bad example above, but you get the point.

  • Hey Josh, pretty cool to see you incorporated (amoung others) my feedback in the changes you made to to ConcurrentDictionary and Lazy<>. You guys really do listen to customers! Keep up the good work

  • Without running any code (I'm not using Visual Studio and C# for development) - is it true that the parallel graph traversal demo will not stop if node is not found in the graph?

  • gabr- That's true.  This isn't a complete example, but rather is just meant to illustrate the concept being discussed (namely that prior to Beta 2, you could corrupt the data structure by calling CompleteAdding and Add concurrently, but as of Beta 2 it's safe to do so such that doing so won't corrupt the data structure.)

  • This means that if multiple threads race to call GetOrAdd all of their delegates may be called and some delegates may be called multiple times. Users are expected to rollback any unwanted side-effects.

    Can you explain that in more depth?  Why would a delegate be called multiple times, and how would we even know to "rollback any unwanted side-effects?"

  • If GetOrAdd finds that the key is not in the dictionary, it will invoke the valueFactory delegate in order to produce the value to be added with the specified key.  If multiple threads concurrently call GetOrAdd, they may both concurrently find that the dictionary does not contain the key, and as a result they'll both invoke the valueFactory delegate provided (which may of course be different for each invocation).  Both threads will then proceed to add that new key/value pair into the dictionary, but only one will succeed in doing so; the other will see that another thread added the key/value pair, and thus will return the same value added by the other thread.

    More information is available in the documentation at http://msdn.microsoft.com/en-us/library/ee378677(VS.100).aspx.

  • Can you please explain in detail the parallel situation with AddOrUpdate? The documentation is extremely sparse on this important matter!

    If I have two AddOrUpdate calls at the same time, is it true that my add delegate may get called multiple times, as well as my update delegate? But I am guaranteed that in one of the calls the add delegate result is used and in the other the last update delegate call is used? I want the "value" to be a set, and I don't mind if the update gets called multiple times for the loser, as long as it is called at least against the correct value object, so i can successfully insert into the set.

    I asked this on StackOverflow too here: stackoverflow.com/.../can-you-use-a-concurrentdictionary-for-one-to-many-mapping

Page 1 of 1 (8 items)