All about Async/Await, System.Threading.Tasks, System.Collections.Concurrent, System.Linq, and more…
ConcurrentDictionary<TKey,TValue> is a new type in the .NET Framework 4, living in the System.Collections.Concurrent namespace. As noted in the MSDN documentation, ConcurrentDictionary “represents a thread-safe collection of key-value pairs that can be accessed by multiple threads concurrently.”
While ConcurrentDictionary implements IDictionary<TKey, TValue> just as does Dictionary<TKey,TValue>, it also has several unique mechanisms for adding / updating key/value pairs in the dictionary, mechanisms specific to its concurrent focus. Here is a summary of when to use which mechanism.
All of these operations are atomic and are thread-safe with regards to all other operations on the ConcurrentDictionary. The only caveat to the atomicity of each operation is for those which accept a delegate, namely AddOrUpdate and GetOrAdd. For modifications / writes to the dictionary, ConcurrentDictionary employs fine-grained locking to ensure thread-safety (reads on the dictionary are performed in a lock-free manner); however, these delegates are invoked outside of the locks in order to avoid the myriad of problems that can arise from executing unknown code under a lock. As such, the code executed by these delegates is not subject to the atomicity of the operation.
It is such an awesome type. I found it a pleasure to use in a recent project. GetOrAdd is particularly useful - nice work.
Hi Stephen,
Thanks for the informative overview post on ConcurrentDictionary. I can see that there are a lot of niceties coming out of ConcurrentDictionary; some arising from the need of atomicity, but others are just, well, nice.
For example, AddOrUpdate() sounds like a wonderful addition which would greatly ease the typical dance of "is this item in the dictionary? if not add a default and then modify it, if so take out the value and modify it..."
I've been missing this ever since I used Python's setdefault() method for the built-in dictionary type. (http://docs.python.org/library/stdtypes.html#dict.setdefault)
So, ramblings aside, I have a question: Will conveniences like AddOrUpdate() be back-ported to plain old Dictionary (through extension methods, for instance) so that people aren't tempted to use ConcurrentDictionary just for the conveniences even where they don't need concurrency? (I'd certainly be tempted!)
Thanks,
Daniel
Matt G, I'm glad to hear you've found the type so useful! If you wouldn't mind sharing the details of how you used the type and what kind of project you used it in, we'd be very interested. Feel free to post here or to email me personally at stoub at microsoft dot com.
Daniel, you're very welcome, and I'm also glad you like the design of the type. I've forwarded your comment/request to members of the base class libraries team, as they'll be best able to respond about whether these methods will ever find their way into the non-concurrent counterpart. In the meantime, as you suggest, you can certainly implement such functionality for the non-concurrent Dictionary<TKey,TValue> as extension methods. For example:
public static TValue GetOrAdd<TKey,TValue>(
this Dictionary<TKey,TValue> dictionary,
TKey key,
Func<TKey, TValue> valueFactory)
{
TValue value;
if (!dictionary.TryGetValue(key, out value))
value = valueFactory(key);
dictionary.Add(key, value);
}
return value;
public static TValue AddOrUpdate<TKey,TValue>(
Func<TKey, TValue> addValueFactory,
Func<TKey, TValue, TValue> updateValueFactory)
if (dictionary.TryGetValue(key, out value))
value = updateValueFactory(key, value);
else
value = addValueFactory(key);
dictionary[key] = value;
Daniel, as a follow-up, the recommendation from the base class libraries team is to file a Connect suggestion at https://connect.microsoft.com/VisualStudio/content/content.aspx?ContentID=14631.
Thanks for the follow-up Stephen.
I've filed the following Connect sugggestion:
https://connect.microsoft.com/VisualStudio/feedback/ViewFeedback.aspx?FeedbackID=524451
You're right that it's trivial to implement these extension methods yourself... I'm just greedy and want all the convenience right out of the box :-)
(Another added benefit is that it would make for a more consistent API between the ConcurrentDictionary and existing Dictionary types.)
First word typo: ConcurrentyDictionary :)
Thanks, Mark! Fixed :)
In a previous comment on this class I alluded to a very similar class I wrote with many internal buckets to lower lock contention.
I extracted the important thread safe members from the class to form:
public interface ISafeKeyedCollection<K, T>
IDictionary<K, T> AsSnapshot { get; }
void Clear();
void ForceAddValue(K key, T value);
bool Remove(K key);
bool Remove(KeyValuePair<K, T> item);
bool TryAddValue(K key, T value, out T contains);
bool TryGetValue(K key, out T value);
Will you be doing a similar thing?
Hi Luke-
ConcurrentDictionary does use fine-grained locking internally, i.e. multiple locks rather than a single lock in order to minimize lock contention. The number of locks employed is actually controllable as well through the concurrencyLevel parameter available on several of the dictionary's constructors. Note, however, that this locking is done only for writes; reads on the dictionary are implemented in a lock-free manner.
Thanks and sorry, I've clearly confused you (either that or you're politely telling me it's a rubbish idea ;)) but...
I meant about the interface to allow methods to use polymorphism rather than the specific type, and to allow for future as-yet-un-dreamt-of keyed collections with the same behaviour.
In my app I pass my cache around using its ISafeKeyedCollection interface so I can switch it out for a more optimal collection or one that uses an alternative backing store or perhaps even a wrapper that uses your cool collections.
Thanks, Luke
Ah, I did misunderstand what you were suggesting. It would certainly be technically possible to implement such an interface on ConcurrentDictionary, though that would of course mean such an interface would need to ship in the Framework, and we're typically skittish about doing such a thing when we don't already have multiple types that would be able to implement the interface, e.g. those "future as-yet-un-dreamt-of keyed collections with the same behaviour" ;)
You can of course create a wrapper for ConcurrentDictionary that implements your own ISafeKeyedCollection interface, delegating to the wrapped dictionary, and then still get the benefits of it.
Hope that helps.
Hi Stephen:
Awesome class. Question on GetOrAdd(TKey key, Func<TKey, TValue> valueFactory), which seems to be an obvious choice for a cache implementation. Ideally, I'd like it to behave like this:
1. I ask for data for IBM. The data isn't in the cache, so I go and retrieve it, then cache it.
2. I ask for data for MSFT. The data isn't in the cache, so I begin retrieving it.
3. While still getting data for MSFT, I ask for data for IBM, and immediately receive it from the cache.
4. While still getting data for MSFT, I ask for data for GOOG. The data isn't in the cache, so I begin retrieving it.
5. While still getting data for MSFT, I ask for data for MSFT again, and, since that key's data is already currently being retrieved, I don't begin a new retrieval but wait for the first one to complete and then return the data from the first MSFT retrieval.
From reading the description, it sounds like GetOrAdd(TKey key, Func<TKey, TValue> valueFactory) supports steps 1, 2, 3, and 4. It's not clear to me whether it supports 5. Any insight here?
Thanks much,
Hi Daniel-
ConcurrentDictionary itself doesn't provide the functionality for #5, but it's easily layered in. In fact, in our Beta 2 samples, there's a type called AsyncCache that provides the functionality you're looking for (and more). Basically, it's a very thin wrapper around a ConcurrentDictionary<TKey, Lazy<Task<TValue>>>. The Lazy<> ensures that one and only one Task instance is generated per key, and the Task enables multiple things, including synchronously waiting for the data to be returned or asynchronously getting callbacks when the data has arrived. Since Task can be used to represent asynchronous I/O operations without blocking threads, you can get some great scalability out of this approach.
Look for the type in \ParallelExtensionsExtras\CoordinationDataStructures\AsyncCoordination\AsyncCache.cs.
And for reference the samples are posted at http://code.msdn.microsoft.com/ParExtSamples.
Stephen - perfect, thanks! Using a Lazy<Task> makes perfect sense.