ConcurrentDictionary’s support for adding and updating

ConcurrentDictionary’s support for adding and updating

  • Comments 21

ConcurrentDictionary<TKey,TValue> is a new type in the .NET Framework 4, living in the System.Collections.Concurrent namespace.  As noted in the MSDN documentation, ConcurrentDictionary “represents a thread-safe collection of key-value pairs that can be accessed by multiple threads concurrently.” 

While ConcurrentDictionary implements IDictionary<TKey, TValue> just as does Dictionary<TKey,TValue>, it also has several unique mechanisms for adding / updating key/value pairs in the dictionary, mechanisms specific to its concurrent focus.  Here is a summary of when to use which mechanism.

  • If you want to…
    • Add a new item to the dictionary only if the key doesn’t currently exist in the dictionary…
      • Use TryAdd.  TryAdd accepts the key and the value, and adds the pair to the dictionary if the key doesn’t currently exist in the dictionary.  The method returns whether or not the new pair was added.
        • public bool TryAdd(TKey key, TValue value)
    • Update the value for an existing key in the dictionary, but only if the existing value for that key is equal to a specific value…
      • Use TryUpdate.  TryUpdate accepts the key and the new value to set that key to, but it also accepts the value that the key in the dictionary must currently contain in order for the update to succeed.  You can think of TryUpdate like Interlocked.CompareExchange but for an element in the dictionary.
        • public bool TryUpdate(TKey key, TValue newValue, TValue comparisonValue)
    • Store a key/value pair into the dictionary unconditionally, overwriting any value for that key if the key already exists…
      • Use the indexer’s setter, e.g. dictionary[key] = newValue.
        • public TValue this[TKey key] { get; set; }
    • Add a key/value pair to the dictionary if the key doesn’t exist in the dictionary, or if the key does exist, update the value for the key based on the key’s existing value…
      • Use AddOrUpdate.  AddOrUpdate has two overloads:
        • One overload accepts the key as well as two delegates.  The first delegate is used if the key doesn’t exist in the dictionary; it accepts the key and returns the value that should be added for the key.  The second delegate is used if the key does exist: it accepts both the key and the current value for the key, and it returns the new value that should be set for the key.
          • public TValue AddOrUpdate(TKey key, Func<TKey, TValue> addValueFactory, Func<TKey, TValue, TValue> updateValueFactory)
        • The other overload accepts the key, an add value, and the update delegate.  This is exactly the same as the former overload, except that if the key isn’t in the dictionary, the provided value is added for the key, rather than invoking a delegate to get the necessary value
          • public TValue AddOrUpdate(TKey key, TValue addValue, Func<TKey, TValue, TValue> updateValueFactory)
    • Get the value for a key in the dictionary, adding the value to the dictionary (and returning it) if the key doesn’t exist…
      • Use GetOrAdd.  You can think of this as lazy-initialization for a key/value pair in the dictionary, getting the value if it’s there, or adding it and then getting it if it’s not.  As with AddOrUpdate, GetOrAdd provides two overloads: one takes the value to be added if the key doesn’t exist, and the other takes a delegate that will generate the value if the key doesn’t exist.
        • public TValue GetOrAdd(TKey key, TValue value)
        • public TValue GetOrAdd(TKey key, Func<TKey, TValue> valueFactory)

All of these operations are atomic and are thread-safe with regards to all other operations on the ConcurrentDictionary.  The only caveat to the atomicity of each operation is for those which accept a delegate, namely AddOrUpdate and GetOrAdd.  For modifications / writes to the dictionary, ConcurrentDictionary employs fine-grained locking to ensure thread-safety (reads on the dictionary are performed in a lock-free manner); however, these delegates are invoked outside of the locks in order to avoid the myriad of problems that can arise from executing unknown code under a lock.  As such, the code executed by these delegates is not subject to the atomicity of the operation.

Leave a Comment
  • Please add 5 and 8 and type the answer here:
  • Post
  • It is such an awesome type. I found it a pleasure to use in a recent project. GetOrAdd is particularly useful - nice work.

  • Hi Stephen,

    Thanks for the informative overview post on ConcurrentDictionary. I can see that there are a lot of niceties coming out of ConcurrentDictionary; some arising from the need of atomicity, but others are just, well, nice.

    For example, AddOrUpdate() sounds like a wonderful addition which would greatly ease the typical dance of "is this item in the dictionary? if not add a default and then modify it, if so take out the value and modify it..."

    I've been missing this ever since I used Python's setdefault() method for the built-in dictionary type. (http://docs.python.org/library/stdtypes.html#dict.setdefault)

    So, ramblings aside, I have a question: Will conveniences like AddOrUpdate() be back-ported to plain old Dictionary (through extension methods, for instance) so that people aren't tempted to use ConcurrentDictionary just for the conveniences even where they don't need concurrency? (I'd certainly be tempted!)

    Thanks,

    Daniel

  • Matt G, I'm glad to hear you've found the type so useful!  If you wouldn't mind sharing the details of how you used the type and what kind of project you used it in, we'd be very interested.  Feel free to post here or to email me personally at stoub at microsoft dot com.

    Daniel, you're very welcome, and I'm also glad you like the design of the type.  I've forwarded your comment/request to members of the base class libraries team, as they'll be best able to respond about whether these methods will ever find their way into the non-concurrent counterpart.  In the meantime, as you suggest, you can certainly implement such functionality for the non-concurrent Dictionary<TKey,TValue> as extension methods.  For example:

    public static TValue GetOrAdd<TKey,TValue>(

       this Dictionary<TKey,TValue> dictionary,

       TKey key,

       Func<TKey, TValue> valueFactory)

    {

       TValue value;

       if (!dictionary.TryGetValue(key, out value))

       {

           value = valueFactory(key);

           dictionary.Add(key, value);

       }

       return value;

    }

    public static TValue AddOrUpdate<TKey,TValue>(

       this Dictionary<TKey,TValue> dictionary,

       TKey key,

       Func<TKey, TValue> addValueFactory,

       Func<TKey, TValue, TValue> updateValueFactory)

    {

       TValue value;

       if (dictionary.TryGetValue(key, out value))

       {

           value = updateValueFactory(key, value);

       }

       else

       {

           value = addValueFactory(key);

       }

       dictionary[key] = value;

       return value;

    }

  • Daniel, as a follow-up, the recommendation from the base class libraries team is to file a Connect suggestion at https://connect.microsoft.com/VisualStudio/content/content.aspx?ContentID=14631.

  • Thanks for the follow-up Stephen.

    I've filed the following Connect sugggestion:

    https://connect.microsoft.com/VisualStudio/feedback/ViewFeedback.aspx?FeedbackID=524451

    You're right that it's trivial to implement these extension methods yourself... I'm just greedy and want all the convenience right out of the box :-)

    (Another added benefit is that it would make for a more consistent API between the ConcurrentDictionary and existing Dictionary types.)

    Daniel

  • First word typo: ConcurrentyDictionary :)

  • Thanks, Mark!  Fixed :)

  • In a previous comment on this class I alluded to a very similar class I wrote with many internal buckets to lower lock contention.

    I extracted the important thread safe members from the class to form:

       public interface ISafeKeyedCollection<K, T>

       {

           IDictionary<K, T> AsSnapshot { get; }

           void Clear();

           void ForceAddValue(K key, T value);

           bool Remove(K key);

           bool Remove(KeyValuePair<K, T> item);

           bool TryAddValue(K key, T value, out T contains);

           bool TryGetValue(K key, out T value);

       }

    Will you be doing a similar thing?

  • Hi Luke-

    ConcurrentDictionary does use fine-grained locking internally, i.e. multiple locks rather than a single lock in order to minimize lock contention.  The number of locks employed is actually controllable as well through the concurrencyLevel parameter available on several of the dictionary's constructors. Note, however, that this locking is done only for writes; reads on the dictionary are implemented in a lock-free manner.

  • Thanks and sorry, I've clearly confused you (either that or you're politely telling me it's a rubbish idea ;)) but...

    I meant about the interface to allow methods to use polymorphism rather than the specific type, and to allow for future as-yet-un-dreamt-of keyed collections with the same behaviour.

    In my app I pass my cache around using its ISafeKeyedCollection interface so I can switch it out for a more optimal collection or one that uses an alternative backing store or perhaps even a wrapper that uses your cool collections.

    Thanks, Luke

  • Hi Luke-

    Ah, I did misunderstand what you were suggesting.  It would certainly be technically possible to implement such an interface on ConcurrentDictionary, though that would of course mean such an interface would need to ship in the Framework, and we're typically skittish about doing such a thing when we don't already have multiple types that would be able to implement the interface, e.g. those "future as-yet-un-dreamt-of keyed collections with the same behaviour" ;)

    You can of course create a wrapper for ConcurrentDictionary that implements your own ISafeKeyedCollection interface, delegating to the wrapped dictionary, and then still get the benefits of it.

    Hope that helps.

  • Hi Stephen:

    Awesome class. Question on GetOrAdd(TKey key, Func<TKey, TValue> valueFactory), which seems to be an obvious choice for a cache implementation.  Ideally, I'd like it to behave like this:

    1. I ask for data for IBM.  The data isn't in the cache, so I go and retrieve it, then cache it.

    2. I ask for data for MSFT.  The data isn't in the cache, so I begin retrieving it.

    3. While still getting data for MSFT, I ask for data for IBM, and immediately receive it from the cache.

    4. While still getting data for MSFT, I ask for data for GOOG. The data isn't in the cache, so I begin retrieving it.

    5. While still getting data for MSFT, I ask for data for MSFT again, and, since that key's data is already currently being retrieved, I don't begin a new retrieval but wait for the first one to complete and then return the data from the first MSFT retrieval.

    From reading the description, it sounds like GetOrAdd(TKey key, Func<TKey, TValue> valueFactory) supports steps 1, 2, 3, and 4.  It's not clear to me whether it supports 5.  Any insight here?

    Thanks much,

    Daniel

  • Hi Daniel-

    ConcurrentDictionary itself doesn't provide the functionality for #5, but it's easily layered in.  In fact, in our Beta 2 samples, there's a type called AsyncCache that provides the functionality you're looking for (and more).  Basically, it's a very thin wrapper around a ConcurrentDictionary<TKey, Lazy<Task<TValue>>>.  The Lazy<> ensures that one and only one Task instance is generated per key, and the Task enables multiple things, including synchronously waiting for the data to be returned or asynchronously getting callbacks when the data has arrived.  Since Task can be used to represent asynchronous I/O operations without blocking threads, you can get some great scalability out of this approach.

    Look for the type in \ParallelExtensionsExtras\CoordinationDataStructures\AsyncCoordination\AsyncCache.cs.

  • And for reference the samples are posted at http://code.msdn.microsoft.com/ParExtSamples.

  • Stephen - perfect, thanks!  Using a Lazy<Task> makes perfect sense.

Page 1 of 2 (21 items) 12