Would you like a MultiDictionary?

Would you like a MultiDictionary?

Rate This
  • Comments 73

We’ve recently shipped new collection types on NuGet with our Immutable Collections package. NuGet allows us to ship prerelease and experimental versions of libraries to gather feedback from the community. In this post, our software developer intern Ian Hays will talk about his intern project: an experimental NuGet package containing advanced collection types. -- Immo

Dictionary provides a mapping between a key and a single value, and is one of the most used collection types in the .NET Framework. Programs often need a mapping between one key and multiple values. While the functionality can be composed using existing collection types, it can be error prone due to corner cases.

Today we’re releasing an experimental NuGet package with a new related type, MultiDictionary. The MultiDictionary is a simple, intuitive collection that essentially functions like a Dictionary<TKey, ICollection<TValue>> but abstracts the ICollection<TValue>. A more precise definition is that it is a Dictionary<TKey, TValue> that allows multiple TValues to be added for any TKey (i.e. keys don’t have to be unique).

What’s wrong with Dictionary?

The .NET Framework already includes an efficient dictionary implementation that can be used with an ICollection as the value type parameter, so why bother making MultiDictionary at all? The short answer is simplicity. The long answer is also simplicity.

What’s your favorite data structure? Mine is the dictionary; I love the near constant time operations, the huge number of use cases, the cleanliness! Although the dictionary has a wide variety of uses, there are times when I want to add multiple values per key and Dictionary just doesn’t quite cut it. In those situations the solution is simple: just build a Dictionary<TKey, List<TValue>>!

The issue with the dictionary of list is nearly every call to the Dictionary has to be wrapped in logic to check the current state of the dictionary before adding/removing/indexing etc. I’m never satisfied with the idea of surrounding my dictionary calls with a series of if statements, so I end up coding an entirely new data structure to wrap my dictionary of lists. I’ve had to do this more times than I’m proud of, which is why I’m pleased to code up that data structure just one last time.

Introducing MultiDictionary

I could go into detail on the API and characteristics of MultiDictionary, but I’ll save that for later; let’s first look at some examples of typical usage for MultiDictionary.

MultiDictionary<string, int> myDictionary = new MultiDictionary<string, int>();
myDictionary.Add("key", 1);
myDictionary.Add("key", 2);
myDictionary.Add("key", 3);
//myDictionary["key"] now contains the values 1, 2, and 3

When we index into our myDictionary, we get an ICollection<int> that contains the elements 1, 2, and 3. If the key wasn’t in the MultiDictionary, then an empty ICollection associated with that key will be returned.

All ICollection instances returned by indexing into the MultiDictionary function as indirections to the collections inside of our MultiDictionary, which means that as the MultiDictionary changes so does the ICollection and vice versa. Consider the following example that illustrates this:

MultiDictionary<string, int> myDictionary = new MultiDictionary<string, int>();
myDictionary.Add("key", 1);
ICollection<int> myCollection = myDictionary["key"];
myCollection.Add(2);
//myCollection now contains the values 1, and 2

The MultiDictionary also has methods for adding or removing one key-value pair at a time as well as adding or removing multiple values per key.

MultiDictionary<string, int> myDictionary = new MultiDictionary<string, int>();
myDictionary.AddRange("key1", new int[] { 1, 2, 3 });
myDictionary.AddRange("key2", new int[] { 1, 2, 3 });
myDictionary.Remove("key1");
myDictionary.RemoveItem("key2", 2);
//myDictionary now contains key2 with values 1 and 3

There are a few more interesting and useful methods inside of the MultiDictionary, but I’ll let you explore those on your own!

Why should I use MultiDictionary?

Let’s look at some benefits of the MultiDictionary:

  • Adding a single key-value pair is far simpler with a MultiDictionary than with a Dictionary of lists

    //Adding with a MultiDictionary<TKey,TValue>
    myDictionary.Add(1, 2);

    //Adding with a Dictionary<TKey, ICollection<TValue>>
    if (singleDictionary.ContainsKey(1))
        singleDictionary[1].Add(2);
    else
        singleDictionary.Add(1, new int[] { 2 });
  • Adding multiple values to a Key is supported in the MultiDictionary through the AddRange method

    //Adding multiple values with a MultiDictionary<TKey,TValue>
    myDictionary.AddRange(1, new int[] { 1, 2, 3 });

    //Adding multiple values with a Dictionary<TKey, ICollection<TValue>>
    ICollection<int> singleDictionaryCollection;
    if (singleDictionary.TryGetValue(1, out singleDictionaryCollection))
    {
        foreach (int toAdd in (new int[] { 1, 2, 3 }))
            singleDictionaryCollection.Add(toAdd);
    }
    else
    {
        singleDictionary.Add(1, new int[] { 1, 2, 3 });
    }
  • Indexing into the MultiDictionary will never throw an exception (unless the key is null) and will always return an ICollection that changes as the MultiDictionary changes and vice versa.

  • You can remove a single key value pair or all of the values associated with a key through the RemoveItem and Remove methods, respectively.

  • The Values property returns an ICollection<TValue> instead of an ICollection<ICollection<TValue>> like a Dictionary<TKey, ICollection<TValue>> would. This makes it easier to iterate through the values in the dictionary.

Try it out!

Enough reading, try it out in your favorite .NET language and let us know what you think! The Alpha release of the MultiDictionary is available on NuGet. Please let us know what you think by leaving a comment on this post or by contacting us via the contact page.

Thanks for reading, and enjoy!

Leave a Comment
  • Please add 5 and 5 and type the answer here:
  • Post
  • This looks like a nice data structure as long as you're not working with a multi-threaded application. As soon as you think going multi-thread, you'll have a lot of synchronization management. This gets even worst with the ICollections you get back if they actually get "updated" as you add/remove items from the MultiDictionary. If getting the ICollection out of the MultiDictionary would return an immutable collection, it would already help a lot with multi-threading scenarios.

    I haven't looked at the source code, but how will the MultiDictionary handle multi-thread inserts/removes? I guess best practice will be to lock on the SyncRoot property?

    Don't get me wrong, this is a nice addition to the available collections and I am looking forward to use it in specific cases. I would be even happier to use it in more places if it was designed with multi-threading at its core!

  • @biiiipy Good catch! The second code block should read:

    MultiDictionary<string, int> myDictionary = new MultiDictionary<string, int>();

    myDictionary.Add("key", 1);

    ICollection<int> myCollection = myDictionary["key"];

    myDictionary.Add("key", 2);

    //myCollection now contains the values 1, and 2

    although your correction is also an equally valid and useful example:

    MultiDictionary<string, int> myDictionary = new MultiDictionary<string, int>();

    myDictionary.Add("key", 1);

    ICollection<int> myCollection = myDictionary["key"];

    myCollection.Add("2);

    //myDictionary["key] now contains the values 1, and 2

    @Aron Parker I hope so!

  • The code for adding multiple values to Dictionary<TKey, ICollection<TValue>> contains a bug (you can't Add() to an array). Was that intentional to show another issue with that approach?

  • @Kai Eichinger My pleasure! I'm not sure what the process for open-sourcing something like this would be, but my impression is that it's rather difficult. I'll let @Immo Landwerth [MSFT] answer :)

  • @biiiipy: Good catch. Fixed.

    @Aron Parker: We'll most likely make this a NuGet package. The current NuGet package (Microsoft.Collections.Experimental) will not the permanent home for this type.

    @Kai Eichinger: OSS is on our radar (see www.dotnetfoundation.org) but instead of creating yet another project we're thinking of how we would do OSS the right way. Stay tuned.

    @Miky Lestat, fly: The inconsistency do Dictionary<K,V> came to my mind as well. However, when you look at scenarios where V is a collection you most often want the behavior MultiDictionary has. May be should create another API for this, though.

    Controlling the underlying collection type is an interesting idea, we'll look into that.

    @xor88: Excellent points.

    @Sinix: Agreed, ILookup would certainly make sense.

    @Simon Le Bourdais-Cabana: Most of our collections aren't thread-safe (except for concurrent and immutable collections). Our goal was to create a great mutable MultiDictionary<K,V> that fits well with Dictionary<K, V>, so supporting multi-threading wasn't a goal.

    What threading scenarios would you like us to support?

    @svick: Which array code sample are you referring to?

  • I've been using PowerCollection's MultiDictionary since .NET 2.0. A nice feature of PowerCollection's implementation is that it allows you to specify whether the values are stored in a set or a bag (allowDuplicateValues). The class is also extensible so you can implement it with your own value collection. I will definitely give this a try.

  • Personally I'd have preferred if the return type was IReadOnlyList<V> although I like that additions to the MD are reflected into the retrieved collections

  • @Immo, the one with the comment "Adding multiple values with a Dictionary<TKey, ICollection<TValue>>".

  • Nice, but won't get into our production code until it ships with the .net framework.  Our enterprise applications are prevented from beta, alpha, or unsupported, or supported by 1 developer libraries.

  • @Guy Godin and @Miky Lestat Adding other options for the ICollection type is something we're looking into :) Would the ability to choose between duplicate/unique values be fine, or did you also want the ability to use any class that implemented ICollection as the internal ICollection type?

    @Renato The reasoning behind returning an ICollection instead of an IReadOnlyCollection is that with the ICollection we can allow modification of the MultiDictionary through this possibly more convenient live-view. Why would you prefer that they be ReadOnly?

  • Thank you so much guys! This is surely a very wanted data structure.

    Speaking of "controlling underlying collection", I just used a "Dictionary<string, Hash<Tuple<string, FileSystemWatcher>>>" in WebEssential's code here: github.com/.../files

    With MultiDictionary, it would become MultiDictionary<string, Tuple<string, FileSystemWatcher>>, but if we want ICollection to be HashSet type (especially to perform Contains() in O(1)), can we do that with this with current implementation?

  • Concurrent multidictionary

  • +1 for ILookup/IGrouping support. I've messed around with my own implementations far more often than I've liked.

  • Thanks a lot! I discovered that I need a MultiDictionary just 2 weeks ago, emulated it with Dictionary<K, List<V>>.

  • It was implemented 4 years ago in Ninject named "Multimap"

    github.com/.../Multimap.cs

Page 2 of 5 (73 items) 12345