Would you like a MultiDictionary?

Would you like a MultiDictionary?

Rate This
  • Comments 75

We’ve recently shipped new collection types on NuGet with our Immutable Collections package. NuGet allows us to ship prerelease and experimental versions of libraries to gather feedback from the community. In this post, our software developer intern Ian Hays will talk about his intern project: an experimental NuGet package containing advanced collection types. -- Immo

Dictionary provides a mapping between a key and a single value, and is one of the most used collection types in the .NET Framework. Programs often need a mapping between one key and multiple values. While the functionality can be composed using existing collection types, it can be error prone due to corner cases.

Today we’re releasing an experimental NuGet package with a new related type, MultiDictionary. The MultiDictionary is a simple, intuitive collection that essentially functions like a Dictionary<TKey, ICollection<TValue>> but abstracts the ICollection<TValue>. A more precise definition is that it is a Dictionary<TKey, TValue> that allows multiple TValues to be added for any TKey (i.e. keys don’t have to be unique).

What’s wrong with Dictionary?

The .NET Framework already includes an efficient dictionary implementation that can be used with an ICollection as the value type parameter, so why bother making MultiDictionary at all? The short answer is simplicity. The long answer is also simplicity.

What’s your favorite data structure? Mine is the dictionary; I love the near constant time operations, the huge number of use cases, the cleanliness! Although the dictionary has a wide variety of uses, there are times when I want to add multiple values per key and Dictionary just doesn’t quite cut it. In those situations the solution is simple: just build a Dictionary<TKey, List<TValue>>!

The issue with the dictionary of list is nearly every call to the Dictionary has to be wrapped in logic to check the current state of the dictionary before adding/removing/indexing etc. I’m never satisfied with the idea of surrounding my dictionary calls with a series of if statements, so I end up coding an entirely new data structure to wrap my dictionary of lists. I’ve had to do this more times than I’m proud of, which is why I’m pleased to code up that data structure just one last time.

Introducing MultiDictionary

I could go into detail on the API and characteristics of MultiDictionary, but I’ll save that for later; let’s first look at some examples of typical usage for MultiDictionary.

MultiDictionary<string, int> myDictionary = new MultiDictionary<string, int>();
myDictionary.Add("key", 1);
myDictionary.Add("key", 2);
myDictionary.Add("key", 3);
//myDictionary["key"] now contains the values 1, 2, and 3

When we index into our myDictionary, we get an ICollection<int> that contains the elements 1, 2, and 3. If the key wasn’t in the MultiDictionary, then an empty ICollection associated with that key will be returned.

All ICollection instances returned by indexing into the MultiDictionary function as indirections to the collections inside of our MultiDictionary, which means that as the MultiDictionary changes so does the ICollection and vice versa. Consider the following example that illustrates this:

MultiDictionary<string, int> myDictionary = new MultiDictionary<string, int>();
myDictionary.Add("key", 1);
ICollection<int> myCollection = myDictionary["key"];
myCollection.Add(2);
//myCollection now contains the values 1, and 2

The MultiDictionary also has methods for adding or removing one key-value pair at a time as well as adding or removing multiple values per key.

MultiDictionary<string, int> myDictionary = new MultiDictionary<string, int>();
myDictionary.AddRange("key1", new int[] { 1, 2, 3 });
myDictionary.AddRange("key2", new int[] { 1, 2, 3 });
myDictionary.Remove("key1");
myDictionary.RemoveItem("key2", 2);
//myDictionary now contains key2 with values 1 and 3

There are a few more interesting and useful methods inside of the MultiDictionary, but I’ll let you explore those on your own!

Why should I use MultiDictionary?

Let’s look at some benefits of the MultiDictionary:

  • Adding a single key-value pair is far simpler with a MultiDictionary than with a Dictionary of lists

    //Adding with a MultiDictionary<TKey,TValue>
    myDictionary.Add(1, 2);

    //Adding with a Dictionary<TKey, ICollection<TValue>>
    if (singleDictionary.ContainsKey(1))
        singleDictionary[1].Add(2);
    else
        singleDictionary.Add(1, new int[] { 2 });
  • Adding multiple values to a Key is supported in the MultiDictionary through the AddRange method

    //Adding multiple values with a MultiDictionary<TKey,TValue>
    myDictionary.AddRange(1, new int[] { 1, 2, 3 });

    //Adding multiple values with a Dictionary<TKey, ICollection<TValue>>
    ICollection<int> singleDictionaryCollection;
    if (singleDictionary.TryGetValue(1, out singleDictionaryCollection))
    {
        foreach (int toAdd in (new int[] { 1, 2, 3 }))
            singleDictionaryCollection.Add(toAdd);
    }
    else
    {
        singleDictionary.Add(1, new int[] { 1, 2, 3 });
    }
  • Indexing into the MultiDictionary will never throw an exception (unless the key is null) and will always return an ICollection that changes as the MultiDictionary changes and vice versa.

  • You can remove a single key value pair or all of the values associated with a key through the RemoveItem and Remove methods, respectively.

  • The Values property returns an ICollection<TValue> instead of an ICollection<ICollection<TValue>> like a Dictionary<TKey, ICollection<TValue>> would. This makes it easier to iterate through the values in the dictionary.

Try it out!

Enough reading, try it out in your favorite .NET language and let us know what you think! The Alpha release of the MultiDictionary is available on NuGet. Please let us know what you think by leaving a comment on this post or by contacting us via the contact page.

Thanks for reading, and enjoy!

Leave a Comment
  • Please add 7 and 5 and type the answer here:
  • Post
  • @Ian Both actually. On a side note, a Bag collection would be useful. There's a ConcurrentBag but no regular Bag collection which allows duplicate values and fast Contains operations.

    Also extension methods on IGrouping would be helpful (ToMultiDictionary)

  • @Miky Lestat, fly: I disagree. Dictionary throw a KeyNotFoundException because when you call the indexer, you expect to get exactly one value, and if there is no value for this key, there is nothing to do but throw.

    However, for MultiDictionary, the contract is different: you don't expect one value, but a collection of values ; whether this collection is empty or not is mostly irrelevant. So it makes perfect sense to return an empty collection.

    Furthermore, the contract fulfilled by this class is mostly the one defined by the ILookup interface (except that it's writable), whose only implementation (the Lookup class) returns an empty collection if the key is not found. So returning an empty collection is consistent with the existing Lookup behavior.

    Personally I like the idea of this class, but I'm not so sure about its name ; it's more like a Lookup than like a Dictionary, so the name MultiDictionary is confusing. I would prefer something with Lookup in the name...

  • MultiDictionary is a nice addition. It always annoyed me to write a bunch of code to handle collection in dictionary.

  • I use this pattern all the time! I would love this data structure to be baked into .NET

  • I would like this too. I've written  my own implementation in the past using a special class that act as wrapper for the values.

  • I like this a lot, however my more common use case is nested dictionaries which come with the same issues that building DIctionary<TKey, List<TValue>>, so any chance we'll see the equivalent of <Dictionary<Tkey, Dictionary<TKey2,TValue>> from you guys?

  • This is interesting and I can see it's application, but often I've found that the List<Value> probably indicates a missing object in my data model, and then you are back to the normal IDictionary<TKey, TValue> scenario.

  • To answer the question, yes, this is very useful. I've encountered the need for this several times over the past couple of years. It would reduce a lot of scrappy code.

  • This could be very useful indeed.

    But how about the opposite, a dictionary with multiple keys? (like a MultiKeyDict<TKey1, TKey2, TValue>)

  • +1 to the "Dictionary with multiple keys" idea. Maybe then I'd have a solution to Eric Lippert's lambasting of my answer here: codereview.stackexchange.com/.../33857

  • A few thoughts:

    - I think it should throw if the key is not present, so the behavior is consistent with Dictionary.

    - I think both Dictionary and MultiDictionary should have a GetOrDefault method. For Dictionary, this would return the value type's default. For MultiDictionary, this would return an empty collection.

    - This is one of those types that everyone either a) has their own implementation of or b) uses a 3rd party implementation of. There are two other kinds of Dictionaries I also find useful:

    MultiAllKeyDictionary- where you can provide any number of keys for a given value. The hashcode is computed by combining the hashcodes of all the keys. To get a value *all* members of a given key are required. You can imagine strongly typed permutations for 2 keys, 3 keys, etc (I generate these with a T4 template), as well as a more general version that takes an array of objects for a set of keys that is not determined at compile time. This is much nicer than working with a Dictionary<Tuple<T1,T2,>,TValue>.

    MultiAnyKeyDictionary- similar to the above dictionary, but here a value can be retrieved by *any* of the keys by which it was input. The hashcodes for individual keys are used to index internally into separate Dictionaries. You can also imagine strongly typed implementations for 2 keys, 3 keys, etc. This is very useful when you have some entity that you might want to index in more than one way. For example, I have a customer, and sometimes I need to look them up by Id and other times by Name. This dictionary lets me do that without having to manually keep two separate Dictionaries in sync.

  • Great news, I have been missing this on a number of occasions.

    * Please add support for PCL profile 158

    * I would also like to se a Read only implementation (IReadOnlyMultiDictionary + readonly implementation)

  • I wasn't initially sure about the fact that the returned ICollection instances were mutable; it seemed to me like that would have all sorts of downsides. Having looked at the implementation with reflector, though, it seems that the ICollection implementation really is a very thin abstraction over the MultiDictionary's own storage (i.e. it's not a copy). Given that implementation, I'm now fairly happy with it.

    I'd probably only ever want to use it in a manner that the ICollections were treated as read-only (is that ILookup?), but I guess that should be easy do with a wrapper around your implementation.

  • Thread safety properties?

  • Yes I would!

Page 3 of 5 (75 items) 12345