MultiDictionary becomes MultiValueDictionary

MultiDictionary becomes MultiValueDictionary

Rate This
  • Comments 17

We just shipped an update to our experimental implementation of a multi value dictionary. In this post, our software developer intern Ian Hays talks about the changes. -- Immo

Goodbye MultiDictionary

In my last post I went over MultiDictionary, officially available on NuGet as the prerelease package Microsoft.Experimental.Collections. We received great feedback, questions and commentary in the comments, and it was clear that this was something that a lot of you felt passionately about (70 comments? Awesome!). We’ve read all of your comments and taken them into consideration for this next iteration of Microsoft.Experimental.Collections.

You should also check out our interview on Channel 9:

Hello MultiValueDictionary

First off, let’s talk about the name. It was a bit ambiguous what the “Multi” in “MultiDictionary” referred to: at first glance, “multi” could mean there were multiple keys per value, or a dictionary of dictionaries, or that it was a bi-directional dictionary. To make it explicit and leave room for other variants in the future, we’ve renamed the type to MultiValueDictionary to clarify that the type allows multiple values for a single key.

Let’s get right to the meat of the post: what’s changed? We’ll go into some of the major design decisions and changes that make up the new MultiValueDictionary in the next sections.

IEnumerable of…?

MultiDictionary could be thought of as Dictionary<TKey, TValue> where we could have multiple elements with the same TKey. MultiValueDictionary is more akin to a Dictionary<TKey, IReadOnlyCollection<TValue>> with a number of methods to enable easy modification of the internal IReadOnlyCollections. This distinction may seem subtle, but it affects how you consume the data structure.

For example, let’s look at the Count and Values properties. MultiDictionary would return the number of values and a collection of values, while MultiValueDictionary returns the number of keys and a collection of IReadOnlyCollections of values.

// MultiDictionary

var multiDictionary = new MultiDictionary<string, int>();
multiDictionary.Add("key", 1);
multiDictionary.Add("key", 2);
//multiDictionary.Count == 2
//multiDictionary.Values contains elements [1,2]

// MultiValueDictionary

var multiValueDictionary = new MultiValueDictionary<string, int>();
multiValueDictionary.Add("key", 1);
multiValueDictionary.Add("key", 2);
//multiValueDictionary.Count == 1
//multiValueDictionary.Values contains elements [[1,2]]

This behavioral change also affects the enumerator in the same way that it affects the Values property. Previously the dictionary was flattened when enumerating, as it implemented IEnumerable<KeyValuePair<TKey, TValue>>. MultiValueDictionary now implements IEnumerable<KeyValuePair<TKey, IReadOnlyCollection<TValue>>.

var multiValueDictionary = new MultiValueDictionary<string, int>();
multiValueDictionary.Add("key", 1);
multiValueDictionary.Add("key", 2);
multiValueDictionary.Add("anotherKey", 3);

foreach (KeyValuePair<string, IReadOnlyCollection<int>> key in multiValueDictionary)
{
  foreach (int value in key.Value)
  {
    Console.WriteLine("{0}, {1}", key.Key, value);
  }
}
// key, 1
// key, 2
// anotherKey, 3

As Sinix pointed out in the previous blog post comments, this is very similar to another type in the .NET Framework, ILookup<TKey, TValue>. MultiValueDictionary shouldn’t implement both the dictionary and lookup interfaces, because that would cause it through interface inheritance to implement two different versions of IEnumerable: IEnumerable<KeyValuePair<TKey, IReadOnlyCollection<TValue>> and IEnumerable<IGrouping<TKey, TValue>. It wouldn’t be clear which version you would get when using foreach. But since MultiValueDictionary logically implements the concept, we’ve added a method AsLookup() to MultiValueDictionary which returns an implementation of the ILookup interface.

var multiValueDictionary = new MultiValueDictionary<string, int>();
multiValueDictionary.Add("key", 1);
multiValueDictionary.Add("key", 2);
multiValueDictionary.Add("anotherKey", 3);

var lookup = multiValueDictionary.AsLookup();
foreach (IGrouping<string, int> group in lookup)
{
  foreach (int value in group)
  {
    Console.WriteLine("{0}, {1}", group.Key, value);
  }
}
// key, 1
// key, 2
// anotherKey, 3

Indexing and TryGetValue

In the first iteration of the MultiDictionary we followed the precedent from Linq’s AsLookup() with regards to the way the indexation into the MultiDictionary worked. In a regular Dictionary, if you attempt to index into a key that isn’t present you’ll get a KeyNotFoundException, but like AsLookup(), the MultiDictionary returned an empty list instead. This was mostly to match the functionality of the Lookup class that is conceptually similar to the MultiDictionary, but also because this behavior was more practically applicable to the kinds of things you’d be using the MultiDictionary.

With the behavior changes brought on by the MultiValueDictionary and the addition of the AsLookup() method, this old functionality doesn’t quite fit anymore. We heard feedback that this inconsistency between MultiDictionary and Dictionary was confusing, so the MultiValueDictionary will now throw a KeyNotFoundException when indexing on a key that isn’t present. We’ve also added a TryGetValue method to accommodate the new behavior.

var multiValueDictionary = new MultiValueDictionary<string, int>();
multiValueDictionary.Add("key", 1);
//multiValueDictionary["notkey"] throws a KeyNotFoundException
IReadOnlyCollection<int> collection = multiValueDictionary["key"];
multiValueDictionary.Add("key", 2);
//collection contains values [1,2]

Another related change with the MultiValueDictionary on the topic of the indexer is the return value. Previously we returned a mutable ICollection<TValue>. Adding and removing values from the returned ICollection<TValue> updated the MultiDictionary. While there are uses for this functionality, it can be unexpected and create unintentional coupling between parts of an application. To address this we’ve changed the return type to IReadOnlyCollection<TValue>. The read-only collection will still update with changes to the MultiValueDictionary.

When a List just doesn’t cut it

One limitation of the MultiDictionary was that internally, it used a Dictionary<TKey, List<TValue>> and there was no way to change the inner collection type. With the MultiValueDictionary we’ve added the ability to specify your own inner collection.

Showing a simple example of how they work is probably easier than trying to describe them first, so let’s do that.

var multiValueDictionary = MultiValueDictionary<string, int>.Create<HashSet<int>>();
multiValueDictionary.Add("key", 1);
multiValueDictionary.Add("key", 1);
//multiDictionary["key"].Count == 1

Above, we replace the default List<TValue> with a HashSet<TValue>. As the examples show, HashSet combines duplicate TValues.

For every constructor there is a parallel generic static Create method that takes the same parameters but allows specification of the interior collection type. It’s important to point out that this doesn’t affect the return value of the indexer/TryValueValue though (they return very limited IReadOnlyCollections regardless of the inner collection type).

If you want a little bit more control over how your custom collection is instantiated, there are also the more specific Create methods that allow you to pass a delegate to specify the inner collection type:

var multiValueDictionary = MultiValueDictionary<string, int>.Create<HashSet<int>>(myHashSetFactory);
multiValueDictionary.Add("key", 1);
multiValueDictionary.Add("key", 1);
//multiValueDictionary["key"].Count == 1

In either case, the specified collection type must implement ICollection<TValue> and must not have IsReadOnly set to true by default.

And that’s all!

You can download the new MultiValueDictionary from NuGet and try it out for yourself! If you have any questions or if you just want to give feedback, please leave a comment or contact us.

Leave a Comment
  • Please add 2 and 7 and type the answer here:
  • Post
  • Thanks! It's surprisingly a lot of work to be done to make the collection usable as every 'simple' feature turns out to be not so easy:)

  • The new name is better, less confusing; but I'd still rather call that a Lookup than a Dictionary. It's a shame that the name Lookup<TKey, TValue> is already taken, because that would have been a good choice.

    I don't like the new semantics. The goal of this class is to associate multiple values to a key. But in this new version, the public interface doesn't really reflect that intent; instead, it mostly reflects the internal implementation, i.e. the fact that it's basically a dictionary where the values are collections. IMO, this is wrong: when you ask "give me all the values associated with that key", you expect to receive a collection; if no values are associated with that key, you should receive a collection with 0 element, not an exception. The exception makes sense for Dictionary, because you expect exactly 1 value, and there is no other way to convey the fact that there is no such value; but when you expect multiple values, there is no reason to treat 0 as a special case: an empty collection is much more convenient to the caller than an exception. I could understand the will to make it consistent with Dictionary, but IMO it's more important to make it easy to use than to make it consistent with something that is only vaguely similar. And this class would definitely be easier to use as a Lookup than as a Dictionary...

  • (forgot that part in my previous post...)

    The possibility to choose the type of collection for the values is a nice touch. However I think the factory delegate should take the key as a parameter; it doesn't cost anything, and it's more flexible.

  • I also do not like the KeyNotFoundException. When I read these comments on the MultiDictionary, I thought the guys did not really understand the purpose of this class. This is NOT a traditional dictionary so of cause it's semantics are different. I have implemented a very similar class like this myself so I know what I could use it for. Having to use TryGet is always a bit cumbersome with the current C# semantics (looking forward to C# vNext). It is much more convenient to just check the size of the returned collection.

  • I agree with Thomas KeyNotFoundException is stupid. Exceptions are for... exceptional situations. A key not being present in a dictionary is a pretty standard scenario.

  • Awesome stuff!

  • Looks good, except for the KeyNotFoundException, which will cause unwieldy code only for the sake of consistency with an only slightly related type. It would be a lot simpler to just return an empty collection if the key isn't found

  • I like the changes, even the 'KeyNotFoundException'. To me trying to retrieve a value for a key that was never stored is an exception/unanticipated situation in code. This behavior is also inline with my expectations of a Dictionary (be it regular or MultiValueDictionary). I assume we can use the ContainsKey and TryGetValue methods like we did with a Dictionary where key not present is a check.

    I welcome the new name, I had always felt that MultiDictionary was ambiguous when it was introduced.

    Overall good changes, Cheers.

  • Not a fan of the KeyNotFoundException from a productivity stand point. In most cases you will be doing a for each on the values for a given key when consuming the MultiDictionary. Doing so will now require you to first do a TryGetValue, then check that it returned True, store the returned value and then iterate. I've been using PowerCollections' MultiDictionary for 10 years so maybe I'm biased but I think simplicity of use is more important than "being like a dictionary".

    The create method that allows you to specify the type/factory for the ValueCollection is neat though.

  • I agree that it should not throw a KeyNotFoundException. 100% agree with Thomas Levesque's arguments. Like Guy Godin, I'm a user of PowerCollection's MultiDictionary which doesn't throw an exception and this behavior is exactly what I needed in the past.

  • @Chris Marisic, a value in a dictionary without a key is common? This IS an exception! Think about a real dictionary, where you have a definition of something without its name.

  • KeyNotFoundException should be thrown for sure.

    I miss a copy constructror. It would be nice to copy a MultiValueDictionary like this:

    var asd = new MultiValueDictionary<string, int>(oldMultiValueDictionary);

    It also works for List<> and Dictionary<,>.

  • Keep it simple => vote for removal of KeyNotFoundException

  • I really like the new datatype.

    However I am missing an easy way to initialize the MultiValueDictionary using object initializers.

    With an dictionary it possible to use the following syntax:

           private Dictionary<int, int[]> graph = new Dictionary<int, int[]>()

           {

               {1, new []{2,3,4}}

           };

           private MultiValueDictionary<int, int> graph = new MultiValueDictionary<int, int>()

           {

                //not possible?

           };

    Any suggestions?

  • With the latest update, it's not worth to use MultiValueDictionary, you still need to add validations on keys not found, why not use Dictionary<T, List<Y>> instead? no extra dependencies needed. And it's easier to create Dictionary<T, Stack<Y>> without MultiValueDictionary.

    If the KeyNotFoundException is necessary to keep consistency on IDictionary, why not remove the IDictionary interface declaration? I though the reason behing MultiValueDictionary was to have a simple to use way to manage multiple values associated with one key, the code to add and read values should be clean and easy to use. I don't know, OneToMany<string, int>?

Page 1 of 2 (17 items) 12