Welcome to MSDN Blogs Sign in | Join | Help

Fast Switching with LINQ

Jomo Fisher—I’ve run into a performance problem several times in my career that I’ve never found a truly satisfying answer for until now. The problem can be distilled down to this:

Look up a value based on a string key. The set of strings is fixed and no unknown key values can occur.

Consider my personal rankings of the Seven Dwarves on a scale of 1-10:

                happy=9, sneezy=2, doc=7, sleepy=8, dopey=9, grumpy=2, bashful=6

I need a function that takes a name and returns a ranking. The default solution for this problem should be to use a Dictionary or HashTable. The implementations that ship with the .NET framework are extremely fast.

However, these container classes offer more features than I need for this problem. New keys can be added but my set is fixed. They can tell you whether a key is valid or not but I know a priori that the keys are limited to a fixed set. They can enumerate the set of keys or the set of values but I don’t need this capability.

Any time I see a data structure with a capability I’m not using it makes me wonder whether I can trade that capability for something I do need—in this case a speed boost.

If I was really desperate for a performance gain then it was possible to write a function like:

        static int Lookup(string s) {

            if (s.Length < 6) {

                if (s.Length < 5) return 7;

                else return 9;

            } else {

                if (s[0] < 's') {

                    if (s[0] < 'g') return 6;

                    else return 2;

                } else {

                    if (s[1] < 'n') return 8;

                    else return 2;

                }

            }

        }

Notice that we only have to look at the length and the first two characters of the string to make the decision. This function is fast but nearly impossible to maintain or debug. Also, I have to know the set of strings at compile time. All of these drawbacks make this approach significantly less useful.

Enter the LINQ Expression Compiler that will be available in the .NET 3.5. With the new expression support it’s possible to write an expression at runtime and then compile this expression into straight IL (which will be jitted into machine code). It turned out to be pretty easy to write a switch compiler that would generate an appropriate lookup expression on the fly. I wrote a class called SwitchCompiler to do the heavy lifting. Here’s how it’s called:

            var dict = new Dictionary<string, int> {

                    {"happy", 9}, {"sneezy", 2},

                    {"doc", 7}, {"sleepy", 8},

                    {"dopey", 9}, {"grumpy", 2},

                    {"bashful", 6}

                };

            var Lookup = SwitchCompiler.CreateSwitch(dict);

            var v = Lookup("doc");

            Debug.Assert(v == 7);

SwitchCompiler first builds up an expression that looks like this:

s.Length < 6 ? (s.Length < 5 ? 7 : 9) :

                (s[0] < 's' ? (s[0] < 'g' ? 6 : 2) : (s[1] < 'n' ? 8 : 2));

And then compiles this into a delegate returned into Lookup. (It should also be possible to generate a perfect minimal hash in the Compile method. I haven't tried this).

The actual implementation of SwitchCompiler is about 65 lines of code. You can see it in the attachment.

On my machine, looking up each value in a tight loop 1 million times takes 70ms while the corresponding loop using Dictionary takes 697ms. That’s close to a 900% performance improvement.

This posting is provided "AS IS" with no warranties, and confers no rights.

kick it on DotNetKicks.com
Published Wednesday, March 28, 2007 3:37 PM by Jomo Fisher
Filed under: , ,

Attachment(s): SwitchCompiler.cs

Comment Notification

If you would like to receive an email when updates are made to this post, please register here

Subscribe to this post's comments using RSS

Comments

# re: Fast Switching with LINQ

That's pretty impressive performance for a VM-based language.  The C++ equivelent, 1 million iterations, runs in about 55 ms, on a Pentium 4, 3.4 Ghz.  Here's the code:

struct KeyValue {

 const char* key;

 int value;

 inline bool operator < (const KeyValue& that) const {

   return strcmp(this->key, that.key) < 0;

 }

};

static const KeyValue kvs[] = {

 {"bashful", 6},

 {"doc", 7},

 {"dopey", 9},

 {"grumpy", 2},

 {"happy", 9},

 {"sleepy", 8},

 {"sneezy", 2},

}, * const kvs_end = kvs + (sizeof(kvs) / sizeof(0[kvs]));

inline int Lookup(const char* s) {

 KeyValue key = { s };

 return std::lower_bound(kvs, kvs_end, key)->value;

}

int _tmain(int argc, _TCHAR* argv[])

{

 LARGE_INTEGER begin, end, frequency;

 QueryPerformanceCounter(&begin);

 const KeyValue* k = kvs;

 int total = 0;  // Just so the compiler doesn't completely optimize out the for loop.

 for (int i = 0; i < 1000000; ++i) {

   total  += Lookup(k->key);

   if (++k == kvs_end)

     k = kvs;

 }

 QueryPerformanceCounter(&end);

 QueryPerformanceFrequency(&frequency);

 __int64 ms = (end.QuadPart - begin.QuadPart) * 1000 / frequency.QuadPart;

 cout << "Total:\t" << total << endl;

 cout << "Milliseconds:\t" << ms << endl;

}

One minor benefit of the C++ code is that there is no overhead of creating a lookup table before the first call to Lookup().

Wednesday, March 28, 2007 4:38 PM by Jeff

# re: Fast Switching with LINQ

Hi Jeff,

I tried your code on my machine and amazingly got exactly 55ms. That makes your orange look a little more like my apple.

Your comment made me realize my implementation has one more feature that I could possibly throw away: there's an array bounds check on the string index operations. I think I could get rid of this by emitting 'unsafe' code. Unfortunately, I think I'd have to bypass the expression compiler to do this.

I seem to be within spitting distance of optimized C++, that's pretty cool considering how amazing the VS C++ optimizer is.

Jomo

Thursday, March 29, 2007 2:43 PM by Jomo Fisher

# re: Fast Switching with LINQ

Here's my test code in case anyone wants to reproduce my result:

/*

Use of included script samples are subject to the terms specified at http://www.microsoft.com/resources/sharedsource/licensingbasics/permissivelicense.mspx

Written by Jomo Fisher

*/

using System;

using System.Collections.Generic;

using System.Linq;

using System.Text;

using System.Reflection;

using FastSwitch;

namespace ConsoleApplication1 {

   class Program {

       static void Main(string[] args) {

           var dict = new Dictionary<string, int> {

                   {"happy", 9}, {"sneezy", 2},

                   {"doc", 7}, {"sleepy", 8},

                   {"dopey", 9}, {"grumpy", 2},

                   {"bashful", 6}

               };

           const int count = 1000000;

           DateTime start = DateTime.Now;

           for (int i = 0; i < count; ++i) {

               var v = dict["happy"];

               v = dict["sneezy"];

               v = dict["doc"];

               v = dict["sleepy"];

               v = dict["dopey"];

               v = dict["grumpy"];

               v = dict["bashful"];

           }

           TimeSpan elapsed = DateTime.Now - start;

           Console.WriteLine("Dictionary: {0} ms", elapsed.TotalMilliseconds);

           var compiled = SwitchCompiler.CreateSwitch(dict);

           start = DateTime.Now;

           for (int i = 0; i < count; ++i) {

               var v = compiled("doc");

               System.Diagnostics.Debug.Assert(v == 7);

               v = compiled("happy");

               System.Diagnostics.Debug.Assert(v == 9);

               v = compiled("sneezy");

               System.Diagnostics.Debug.Assert(v == 2);

               v = compiled("sleepy");

               System.Diagnostics.Debug.Assert(v == 8);

               v = compiled("dopey");

               System.Diagnostics.Debug.Assert(v == 9);

               v = compiled("grumpy");

               System.Diagnostics.Debug.Assert(v == 2);

               v = compiled("bashful");

               System.Diagnostics.Debug.Assert(v == 6);

           }

           elapsed = DateTime.Now - start;

           Console.WriteLine("Compiled: {0} ms", elapsed.TotalMilliseconds);

       }

   }

}

Thursday, March 29, 2007 3:09 PM by Jomo Fisher

# Fast Switching with LINQ

You've been kicked (a good thing) - Trackback from DotNetKicks.com

Friday, March 30, 2007 7:34 PM by DotNetKicks.com

# re: Fast Switching with LINQ

Looks like you want a perfect hash or a minimal perfect hash? See e.g. http://burtleburtle.net/bob/hash/perfect.html

I don't know about a c# implementation though.

Saturday, March 31, 2007 8:52 AM by tom

# re: Fast Switching with LINQ

The c++ code and c# code are timing 2 different things.

the c++ code is cycling through each key for a total of 1,000,000 lookups.  the c# code is looking each key up 1,000,000 times for a total of 7,000,000 lookups.

Saturday, March 31, 2007 9:36 AM by Justin

# re: Fast Switching with LINQ

Justin,

You're right. I should have noticed the C++ code was doing 1/7th of the lookups as the C# code. When I adjust the C++ code, it dutifully reports 385ms which is pretty close to what the plain old .NET dictionary was doing.

So I guess the LINQ algorithm is in the cat-bird seat, at least for a the time-being.

I also noticed the C++ answer requires that the set of keys be sorted alphabetically before lookups can commence, so there *is* a little preparation time.

Monday, April 02, 2007 12:12 PM by Jomo Fisher

# re: Fast Switching with LINQ

Tom,

I had considered a minimal perfect hash (and I had even looked at the exact page you sent me). I still think the LINQ algorithm should perform better in many cases because that perfect hashing algorithm is O(n) on the number of bytes in the input key. The LINQ algorithm is O(log N) on the number of bytes in the key. In particular, I think the LINQ algorithm will outpace the other when the size of the keys is large and the length of the keys is non-uniform.

Of course the big-O notation can hide a potentially large constant factor, but I don't really see why it would in this case here.

Also, keep in mind that that particular minimal perfect hash implementation is throwing away a feature that I wanted to keep: It can't easily accomodate key sets that are available only at run time.

Monday, April 02, 2007 12:24 PM by Jomo Fisher

# re: Fast Switching with LINQ

The C++ compiler optimized out the entire loop in my first version.  I added the total variable, and it still optimized out the whole loop.  I added the cout statement at the bottom and it finally executed the code.

Any sign the C# compiler is optimizing out the whole loop in the C# code?

Tuesday, April 03, 2007 5:28 PM by Jeff

# re: Fast Switching with LINQ

The C# compiler is definitely not removing the loop--for one, it has no way to determine whether the delegate has side-effects that need to be preserved.

This can be seen by increasing the number of iterations--2M loops run in twice the time as 1M.

Tuesday, April 03, 2007 6:15 PM by Jomo Fisher

# re: Fast Switching with LINQ

Wow, your switch compiler and the .Net VM is very impressive indeed.  My fastest run of the following C++ code was 88 ms.  Note also that I kind of cheated and used std::find instead of std::lower_bound, because when there are only 1 or 2 elements in the vector, std::find runs much faster.

#include "stdafx.h"

struct KeyValue {

 const char* key;

 int value;

};

struct CharValue {

 int char_;

 bool terminal_;

 union {

   int value_;

   size_t next_index_;

   vector<CharValue>* next_;

 };

 ptrdiff_t char_step_;

};

inline bool operator < (const CharValue& value, int c) {

 return value.char_ < c;

}

inline bool operator < (int c, const CharValue& value) {

 return c < value.char_;

}

inline bool operator == (const CharValue& value, int c) {

 return value.char_ == c;

}

static const KeyValue kvs[] = {

 {"bashful", 6},

 {"doc", 7},

 {"dopey", 9},

 {"grumpy", 2},

 {"happy", 9},

 {"sleepy", 8},

 {"sneezy", 2},

}, * const kvs_end = kvs + (sizeof(kvs) / sizeof(0[kvs]));

class KnownSetTrie {

public:

 typedef vector<CharValue> CharMap;

 vector<CharMap> tree_;

 ptrdiff_t first_step_;

 template <typename Iter>

 KnownSetTrie(Iter begin, Iter end) {

   first_step_ = Compile(begin, end, 0) - 1;

   // Now that the tree_ is stable, convert indices to node pointers.

   for(vector<CharMap>::iterator i = tree_.begin(); i != tree_.end(); ++i) {

     for (CharMap::iterator j = i->begin(); j != i->end(); ++j)

       if (!j->terminal_)

         j->next_ = &tree_[j->next_index_];

   }

 }

 template <typename Iter>

 ptrdiff_t Compile(Iter begin, Iter end, ptrdiff_t char_index) {

   CharMap map;

   Iter start = begin, i = begin + 1;

   while (true) {

     CharValue value;

     value.char_ = start->key[char_index];

     if (i != end && i->key[char_index] == start->key[char_index]) {

       do {

         ++i;

       } while (i != end && i->key[char_index] == start->key[char_index]);

       value.terminal_ = false;

       value.char_step_ = Compile(start, i, char_index + 1);

       value.next_index_ = tree_.size() - 1;

     } else {

       value.terminal_ = true;

       value.value_ = start->value;

     }

     map.push_back(value);

     if (i == end)

       break;

     start = i++;

   }

   if (1 == map.size() && !map.front().terminal_) {

     // Pass through node:

     return map.front().char_step_ + 1;

   }

   tree_.push_back(map);

   return 1;

 }

 inline int Lookup(const char* s) const {

   const CharMap* node = &tree_.back();

   s += first_step_;

   while (true) {

     const CharMap::const_iterator value = find(node->begin(), node->end(), *s);

     if (value->terminal_)

       return value->value_;

     s += value->char_step_;

     node = value->next_;

   }

 }

};

int _tmain(int argc, _TCHAR* argv[])

{

 KnownSetTrie trie(kvs, kvs_end);

 LARGE_INTEGER begin, end, frequency;

 QueryPerformanceCounter(&begin);

 int total = 0;  // Just so the compiler doesn't completely optimize out the for loop.

 for (int i = 0; i < 1000000; ++i) {

   for (const KeyValue* k = kvs; k != kvs_end; ++k)

     total  += trie.Lookup(k->key);

 }

 QueryPerformanceCounter(&end);

 QueryPerformanceFrequency(&frequency);

 __int64 ms = (end.QuadPart - begin.QuadPart) * 1000 / frequency.QuadPart;

 cout << "Total:\t" << total << endl;

 cout << "Milliseconds:\t" << ms << endl;

}

Tuesday, April 03, 2007 7:37 PM by Jeff

# LINQ is seemingly more powerful than simple ORM

It seems the LINQ guys have done a great job of optimization inside of the LINQ namespace (System.Linq)....

Wednesday, April 04, 2007 5:47 AM by CedarLogic - Shawn Cicoria

# re: Fast Switching with LINQ

I got the C++ code down to 70 ms.  It took a __fastcall and some fun struct packing, though.

I can't wait to use C# .Net!

#include "stdafx.h"

struct KeyValue {

 const char* key;

 int value;

};

struct CharValue {

 int char_ : 8;

 int terminal_ : 1;

 int char_step_ : 23;

 union {

   int value_;

   size_t next_index_;

   vector<CharValue>* next_;

 };

};

inline bool operator < (const CharValue& value, int c) {

 return value.char_ < c;

}

inline bool operator < (int c, const CharValue& value) {

 return c < value.char_;

}

inline bool operator == (const CharValue& value, int c) {

 return value.char_ == c;

}

static const KeyValue kvs[] = {

 {"bashful", 6},

 {"doc", 7},

 {"dopey", 9},

 {"grumpy", 2},

 {"happy", 9},

 {"sleepy", 8},

 {"sneezy", 2},

}, * const kvs_end = kvs + (sizeof(kvs) / sizeof(0[kvs]));

class KnownSetTrie {

public:

 typedef vector<CharValue> CharMap;

 vector<CharMap> tree_;

 ptrdiff_t first_step_;

 CharMap* root_;

 template <typename Iter>

 KnownSetTrie(Iter begin, Iter end) {

   first_step_ = Compile(begin, end, 0) - 1;

   // Now that the tree_ is stable, convert indices to node pointers.

   for(vector<CharMap>::iterator i = tree_.begin(); i != tree_.end(); ++i) {

     for (CharMap::iterator j = i->begin(); j != i->end(); ++j)

       if (!j->terminal_)

         j->next_ = &tree_[j->next_index_];

     root_ = &tree_.back();

   }

 }

 template <typename Iter>

 ptrdiff_t Compile(Iter begin, Iter end, ptrdiff_t char_index) {

   CharMap map;

   Iter start = begin, i = begin + 1;

   while (true) {

     CharValue value;

     value.char_ = start->key[char_index];

     if (i != end && i->key[char_index] == start->key[char_index]) {

       do {

         ++i;

       } while (i != end && i->key[char_index] == start->key[char_index]);

       value.terminal_ = false;

       value.char_step_ = Compile(start, i, char_index + 1);

       value.next_index_ = tree_.size() - 1;

     } else {

       value.terminal_ = true;

       value.value_ = start->value;

     }

     map.push_back(value);

     if (i == end)

       break;

     start = i++;

   }

   if (1 == map.size() && !map.front().terminal_) {

     // Pass through node:

     return map.front().char_step_ + 1;

   }

   tree_.push_back(map);

   return 1;

 }

 int __fastcall Lookup(const char* s) const {

   const CharMap* node = root_;

   s += first_step_;

   while (true) {

     const CharMap::const_iterator value = find(node->begin(), node->end(), *s);

     if (value->terminal_)

       return value->value_;

     s += value->char_step_;

     node = value->next_;

   }

 }

};

int _tmain(int argc, _TCHAR* argv[])

{

 KnownSetTrie trie(kvs, kvs_end);

 LARGE_INTEGER begin, end, frequency;

 QueryPerformanceCounter(&begin);

 int total = 0;  // Just so the compiler doesn't completely optimize out the for loop.

 for (int i = 0; i < 1000000; ++i) {

   for (const KeyValue* k = kvs; k != kvs_end; ++k)

     total += trie.Lookup(k->key);

 }

 QueryPerformanceCounter(&end);

 QueryPerformanceFrequency(&frequency);

 __int64 ms = (end.QuadPart - begin.QuadPart) * 1000 / frequency.QuadPart;

 cout << "Total:\t" << total << endl;

 cout << "Milliseconds:\t" << ms << endl;

}

Wednesday, April 04, 2007 11:52 AM by Jeff

# LINQ is seemingly more powerful than simple ORM

It seems the LINQ guys have done a great job of optimization inside of the LINQ namespace (System.Linq).

Wednesday, April 04, 2007 1:02 PM by CedarLogic
Shawn Cicoria

# re: Fast Switching with LINQ

Hey Jeff,

That's a really nice solution. I was wondering whether a trie could help here.

Wednesday, April 04, 2007 1:31 PM by Jomo Fisher

# re: Fast Switching with LINQ

Okay, this is insane. How could you rate grumpy and sneezy as a 2? Unless of course 1 is the best and 10 is the worst.

This "SwitchCompiler.CreateSwitch" is absolutely amazing. It's a best of both world's sort of solution -- we can focus on the high level problem, and have Linq provide optimisations that would otherwise be utterly unmaintainable and difficult to perform manually.

Brilliant stuff. Thank you.

lb

Wednesday, April 04, 2007 9:13 PM by lb

# Community Convergence XXIV

Recently my time has been taken up with a series of internal issues involving Beta1, C# Samples and various

Monday, April 09, 2007 2:41 AM by Charlie Calvert's Community Blog

# re: Fast Switching with LINQ

A faster solution than a trie is a hand-built binary search tree, which is what I imagine case statement compiler does.  This code runs around 55 ms, and doesn't cheat like my earlier code with __fastcalls or std::find() in place of std::lower_bound():

Sorry for turning this into a language shoot-out, but this looked like such an interesting case to play with.  And again the .Net runtime performs astoundingly well.

#include "stdafx.h"

struct KeyValue {

 const char* key;

 int value;

};

static const KeyValue kvs[] = {

 {"bashful", 6},

 {"doc", 7},

 {"dopey", 9},

 {"grumpy", 2},

 {"happy", 9},

 {"sleepy", 8},

 {"sneezy", 2},

}, * const kvs_end = kvs + (sizeof(kvs) / sizeof(0[kvs]));

struct DTreeNode;

struct DTreeStep {

 bool terminal;

 union {

   int retval;

   int char_step;

 };

 union {

   DTreeNode* next;

   size_t next_index;

 };

};

struct DTreeNode {

 char c;

 DTreeStep less_step, ge_step;

};

class DTree {

public:

 vector<DTreeNode> nodes_;

 DTreeStep root_step_;

 template <typename Iter>

 DTree(Iter begin, Iter end) {

   root_step_ = Compile(begin, end, 0);

   // Convert indices into pointers.

   for (vector<DTreeNode>::iterator i = nodes_.begin(); i != nodes_.end(); ++i) {

     if (!i->less_step.terminal)

       i->less_step.next = &*(nodes_.begin() + i->less_step.next_index);

     if (!i->ge_step.terminal)

       i->ge_step.next = &*(nodes_.begin() + i->ge_step.next_index);

   }

   root_step_.next = &*(nodes_.begin() + root_step_.next_index);

 }

 template <typename Iter>

 DTreeStep Compile(Iter begin, Iter end, ptrdiff_t char_index) {

   ptrdiff_t count = end - begin;

   DTreeStep step;

   if (1 == count) {

     step.terminal = true;

     step.retval = begin->value;

     return step;

   }

   Iter middle = begin + (end - begin) / 2,

     middle_begin = middle, middle_end = middle + 1;

   while (true) {

     if (middle_begin == begin)

       break;

     Iter prior = middle_begin - 1;

     if (prior->key[char_index] == middle->key[char_index])

       middle_begin = prior;

     else

       break;

   }

   while (true) {

     if (middle_end == end)

       break;

     if (middle_end->key[char_index] == middle->key[char_index])

       ++middle_end;

     else

       break;

   }

   const ptrdiff_t less_count = middle_begin - begin,

     equal_count = middle_end - middle_begin,

     greater_count = end - middle_end;

   if (0 == less_count) {

     if (0 == greater_count) {

       // All middles, pass through

       step = Compile(begin, end, char_index + 1);

       step.char_step += 1;

       return step;

     } else {

       // middles and greaters

       middle_begin = middle_end;

     }

   }

   DTreeNode node;

   node.c = middle_end->key[char_index];

   node.less_step = Compile(begin, middle_begin, char_index);

   node.ge_step = Compile(middle_begin, end, char_index);

   step.terminal = false;

   step.char_step = 0;

   step.next_index = nodes_.size();

   nodes_.push_back(node);

   return step;

 }

 int Lookup(const char* s) const {

   const DTreeStep* step = &root_step_;

   while (true) {

     const DTreeNode* node = step->next;

     step = *s < node->c ? &node->ge_step : &node->less_step;

     if (step->terminal)

       return step->retval;

     s += step->char_step;

   }

 }

};

int _tmain(int argc, _TCHAR* argv[])

{

 DTree dtree(kvs, kvs_end);

 LARGE_INTEGER begin, end, frequency;

 QueryPerformanceCounter(&begin);

 int total = 0;  // Just so the compiler doesn't completely optimize out the for loop.

 for (int i = 0; i < 1000000; ++i) {

   for (const KeyValue* k = kvs; k != kvs_end; ++k)

     total += dtree.Lookup(k->key);

 }

 QueryPerformanceCounter(&end);

 QueryPerformanceFrequency(&frequency);

 __int64 ms = (end.QuadPart - begin.QuadPart) * 1000 / frequency.QuadPart;

 cout << "Total:\t" << total << endl;

 cout << "Milliseconds:\t" << ms << endl;

}

Monday, April 09, 2007 6:37 PM by Jeff

# re: Fast Switching with LINQ

ahh, I have no idea how you lot managed to get such results on 3.4 ghz machines for the standard dictionary tests.

heres my code.. using standard generic dictionary and I get wihout fail 92-93ms lookup times for 1 million loops.

I havent tried the linq version on my machine as I'm only running .net 2.0  but even comparing the results for linq here (even though this is distorted due to difference in machines used) the difference is a max of 20%..  nothing like the 900% claimed.

by the way..  my tests were all run on a p4 2.4 ghz machine...   i'm sure using a cpu that clocks 30% faster will also reduce the performance gap between my results and those posted in the original statement.

PS.  dont use datetime for monitoring performance as datetime is not updated every tick..  my experience is that datetime as a +/- difference of 30-40 ms on a good day vs the stopwatch class.

anyway heres the code I run my tests with.

Dictionary<string, int> index = new Dictionary<string, int>();

           index.Add("happy", 9);

           index.Add("sneezy", 2);

           index.Add("doc", 7);

           index.Add("sleepy", 8);

           index.Add("dopey", 9);

           index.Add("grumpy", 2);

           int found = 0;            

           Stopwatch sw = new Stopwatch();

           sw.Start();

           for (int i = 0; i < 1000000; i++)

               found = index["doc"];

           long elapsed = sw.ElapsedMilliseconds;

           sw.Stop();

           sw = null;

Friday, April 13, 2007 4:55 AM by Zan

# re: Fast Switching with LINQ

Zan,

Your test is only looking up one value in a loop. Our tests are looking up seven values in a loop. You can see the actual test code in the comments above. If you adjust your test to match you'll get ~650ms.

Monday, April 16, 2007 7:45 PM by Jomo Fisher

# re: Fast Switching with LINQ

I found that using KeyValuePair<string,TR> instead of Case<TR> allows to expose the function CreateSwitch to have a IEnumerable<KeyValuePair<string,TR>> parameter

this allows these uses:

       public static Func<string, TR> CreateSwitch<TR>(this IEnumerable<string> strings, Func<string, TR> func)

       {

           return CreateSwitch(from s in strings select new KeyValuePair<string, TR>(s, func(s)));

       }

       public static Func<string, IEnumerable<TR>> EnumerableInvert<TR>(this IEnumerable<TR> values, Func<TR, string> name)

I added the fllowing check in the function:

       public static Func<string, TR> CreateSwitch<TR>(this IEnumerable<KeyValuePair<string, TR>> caseDict)

       {

           if (caseDict.Count() != caseDict.Distinct().Count())

               throw new ArgumentException("duplicate names");

           var cases = caseDict.ToArray();

           var p = Expression.Parameter(typeof(string), "p");

           var expr = Expression.Lambda<Func<string, TR>>(

               BuildStringLength( p, cases.ToOrderedArray(s => s.Key.Length), 0, cases.Length  - 1),

               new ParameterExpression[] { p }

           );

           var del = expr.Compile();

           return del;

       }

       {

           return CreateSwitch(from v in values group v by name(v) into byname select new KeyValuePair<string, IEnumerable<TR>>(byname.Key, byname));

       }

       public static Func<string,TR> Invert<TR>(this IEnumerable<TR> values, Func<TR, string> name)

       {

           return CreateSwitch(from v in values group v by name(v) into byname select new KeyValuePair<string, TR>(byname.Key, byname.First()));

       }

       public static Func<string, TR> ParseEnum<TR>()

           //where TR: Enum

       {

           return CreateSwitch(from v in (TR[])Enum.GetValues(typeof(TR)).OfType<TR>() select new KeyValuePair<string, TR>(Enum.GetName(typeof(TR), v), v));

       }

Tuesday, May 22, 2007 11:44 AM by Gilles Michard

# LINQ to SQL Beta2 Performance Numbers and the Dynamic Compilation Pattern

Rico Mariani has been posting about LINQ to SQL perfomance and has finally posted the performance numbers

Friday, July 06, 2007 11:58 AM by Jomo Fisher -- C#, LINQ and Whatnot

# LINQ to SQL Beta2 Performance Numbers and the Dynamic Compilation Pattern

Rico Mariani has been posting about LINQ to SQL perfomance and has finally posted the performance numbers

Friday, July 06, 2007 12:56 PM by Noticias externas

# The Least You Need to Know about C# 3.0

Jomo Fisher—A lot of people (myself included) have written about LINQ in the next version of C#. LINQ

Monday, July 23, 2007 5:38 PM by Jomo Fisher -- C#, LINQ and Whatnot

# re: Fast Switching with LINQ

Great stuff Jomo!

Unfortunately it doesn't work with this dictionary:

{"Ado", i++},

{"A2o", i++},

{"A2i", i++},

{"B2o", i++},

{"AdoX", i++},

{"AdPX", i++}

It crashes with an IndexOutOfRangeException in BuildStringChar() because it gets a set of strings that are of different lengths. It seems to me that both your if statements that deal with middle are incorrect. If I replace middle with upper in those and in both calls to FirstDifferentDown() it almost works.

Another problem I had to fix is that ToOrderedArray() orders the strings in a way that is not consistent with the Char comparisons in the resulting delegate. In this case it breaks down because 'o' < 'P' is false while "AdoX".CompareTo("AdPX") is -1. My solution was to make a new ToOrderedArray() that takes an IComparer<string> and passes it on to OrderBy(). I then pass it the Instance member from this class:

internal class StringCharComparer : IComparer<string>

{

   public static StringCharComparer Instance = new StringCharComparer();

   public int Compare(string x, string y)

   {

       for (int i = 0; i < Math.Min(x.Length, y.Length); i++)

       {

           if (x[i] > y[i])

               return 1;

           else if (x[i] < y[i])

               return -1;

       }

       if (x.Length == y.Length)

           return 0;

       else if (x.Length > y.Length)

           return 1;

       else

           return -1;

   }

}

By the way, it was pure luck that the dictionaries I tried exposed these bugs.

Thoughts?

Sunday, August 05, 2007 1:19 PM by Raptor-75

# re: Fast Switching with LINQ

Raptor, I think you're right on both accounts. Thanks for pointing this out.

Thursday, August 09, 2007 12:14 PM by Jomo Fisher

# re: Fast Switching with LINQ

Nice, one thing I did was add the optimized Lookup into the testing for comparison. This is where one decides if the lookup is a dynamic list and if the dictionary will change over time. Even though the optimized Lookup is hard to maintain and debug that would be OK if it list was static and did not change over time. But if it does change then one has choices to build a Compiled Expression versus a conventional way. The new idea would be to take a conversional way of doing something and turning it into an Expression that could be compiled and may have performance gains. I like how you take something as simple as a lookup that one might think as being optimized but showing that by building an optimized expression the performance can be even faster.

Dictionary: 700.07 ms  ~ 7.8x slower

Compiled: 170.017 ms   ~ 1.89 - 2x slower

Lookup: 90.009 ms

Dictionary: 700.07 ms  ~

Compiled: 170.017 ms   ~ 4.12x faster

Thanx for the Expresion Building logic that helps whith how all various static methods in System.Linq.Expressions are used to build a dynamic function.

Tuesday, August 28, 2007 3:27 PM by Some One

# re: Fast Switching with LINQ

Nothing really to add of value, but kind of interesting. If you define the following class:

class EqualityCompareStrings : IEqualityComparer<string>

{

   public bool Equals(string x, string y) { return string.Equals(x, y); }

   public int GetHashCode(string obj) { return obj.GetHashCode(); }

}

And then create the dictionary with an instance of that class like

Dictionary<string, int> dict = new Dictionary<string, int>(new EqualityCompareStrings());

Then you get (from my test!) a time on the non-SwitchCompiler code of about 80%.

So as I said nothing really to do with the anything much, although considering how often strings are used as keys it might be a trivial optimization to put into System.Collections.Generic.EqualityComparer<T>. CreateComparer() to handle it as a special case...

Assuming that I haven't done anything wrong :-)

(But then you might as well use the Jomo special SwitchCompiler - but if you want to keep using Dictionary<string, ...>)

Tuesday, November 06, 2007 11:13 PM by Paul Westcott

# re: Fast Switching with LINQ

Hello

I've also hit on the bug that Raptor-75 pointed out. Using his StringCharComparer dindn't solve it all. If you have a complete fix can you please post it?

Best Regards,

Gustavo Guerra

Monday, December 24, 2007 6:18 AM by Gustavo Guerra

# Using LINQ to 'Hand-Optimize' Your Code

On my current project, we're in a 'make it faster' mode right now.&#160; We've already knocked out a

Thursday, April 10, 2008 10:23 PM by Jonathan Rupp

# StaticStringDictionary - Fast Switching with LINQ revisited

StaticStringDictionary - Fast Switching with LINQ revisited

Thursday, August 28, 2008 4:58 PM by Code Beside

# re: Fast Switching with LINQ

Hi, I posted a version that fixed the bugs found by Raptor-75 in my blog: http://blog.codebeside.org/archive/2008/08/28/staticstringdictionary-fast-switching-with-linq-revisited.aspx

Thursday, August 28, 2008 5:01 PM by Gustavo Guerra

# Update on Fast Switching with LINQ

Jomo Fisher—A while back I wrote about using LINQ to get close-to-the-metal performance in a particular

Friday, August 29, 2008 11:44 AM by Jomo Fisher -- Sharp Things

Leave a Comment

(required) 
required 
(required) 

  
Enter Code Here: Required
 
Page view tracker