<?xml version="1.0" encoding="UTF-8" ?>
<?xml-stylesheet type="text/xsl" href="http://blogs.msdn.com/utility/FeedStylesheets/rss.xsl" media="screen"?><rss version="2.0" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:slash="http://purl.org/rss/1.0/modules/slash/" xmlns:wfw="http://wellformedweb.org/CommentAPI/"><channel><title>System.Collections.Generic Dictionary&lt;K,V&gt; Capacity, Hashing, and Collisions.</title><link>http://blogs.msdn.com/kcwalina/archive/2004/08/06/210297.aspx</link><description>Somebody asked me a question how to set the initial capacity of System.Collections.Generic.Dictionary&amp;lt;K,V&amp;gt; if one knows that the Dictionary will contain 1000 entries. The short answer Dictionary&amp;lt;int,string&amp;gt; numbers = new Dictionary&amp;lt;int,string&amp;gt;(1000);</description><dc:language>en-US</dc:language><generator>CommunityServer 2.1 SP1 (Build: 61025.2)</generator><item><title>re: System.Collections.Generic Dictionary&lt;K,V&gt; Capacity, Hashing, and Collisions.</title><link>http://blogs.msdn.com/kcwalina/archive/2004/08/06/210297.aspx#210312</link><pubDate>Sat, 07 Aug 2004 04:22:00 GMT</pubDate><guid isPermaLink="false">91d46819-8472-40ad-a661-2c78acb4018c:210312</guid><dc:creator>Eric Newton</dc:creator><description>on chaining, why the next prime number?  why not just use 110% of capacity asked for?  </description></item><item><title>re: System.Collections.Generic Dictionary&lt;K,V&gt; Capacity, Hashing, and Collisions.</title><link>http://blogs.msdn.com/kcwalina/archive/2004/08/06/210297.aspx#210368</link><pubDate>Sat, 07 Aug 2004 07:23:00 GMT</pubDate><guid isPermaLink="false">91d46819-8472-40ad-a661-2c78acb4018c:210368</guid><dc:creator>Steve Hall</dc:creator><description>Better yet, why isn't the &amp;quot;growth factor&amp;quot; a user-writable property, so the user can tune his application if smart enough?  Of course, a default could be supplied.  I too have problems with &amp;quot;next prime number&amp;quot; as a default growth factor, since this increases the growth factor %-wise FASTER for larger collections, which might not be desirable.&lt;br&gt;&lt;br&gt;I'm thinking of collections in the millions of items...  Primes get farther and farther apart and will cause an attempt to increase by larger and larger %'s as the collection grows.  A lot of apps I write would probably run out of memory prematurely that way, but not if I was able to tune it with a percentage...and be able to change, e.g., decrease, the growth factor as it grows.&lt;br&gt;&lt;br&gt;Basically, ANY rule-of-thumb default will ALWAYS have &amp;quot;exceptions to the rule&amp;quot;, and you should allow for the app. to be smart about allocations by overriding the default, esp. those app's that have large collections constrained by the 2/3GB address space size.</description></item><item><title>re: System.Collections.Generic Dictionary&lt;K,V&gt; Capacity, Hashing, and Collisions.</title><link>http://blogs.msdn.com/kcwalina/archive/2004/08/06/210297.aspx#210420</link><pubDate>Sat, 07 Aug 2004 09:06:00 GMT</pubDate><guid isPermaLink="false">91d46819-8472-40ad-a661-2c78acb4018c:210420</guid><dc:creator>Scott Mitchell</dc:creator><description>Not to be too pedantic, but I think the comment re: Chaining reallocation is a triffle off.  You say: &amp;quot;Therefore the Dictionary&amp;lt;K,V&amp;gt; can be completely saturated, we don’t use loadFactor, and we simply take the next prime number after the capacity to determine how many buckets to allocate.&amp;quot;  Rather, it gets the closest prime number twice as large as the last prime number.  This makes sense, since certain prime numbers are near by one another, like 1783 and 1787.  You don't want to go through all the work of resizing only to increase the size by a small number.  (In fact, IIRC from my school days, it is assumed that there is an infinite # of twin primes - prime numbers that differ by 2 [&lt;a target="_new" href="http://primes.utm.edu/top20/page.php?id=1"&gt;http://primes.utm.edu/top20/page.php?id=1&lt;/a&gt;].)&lt;br&gt;&lt;br&gt;Steve, your comment about primes always getting further apart as they get bigger is bit off, as you can see visually at this list of the first 1,000 primes: &lt;a target="_new" href="http://www.utm.edu/research/primes/lists/small/1000.txt"&gt;http://www.utm.edu/research/primes/lists/small/1000.txt&lt;/a&gt;&lt;br&gt;&lt;br&gt;For the Dictionary, it might make sense to let the developer specify the rate at which the bucket size increases, although doubling size for a cache is a good heuristic.  For the Hashtable, however, it is important that the next number be a prime, as for probing it is important that the hashed value and the number of buckets be relatively prime, which is guaranteed if the number of buckets is prime itself.  Granted, you could still allow a differnt scaling factor, and just manually find the next prime.&lt;br&gt;&lt;br&gt;For more on Hashtables, check out my article at: &lt;a target="_new" href="http://msdn.microsoft.com/vcsharp/default.aspx?pull=/library/en-us/dv_vstechart/html/datastructures_guide2.asp"&gt;http://msdn.microsoft.com/vcsharp/default.aspx?pull=/library/en-us/dv_vstechart/html/datastructures_guide2.asp&lt;/a&gt;</description></item><item><title>re: System.Collections.Generic Dictionary&lt;K,V&gt; Capacity, Hashing, and Collisions.</title><link>http://blogs.msdn.com/kcwalina/archive/2004/08/06/210297.aspx#210642</link><pubDate>Sat, 07 Aug 2004 23:33:00 GMT</pubDate><guid isPermaLink="false">91d46819-8472-40ad-a661-2c78acb4018c:210642</guid><dc:creator>Peter Golde</dc:creator><description>It would be interesting to know why chaining was used instead of probing in Dictionary&amp;lt;K,V&amp;gt;. Off the top of my head I would expect probing to use less memory than chaining, which should be an advantage.</description></item><item><title>re: System.Collections.Generic Dictionary&lt;K,V&gt; Capacity, Hashing, and Collisions.</title><link>http://blogs.msdn.com/kcwalina/archive/2004/08/06/210297.aspx#214302</link><pubDate>Fri, 13 Aug 2004 21:48:00 GMT</pubDate><guid isPermaLink="false">91d46819-8472-40ad-a661-2c78acb4018c:214302</guid><dc:creator>Krzysztof Cwalina [MSFT]</dc:creator><description>Scott, you are right. We do take the closest prime after 2 x the current number of buckets.&lt;br&gt;&lt;br&gt;Steve, we have considered pluggable growth algorithms, but decided against it for performance and API complexity reasons. We may consider it in the future though.&lt;br&gt;&lt;br&gt;Peter, we chose chaining after some perf analysis of common scenarios. Both of the implementations were virtually same in the majority of benchmarks. Chaining was slightly better in memory consumption of some specific scenarios that we felt were more importnat than the scenarios were it was actually slightly worse (as you observed), but I cannot recall what they were now. To tell you the truth, I don’t think any of the implementations is a clear winner. Possibly the main reason we ended up with chaining is that we had a brand new implementation with cleaner code that the developers liked more :-)</description></item><item><title>Jordy Baart : System.Collections vs. System.Collection.Generic and System.Collections.ObjectModel</title><link>http://blogs.msdn.com/kcwalina/archive/2004/08/06/210297.aspx#766155</link><pubDate>Fri, 22 Sep 2006 16:38:29 GMT</pubDate><guid isPermaLink="false">91d46819-8472-40ad-a661-2c78acb4018c:766155</guid><dc:creator>Jordy Baart : System.Collections vs. System.Collection.Generic and System.Collections.ObjectModel</dc:creator><description>PingBack from &lt;a rel="nofollow" target="_new" href="http://bloggingabout.net/blogs/jordy/archive/2006/09/22/System.Collections-vs.-System.Collection.Generic-and-System.Collections.ObjectModel.aspx"&gt;http://bloggingabout.net/blogs/jordy/archive/2006/09/22/System.Collections-vs.-System.Collection.Generic-and-System.Collections.ObjectModel.aspx&lt;/a&gt;</description></item><item><title>Objekt Identit?t | hilpers</title><link>http://blogs.msdn.com/kcwalina/archive/2004/08/06/210297.aspx#9347923</link><pubDate>Tue, 20 Jan 2009 18:27:55 GMT</pubDate><guid isPermaLink="false">91d46819-8472-40ad-a661-2c78acb4018c:9347923</guid><dc:creator>Objekt Identit?t | hilpers</dc:creator><description>&lt;p&gt;PingBack from &lt;a rel="nofollow" target="_new" href="http://www.hilpers.com/1041242-objekt-identitaet"&gt;http://www.hilpers.com/1041242-objekt-identitaet&lt;/a&gt;&lt;/p&gt;
</description></item></channel></rss>