<?xml version="1.0" encoding="UTF-8" ?>
<?xml-stylesheet type="text/xsl" href="http://blogs.msdn.com/utility/FeedStylesheets/rss.xsl" media="screen"?><rss version="2.0" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:slash="http://purl.org/rss/1.0/modules/slash/" xmlns:wfw="http://wellformedweb.org/CommentAPI/"><channel><title>Rico Mariani's Performance Tidbits : quiz</title><link>http://blogs.msdn.com/ricom/archive/tags/quiz/default.aspx</link><description>Tags: quiz</description><dc:language>en-US</dc:language><generator>CommunityServer 2.1 SP1 (Build: 61025.2)</generator><item><title>Performance Quiz #13 -- Linq to SQL compiled query cost -- solution</title><link>http://blogs.msdn.com/ricom/archive/2008/01/14/performance-quiz-13-linq-to-sql-compiled-query-cost-solution.aspx</link><pubDate>Mon, 14 Jan 2008 20:51:00 GMT</pubDate><guid isPermaLink="false">91d46819-8472-40ad-a661-2c78acb4018c:7110071</guid><dc:creator>ricom</dc:creator><slash:comments>18</slash:comments><comments>http://blogs.msdn.com/ricom/comments/7110071.aspx</comments><wfw:commentRss>http://blogs.msdn.com/ricom/commentrss.aspx?PostID=7110071</wfw:commentRss><description>&lt;P&gt;Well is there really a "solution" at all in general?&amp;nbsp; This particular case I think I constrained enough that you can claim an answer but does it generalize?&amp;nbsp; Let's look at what I got first, the raw results are pretty easy to understand.&lt;/P&gt;
&lt;P&gt;The experiment I conducted was to run a fixed number of queries (5000 in this case) but to break them up so that the compiled query was reused a decreasing amount.&amp;nbsp; The first run is the "best" 1 batch of 5000 selects all using the compiled query.&amp;nbsp; Then 2 batches of 2500, and so on down to 5000 batches of 1.&amp;nbsp; As a control I also run the uncompiled case at each step expecting of course that it makes no difference.&amp;nbsp; Note the output indicates we selected a total of 25000 rows of data -- that is 5 per select as expected.&amp;nbsp; Here are the raw results:&lt;/P&gt;
&lt;BLOCKQUOTE&gt;
&lt;P&gt;Testing 1 batches of 5000 selects &lt;BR&gt;5000 selects uncompiled 9200.0ms 25000 records total 543.48 selects/sec &lt;BR&gt;5000 selects compiled 5401.0ms 25000 records total 925.75 selects/sec &lt;/P&gt;
&lt;P&gt;Testing 2 batches of 2500 selects &lt;BR&gt;5000 selects uncompiled 9181.0ms 25000 records total 544.60 selects/sec &lt;BR&gt;5000 selects compiled 5402.0ms 25000 records total 925.58 selects/sec &lt;/P&gt;
&lt;P&gt;Testing 5 batches of 1000 selects &lt;BR&gt;5000 selects uncompiled 9169.0ms 25000 records total 545.32 selects/sec &lt;BR&gt;5000 selects compiled 5432.0ms 25000 records total 920.47 selects/sec &lt;/P&gt;
&lt;P&gt;Testing 100 batches of 50 selects &lt;BR&gt;5000 selects uncompiled 9184.0ms 25000 records total 544.43 selects/sec &lt;BR&gt;5000 selects compiled 5511.0ms 25000 records total 907.28 selects/sec &lt;/P&gt;
&lt;P&gt;Testing 1000 batches of 5 selects &lt;BR&gt;5000 selects uncompiled 9166.0ms 25000 records total 545.49 selects/sec &lt;BR&gt;5000 selects compiled 6526.0ms 25000 records total 766.17 selects/sec &lt;/P&gt;
&lt;P&gt;Testing 2500 batches of 2 selects &lt;BR&gt;5000 selects uncompiled 9165.0ms 25000 records total 545.55 selects/sec &lt;BR&gt;5000 selects compiled 7892.0ms 25000 records total 633.55 selects/sec &lt;/P&gt;
&lt;P&gt;Testing 5000 batches of 1 selects &lt;BR&gt;5000 selects uncompiled 9157.0ms 25000 records total 546.03 selects/sec &lt;BR&gt;5000 selects compiled 10825.0ms 25000 records total 461.89 selects/sec&lt;/P&gt;&lt;/BLOCKQUOTE&gt;
&lt;P&gt;And there you have it.&amp;nbsp; Even at 2 uses the compiled query still wins but at 1 use it loses.&amp;nbsp; In fact, the magic number for this particular query is about 1.5 average uses to break even.&amp;nbsp; But why?&amp;nbsp; And how might it change?&lt;/P&gt;
&lt;P&gt;Well, as has been observed in the comments, Linq query compilation isn't like regular expression compilation.&amp;nbsp; In fact compiling the query doesn't do anything that isn't going to have to happen anyway.&amp;nbsp; In fact, actually creating the compiled query with Query.Compile hardly does anything at all, it's all deferred until the query is run just as it would have been had the query not been compiled.&amp;nbsp; So what is the overhead?&amp;nbsp; Why is it slower at all?&amp;nbsp; And what's the point of it?&lt;/P&gt;
&lt;P&gt;Well the main purpose of that compiled query object is to have an object, of the correct type, that also has the correct lifetime.&amp;nbsp; The compiled query can live across DataContexts, in fact it could potentially live for the entire life of your program.&amp;nbsp; And since it has no shared state in it, it's thread-safe and so forth.&amp;nbsp; It exists to:&lt;/P&gt;
&lt;P&gt;1) Give the Linq to SQL system a place to store the results of analyzing the query (i.e. the actual SQL plus the delegate that will be used to extract data from the result set)&lt;/P&gt;
&lt;P&gt;2) Allow the user to specify the "variable parts" of the query.&amp;nbsp; The most common case isn't that the query is exactly the same from run to run, usually it's "nearly" the same... That is it's the same except that perhaps the search string is different in the where clause, or the ID being fetched is different.&amp;nbsp; The shape is the same.&amp;nbsp; Creating a delegate with parameters allows you to specify which things are fixed and which things are variable.&lt;/P&gt;
&lt;P&gt;Now there was some debate about how to make compiled queries durable, automatically caching them was considered, but this was something I was strongly against.&amp;nbsp; Largely because of the object lifetime issues it would cause.&amp;nbsp; First, you would have to do complicated matching of a created query against something that was already in the cache -- something I'd like to avoid.&amp;nbsp; Secondly you have to decide where to store the cache, if you associate it with the DataContext then you get much less query re-use because you only get a benefit if you run the same query twice in the same data context.&amp;nbsp; To get the most benefit you want to be able to re-use the query across DataContexts.&amp;nbsp; But then, do you make the cache global?&amp;nbsp; If you do you have threading issues accessing it, and you have the terrible problem that you don't know when is a good time to discard items from the cache.&amp;nbsp; Ultimately this was my strongest point, at the Linq data level we do not know enough about the query patterns to choose a good caching policy, and, as I've written many times before, when it comes to caching good policy is crucial.&amp;nbsp; In fact, analogously, we had to make changes in the regular expression caching system back in Whidbey precisely because we were seeing cases where our caching assumptions were resulting in catastrophically bad performance (Mid Life Crisis due to retained compiled regular expressions in our cache) --&amp;nbsp; I didn't want to make that mistake again.&lt;/P&gt;
&lt;P&gt;So that's roughly how we end up at our final design.&amp;nbsp; Any Linq to SQL user can choose how much or how little caching is done.&amp;nbsp; They control the lifetime, they can choose an easy mechanism (e.g. stuff it in a static variable forever) or a complicated recycling method depending on their needs.&amp;nbsp; Usually the simple choice is adequate.&amp;nbsp; And they can easily choose which queries to compile and which to just run in the usual manner.&lt;/P&gt;
&lt;P&gt;Let's get back to the overhead of compiled queries.&amp;nbsp; Besides the one-time cost of creating the delegate there is also an little extra delegate indirection on each run of the query plus the more complicated thing we have to do: since the compiled query can span DataContexts we have to make sure that the DataContext we are being given in any particular execution of a compiled query is compatible with the DataContext that was provided when the query was compiled the first time.&lt;/P&gt;
&lt;P&gt;Other than that the code path is basically the same, which means you come out ahead pretty quickly.&amp;nbsp; This test case was, as usual, designed to magnify the typical overheads so we can observe them.&amp;nbsp; The result set is a small number of rows, it is always the same rows, the database is local, and the query itself is a simple one.&amp;nbsp; All the usual costs of doing a query have been minimized.&amp;nbsp; In the wild you would expect the query to be more complicated, the database to be remote, the actual data returned to be larger and not always the same data.&amp;nbsp; This of course both reduces the benefit of compilation in the first place but also, as a consolation prize, reduces the marginal overhead.&lt;/P&gt;
&lt;P&gt;In short, if you expect to reuse the query at all, there is no performance related reason not to compile it.&amp;nbsp; &lt;/P&gt;&lt;img src="http://blogs.msdn.com/aggbug.aspx?PostID=7110071" width="1" height="1"&gt;</description><category domain="http://blogs.msdn.com/ricom/archive/tags/performance/default.aspx">performance</category><category domain="http://blogs.msdn.com/ricom/archive/tags/databases/default.aspx">databases</category><category domain="http://blogs.msdn.com/ricom/archive/tags/quiz/default.aspx">quiz</category></item><item><title>Performance Quiz #13 -- Linq to SQL compiled queries cost</title><link>http://blogs.msdn.com/ricom/archive/2008/01/11/performance-quiz-13-linq-to-sql-compiled-queries-cost.aspx</link><pubDate>Fri, 11 Jan 2008 23:08:00 GMT</pubDate><guid isPermaLink="false">91d46819-8472-40ad-a661-2c78acb4018c:7078749</guid><dc:creator>ricom</dc:creator><slash:comments>28</slash:comments><comments>http://blogs.msdn.com/ricom/comments/7078749.aspx</comments><wfw:commentRss>http://blogs.msdn.com/ricom/commentrss.aspx?PostID=7078749</wfw:commentRss><description>&lt;P&gt;I've written a few articles about Linq now and you know I was a big fan of &lt;A href="http://blogs.msdn.com/ricom/archive/2007/06/22/dlinq-linq-to-sql-performance-part-1.aspx" mce_href="http://blogs.msdn.com/ricom/archive/2007/06/22/dlinq-linq-to-sql-performance-part-1.aspx"&gt;compiled queries in Linq&lt;/A&gt; but what do they cost?&amp;nbsp; Or more specifically, how many times to you have to use a compiled query in order for the cost of compilation to pay for itself?&amp;nbsp; With regular expressions for instance it's usually a mistake to compile a regular expression if you only intend to match it against a fairly small amount of text.&lt;/P&gt;
&lt;P&gt;Lets do a specific experiment to get an idea.&amp;nbsp; Using the ubiquitous Northwinds database and getting the same data over and over to control for the the cost of the database accesses (and magnify any Linq overheads) we run this query:&lt;/P&gt;
&lt;BLOCKQUOTE&gt;
&lt;P&gt;var q = (from o in nw.Orders &lt;BR&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; select new { &lt;BR&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; OrderID = o.OrderID, &lt;BR&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; CustomerID = o.CustomerID, &lt;BR&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; EmployeeID = o.EmployeeID, &lt;BR&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; ShippedDate = o.ShippedDate &lt;BR&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; }).Take(5);&lt;/P&gt;&lt;/BLOCKQUOTE&gt;
&lt;P&gt;and compare it against:&lt;/P&gt;
&lt;BLOCKQUOTE&gt;
&lt;P&gt;var fq = CompiledQuery.Compile &lt;BR&gt;( &lt;BR&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp; (Northwinds nw) =&amp;gt; &lt;BR&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; (from o in nw.Orders &lt;BR&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; select new &lt;BR&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; { &lt;BR&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; OrderID = o.OrderID, &lt;BR&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; CustomerID = o.CustomerID, &lt;BR&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; EmployeeID = o.EmployeeID, &lt;BR&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; ShippedDate = o.ShippedDate &lt;BR&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; }).Take(5) &lt;BR&gt;);&lt;/P&gt;&lt;/BLOCKQUOTE&gt;
&lt;P&gt;So now the quiz:&amp;nbsp; How many times to I have to use the compiled version of the query in order for it to be cheaper to compile than it would have been to just use the original query directly?&lt;/P&gt;&lt;img src="http://blogs.msdn.com/aggbug.aspx?PostID=7078749" width="1" height="1"&gt;</description><category domain="http://blogs.msdn.com/ricom/archive/tags/performance/default.aspx">performance</category><category domain="http://blogs.msdn.com/ricom/archive/tags/databases/default.aspx">databases</category><category domain="http://blogs.msdn.com/ricom/archive/tags/quiz/default.aspx">quiz</category></item><item><title>Performance Quiz #12 -- The Cost of a Good Hash -- Solution</title><link>http://blogs.msdn.com/ricom/archive/2007/01/26/performance-quiz-12-the-cost-of-a-good-hash-solution.aspx</link><pubDate>Fri, 26 Jan 2007 21:01:00 GMT</pubDate><guid isPermaLink="false">91d46819-8472-40ad-a661-2c78acb4018c:1537591</guid><dc:creator>ricom</dc:creator><slash:comments>6</slash:comments><comments>http://blogs.msdn.com/ricom/comments/1537591.aspx</comments><wfw:commentRss>http://blogs.msdn.com/ricom/commentrss.aspx?PostID=1537591</wfw:commentRss><description>&lt;P&gt;
&lt;STYLE&gt;
.mytable {
font-face: arial;
font-size: 8pt;
}
&lt;/STYLE&gt;
Well once again there have&amp;nbsp; been many thoughtful replies to both the &lt;A href="http://blogs.msdn.com/ricom/archive/2007/01/22/performance-quiz-12-the-cost-of-a-good-hash.aspx" mce_href="http://blogs.msdn.com/ricom/archive/2007/01/22/performance-quiz-12-the-cost-of-a-good-hash.aspx"&gt;original question&lt;/A&gt; as well as the &lt;A href="http://blogs.msdn.com/ricom/archive/2007/01/24/performance-quiz-12-the-cost-of-a-good-hash-some-help.aspx" mce_href="http://blogs.msdn.com/ricom/archive/2007/01/24/performance-quiz-12-the-cost-of-a-good-hash-some-help.aspx"&gt;followup with hints&lt;/A&gt;.&lt;/P&gt;
&lt;P mce_keep="true"&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Recall the original question asked a bit of trivia:&amp;nbsp; could you name 5 implementations of GetHashCode in the framework that do things you might not expect to see in a hash function?&amp;nbsp; It's a little bit of a vague question and also requires&amp;nbsp;a good deal of arcane knowledge of the CLR.&amp;nbsp;The most interesting thing about the discussion I think is just what people considered "unusual."&amp;nbsp; But as always there was a method to my madness.&lt;/P&gt;
&lt;P&gt;So without further ado here's my list of "unusual" hash functions, there are 53 of them.&lt;/P&gt;
&lt;P mce_keep="true"&gt;&amp;nbsp;&lt;/P&gt;
&lt;TABLE class=mytable class="mytable"&gt;
&lt;CAPTION&gt;&lt;FONT size=2&gt;&lt;STRONG&gt;GetHashCode Methods that Allocate&lt;/STRONG&gt;&lt;/FONT&gt;&lt;/CAPTION&gt;
&lt;TBODY&gt;
&lt;TR&gt;
&lt;TD class=""&gt;java.lang.reflect.Method.GetHashCode&lt;/TD&gt;
&lt;TD class=""&gt;4.2&lt;/TD&gt;&lt;/TR&gt;
&lt;TR&gt;
&lt;TD class=""&gt;com.ms.lang.MulticastDelegate.GetHashCode&lt;/TD&gt;
&lt;TD class=""&gt;4.2&lt;/TD&gt;&lt;/TR&gt;
&lt;TR&gt;
&lt;TD class=""&gt;com.ms.lang.Delegate.GetHashCode&lt;/TD&gt;
&lt;TD class=""&gt;4.2&lt;/TD&gt;&lt;/TR&gt;
&lt;TR&gt;
&lt;TD class=""&gt;System.Uri.GetHashCode&lt;/TD&gt;
&lt;TD class=""&gt;3.8&lt;/TD&gt;&lt;/TR&gt;
&lt;TR&gt;
&lt;TD class=""&gt;System.Security.Policy.HashMembershipCondition.GetHashCode&lt;/TD&gt;
&lt;TD class=""&gt;3.8&lt;/TD&gt;&lt;/TR&gt;
&lt;TR&gt;
&lt;TD class=""&gt;java.lang.reflect.Field.GetHashCode&lt;/TD&gt;
&lt;TD class=""&gt;3.5&lt;/TD&gt;&lt;/TR&gt;
&lt;TR&gt;
&lt;TD class=""&gt;com.ms.wfc.ui.Region.GetHashCode&lt;/TD&gt;
&lt;TD class=""&gt;3.5&lt;/TD&gt;&lt;/TR&gt;
&lt;TR&gt;
&lt;TD class=""&gt;com.ms.wfc.ui.Pen.GetHashCode&lt;/TD&gt;
&lt;TD class=""&gt;3.5&lt;/TD&gt;&lt;/TR&gt;
&lt;TR&gt;
&lt;TD class=""&gt;com.ms.wfc.ui.Font.GetHashCode&lt;/TD&gt;
&lt;TD class=""&gt;3.5&lt;/TD&gt;&lt;/TR&gt;
&lt;TR&gt;
&lt;TD class=""&gt;com.ms.wfc.ui.Color.GetHashCode&lt;/TD&gt;
&lt;TD class=""&gt;3.5&lt;/TD&gt;&lt;/TR&gt;
&lt;TR&gt;
&lt;TD class=""&gt;com.ms.wfc.ui.Brush.GetHashCode&lt;/TD&gt;
&lt;TD class=""&gt;3.5&lt;/TD&gt;&lt;/TR&gt;
&lt;TR&gt;
&lt;TD class=""&gt;System.Security.Policy.NetCodeGroup.GetHashCode&lt;/TD&gt;
&lt;TD class=""&gt;3.5&lt;/TD&gt;&lt;/TR&gt;
&lt;TR&gt;
&lt;TD class=""&gt;System.Security.Policy.FileCodeGroup.GetHashCode&lt;/TD&gt;
&lt;TD class=""&gt;3.5&lt;/TD&gt;&lt;/TR&gt;
&lt;TR&gt;
&lt;TD class=""&gt;System.Security.Policy.CodeGroup.GetHashCode&lt;/TD&gt;
&lt;TD class=""&gt;3.5&lt;/TD&gt;&lt;/TR&gt;
&lt;TR&gt;
&lt;TD class=""&gt;System.Security.PermissionSet.GetHashCode&lt;/TD&gt;
&lt;TD class=""&gt;3.3&lt;/TD&gt;&lt;/TR&gt;
&lt;TR&gt;
&lt;TD class=""&gt;System.Security.NamedPermissionSet.GetHashCode&lt;/TD&gt;
&lt;TD class=""&gt;3.3&lt;/TD&gt;&lt;/TR&gt;
&lt;TR&gt;
&lt;TD class=""&gt;System.Configuration.SettingElement.GetHashCode&lt;/TD&gt;
&lt;TD class=""&gt;3.1&lt;/TD&gt;&lt;/TR&gt;
&lt;TR&gt;
&lt;TD class=""&gt;System.Diagnostics.ListenerElement.GetHashCode&lt;/TD&gt;
&lt;TD class=""&gt;3.0&lt;/TD&gt;&lt;/TR&gt;
&lt;TR&gt;
&lt;TD class=""&gt;System.Configuration.ConfigurationElement.GetHashCode&lt;/TD&gt;
&lt;TD class=""&gt;3.0&lt;/TD&gt;&lt;/TR&gt;
&lt;TR&gt;
&lt;TD class=""&gt;java.math.BigDecimal.GetHashCode&lt;/TD&gt;
&lt;TD class=""&gt;2.8&lt;/TD&gt;&lt;/TR&gt;
&lt;TR&gt;
&lt;TD class=""&gt;System.Web.Configuration.TagPrefixInfo.GetHashCode&lt;/TD&gt;
&lt;TD class=""&gt;2.8&lt;/TD&gt;&lt;/TR&gt;
&lt;TR&gt;
&lt;TD class=""&gt;System.Web.Configuration.TransformerInfo.GetHashCode&lt;/TD&gt;
&lt;TD class=""&gt;2.4&lt;/TD&gt;&lt;/TR&gt;
&lt;TR&gt;
&lt;TD class=""&gt;System.Web.Configuration.TagMapInfo.GetHashCode&lt;/TD&gt;
&lt;TD class=""&gt;2.4&lt;/TD&gt;&lt;/TR&gt;
&lt;TR&gt;
&lt;TD class=""&gt;System.Web.Configuration.ProfileGroupSettings.GetHashCode&lt;/TD&gt;
&lt;TD class=""&gt;2.4&lt;/TD&gt;&lt;/TR&gt;
&lt;TR&gt;
&lt;TD class=""&gt;System.Web.Configuration.CustomError.GetHashCode&lt;/TD&gt;
&lt;TD class=""&gt;2.4&lt;/TD&gt;&lt;/TR&gt;
&lt;TR&gt;
&lt;TD class=""&gt;System.Web.Configuration.BuildProvider.GetHashCode&lt;/TD&gt;
&lt;TD class=""&gt;2.4&lt;/TD&gt;&lt;/TR&gt;
&lt;TR&gt;
&lt;TD class=""&gt;System.Web.Configuration.NamespaceInfo.GetHashCode&lt;/TD&gt;
&lt;TD class=""&gt;2.1&lt;/TD&gt;&lt;/TR&gt;
&lt;TR&gt;
&lt;TD class=""&gt;System.Security.Policy.StrongNameMembershipCondition.GetHashCode&lt;/TD&gt;
&lt;TD class=""&gt;2.0&lt;/TD&gt;&lt;/TR&gt;
&lt;TR&gt;
&lt;TD class=""&gt;System.Xml.Serialization.CaseInsensitiveKeyComparer.System.Collections.IEqualityComparer.GetHashCode(System.Object)&lt;/TD&gt;
&lt;TD class=""&gt;1.9&lt;/TD&gt;&lt;/TR&gt;
&lt;TR&gt;
&lt;TD class=""&gt;System.Security.Policy.PublisherMembershipCondition.GetHashCode&lt;/TD&gt;
&lt;TD class=""&gt;1.7&lt;/TD&gt;&lt;/TR&gt;
&lt;TR&gt;
&lt;TD class=""&gt;System.Xml.Schema.KeySequence.GetHashCode&lt;/TD&gt;
&lt;TD class=""&gt;1.6&lt;/TD&gt;&lt;/TR&gt;
&lt;TR&gt;
&lt;TD class=""&gt;System.Data.SqlTypes.SqlString.GetHashCode&lt;/TD&gt;
&lt;TD class=""&gt;1.6&lt;/TD&gt;&lt;/TR&gt;
&lt;TR&gt;
&lt;TD class=""&gt;System.Data.SqlTypes.SqlDecimal.GetHashCode&lt;/TD&gt;
&lt;TD class=""&gt;1.6&lt;/TD&gt;&lt;/TR&gt;
&lt;TR&gt;
&lt;TD class=""&gt;java.text.CollationKey.GetHashCode&lt;/TD&gt;
&lt;TD class=""&gt;1.5&lt;/TD&gt;&lt;/TR&gt;
&lt;TR&gt;
&lt;TD class=""&gt;System.Security.Policy.UrlMembershipCondition.GetHashCode&lt;/TD&gt;
&lt;TD class=""&gt;1.5&lt;/TD&gt;&lt;/TR&gt;
&lt;TR&gt;
&lt;TD class=""&gt;System.Security.Policy.SiteMembershipCondition.GetHashCode&lt;/TD&gt;
&lt;TD class=""&gt;1.5&lt;/TD&gt;&lt;/TR&gt;
&lt;TR&gt;
&lt;TD class=""&gt;System.Globalization.SortKey.GetHashCode&lt;/TD&gt;
&lt;TD class=""&gt;1.5&lt;/TD&gt;&lt;/TR&gt;
&lt;TR&gt;
&lt;TD class=""&gt;System.Net.Cookie.GetHashCode&lt;/TD&gt;
&lt;TD class=""&gt;1.3&lt;/TD&gt;&lt;/TR&gt;
&lt;TR&gt;
&lt;TD class=""&gt;System.Management.ManagementBaseObject.GetHashCode&lt;/TD&gt;
&lt;TD class=""&gt;1.3&lt;/TD&gt;&lt;/TR&gt;
&lt;TR&gt;
&lt;TD class=""&gt;java.math.BigInteger.GetHashCode&lt;/TD&gt;
&lt;TD class=""&gt;1.0&lt;/TD&gt;&lt;/TR&gt;
&lt;TR&gt;
&lt;TD class=""&gt;System.Windows.Forms.TableLayoutPanelCellPosition.GetHashCode&lt;/TD&gt;
&lt;TD class=""&gt;1.0&lt;/TD&gt;&lt;/TR&gt;
&lt;TR&gt;
&lt;TD class=""&gt;System.Windows.Forms.DataGridViewCellStyle.GetHashCode&lt;/TD&gt;
&lt;TD class=""&gt;1.0&lt;/TD&gt;&lt;/TR&gt;
&lt;TR&gt;
&lt;TD class=""&gt;System.Windows.Forms.DataGridViewAdvancedBorderStyle.GetHashCode&lt;/TD&gt;
&lt;TD class=""&gt;1.0&lt;/TD&gt;&lt;/TR&gt;
&lt;TR&gt;
&lt;TD class=""&gt;System.Windows.Forms.DataGridView+HitTestInfo.GetHashCode&lt;/TD&gt;
&lt;TD class=""&gt;1.0&lt;/TD&gt;&lt;/TR&gt;
&lt;TR&gt;
&lt;TD class=""&gt;System.Web.UI.ControlCachedVary.GetHashCode&lt;/TD&gt;
&lt;TD class=""&gt;1.0&lt;/TD&gt;&lt;/TR&gt;
&lt;TR&gt;
&lt;TD class=""&gt;System.Web.UI.AttributeCollection.GetHashCode&lt;/TD&gt;
&lt;TD class=""&gt;1.0&lt;/TD&gt;&lt;/TR&gt;
&lt;TR&gt;
&lt;TD class=""&gt;System.Web.FileSecurity+DaclComparer.System.Collections.IEqualityComparer.GetHashCode(System.Object)&lt;/TD&gt;
&lt;TD class=""&gt;1.0&lt;/TD&gt;&lt;/TR&gt;
&lt;TR&gt;
&lt;TD class=""&gt;System.Web.Configuration.RootProfilePropertySettingsCollection.GetHashCode&lt;/TD&gt;
&lt;TD class=""&gt;1.0&lt;/TD&gt;&lt;/TR&gt;
&lt;TR&gt;
&lt;TD class=""&gt;System.Web.Caching.CachedVary.GetHashCode&lt;/TD&gt;
&lt;TD class=""&gt;1.0&lt;/TD&gt;&lt;/TR&gt;
&lt;TR&gt;
&lt;TD class=""&gt;System.Security.Cryptography.X509Certificates.X509CertificateCollection.GetHashCode&lt;/TD&gt;
&lt;TD class=""&gt;1.0&lt;/TD&gt;&lt;/TR&gt;
&lt;TR&gt;
&lt;TD class=""&gt;System.Security.AccessControl.GenericAce.GetHashCode&lt;/TD&gt;
&lt;TD class=""&gt;1.0&lt;/TD&gt;&lt;/TR&gt;
&lt;TR&gt;
&lt;TD class=""&gt;System.Drawing.FontFamily.GetHashCode&lt;/TD&gt;
&lt;TD class=""&gt;1.0&lt;/TD&gt;&lt;/TR&gt;
&lt;TR&gt;
&lt;TD class=""&gt;System.Configuration.ConfigurationElementCollection.GetHashCode&lt;/TD&gt;
&lt;TD class=""&gt;1.0&lt;/TD&gt;&lt;/TR&gt;&lt;/TBODY&gt;&lt;/TABLE&gt;
&lt;P&gt;&lt;BR&gt;And you can already see why I think they are unusual:&amp;nbsp; They allocate.&amp;nbsp; In my opinion any "normal" hash function should be able to do its job without creating any temporary objects.&amp;nbsp; Given the frequency with which things can be hashed I think that's important.&amp;nbsp; That doesn't mean that I'm going to go to Red Alert about any of the above but it does mean that we should think more carefully about creating big hash tables of the above.&lt;/P&gt;
&lt;P&gt;OK, so, great, an interesting observation.&amp;nbsp; What's the point?&lt;/P&gt;
&lt;P&gt;Well, I while I believe that the particulars of this question are not profound, the notion that you can have a cost metric applied universally to the runtime -- even one that is as simple minded as the one I just proposed -- *is* profound.&lt;/P&gt;
&lt;P&gt;Even though my "allocation complexity" metric may be feeble it is already useful and actually it is my hope that many people find it feeble and write about how their metric is much better and by the way here it is the much better metric in text format etc.&amp;nbsp;&amp;nbsp;etc.&amp;nbsp; Imagine if we had the methods scored in a variety of ways all readily available and pluggable into intellisense.&amp;nbsp; You could have scores relating to synchronization, i/o, memory, algorithmic complexity, even measured data where appropriate.&amp;nbsp;&amp;nbsp;Other engineering disciplines have abundant information about their raw materials but for those of us in the software business its often either guesswork or our own hard-won data.&lt;/P&gt;
&lt;P&gt;Let's look at what you can do with even my feeble metric.&amp;nbsp; In my &lt;A href="http://blogs.msdn.com/ricom/archive/2004/03/12/88715.aspx" mce_href="http://blogs.msdn.com/ricom/archive/2004/03/12/88715.aspx"&gt;very first quiz&lt;/A&gt; I asked (in part) whether it would be better to call Write three times on a stream or create a format string and call it once.&amp;nbsp; Can we answer that question with the published metric?&lt;/P&gt;
&lt;P&gt;Here are the relevant two lines from &lt;A href="http://download.microsoft.com/download/8/d/a/8da59e54-0bbf-47c4-ab76-e38f478a44d2/log10costs.zip" mce_href="http://download.microsoft.com/download/8/d/a/8da59e54-0bbf-47c4-ab76-e38f478a44d2/log10costs.zip"&gt;my costs file&lt;/A&gt;:&lt;/P&gt;
&lt;BLOCKQUOTE&gt;
&lt;P&gt;System.IO.TextWriter.Write(System.String) 0 &lt;BR&gt;System.IO.TextWriter.Write(System.String,System.Object[]) 1.5&lt;/P&gt;&lt;/BLOCKQUOTE&gt;
&lt;P&gt;That's not as good as the actual measurement but, wow, that's something now isn't it?&amp;nbsp; Rough guidance to give you a clue!&amp;nbsp; A great "tip" that the var-args formatting option has some underlying cost. 
&lt;P&gt;Even these rough costs are very powerful in terms of helping you write code.&amp;nbsp; Suppose you're writing a new&amp;nbsp;"Foo" and you want it to have cost that is roughly the same as existing "Foo".&amp;nbsp; You could go and look at how the existing "Foo" methods&amp;nbsp;score which gives you some idea what costs you can afford in your new "Foo".&amp;nbsp; 
&lt;P&gt;Importantly you must observe Rico's Rule:&amp;nbsp; if you are writing a new "Foo" method and you want that method to have a cost of X (on any metric) then you must not use any methods whose&amp;nbsp;cost is known to be&amp;nbsp;greater than&amp;nbsp;X.&amp;nbsp; Just that simple rule can save you from many costly mistakes. 
&lt;P&gt;Finally you have some way to answer the question:&amp;nbsp; Is it even remotely reasonable to use a given method in a given context.&amp;nbsp; Certainly this particular metric is imperfect, maybe even feeble,&amp;nbsp;but it's &lt;EM&gt;&lt;STRONG&gt;something&lt;/STRONG&gt;&lt;/EM&gt;.&amp;nbsp; And it will only get better if we all work on it.&lt;/P&gt;&lt;img src="http://blogs.msdn.com/aggbug.aspx?PostID=1537591" width="1" height="1"&gt;</description><category domain="http://blogs.msdn.com/ricom/archive/tags/performance/default.aspx">performance</category><category domain="http://blogs.msdn.com/ricom/archive/tags/quiz/default.aspx">quiz</category></item><item><title>Performance Quiz #12 -- The Cost of a Good Hash -- Some Help</title><link>http://blogs.msdn.com/ricom/archive/2007/01/24/performance-quiz-12-the-cost-of-a-good-hash-some-help.aspx</link><pubDate>Wed, 24 Jan 2007 22:30:00 GMT</pubDate><guid isPermaLink="false">91d46819-8472-40ad-a661-2c78acb4018c:1523243</guid><dc:creator>ricom</dc:creator><slash:comments>32</slash:comments><comments>http://blogs.msdn.com/ricom/comments/1523243.aspx</comments><wfw:commentRss>http://blogs.msdn.com/ricom/commentrss.aspx?PostID=1523243</wfw:commentRss><description>&lt;P&gt;I continue to be astounded by what my readers can come up with.&amp;nbsp; &lt;/P&gt;
&lt;P&gt;As usual I had a purpose for posing &lt;A href="http://blogs.msdn.com/ricom/archive/2007/01/22/performance-quiz-12-the-cost-of-a-good-hash.aspx" mce_href="http://blogs.msdn.com/ricom/archive/2007/01/22/performance-quiz-12-the-cost-of-a-good-hash.aspx"&gt;my last question&lt;/A&gt;&amp;nbsp;and that purpose was to show that basically its hard to get a handle on what things cost in any kind of omnibus way.&amp;nbsp; But imagine my surprise when Frank and&amp;nbsp;Shuggy turned this into an interesting discussion of hashing techniques and combining hashes.&amp;nbsp; To say nothing of some of the other regulars (you know who you are) that have insightful comments.&amp;nbsp; Those comments are definately worth a read and thanks very much for presenting them!&amp;nbsp; &lt;/P&gt;
&lt;P&gt;But I don't know that anyone has answered my question quite squarely (except maybe Alois) and I think this makes my point that it is hard to answer this question.&amp;nbsp; &lt;/P&gt;
&lt;P&gt;But what if you were to download &lt;A href="http://download.microsoft.com/download/8/d/a/8da59e54-0bbf-47c4-ab76-e38f478a44d2/log10costs.zip" mce_href="http://download.microsoft.com/download/8/d/a/8da59e54-0bbf-47c4-ab76-e38f478a44d2/log10costs.zip"&gt;this file&lt;/A&gt;...&amp;nbsp; I think you'd find it quite helpful.&lt;/P&gt;
&lt;P&gt;What you see is a compilation of one cost metric for the entire framework.&amp;nbsp; It's not the only cost metric by any means, you could imagine many others, but it is an interesting one.&amp;nbsp;&amp;nbsp; The cost is computed by doing a static analysis of all allocations made in the call tree of each method.&amp;nbsp; It's the metric that I discussed in this article on &lt;A href="http://blogs.msdn.com/ricom/archive/2006/12/11/avoiding-coding-pitfalls-with-performance-signatures.aspx" mce_href="http://blogs.msdn.com/ricom/archive/2006/12/11/avoiding-coding-pitfalls-with-performance-signatures.aspx"&gt;performance signatures&lt;/A&gt;&amp;nbsp;including the exception for methods&amp;nbsp;that look like they do a one-time allocation.&amp;nbsp; The cost is logarithmic, base 10, offset by 1, i.e.&lt;/P&gt;
&lt;BLOCKQUOTE&gt;
&lt;P&gt;if (allocation_count == 0)&lt;BR&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp; cost = 0;&lt;BR&gt;else&lt;BR&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp; cost = 1+log(allocation_count)&lt;/P&gt;&lt;/BLOCKQUOTE&gt;
&lt;P&gt;And I'm only reporting one digit after the decimal because I want to give the idea that this is just a rough cost, it's not supposed to be super-precise it's just supposed to give you a rough idea of whether or not a method is very complex or not so much.&lt;/P&gt;
&lt;P&gt;I'm trying to think of a good name for this metric, so far I'm thinking something like Allocation Complexity but I'm open to better ideas.&lt;/P&gt;
&lt;P&gt;Anyway snoop the file then see if you can answer my question.&amp;nbsp; For bonus points look at some other low level functions that might be interesting like say&amp;nbsp; Compare, Equals, or GetEnumerator.&amp;nbsp; It's very intersting to see that some methods that seem like they should be very low level are actually quite high level.&lt;/P&gt;
&lt;P&gt;I think costs like this one can be very useful in making decisions about which framework features to use in which contexts.&lt;/P&gt;
&lt;P&gt;Again this is only one kind of cost and its only approximate but I think it's interesting nonetheless.&lt;/P&gt;
&lt;P&gt;Now would you like to try the original question again:&lt;/P&gt;
&lt;P&gt;Q: Name&amp;nbsp;five implementations of GetHashCode in the framework that do things that you wouldn't expect a hashing function to do.&lt;/P&gt;&lt;img src="http://blogs.msdn.com/aggbug.aspx?PostID=1523243" width="1" height="1"&gt;</description><category domain="http://blogs.msdn.com/ricom/archive/tags/performance/default.aspx">performance</category><category domain="http://blogs.msdn.com/ricom/archive/tags/quiz/default.aspx">quiz</category><category domain="http://blogs.msdn.com/ricom/archive/tags/signatures/default.aspx">signatures</category></item><item><title>Performance Quiz #12 -- The Cost of a Good Hash</title><link>http://blogs.msdn.com/ricom/archive/2007/01/22/performance-quiz-12-the-cost-of-a-good-hash.aspx</link><pubDate>Mon, 22 Jan 2007 22:38:52 GMT</pubDate><guid isPermaLink="false">91d46819-8472-40ad-a661-2c78acb4018c:1509368</guid><dc:creator>ricom</dc:creator><slash:comments>28</slash:comments><comments>http://blogs.msdn.com/ricom/comments/1509368.aspx</comments><wfw:commentRss>http://blogs.msdn.com/ricom/commentrss.aspx?PostID=1509368</wfw:commentRss><description>&lt;p&gt;This quiz is a little bit different than some of my others.&amp;nbsp; Here is System.String.GetHashCode&lt;/p&gt;&lt;pre&gt;public override unsafe int GetHashCode()
{
      fixed (char* text1 = ((char*) this))
      {
            char* chPtr1 = text1;
            int num1 = 0x15051505;
            int num2 = num1;
            int* numPtr1 = (int*) chPtr1;
            for (int num3 = this.Length; num3 &amp;gt; 0; num3 -= 4)
            {
                  num1 = (((num1 &amp;lt;&amp;lt; 5) + num1) + (num1 &amp;gt;&amp;gt; 0x1b)) ^ numPtr1[0];
                  if (num3 &amp;lt;= 2)
                  {
                        break;
                  }
                  num2 = (((num2 &amp;lt;&amp;lt; 5) + num2) + (num2 &amp;gt;&amp;gt; 0x1b)) ^ numPtr1[1];
                  numPtr1 += 2;
            }
            return (num1 + (num2 * 0x5d588b65));
      }
}
&lt;/pre&gt;
&lt;p&gt;It's a fine little hash function that's does pretty much the sorts of things you'd expect a hash function to do.&lt;/p&gt;
&lt;p&gt;So now the quiz.&amp;nbsp; It's a bit of trivia but I'm going somewhere with it.&lt;/p&gt;
&lt;p&gt;Can you tell me 5 implementations of GetHashCode in the framework that do things that you wouldn't expect a hashing function to do?&lt;/p&gt;&lt;img src="http://blogs.msdn.com/aggbug.aspx?PostID=1509368" width="1" height="1"&gt;</description><category domain="http://blogs.msdn.com/ricom/archive/tags/performance/default.aspx">performance</category><category domain="http://blogs.msdn.com/ricom/archive/tags/quiz/default.aspx">quiz</category></item><item><title>Performance Quiz #11: Ten Questions on Value-Based Programming : Solution</title><link>http://blogs.msdn.com/ricom/archive/2006/09/07/performance-quiz-11-ten-questions-on-value-based-programming-solution.aspx</link><pubDate>Fri, 08 Sep 2006 00:52:00 GMT</pubDate><guid isPermaLink="false">91d46819-8472-40ad-a661-2c78acb4018c:745085</guid><dc:creator>ricom</dc:creator><slash:comments>11</slash:comments><comments>http://blogs.msdn.com/ricom/comments/745085.aspx</comments><wfw:commentRss>http://blogs.msdn.com/ricom/commentrss.aspx?PostID=745085</wfw:commentRss><description>&lt;P&gt;In &lt;A href="http://blogs.msdn.com/ricom/archive/2006/08/31/733887.aspx"&gt;my last quiz&lt;/A&gt; I asked a few questions about a few hypothetical classes that might appear in a value-rich context.&amp;nbsp; I styled my example in the form of some graphics library classes but the idea is a general one.&amp;nbsp; Many contexts can and should be rich in values to get nice data density and the resulting good performance that comes with it.&amp;nbsp; I allude to some of these notions in my posting on &lt;A href="http://blogs.msdn.com/ricom/archive/2006/08/22/713396.aspx"&gt;real-time managed programming&lt;/A&gt;, so in a way this posting is sort of a continuation of that discussion.&lt;/P&gt;
&lt;P&gt;Now as usual there is more than one way to do this properly so I will not pretend to be giving The Definitive Way To Write These Classes but my example does tend to show some important issues that are worth talking about.&amp;nbsp; So without further ado, here's my thoughts on the matter.&lt;/P&gt;
&lt;P&gt;&lt;STRONG&gt;&lt;/STRONG&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&lt;STRONG&gt;Question #1: &lt;/STRONG&gt;Point3d is a struct, not a class.&amp;nbsp; Why?&amp;nbsp;&amp;nbsp; &lt;/P&gt;
&lt;P&gt;The main reason goes right back to data density.&amp;nbsp; There is a strong expectation that this class will be embedded in other classes such as Vertex and you can imagine many others.&amp;nbsp; It is likely that there will be temporaries of this type, and importantly &lt;EM&gt;arrays&lt;/EM&gt; of this type.&amp;nbsp; Using a value type will make it so that walking through sets of related points does not require following multiple pointers and therefore data density is likely to be enhanced.&amp;nbsp; And of course, there is storage overhead for classes -- the object header and the method table pointer -- these will be absent in packed arrays of structures.&lt;/P&gt;
&lt;P&gt;Contrariwise the usual benefits of being a class are not especially valuable.&amp;nbsp; Will we be subtyping?&amp;nbsp; No.&amp;nbsp; Sychronizing?&amp;nbsp; No.&amp;nbsp; Do we want point "identity" (e.g. the &lt;EM&gt;canonical&lt;/EM&gt; instance of point (1,2,-1)?&amp;nbsp; No.&amp;nbsp; &lt;/P&gt;
&lt;P&gt;This leads to a pretty clear decision for me.&lt;/P&gt;
&lt;P&gt;What rules am I breaking?&amp;nbsp; I just made a &lt;EM&gt;public&lt;/EM&gt; &lt;EM&gt;mutable&lt;/EM&gt; value type that is &lt;EM&gt;larger than 16 bytes.&amp;nbsp; &lt;/EM&gt;Three strikes and I have barely begun:)&lt;/P&gt;
&lt;P&gt;Oh, I'm sorry about that last part, I didn't make it clear but I intended all of these to be public types.&amp;nbsp; &lt;A href="http://blogs.msdn.com/kcwalina/default.aspx"&gt;Krzysztof&lt;/A&gt;&amp;nbsp;of course busted my chops about that.&amp;nbsp; Thanks :)&lt;/P&gt;
&lt;P&gt;On to...&lt;/P&gt;
&lt;P&gt;&lt;STRONG&gt;&lt;/STRONG&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&lt;STRONG&gt;Question #2: &lt;/STRONG&gt;Point3d.x is a field, not a property.&amp;nbsp; Why?&lt;/P&gt;
&lt;P&gt;Common expected usage of the point primitive will involve things like translations, scaling, and other user defined operations which I can't possibly predict.&amp;nbsp; The Point structure itself has the lovely property that &lt;EM&gt;there are no illegal values &lt;/EM&gt;so I truly do not care how it is manipulated.&amp;nbsp; The opportunity that I might have for adding any kind of side-effect to the point calculations is vanishingly small.&amp;nbsp; Contrariwise the statement that I make by making the member a field is a powerful one.&amp;nbsp; It's a field.&amp;nbsp; I expect you to edit it.&amp;nbsp; It will never be other than a field.&amp;nbsp; Could I even, truly, afford even a modest overhead (a few clocks) for an operation so primitive as writing a single coordinate of a point?&amp;nbsp; Even something that small would double the write costs.&lt;/P&gt;
&lt;P&gt;The property setting code can certainly be no better than the field code, and in this case it's actually a tad worse (Alois discovered this in his comment).&amp;nbsp; Furthermore, adding additional IL for the inliner to try to fold can only make the situation worse when these field operations are composed with other operations and we would like the larger operation inlined.&lt;/P&gt;
&lt;P&gt;But for me, the main reason is to make the openness completely apparent.&amp;nbsp; It's a strong promise and an important one.&lt;/P&gt;
&lt;P&gt;I've broken another guideline now, there is no public property for this field.&lt;/P&gt;
&lt;P&gt;&lt;STRONG&gt;&lt;/STRONG&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&lt;STRONG&gt;Question #3:&amp;nbsp;&lt;/STRONG&gt;Vertex is a struct, not a class.&amp;nbsp; Why?&amp;nbsp;&amp;nbsp; Same reason as #1?&lt;/P&gt;
&lt;P&gt;Substantially yes, the same reasons as #1.&amp;nbsp; Density, no need for identity, and substantial likelihood of composition.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&lt;STRONG&gt;Question #4:&amp;nbsp;&lt;/STRONG&gt;Vertex.location is a field, not a property.&amp;nbsp; Why?&amp;nbsp; Same reason as #2? 
&lt;P&gt;The reasons given in #2 all apply, but now there is an additional reason.&amp;nbsp; Vertex is not leaf value-type, it has embedded members.&amp;nbsp; We expect the vertex itself to be mutated and we would like to write things like&amp;nbsp; Vertex.normal.dx = 0&amp;nbsp; or something.&amp;nbsp; All manner of transformations might be possible.&amp;nbsp; However the result of a property is not a true l-value so we cannot do something like Vertex.Normal.dx = 0.&amp;nbsp;&amp;nbsp;&amp;nbsp;The chaining of properties gives very unexpected results. 
&lt;P&gt;&lt;BR&gt;&lt;STRONG&gt;Question #5:&lt;/STRONG&gt; Quad has no methods.&amp;nbsp; Why? 
&lt;P&gt;Importantly, Quad has no &lt;EM&gt;context &lt;/EM&gt;and therefore it cannot hope to add any value other than as a bucket of 4 integers, for which we need no methods.&amp;nbsp; Quad cannot have any context because we would not wish every Quad to know things like what Mesh it is a part of, the cost of such a pointer would be astonishing.&amp;nbsp; Indeed a frugal graphics package might limit the Quad to using short integers for the vertex numbers if that were reasonable (though that would limit the mesh size signficantly).&amp;nbsp;&amp;nbsp; Lacking context we can't do things like range check the integers, check for consistency in the normals, or any other sort of thing we might like to do on a single face.&amp;nbsp; We need to put those sorts of functions at the next higher level (the Mesh).&amp;nbsp; 
&lt;P&gt;Since we can't do those sorts of things, lets be very open about the fact that this is nothing more than a 4 slot integer container, it's not "smart"&amp;nbsp;and never will be.&amp;nbsp; As a reward we get the benefits indicated in #1. 
&lt;P&gt;&lt;STRONG&gt;&lt;/STRONG&gt;
&lt;P&gt;&lt;STRONG&gt;Question #6: &lt;/STRONG&gt;MeshSection is a class with private members.&amp;nbsp; Why? 
&lt;P&gt;Well now we've reached a level of implementation where the class actually has some idea what's going on and this class should probably be following all the regular sorts of rules you would expect.&amp;nbsp; I'm showing some of the internals for illustration purposes, such as arrays of Vertex objects and Quad objects but naturally these should not be exposed.&amp;nbsp; A MeshSection might wish to have&amp;nbsp;some growable&amp;nbsp;array form, or chunks, or some other dense representation that might evolve over time.&amp;nbsp; The MeshSection can do validation on the items when they are passed in and provide high level semantic functions such as normal adjustment, tessalation, morphs, and other operations of that ilk.&amp;nbsp; 
&lt;P&gt;Importantly, these representation choices, even the more flexible ones,&amp;nbsp;are rich in represented data and extremely low in overhead.&amp;nbsp; The number of pointers present is virtual nil so garbage collection costs even in the presence of massive peices of geometry would be virtually nil.&amp;nbsp; Effectively the Quad objects are integer 'handles' to vertices as I discussed in my article on &lt;A href="http://blogs.msdn.com/ricom/archive/2006/08/22/713396.aspx"&gt;real-time managed programming&lt;/A&gt;.&amp;nbsp; Yet I can still have a nice abstracted and maintainable interface. 
&lt;P&gt;&lt;STRONG&gt;&lt;/STRONG&gt;
&lt;P&gt;&lt;STRONG&gt;Question #7: &lt;/STRONG&gt;Why do I suggest that MeshSection methods accept arrays of Vertices, Quads, and the like rather than singletons? 
&lt;P&gt;Two significant reasons really.&amp;nbsp; The first is that we might be dealing with MeshSections having millions of vertices and so it seems unwise to talk about operations that affect singleton primitives.&amp;nbsp;&amp;nbsp;While it's possible to imagine adding and connecting single vertices to a MeshSection it's&amp;nbsp;as likely that we might want to merge signficant pieces of geometry.&amp;nbsp; If this kind of operation requried millions of function calls we would have signficant overhead.&amp;nbsp; So perhaps the&amp;nbsp;most important reason of all is that chunky arrays as arguments offer a nice unit of work. 
&lt;P&gt;Secondly,&amp;nbsp;the suggested prohibition against large value types (greater than say 16 bytes) is not without merit.&amp;nbsp;&amp;nbsp;But actually we give that advice not so much because large value types are inherently bad but because if used directly there can be an inordinate amount of pushing and popping of big objects on the stack as well as lost inlining opportunities.&amp;nbsp; Using arrays neatly avoids both this problem as we merely pass around array references, and indeed the array could be reused. 
&lt;P&gt;This leads to an important observation, which is, wait for it... 
&lt;P&gt;It's not the &lt;EM&gt;size&lt;/EM&gt; of the value type that matters.&amp;nbsp; It's how you use it :) 
&lt;P&gt;Ahem.&amp;nbsp; Moving swiftly along...&amp;nbsp; :) 
&lt;P&gt;&lt;STRONG&gt;&lt;/STRONG&gt;
&lt;P&gt;&lt;STRONG&gt;Question #8:&lt;/STRONG&gt; No mention is made of synchronization here at all, is that just an oversight? 
&lt;P&gt;Certainly I would put no synchronization in any of the structs.&amp;nbsp; The operations on those are so small that any sychronization primitive, even the cheapest, would be deadly at that level.&amp;nbsp; But this is actually not the pivotal point.&amp;nbsp; It is much more important to put synchronzation at a semantic level where the code has some understanding of what the unit of work is.&amp;nbsp; That is, how big of an operation is intented to be atomic?&amp;nbsp; What are the high level operations that need to be protected?&amp;nbsp; The soonest you could imagine having this knowledge is at the MeshSection level in this program -- but even there, the code has no thread awareness per se -- it's just doing some vertex math.&amp;nbsp; Lacking knowledge of how things might be scheduled it again becomes difficult to make any interesting synchronization choices.&amp;nbsp; Since the contract for MeshSection is nice and chunky (see #7) you could imagine again leaving the synchronization to the caller,&amp;nbsp;as we do in virtually all of our framework collection classes. 
&lt;P&gt;Ultimately, I don't believe these classes have anything of value to add in terms of synchronization assistance and so I decided not to include it.&amp;nbsp; For more on this, see my article &lt;A href="http://blogs.msdn.com/ricom/archive/2006/05/01/587750.aspx"&gt;"Putting your synchronization at the correct level."&lt;/A&gt; 
&lt;P&gt;&lt;STRONG&gt;&lt;/STRONG&gt;
&lt;P&gt;&lt;STRONG&gt;Question #9: &lt;/STRONG&gt;How useful is the "foreach" construct likely to be when working with arrays of vertices or quads etc? 
&lt;P&gt;Not very.&amp;nbsp; If you were to do foreach (Vertex v in SomeVertices) { whatever }&amp;nbsp; you'd find that in the whatever block you can't modify the vertices because of course it's a value type.&amp;nbsp; You'll likely be having custom iterators or using for loops to manage the array arguments to the MeshSection methods.&amp;nbsp; Unfortunate but in my opinion it's the right thing to do. 
&lt;P&gt;
&lt;P&gt;&lt;STRONG&gt;Question #10:&lt;/STRONG&gt; How many rules did I break? :) 
&lt;P&gt;I lost count.&amp;nbsp; I think three big ones but more than once each I suppose. 
&lt;P&gt;
&lt;P&gt;Thank you very much for all the excellent responses!&lt;/P&gt;
&lt;P&gt;P.S. The annotated design guidelines can be found &lt;A href="http://www.amazon.com/exec/obidos/tg/detail/-/0321246756?v=glance"&gt;here&lt;/A&gt;.&lt;/P&gt;&lt;img src="http://blogs.msdn.com/aggbug.aspx?PostID=745085" width="1" height="1"&gt;</description><category domain="http://blogs.msdn.com/ricom/archive/tags/performance/default.aspx">performance</category><category domain="http://blogs.msdn.com/ricom/archive/tags/quiz/default.aspx">quiz</category></item><item><title>Performance Quiz #11: Ten Questions on Value-Based Programming</title><link>http://blogs.msdn.com/ricom/archive/2006/08/31/performance-quiz-11-ten-questions-on-value-based-programming.aspx</link><pubDate>Thu, 31 Aug 2006 21:23:00 GMT</pubDate><guid isPermaLink="false">91d46819-8472-40ad-a661-2c78acb4018c:733887</guid><dc:creator>ricom</dc:creator><slash:comments>22</slash:comments><comments>http://blogs.msdn.com/ricom/comments/733887.aspx</comments><wfw:commentRss>http://blogs.msdn.com/ricom/commentrss.aspx?PostID=733887</wfw:commentRss><description>&lt;P&gt;Some of you have probably heard one or more of my talks, or read the annotations I made in the &lt;A href="http://msdn2.microsoft.com/en-us/library//ms229042.aspx"&gt;Design Guidelines&lt;/A&gt;.&amp;nbsp; If you have&amp;nbsp;then you already know that I don't always agree with every suggested guideline.&amp;nbsp; Especially not in every context.&amp;nbsp; It's probably fair to say that one of the greatest areas of disagreement has to do with the handling of value types and the prevelance of properties vs. fields in the framework.&lt;/P&gt;
&lt;P&gt;Generally, my feeling is that properties are highly overrated and fields terribly under-utilized.&amp;nbsp; Did I mention that not everyone agrees with this position? :)&lt;/P&gt;
&lt;P&gt;So suffice to say that the guidelines are still good guidelines but of course they are only that, and so you should know when it might be a good time to consider disregarding some of them.&amp;nbsp; Today I wanted to write a motivational example for you that shows the kind of situation that might call for an approach that is more "old-school" if you like.&lt;/P&gt;
&lt;P&gt;To try to make it more real, I've cooked it up in terms of some graphics library primitives that might happen in a real graphics library that was dealing with objects made up of 4 side polygons (quads) with some basic texturing.&amp;nbsp; Naturally this example isn't complete but hopefully there is enough there that you could imagine what the other primitives might look like.&lt;/P&gt;
&lt;P&gt;Below is the code written in the style that I would suggest, following are some key questions which are intended to suggest&amp;nbsp;the origin of my thinking on some of the more important points.&amp;nbsp;&lt;/P&gt;&lt;PRE&gt;    // typical 3d coordinate
    struct Point3d
    {
        public double x;
        public double y;
        public double z;
        // handy math methods
    }

    // projected cooridinates in the text space
    struct Point2d
    {
        public double u;
        public double v;
        // handy math methods
    }

    // useful for normals etc.
    struct Vector3d
    {
        public double dx;
        public double dy;
        public double dz;
        // handy vector math methods
    }

    // a single vertex, with location, and normal
    struct Vertex
    {
        public Point3d location;
        public Point2d uvmap;
        public Vector3d normal;
        // manipulation methods
    }

    struct Quad
    {
        // vertex indices within a mesh
        public int corner1;
        public int corner2;
        public int corner3;
        public int corner4;
    }

    // at last a reference type
    class MeshSection
    {
        private Vertex[] vertices;
        private Quad[] quads;
        private TextureMap texture; // defined elsewhere

        // assorted methods that accept arrays of vertices and quads
        // and insert them into the structure and things like that
    }

&lt;/PRE&gt;
&lt;P&gt;&lt;STRONG&gt;Question #1: &lt;/STRONG&gt;Point3d is a struct, not a class.&amp;nbsp; Why?&amp;nbsp;&amp;nbsp; &lt;/P&gt;
&lt;P&gt;&lt;STRONG&gt;Question #2: &lt;/STRONG&gt;Point3d.x is a field, not a property.&amp;nbsp; Why?&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&lt;STRONG&gt;Question #3:&amp;nbsp;&lt;/STRONG&gt;Vertex is a struct, not a class.&amp;nbsp; Why?&amp;nbsp;&amp;nbsp; Same reason as #1?&lt;/P&gt;
&lt;P&gt;&lt;STRONG&gt;Question #4:&amp;nbsp;&lt;/STRONG&gt;Vertex.location is a field, not a property.&amp;nbsp; Why?&amp;nbsp; Same reason as #2?&lt;/P&gt;
&lt;P&gt;&lt;STRONG&gt;Question #5:&lt;/STRONG&gt; Quad has no methods.&amp;nbsp; Why?&lt;/P&gt;
&lt;P&gt;&lt;STRONG&gt;Question #6: &lt;/STRONG&gt;MeshSection is a class with private members.&amp;nbsp; Why?&lt;/P&gt;
&lt;P&gt;&lt;STRONG&gt;Question #7: &lt;/STRONG&gt;Why do I suggest that MeshSection methods accept arrays of Vertices, Quads, and the like rather than singletons?&lt;/P&gt;
&lt;P&gt;&lt;STRONG&gt;Question #8:&lt;/STRONG&gt; No mention is made of synchronization here at all, is that just an oversight?&lt;/P&gt;
&lt;P&gt;&lt;STRONG&gt;Question #9: &lt;/STRONG&gt;How useful is the "foreach" construct likely to be when working with arrays of vertices or quads etc?&lt;/P&gt;
&lt;P&gt;&lt;STRONG&gt;Question #10:&lt;/STRONG&gt; How many rules did I break? :)&lt;/P&gt;
&lt;P&gt;Full Disclosure:&lt;/P&gt;
&lt;P&gt;Of course Brad and Krzysztof (the principle authors of the guidelines) both understand that the guidelines need to be broken sometimes.&amp;nbsp; Sometimes we disagree when the bar for breaking them has been reached but I think that's actually a good thing because that kind of tension makes for good professional growth for everyone and better discourse for our customers.&lt;/P&gt;
&lt;P&gt;Whenever you decide to go off the recommended path it's still good to be familiar with the guidelines in the area because they can give you some valuable information about what the consequences might be so that you can make an informed decision.&amp;nbsp; Like in this parctiular case you might learn about some serialization issues you'll have to deal with since properties were not chosen.&lt;/P&gt;
&lt;P&gt;Try to answer yourself before you peek at the comments, there's good thoughts down there. :)&lt;/P&gt;
&lt;P&gt;Update:&amp;nbsp; I posted my solution &lt;A href="http://blogs.msdn.com/ricom/archive/2006/09/07/745085.aspx"&gt;here&lt;/A&gt;.&lt;/P&gt;&lt;img src="http://blogs.msdn.com/aggbug.aspx?PostID=733887" width="1" height="1"&gt;</description><category domain="http://blogs.msdn.com/ricom/archive/tags/performance/default.aspx">performance</category><category domain="http://blogs.msdn.com/ricom/archive/tags/quiz/default.aspx">quiz</category></item><item><title>Performance Quiz #10 -- Thread local storage --  Solution</title><link>http://blogs.msdn.com/ricom/archive/2006/07/18/performance-quiz-10-thread-local-storage-solution.aspx</link><pubDate>Wed, 19 Jul 2006 02:25:00 GMT</pubDate><guid isPermaLink="false">91d46819-8472-40ad-a661-2c78acb4018c:670314</guid><dc:creator>ricom</dc:creator><slash:comments>6</slash:comments><comments>http://blogs.msdn.com/ricom/comments/670314.aspx</comments><wfw:commentRss>http://blogs.msdn.com/ricom/commentrss.aspx?PostID=670314</wfw:commentRss><description>&lt;P&gt;I actually posted &lt;A href="http://blogs.msdn.com/ricom/archive/2006/06/16/634480.aspx"&gt;quiz #10&lt;/A&gt; quite a while ago but a comment with the correct solution came in so quickly that I wasn't very motivated to post a followup.&amp;nbsp; There are excellent links&amp;nbsp;in the comments (thank you readers!)&amp;nbsp; But now I'll have to make the quizzes harder :)&lt;/P&gt;
&lt;P&gt;The problem was to see what overhead is associated with various methods of creating flexible thread local storage.&amp;nbsp; I suggested two ways of having named storage.&lt;/P&gt;
&lt;P&gt;I've posted a &lt;A href="http://blogs.msdn.com/ricom/articles/670296.aspx"&gt;sample benchmark&lt;/A&gt; that expands on this and shows four different approaches (some less general than others).&amp;nbsp; &lt;/P&gt;
&lt;P&gt;On my machine I observed the following times:&lt;/P&gt;
&lt;P&gt;Test1: Named Slot 7,991ms&lt;BR&gt;Test2: Numbered Slot 4,136ms&lt;BR&gt;Test3: Thread-local dictionary 2,006ms&lt;BR&gt;Test4: Thread-local direct 704ms&lt;BR&gt;&lt;BR&gt;So, what's going on?&amp;nbsp; Well I looked into it with our profiler and got these results which show the extra costs pretty clearly.&amp;nbsp; Have a look at all the helper functions under Test1 and Test2.&amp;nbsp; &lt;/P&gt;
&lt;TABLE style="FONT-SIZE: 10pt; FONT-FAMILY: verdana, arial, helvetia, sans-serif" cellPadding=0 border=0&gt;
&lt;TBODY&gt;
&lt;TR&gt;
&lt;TD&gt;&lt;B&gt;Exclusive&lt;/B&gt;&amp;nbsp;&lt;/TD&gt;
&lt;TD&gt;&lt;B&gt;Inclusive&lt;/B&gt;&amp;nbsp;&lt;/TD&gt;
&lt;TD&gt;&lt;B&gt;Function Name&lt;/B&gt;&amp;nbsp;&lt;/TD&gt;
&lt;TR vAlign=top&gt;
&lt;TD width=100&gt;0.39 % &lt;/TD&gt;
&lt;TD width=100&gt;89.92 % &lt;/TD&gt;
&lt;TD&gt;
&lt;TABLE style="FONT-SIZE: 10pt; FONT-FAMILY: verdana, arial, helvetia, sans-serif" cellPadding=0 width="100%" border=0&gt;
&lt;TBODY&gt;
&lt;TR vAlign=top&gt;
&lt;TD&gt;&lt;/TD&gt;
&lt;TD width="100%"&gt;Quiz10.Program.Main (string[])&lt;/TD&gt;&lt;/TR&gt;&lt;/TBODY&gt;&lt;/TABLE&gt;&lt;/TD&gt;&lt;/TR&gt;
&lt;TR vAlign=top&gt;
&lt;TD width=100&gt;0.78 % &lt;/TD&gt;
&lt;TD width=100&gt;53.07 % &lt;/TD&gt;
&lt;TD&gt;
&lt;TABLE style="FONT-SIZE: 10pt; FONT-FAMILY: verdana, arial, helvetia, sans-serif" cellPadding=0 width="100%" border=0&gt;
&lt;TBODY&gt;
&lt;TR vAlign=top&gt;
&lt;TD style="FONT-SIZE: 10pt; FONT-FAMILY: 'lucidia console', courier, serif"&gt;&amp;nbsp;&amp;nbsp;&lt;/TD&gt;
&lt;TD width="100%"&gt;Quiz10.Program.Test1 ()&lt;/TD&gt;&lt;/TR&gt;&lt;/TBODY&gt;&lt;/TABLE&gt;&lt;/TD&gt;&lt;/TR&gt;
&lt;TR vAlign=top&gt;
&lt;TD width=100&gt;0.95 % &lt;/TD&gt;
&lt;TD width=100&gt;25.19 % &lt;/TD&gt;
&lt;TD&gt;
&lt;TABLE style="FONT-SIZE: 10pt; FONT-FAMILY: verdana, arial, helvetia, sans-serif" cellPadding=0 width="100%" border=0&gt;
&lt;TBODY&gt;
&lt;TR vAlign=top&gt;
&lt;TD style="FONT-SIZE: 10pt; FONT-FAMILY: 'lucidia console', courier, serif"&gt;&amp;nbsp;&amp;nbsp;|&amp;nbsp;&lt;/TD&gt;
&lt;TD width="100%"&gt;System.LocalDataStoreMgr.GetNamedDataSlot (string)&lt;/TD&gt;&lt;/TR&gt;&lt;/TBODY&gt;&lt;/TABLE&gt;&lt;/TD&gt;&lt;/TR&gt;
&lt;TR vAlign=top&gt;
&lt;TD width=100&gt;0.18 % &lt;/TD&gt;
&lt;TD width=100&gt;12.14 % &lt;/TD&gt;
&lt;TD&gt;
&lt;TABLE style="FONT-SIZE: 10pt; FONT-FAMILY: verdana, arial, helvetia, sans-serif" cellPadding=0 width="100%" border=0&gt;
&lt;TBODY&gt;
&lt;TR vAlign=top&gt;
&lt;TD style="FONT-SIZE: 10pt; FONT-FAMILY: 'lucidia console', courier, serif"&gt;&amp;nbsp;&amp;nbsp;|&amp;nbsp;|&amp;nbsp;&lt;/TD&gt;
&lt;TD width="100%"&gt;JIT_MonReliableEnter (class Object *,bool *)&lt;/TD&gt;&lt;/TR&gt;&lt;/TBODY&gt;&lt;/TABLE&gt;&lt;/TD&gt;&lt;/TR&gt;
&lt;TR vAlign=top&gt;
&lt;TD width=100&gt;5.76 % &lt;/TD&gt;
&lt;TD width=100&gt;8.06 % &lt;/TD&gt;
&lt;TD&gt;
&lt;TABLE style="FONT-SIZE: 10pt; FONT-FAMILY: verdana, arial, helvetia, sans-serif" cellPadding=0 width="100%" border=0&gt;
&lt;TBODY&gt;
&lt;TR vAlign=top&gt;
&lt;TD style="FONT-SIZE: 10pt; FONT-FAMILY: 'lucidia console', courier, serif"&gt;&amp;nbsp;&amp;nbsp;|&amp;nbsp;|&amp;nbsp;&lt;/TD&gt;
&lt;TD width="100%"&gt;System.Collections.Hashtable.get_Item (object)&lt;/TD&gt;&lt;/TR&gt;&lt;/TBODY&gt;&lt;/TABLE&gt;&lt;/TD&gt;&lt;/TR&gt;
&lt;TR vAlign=top&gt;
&lt;TD width=100&gt;3.05 % &lt;/TD&gt;
&lt;TD width=100&gt;3.11 % &lt;/TD&gt;
&lt;TD&gt;
&lt;TABLE style="FONT-SIZE: 10pt; FONT-FAMILY: verdana, arial, helvetia, sans-serif" cellPadding=0 width="100%" border=0&gt;
&lt;TBODY&gt;
&lt;TR vAlign=top&gt;
&lt;TD style="FONT-SIZE: 10pt; FONT-FAMILY: 'lucidia console', courier, serif"&gt;&amp;nbsp;&amp;nbsp;|&amp;nbsp;|&amp;nbsp;&lt;/TD&gt;
&lt;TD width="100%"&gt;@JIT_MonExitWorker@4&lt;/TD&gt;&lt;/TR&gt;&lt;/TBODY&gt;&lt;/TABLE&gt;&lt;/TD&gt;&lt;/TR&gt;
&lt;TR vAlign=top&gt;
&lt;TD width=100&gt;3.49 % &lt;/TD&gt;
&lt;TD width=100&gt;22.31 % &lt;/TD&gt;
&lt;TD&gt;
&lt;TABLE style="FONT-SIZE: 10pt; FONT-FAMILY: verdana, arial, helvetia, sans-serif" cellPadding=0 width="100%" border=0&gt;
&lt;TBODY&gt;
&lt;TR vAlign=top&gt;
&lt;TD style="FONT-SIZE: 10pt; FONT-FAMILY: 'lucidia console', courier, serif"&gt;&amp;nbsp;&amp;nbsp;|&amp;nbsp;&lt;/TD&gt;
&lt;TD width="100%"&gt;NativeArrayMarshalerBase::NativeArrayMarshalerBase (class CleanupWorkList *)&lt;/TD&gt;&lt;/TR&gt;&lt;/TBODY&gt;&lt;/TABLE&gt;&lt;/TD&gt;&lt;/TR&gt;
&lt;TR vAlign=top&gt;
&lt;TD width=100&gt;0.43 % &lt;/TD&gt;
&lt;TD width=100&gt;5.97 % &lt;/TD&gt;
&lt;TD&gt;
&lt;TABLE style="FONT-SIZE: 10pt; FONT-FAMILY: verdana, arial, helvetia, sans-serif" cellPadding=0 width="100%" border=0&gt;
&lt;TBODY&gt;
&lt;TR vAlign=top&gt;
&lt;TD style="FONT-SIZE: 10pt; FONT-FAMILY: 'lucidia console', courier, serif"&gt;&amp;nbsp;&amp;nbsp;|&amp;nbsp;|&amp;nbsp;&lt;/TD&gt;
&lt;TD width="100%"&gt;ThreadStore::LockDLSHash (void)&lt;/TD&gt;&lt;/TR&gt;&lt;/TBODY&gt;&lt;/TABLE&gt;&lt;/TD&gt;&lt;/TR&gt;
&lt;TR vAlign=top&gt;
&lt;TD width=100&gt;0.14 % &lt;/TD&gt;
&lt;TD width=100&gt;5.41 % &lt;/TD&gt;
&lt;TD&gt;
&lt;TABLE style="FONT-SIZE: 10pt; FONT-FAMILY: verdana, arial, helvetia, sans-serif" cellPadding=0 width="100%" border=0&gt;
&lt;TBODY&gt;
&lt;TR vAlign=top&gt;
&lt;TD style="FONT-SIZE: 10pt; FONT-FAMILY: 'lucidia console', courier, serif"&gt;&amp;nbsp;&amp;nbsp;|&amp;nbsp;|&amp;nbsp;&lt;/TD&gt;
&lt;TD width="100%"&gt;CantAllocThreads::MarkThread (void)&lt;/TD&gt;&lt;/TR&gt;&lt;/TBODY&gt;&lt;/TABLE&gt;&lt;/TD&gt;&lt;/TR&gt;
&lt;TR vAlign=top&gt;
&lt;TD width=100&gt;0.04 % &lt;/TD&gt;
&lt;TD width=100&gt;2.80 % &lt;/TD&gt;
&lt;TD&gt;
&lt;TABLE style="FONT-SIZE: 10pt; FONT-FAMILY: verdana, arial, helvetia, sans-serif" cellPadding=0 width="100%" border=0&gt;
&lt;TBODY&gt;
&lt;TR vAlign=top&gt;
&lt;TD style="FONT-SIZE: 10pt; FONT-FAMILY: 'lucidia console', courier, serif"&gt;&amp;nbsp;&amp;nbsp;|&amp;nbsp;|&amp;nbsp;&lt;/TD&gt;
&lt;TD width="100%"&gt;EEHashTableBase&amp;lt;int,class EEIntHashTableHelper,0&amp;gt;::FindItem (int)&lt;/TD&gt;&lt;/TR&gt;&lt;/TBODY&gt;&lt;/TABLE&gt;&lt;/TD&gt;&lt;/TR&gt;
&lt;TR vAlign=top&gt;
&lt;TD width=100&gt;0.77 % &lt;/TD&gt;
&lt;TD width=100&gt;2.19 % &lt;/TD&gt;
&lt;TD&gt;
&lt;TABLE style="FONT-SIZE: 10pt; FONT-FAMILY: verdana, arial, helvetia, sans-serif" cellPadding=0 width="100%" border=0&gt;
&lt;TBODY&gt;
&lt;TR vAlign=top&gt;
&lt;TD style="FONT-SIZE: 10pt; FONT-FAMILY: 'lucidia console', courier, serif"&gt;&amp;nbsp;&amp;nbsp;|&amp;nbsp;|&amp;nbsp;&lt;/TD&gt;
&lt;TD width="100%"&gt;FrameWithCookie&amp;lt;class HelperMethodFrame_1OBJ&amp;gt;::FrameWithCookie&amp;lt;class HelperMethodFrame_1OBJ&amp;gt; (void *,struct LazyMachState *,unsigned int,class Object * *)&lt;/TD&gt;&lt;/TR&gt;&lt;/TBODY&gt;&lt;/TABLE&gt;&lt;/TD&gt;&lt;/TR&gt;
&lt;TR vAlign=top&gt;
&lt;TD width=100&gt;0.78 % &lt;/TD&gt;
&lt;TD width=100&gt;1.59 % &lt;/TD&gt;
&lt;TD&gt;
&lt;TABLE style="FONT-SIZE: 10pt; FONT-FAMILY: verdana, arial, helvetia, sans-serif" cellPadding=0 width="100%" border=0&gt;
&lt;TBODY&gt;
&lt;TR vAlign=top&gt;
&lt;TD style="FONT-SIZE: 10pt; FONT-FAMILY: 'lucidia console', courier, serif"&gt;&amp;nbsp;&amp;nbsp;|&amp;nbsp;&lt;/TD&gt;
&lt;TD width="100%"&gt;System.Threading.Thread.get_LocalDataStoreManager ()&lt;/TD&gt;&lt;/TR&gt;&lt;/TBODY&gt;&lt;/TABLE&gt;&lt;/TD&gt;&lt;/TR&gt;
&lt;TR vAlign=top&gt;
&lt;TD width=100&gt;0.16 % &lt;/TD&gt;
&lt;TD width=100&gt;1.22 % &lt;/TD&gt;
&lt;TD&gt;
&lt;TABLE style="FONT-SIZE: 10pt; FONT-FAMILY: verdana, arial, helvetia, sans-serif" cellPadding=0 width="100%" border=0&gt;
&lt;TBODY&gt;
&lt;TR vAlign=top&gt;
&lt;TD style="FONT-SIZE: 10pt; FONT-FAMILY: 'lucidia console', courier, serif"&gt;&amp;nbsp;&amp;nbsp;|&amp;nbsp;&lt;/TD&gt;
&lt;TD width="100%"&gt;ThreadNative::GetDomainLocalStore (void)&lt;/TD&gt;&lt;/TR&gt;&lt;/TBODY&gt;&lt;/TABLE&gt;&lt;/TD&gt;&lt;/TR&gt;
&lt;TR vAlign=top&gt;
&lt;TD width=100&gt;0.57 % &lt;/TD&gt;
&lt;TD width=100&gt;1.16 % &lt;/TD&gt;
&lt;TD&gt;
&lt;TABLE style="FONT-SIZE: 10pt; FONT-FAMILY: verdana, arial, helvetia, sans-serif" cellPadding=0 width="100%" border=0&gt;
&lt;TBODY&gt;
&lt;TR vAlign=top&gt;
&lt;TD style="FONT-SIZE: 10pt; FONT-FAMILY: 'lucidia console', courier, serif"&gt;&amp;nbsp;&amp;nbsp;|&amp;nbsp;&lt;/TD&gt;
&lt;TD width="100%"&gt;System.LocalDataStore.GetData (class System.LocalDataStoreSlot)&lt;/TD&gt;&lt;/TR&gt;&lt;/TBODY&gt;&lt;/TABLE&gt;&lt;/TD&gt;&lt;/TR&gt;
&lt;TR vAlign=top&gt;
&lt;TD width=100&gt;0.66 % &lt;/TD&gt;
&lt;TD width=100&gt;26.72 % &lt;/TD&gt;
&lt;TD&gt;
&lt;TABLE style="FONT-SIZE: 10pt; FONT-FAMILY: verdana, arial, helvetia, sans-serif" cellPadding=0 width="100%" border=0&gt;
&lt;TBODY&gt;
&lt;TR vAlign=top&gt;
&lt;TD style="FONT-SIZE: 10pt; FONT-FAMILY: 'lucidia console', courier, serif"&gt;&amp;nbsp;&amp;nbsp;&lt;/TD&gt;
&lt;TD width="100%"&gt;Quiz10.Program.Test2 ()&lt;/TD&gt;&lt;/TR&gt;&lt;/TBODY&gt;&lt;/TABLE&gt;&lt;/TD&gt;&lt;/TR&gt;
&lt;TR vAlign=top&gt;
&lt;TD width=100&gt;3.73 % &lt;/TD&gt;
&lt;TD width=100&gt;21.79 % &lt;/TD&gt;
&lt;TD&gt;
&lt;TABLE style="FONT-SIZE: 10pt; FONT-FAMILY: verdana, arial, helvetia, sans-serif" cellPadding=0 width="100%" border=0&gt;
&lt;TBODY&gt;
&lt;TR vAlign=top&gt;
&lt;TD style="FONT-SIZE: 10pt; FONT-FAMILY: 'lucidia console', courier, serif"&gt;&amp;nbsp;&amp;nbsp;|&amp;nbsp;&lt;/TD&gt;
&lt;TD width="100%"&gt;NativeArrayMarshalerBase::NativeArrayMarshalerBase (class CleanupWorkList *)&lt;/TD&gt;&lt;/TR&gt;&lt;/TBODY&gt;&lt;/TABLE&gt;&lt;/TD&gt;&lt;/TR&gt;
&lt;TR vAlign=top&gt;
&lt;TD width=100&gt;0.46 % &lt;/TD&gt;
&lt;TD width=100&gt;5.79 % &lt;/TD&gt;
&lt;TD&gt;
&lt;TABLE style="FONT-SIZE: 10pt; FONT-FAMILY: verdana, arial, helvetia, sans-serif" cellPadding=0 width="100%" border=0&gt;
&lt;TBODY&gt;
&lt;TR vAlign=top&gt;
&lt;TD style="FONT-SIZE: 10pt; FONT-FAMILY: 'lucidia console', courier, serif"&gt;&amp;nbsp;&amp;nbsp;|&amp;nbsp;|&amp;nbsp;&lt;/TD&gt;
&lt;TD width="100%"&gt;ThreadStore::LockDLSHash (void)&lt;/TD&gt;&lt;/TR&gt;&lt;/TBODY&gt;&lt;/TABLE&gt;&lt;/TD&gt;&lt;/TR&gt;
&lt;TR vAlign=top&gt;
&lt;TD width=100&gt;0.18 % &lt;/TD&gt;
&lt;TD width=100&gt;5.13 % &lt;/TD&gt;
&lt;TD&gt;
&lt;TABLE style="FONT-SIZE: 10pt; FONT-FAMILY: verdana, arial, helvetia, sans-serif" cellPadding=0 width="100%" border=0&gt;
&lt;TBODY&gt;
&lt;TR vAlign=top&gt;
&lt;TD style="FONT-SIZE: 10pt; FONT-FAMILY: 'lucidia console', courier, serif"&gt;&amp;nbsp;&amp;nbsp;|&amp;nbsp;|&amp;nbsp;&lt;/TD&gt;
&lt;TD width="100%"&gt;CantAllocThreads::MarkThread (void)&lt;/TD&gt;&lt;/TR&gt;&lt;/TBODY&gt;&lt;/TABLE&gt;&lt;/TD&gt;&lt;/TR&gt;
&lt;TR vAlign=top&gt;
&lt;TD width=100&gt;0.05 % &lt;/TD&gt;
&lt;TD width=100&gt;3.13 % &lt;/TD&gt;
&lt;TD&gt;
&lt;TABLE style="FONT-SIZE: 10pt; FONT-FAMILY: verdana, arial, helvetia, sans-serif" cellPadding=0 width="100%" border=0&gt;
&lt;TBODY&gt;
&lt;TR vAlign=top&gt;
&lt;TD style="FONT-SIZE: 10pt; FONT-FAMILY: 'lucidia console', courier, serif"&gt;&amp;nbsp;&amp;nbsp;|&amp;nbsp;|&amp;nbsp;&lt;/TD&gt;
&lt;TD width="100%"&gt;EEHashTableBase&amp;lt;int,class EEIntHashTableHelper,0&amp;gt;::FindItem (int)&lt;/TD&gt;&lt;/TR&gt;&lt;/TBODY&gt;&lt;/TABLE&gt;&lt;/TD&gt;&lt;/TR&gt;
&lt;TR vAlign=top&gt;
&lt;TD width=100&gt;0.57 % &lt;/TD&gt;
&lt;TD width=100&gt;1.62 % &lt;/TD&gt;
&lt;TD&gt;
&lt;TABLE style="FONT-SIZE: 10pt; FONT-FAMILY: verdana, arial, helvetia, sans-serif" cellPadding=0 width="100%" border=0&gt;
&lt;TBODY&gt;
&lt;TR vAlign=top&gt;
&lt;TD style="FONT-SIZE: 10pt; FONT-FAMILY: 'lucidia console', courier, serif"&gt;&amp;nbsp;&amp;nbsp;|&amp;nbsp;|&amp;nbsp;&lt;/TD&gt;
&lt;TD width="100%"&gt;FrameWithCookie&amp;lt;class HelperMethodFrame_1OBJ&amp;gt;::FrameWithCookie&amp;lt;class HelperMethodFrame_1OBJ&amp;gt; (void *,struct LazyMachState *,unsigned int,class Object * *)&lt;/TD&gt;&lt;/TR&gt;&lt;/TBODY&gt;&lt;/TABLE&gt;&lt;/TD&gt;&lt;/TR&gt;
&lt;TR vAlign=top&gt;
&lt;TD width=100&gt;0.11 % &lt;/TD&gt;
&lt;TD width=100&gt;1.19 % &lt;/TD&gt;
&lt;TD&gt;
&lt;TABLE style="FONT-SIZE: 10pt; FONT-FAMILY: verdana, arial, helvetia, sans-serif" cellPadding=0 width="100%" border=0&gt;
&lt;TBODY&gt;
&lt;TR vAlign=top&gt;
&lt;TD style="FONT-SIZE: 10pt; FONT-FAMILY: 'lucidia console', courier, serif"&gt;&amp;nbsp;&amp;nbsp;|&amp;nbsp;&lt;/TD&gt;
&lt;TD width="100%"&gt;ThreadNative::GetDomainLocalStore (void)&lt;/TD&gt;&lt;/TR&gt;&lt;/TBODY&gt;&lt;/TABLE&gt;&lt;/TD&gt;&lt;/TR&gt;
&lt;TR vAlign=top&gt;
&lt;TD width=100&gt;0.44 % &lt;/TD&gt;
&lt;TD width=100&gt;1.08 % &lt;/TD&gt;
&lt;TD&gt;
&lt;TABLE style="FONT-SIZE: 10pt; FONT-FAMILY: verdana, arial, helvetia, sans-serif" cellPadding=0 width="100%" border=0&gt;
&lt;TBODY&gt;
&lt;TR vAlign=top&gt;
&lt;TD style="FONT-SIZE: 10pt; FONT-FAMILY: 'lucidia console', courier, serif"&gt;&amp;nbsp;&amp;nbsp;|&amp;nbsp;&lt;/TD&gt;
&lt;TD width="100%"&gt;System.Threading.Thread.get_LocalDataStoreManager ()&lt;/TD&gt;&lt;/TR&gt;&lt;/TBODY&gt;&lt;/TABLE&gt;&lt;/TD&gt;&lt;/TR&gt;
&lt;TR vAlign=top&gt;
&lt;TD width=100&gt;0.53 % &lt;/TD&gt;
&lt;TD width=100&gt;1.05 % &lt;/TD&gt;
&lt;TD&gt;
&lt;TABLE style="FONT-SIZE: 10pt; FONT-FAMILY: verdana, arial, helvetia, sans-serif" cellPadding=0 width="100%" border=0&gt;
&lt;TBODY&gt;
&lt;TR vAlign=top&gt;
&lt;TD style="FONT-SIZE: 10pt; FONT-FAMILY: 'lucidia console', courier, serif"&gt;&amp;nbsp;&amp;nbsp;|&amp;nbsp;&lt;/TD&gt;
&lt;TD width="100%"&gt;System.LocalDataStore.GetData (class System.LocalDataStoreSlot)&lt;/TD&gt;&lt;/TR&gt;&lt;/TBODY&gt;&lt;/TABLE&gt;&lt;/TD&gt;&lt;/TR&gt;
&lt;TR vAlign=top&gt;
&lt;TD width=100&gt;0.25 % &lt;/TD&gt;
&lt;TD width=100&gt;8.43 % &lt;/TD&gt;
&lt;TD&gt;
&lt;TABLE style="FONT-SIZE: 10pt; FONT-FAMILY: verdana, arial, helvetia, sans-serif" cellPadding=0 width="100%" border=0&gt;
&lt;TBODY&gt;
&lt;TR vAlign=top&gt;
&lt;TD style="FONT-SIZE: 10pt; FONT-FAMILY: 'lucidia console', courier, serif"&gt;&amp;nbsp;&amp;nbsp;&lt;/TD&gt;
&lt;TD width="100%"&gt;Quiz10.Program.Test3 ()&lt;/TD&gt;&lt;/TR&gt;&lt;/TBODY&gt;&lt;/TABLE&gt;&lt;/TD&gt;&lt;/TR&gt;
&lt;TR vAlign=top&gt;
&lt;TD width=100&gt;0.55 % &lt;/TD&gt;
&lt;TD width=100&gt;7.07 % &lt;/TD&gt;
&lt;TD&gt;
&lt;TABLE style="FONT-SIZE: 10pt; FONT-FAMILY: verdana, arial, helvetia, sans-serif" cellPadding=0 width="100%" border=0&gt;
&lt;TBODY&gt;
&lt;TR vAlign=top&gt;
&lt;TD style="FONT-SIZE: 10pt; FONT-FAMILY: 'lucidia console', courier, serif"&gt;&amp;nbsp;&amp;nbsp;|&amp;nbsp;&lt;/TD&gt;
&lt;TD width="100%"&gt;System.Collections.Generic.Dictionary`2.get_Item (!0)&lt;/TD&gt;&lt;/TR&gt;&lt;/TBODY&gt;&lt;/TABLE&gt;&lt;/TD&gt;&lt;/TR&gt;
&lt;TR vAlign=top&gt;
&lt;TD width=100&gt;2.38 % &lt;/TD&gt;
&lt;TD width=100&gt;6.52 % &lt;/TD&gt;
&lt;TD&gt;
&lt;TABLE style="FONT-SIZE: 10pt; FONT-FAMILY: verdana, arial, helvetia, sans-serif" cellPadding=0 width="100%" border=0&gt;
&lt;TBODY&gt;
&lt;TR vAlign=top&gt;
&lt;TD style="FONT-SIZE: 10pt; FONT-FAMILY: 'lucidia console', courier, serif"&gt;&amp;nbsp;&amp;nbsp;|&amp;nbsp;&amp;nbsp;&amp;nbsp;&lt;/TD&gt;
&lt;TD width="100%"&gt;System.Collections.Generic.Dictionary`2.FindEntry (!0)&lt;/TD&gt;&lt;/TR&gt;&lt;/TBODY&gt;&lt;/TABLE&gt;&lt;/TD&gt;&lt;/TR&gt;
&lt;TR vAlign=top&gt;
&lt;TD width=100&gt;0.20 % &lt;/TD&gt;
&lt;TD width=100&gt;1.30 % &lt;/TD&gt;
&lt;TD&gt;
&lt;TABLE style="FONT-SIZE: 10pt; FONT-FAMILY: verdana, arial, helvetia, sans-serif" cellPadding=0 width="100%" border=0&gt;
&lt;TBODY&gt;
&lt;TR vAlign=top&gt;
&lt;TD style="FONT-SIZE: 10pt; FONT-FAMILY: 'lucidia console', courier, serif"&gt;&amp;nbsp;&amp;nbsp;&lt;/TD&gt;
&lt;TD width="100%"&gt;Quiz10.Program.Test4 ()&lt;/TD&gt;&lt;/TR&gt;&lt;/TBODY&gt;&lt;/TABLE&gt;&lt;/TD&gt;&lt;/TR&gt;&lt;/TBODY&gt;&lt;/TABLE&gt;
&lt;P&gt;The table above is showing all functions starting from Main with an inclusive cost &amp;gt;= 1% and a depth of no more than 3 -- so things are missing but it's good for discussion. Under Test1 there's a good deal of Locking and Marshalling... looks like there is a big oops here. The good news is that the contract is sound so hopefully this could be addressed. But really I'm not sure why I would even bother.&amp;nbsp; The other approach, using [ThreadStatic] is much cleaner and much faster.&amp;nbsp; I don't know why anyone would ever want to use the slots.&lt;/P&gt;
&lt;P&gt;For my part rather than fix this I think I will ask that the relevant functions be &lt;STRONG&gt;deprecated -- &lt;/STRONG&gt;the [ThreadStatic] approach seems better in every way&lt;STRONG&gt;.&amp;nbsp; &lt;/STRONG&gt;The slot methods hereby&amp;nbsp;have my personal deprecation for what that's worth.&lt;/P&gt;&lt;img src="http://blogs.msdn.com/aggbug.aspx?PostID=670314" width="1" height="1"&gt;</description><category domain="http://blogs.msdn.com/ricom/archive/tags/performance/default.aspx">performance</category><category domain="http://blogs.msdn.com/ricom/archive/tags/quiz/default.aspx">quiz</category></item><item><title>Performance Quiz #10 -- Thread local storage</title><link>http://blogs.msdn.com/ricom/archive/2006/06/16/performance-quiz-10-thread-local-storage.aspx</link><pubDate>Fri, 16 Jun 2006 22:55:00 GMT</pubDate><guid isPermaLink="false">91d46819-8472-40ad-a661-2c78acb4018c:634480</guid><dc:creator>ricom</dc:creator><slash:comments>8</slash:comments><comments>http://blogs.msdn.com/ricom/comments/634480.aspx</comments><wfw:commentRss>http://blogs.msdn.com/ricom/commentrss.aspx?PostID=634480</wfw:commentRss><description>&lt;P&gt;It's time for another quiz!&lt;/P&gt;
&lt;P&gt;A very short one this time.&amp;nbsp; If I need some thread local storage and I might need several entries should I use:&lt;/P&gt;
&lt;P&gt;Thread.GetData(slot) and Thread.SetData(slot, object)&lt;/P&gt;
&lt;P&gt;or should I make my own static member like this&lt;/P&gt;
&lt;P&gt;&amp;nbsp;[ThreadStatic]&lt;BR&gt;&amp;nbsp;static&amp;nbsp;Dictionary&amp;lt;String,Object&amp;gt; myItems = new Dictionary&amp;lt;String,Object&amp;gt;;&lt;/P&gt;
&lt;P&gt;and get things through the dictionary?&lt;/P&gt;
&lt;P&gt;Warning: spoilers below in the comments ... already :)&amp;nbsp; Maybe my quizzes are too easy :)&lt;/P&gt;&lt;img src="http://blogs.msdn.com/aggbug.aspx?PostID=634480" width="1" height="1"&gt;</description><category domain="http://blogs.msdn.com/ricom/archive/tags/performance/default.aspx">performance</category><category domain="http://blogs.msdn.com/ricom/archive/tags/quiz/default.aspx">quiz</category></item><item><title>Performance Quiz #9 -- extra innings due to dispatch stubs</title><link>http://blogs.msdn.com/ricom/archive/2006/03/21/performance-quiz-9-extra-innings-due-to-dispatch-stubs.aspx</link><pubDate>Wed, 22 Mar 2006 03:37:00 GMT</pubDate><guid isPermaLink="false">91d46819-8472-40ad-a661-2c78acb4018c:557439</guid><dc:creator>ricom</dc:creator><slash:comments>1</slash:comments><comments>http://blogs.msdn.com/ricom/comments/557439.aspx</comments><wfw:commentRss>http://blogs.msdn.com/ricom/commentrss.aspx?PostID=557439</wfw:commentRss><description>&lt;P&gt;Well you'll recall that in the &lt;A href="http://blogs.msdn.com/ricom/archive/2006/03/12/549987.aspx"&gt;Performance Quiz #9 solution&lt;/A&gt;&amp;nbsp;there was a surprising result where Test7 was actually faster than Test5 even though they appear to be doing basically exactly the same work.&amp;nbsp; So my new challenge to you was to see if you can explain it.&amp;nbsp; &lt;/P&gt;
&lt;P&gt;So, now this is your last chance to go and look for yourself before I give away the answer... so stop here if you want to work on the problem.&lt;/P&gt;
&lt;P&gt;OK if you're still reading then you're just dying to hear what was going on.&amp;nbsp; Well there's a lot of serendipity here.&amp;nbsp; My colleague &lt;A href="http://blogs.msdn.com/vancem"&gt;Vance&lt;/A&gt;&amp;nbsp;had just written a blog about how to &lt;A href="http://blogs.msdn.com/vancem/archive/2006/03/13/550529.aspx"&gt;disassemble dispatch stubs&lt;/A&gt;.&amp;nbsp; In what I can only categorize as a total unmitigated fluke (because believe it or not we didn't discuss this with each other at all) his blog which I linked to in the original has the answer right there on a silver platter.&lt;/P&gt;
&lt;P&gt;As &lt;A id=_ctl0____ctl0____ctl1___Comments___Comments__ctl4_NameLink title="Ivan Stoev" href="http://blogs.msdn.com/user/Profile.aspx?UserID=1001" target=_blank&gt;&lt;FONT color=#006bad&gt;Ivan Stoev&lt;/FONT&gt;&lt;/A&gt; wrote in the solution comments:&lt;/P&gt;
&lt;DIV class=commentsbody&gt;"The difference is caused by different interface call stubs (as explained by Vance Morrison) in Test5 and Test7 since [the] foreach loop calls ([to] IEnumerator&amp;lt;ushort&amp;gt; MoveNext and Current) in SumSpecial are only on List&amp;lt;ushort&amp;gt; instance while [the] same calls in SumForeach are on both ushort[] (in Test4)&amp;nbsp;and List&amp;lt;ushort&amp;gt; (in Test5). If we comment [out the] Test4 call, they would have similar speed (Test5 would be a little bit faster as expected). &lt;BR&gt;Very interesting, thanks Rico and Vance!"&lt;/DIV&gt;
&lt;DIV&gt;&amp;nbsp;&lt;/DIV&gt;
&lt;DIV&gt;Which is right on the mark!&amp;nbsp;&amp;nbsp; Full points to Ivan!&lt;/DIV&gt;
&lt;DIV&gt;&amp;nbsp;&lt;/DIV&gt;
&lt;DIV&gt;SumSpecial has two&amp;nbsp;call sites to the foreach methods, the first one always calls on an array instance so it can use the monomorphic stub (which is faster).&amp;nbsp;The second one calls always on the List&amp;lt;T&amp;gt; instance so it too can use the monomorphic stub.&amp;nbsp; SumForeach is handicapped because it's called first on an array and then on a List so the second time we decide to use a more flexible but slower stub to do the calling.&amp;nbsp; That makes Test5 slower.&lt;/DIV&gt;
&lt;DIV&gt;&amp;nbsp;&lt;/DIV&gt;
&lt;DIV&gt;As Ivan writes you can see the difference go away by simply commenting out the call to Test4 in program.cs at which point all the call sites in question are monomorphic.&amp;nbsp; Or equivalently you can create a duplicate of SumForeach called SumForeach2 and use that in Test5 and again all the call sites are monomorphic.&lt;/DIV&gt;
&lt;DIV&gt;&amp;nbsp;&lt;/DIV&gt;
&lt;DIV&gt;Remember, as Vance writes,&amp;nbsp;stub selection is &lt;EM&gt;per call site&lt;/EM&gt;.&lt;/DIV&gt;
&lt;DIV&gt;&amp;nbsp;&lt;/DIV&gt;
&lt;DIV&gt;And now that we have that little data we might want to rewrite program.cs to factor out stub selection in the reported times.&amp;nbsp; But really that doesn't change the results a whole lot.&amp;nbsp; The main thing that was going on was the array special helpers as discussed.&lt;/DIV&gt;
&lt;DIV&gt;&amp;nbsp;&lt;/DIV&gt;
&lt;DIV&gt;Chalk up another one for powerful secondary effects in the performance biz!&lt;/DIV&gt;&lt;img src="http://blogs.msdn.com/aggbug.aspx?PostID=557439" width="1" height="1"&gt;</description><category domain="http://blogs.msdn.com/ricom/archive/tags/performance/default.aspx">performance</category><category domain="http://blogs.msdn.com/ricom/archive/tags/quiz/default.aspx">quiz</category></item><item><title>Performance Quiz #9 :  IList&amp;lt;T&amp;gt; List and array speed -- solution</title><link>http://blogs.msdn.com/ricom/archive/2006/03/12/performance-quiz-9-ilist-lt-t-gt-list-and-array-speed-solution.aspx</link><pubDate>Sun, 12 Mar 2006 19:46:00 GMT</pubDate><guid isPermaLink="false">91d46819-8472-40ad-a661-2c78acb4018c:549987</guid><dc:creator>ricom</dc:creator><slash:comments>8</slash:comments><comments>http://blogs.msdn.com/ricom/comments/549987.aspx</comments><wfw:commentRss>http://blogs.msdn.com/ricom/commentrss.aspx?PostID=549987</wfw:commentRss><description>&lt;P&gt;Last week I posted &lt;A href="http://blogs.msdn.com/ricom/archive/2006/03/09/548097.aspx"&gt;Performance Quiz #9&lt;/A&gt;&amp;nbsp;for discussion. Well the comments have been just awesome and many people went right ahead and measured the results.&amp;nbsp; Some of you have discovered the key point in this particular test:&lt;/P&gt;
&lt;P&gt;Arrays are magic.&lt;/P&gt;
&lt;P&gt;In the test case as given there's a great deal of repeated work extracting the length of the array and accessing the items.&amp;nbsp; This is because of the unusual property that arrays have -- they implement IList&amp;lt;T&amp;gt; for potetially more than one T even due to inheritance.&amp;nbsp; In the interest of economy alone then it is worthwhile to consolidate the IList&amp;lt;T&amp;gt; implementations into some kind of secret helper but this has some consequences.&lt;/P&gt;
&lt;P&gt;Luckily there are also alternatives -- the example that I provided is perhaps the worst performing choice.&amp;nbsp; As one reader pointed out you could make it much better by extracting the count from the IList once and then using that in the loop.&amp;nbsp; Note that the compiler cannot do this because IList&amp;lt;T&amp;gt;.Count is a virtual method and so we cannot assume anything about the implementation (it could sent a letter to grandma every time it is called and grammy wants her email!)&lt;/P&gt;
&lt;P&gt;Here are the results from my program (source code attached, see link at the very bottom)&lt;/P&gt;
&lt;TABLE&gt;
&lt;TBODY&gt;
&lt;TR&gt;
&lt;TD&gt;&lt;STRONG&gt;Test Case&lt;/STRONG&gt;&lt;/TD&gt;
&lt;TD&gt;&lt;STRONG&gt;Milliseconds&lt;/STRONG&gt;&lt;/TD&gt;&lt;/TR&gt;
&lt;TR&gt;
&lt;TD&gt;Test1: Array&lt;/TD&gt;
&lt;TD&gt;54&lt;/TD&gt;&lt;/TR&gt;
&lt;TR&gt;
&lt;TD&gt;Test2: List&amp;lt;&amp;gt;&lt;/TD&gt;
&lt;TD&gt;8&lt;/TD&gt;&lt;/TR&gt;
&lt;TR&gt;
&lt;TD&gt;Test3: ArrayWrapper&lt;/TD&gt;
&lt;TD&gt;14&lt;/TD&gt;&lt;/TR&gt;
&lt;TR&gt;
&lt;TD&gt;Test4: Array via foreach&lt;/TD&gt;
&lt;TD&gt;9&lt;/TD&gt;&lt;/TR&gt;
&lt;TR&gt;
&lt;TD&gt;Test5: List&amp;lt;&amp;gt; via foreach&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; &lt;/TD&gt;
&lt;TD&gt;11&lt;/TD&gt;&lt;/TR&gt;
&lt;TR&gt;
&lt;TD&gt;Test6: Array via special&lt;/TD&gt;
&lt;TD&gt;6&lt;/TD&gt;&lt;/TR&gt;
&lt;TR&gt;
&lt;TD&gt;Test7: List&amp;lt;&amp;gt; via special&lt;/TD&gt;
&lt;TD&gt;8&lt;/TD&gt;&lt;/TR&gt;&lt;/TBODY&gt;&lt;/TABLE&gt;
&lt;P&gt;The first two rows are the problem as given and we can easily see that the array is performing much more slowly than the List.&amp;nbsp; Let's&amp;nbsp;see where the cost is:&lt;/P&gt;
&lt;TABLE&gt;
&lt;TBODY&gt;
&lt;TR&gt;
&lt;TD&gt;&lt;STRONG&gt;Excl%&lt;/STRONG&gt;&lt;/TD&gt;
&lt;TD&gt;&lt;STRONG&gt;Incl%&lt;/STRONG&gt;&lt;/TD&gt;
&lt;TD&gt;&lt;STRONG&gt;Name&lt;/STRONG&gt;&lt;/TD&gt;&lt;/TR&gt;
&lt;TR&gt;
&lt;TD&gt;0.00&lt;/TD&gt;
&lt;TD&gt;51.27&lt;/TD&gt;
&lt;TD&gt;ArrayTest.Program.Test1(uint16[])&lt;/TD&gt;&lt;/TR&gt;
&lt;TR&gt;
&lt;TD&gt;1.64&lt;/TD&gt;
&lt;TD&gt;51.27&lt;/TD&gt;
&lt;TD&gt;&amp;nbsp;&amp;nbsp;ArrayTest.Program.Test.Sum(System.Collections.Generic.IList`1&lt;UINT16&gt;)&lt;/TD&gt;&lt;/TR&gt;
&lt;TR&gt;
&lt;TD&gt;2.38&lt;/TD&gt;
&lt;TD&gt;25.64&lt;/TD&gt;
&lt;TD&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&lt;STRONG&gt;System.SZArrayHelper.get_Item(int32)&lt;/STRONG&gt;&lt;/TD&gt;&lt;/TR&gt;
&lt;TR&gt;
&lt;TD&gt;5.47&lt;/TD&gt;
&lt;TD&gt;23.26&lt;/TD&gt;
&lt;TD&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;JIT_IsInstanceOfArray(...)&lt;/TD&gt;&lt;/TR&gt;
&lt;TR&gt;
&lt;TD&gt;4.72&lt;/TD&gt;
&lt;TD&gt;7.26&lt;/TD&gt;
&lt;TD&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; ArrayIsInstanceOfNoGC(...)&lt;/TD&gt;&lt;/TR&gt;
&lt;TR&gt;
&lt;TD&gt;2.63&lt;/TD&gt;
&lt;TD&gt;10.02&lt;/TD&gt;
&lt;TD&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; SigTypeContext::InitTypeContext(...)&lt;/TD&gt;&lt;/TR&gt;
&lt;TR&gt;
&lt;TD&gt;0.49&lt;/TD&gt;
&lt;TD&gt;22.40&lt;/TD&gt;
&lt;TD&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&lt;STRONG&gt;System.SZArrayHelper.get_Count()&lt;/STRONG&gt;&lt;/TD&gt;&lt;/TR&gt;
&lt;TR&gt;
&lt;TD&gt;5.36&lt;/TD&gt;
&lt;TD&gt;21.91&lt;/TD&gt;
&lt;TD&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;JIT_IsInstanceOfArray(...)&lt;/TD&gt;&lt;/TR&gt;
&lt;TR&gt;
&lt;TD&gt;4.30&lt;/TD&gt;
&lt;TD&gt;6.99&lt;/TD&gt;
&lt;TD&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; ArrayIsInstanceOfNoGC(...)&lt;/TD&gt;&lt;/TR&gt;
&lt;TR&gt;
&lt;TD&gt;2.40&lt;/TD&gt;
&lt;TD&gt;9.07&lt;/TD&gt;
&lt;TD&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; SigTypeContext::InitTypeContext(...)&lt;/TD&gt;&lt;/TR&gt;&lt;/TBODY&gt;&lt;/TABLE&gt;
&lt;P&gt;Well it pretty much leaps off the page, the cost is in those helpers and you can see they're doing a good bit of internal work to verify that we are talking about a simple array of a non-collectable type (ushort) and then more work happens to get to the real data.&amp;nbsp; The tragedy here is that the work done to validate what we need to do far exceeds the actual job at hand -- which in both cases is just extracting one integer from a well known location.&amp;nbsp; The price of abstraction....&amp;nbsp; Though I'm pretty sure there is room for improvement in there.&lt;/P&gt;
&lt;P&gt;Let's look at some of the other alternatives.&amp;nbsp; First how did the List&amp;lt;T&amp;gt; fare:&lt;/P&gt;
&lt;TABLE&gt;
&lt;TBODY&gt;
&lt;TR&gt;
&lt;TD&gt;&lt;STRONG&gt;Excl%&lt;/STRONG&gt;&lt;/TD&gt;
&lt;TD&gt;&lt;STRONG&gt;Incl%&lt;/STRONG&gt;&lt;/TD&gt;
&lt;TD&gt;&lt;STRONG&gt;Name&lt;/STRONG&gt;&lt;/TD&gt;&lt;/TR&gt;
&lt;TR&gt;
&lt;TD&gt;0.00&lt;/TD&gt;
&lt;TD&gt;8.62&lt;/TD&gt;
&lt;TD&gt;ArrayTest.Program.Test2(System.Collections.Generic.List`1&lt;UINT16&gt;)&lt;/TD&gt;&lt;/TR&gt;
&lt;TR&gt;
&lt;TD&gt;1.06&lt;/TD&gt;
&lt;TD&gt;8.62&lt;/TD&gt;
&lt;TD&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;ArrayTest.Program.Test.Sum(System.Collections.Generic.IList`1&lt;UINT16&gt;)&lt;/TD&gt;&lt;/TR&gt;
&lt;TR&gt;
&lt;TD&gt;5.82&lt;/TD&gt;
&lt;TD&gt;5.82&lt;/TD&gt;
&lt;TD&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;CLRStubOrUnknownAddress&lt;/TD&gt;&lt;/TR&gt;
&lt;TR&gt;
&lt;TD&gt;1.54&lt;/TD&gt;
&lt;TD&gt;1.54&lt;/TD&gt;
&lt;TD&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;System.Collections.Generic.List`1.get_Item(int32)&lt;/TD&gt;&lt;/TR&gt;&lt;/TBODY&gt;&lt;/TABLE&gt;
&lt;P&gt;In the list case there's still a good bit of dispatch code but you can see it's much cheaper.&amp;nbsp; We're on the sweet path for regular interface dispatch.&amp;nbsp; The actual count function calls are stil there if we were to look at lower percentages but I pruned anything with inclusive time below 1% in this example so it's not showing up.&amp;nbsp; We're doing a lot better.&lt;/P&gt;
&lt;P&gt;Now what about some of the other results?&amp;nbsp; Well, now we're in the bonus marks zone and I don't want to drown you in perf results you can get yourself so I'll summarize a bit.&lt;/P&gt;
&lt;P&gt;In Test 3 I made a generic wrapper class to hold the array.&amp;nbsp; This adds a level of indirection but it avoids all the magic array processing.&amp;nbsp; My wrapper is highly lame (it only implements count and get item) but you can see that it's doing fairly well performance wise.&amp;nbsp; Much better than the raw array and it's a cheap object.&lt;/P&gt;
&lt;P&gt;But the fun continues:&amp;nbsp; What if we replace that for loop in the original test with a foreach?&amp;nbsp; We can't get the ultra-special array foreach construct but we can get a nice standard foreach over an IList in both cases.&amp;nbsp; Both Test4 and Test5 do introduce one allocation for the enumerator (which could be an issue if there were gazillions of these lists) but for our test case that's not even measurable (indeed I get no samples in the allocator).&lt;/P&gt;
&lt;P&gt;And the result?&amp;nbsp; It's pretty sweet:&amp;nbsp; The array problem goes away entirely because we only have to do the array testing one time.&amp;nbsp; After that we've got a nice bound up enumerator that knows all the details of the array -- no more redundant computations.&amp;nbsp;&amp;nbsp; The array works great.. but the list is a bit slower?&amp;nbsp; Why would that be?&amp;nbsp; Well there's some extra safety checks in the enumerator access code for List&amp;lt;T&amp;gt; to make sure the enumerator isn't being used on an List that is being modified for instance.&amp;nbsp; Those checks slow down the list access a bit.&lt;/P&gt;
&lt;P&gt;And the last two tests:&amp;nbsp; What if we special case the array in the sum function; assuming that was a very common/important case.&amp;nbsp; You can see that test6 gives the best result of all -- now there isn't any enumerator and we're on the most optimized code path of any -- foreach over a plain array.&lt;/P&gt;
&lt;P&gt;But why is the List case faster in test7?&amp;nbsp; Shouldn't it be pretty much the same as test5?&amp;nbsp; After all there's an extra test and otherwise it's the same.&amp;nbsp; Now there's a mystery...&amp;nbsp; I actually ran a variety of control experiments, doing the tests in different order and with different array members&amp;nbsp;to make sure this wasn't&amp;nbsp;an anomolous result -- you can imagine with all these iterations cache effects would be important.&amp;nbsp; But no matter what I did the code in test7 always went faster than the code in test5.&amp;nbsp; Adding more iterations actually magnifies the effect.&lt;/P&gt;
&lt;P&gt;The profile isn't especially helpful either -- both Test5 and Test7 have exactly the same call shape but Test7 spends less time in CLR stubs apparently.&amp;nbsp; At the moment my best theory is that we happen to get better cache alignment in Test7 than Test5 -- less instructions split across cache lines or&amp;nbsp;fewer unfortunate cache evictions because of happenstance.&lt;/P&gt;
&lt;P&gt;Such is the life of the performance engineer -- we sometimes get bit by a secondary effect.&lt;/P&gt;
&lt;P&gt;So shall we continue?&amp;nbsp; What is happening to make Test7 faster than Test5?&lt;/P&gt;
&lt;P&gt;And this seems like a great place to thank my colleague &lt;A href="http://blogs.msdn.com/vancem/"&gt;Vance Morrison&lt;/A&gt;&amp;nbsp;for&amp;nbsp;chiming in with&amp;nbsp;the foreach and special case approaches&amp;nbsp;when I first was bouncing this analysis around.&amp;nbsp; Thanks a ton Vance :)&lt;/P&gt;&lt;img src="http://blogs.msdn.com/aggbug.aspx?PostID=549987" width="1" height="1"&gt;</description><enclosure url="http://blogs.msdn.com/ricom/attachment/549987.ashx" length="5853" type="text/plain" /><category domain="http://blogs.msdn.com/ricom/archive/tags/performance/default.aspx">performance</category><category domain="http://blogs.msdn.com/ricom/archive/tags/quiz/default.aspx">quiz</category></item><item><title>Performance Quiz #9 :  IList&amp;lt;T&amp;gt; List and array speed</title><link>http://blogs.msdn.com/ricom/archive/2006/03/09/performance-quiz-9-ilist-lt-t-gt-list-and-array-speed.aspx</link><pubDate>Fri, 10 Mar 2006 06:31:00 GMT</pubDate><guid isPermaLink="false">91d46819-8472-40ad-a661-2c78acb4018c:548097</guid><dc:creator>ricom</dc:creator><slash:comments>34</slash:comments><comments>http://blogs.msdn.com/ricom/comments/548097.aspx</comments><wfw:commentRss>http://blogs.msdn.com/ricom/commentrss.aspx?PostID=548097</wfw:commentRss><description>&lt;P&gt;A short and sweet quiz with lots of juicy discussion possibilities:&lt;/P&gt;
&lt;P&gt;&lt;FONT size=2&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&lt;FONT color=#0000ff&gt;public&lt;/FONT&gt; &lt;FONT color=#0000ff&gt;int&lt;/FONT&gt; Sum(&lt;FONT color=#008080&gt;IList&lt;/FONT&gt;&amp;lt;&lt;FONT color=#0000ff&gt;ushort&lt;/FONT&gt;&amp;gt; indices)&lt;BR&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;{&lt;BR&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; &lt;FONT color=#0000ff&gt;int&lt;/FONT&gt; result = 0;&lt;BR&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; &lt;FONT color=#0000ff&gt;for&lt;/FONT&gt; (&lt;FONT color=#0000ff&gt;int&lt;/FONT&gt; i = 0; i &amp;lt; indices.Count; i++)&amp;nbsp;&lt;BR&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; result += indices[i];&lt;BR&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; &lt;FONT color=#0000ff&gt;return&lt;/FONT&gt; result;&lt;BR&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;}&lt;/FONT&gt;&lt;FONT size=1&gt;&lt;/P&gt;&lt;/FONT&gt;
&lt;P&gt;Considering only the time it takes to do the Sum (i.e. assuming we had already set up the array/list) which gives better performance and why?&lt;/P&gt;&lt;FONT size=1&gt;
&lt;P&gt;&lt;/FONT&gt;&lt;FONT size=2&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&lt;FONT color=#0000ff&gt;// #1&lt;/FONT&gt;&lt;BR&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&lt;FONT color=#0000ff&gt;ushort&lt;/FONT&gt;[] tmp = new &lt;FONT color=#0000ff&gt;ushort&lt;/FONT&gt;[500000]; // this doesn't count&lt;BR&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;Sum(tmp);&amp;nbsp;// this is what we are timing&lt;/FONT&gt;&lt;/P&gt;
&lt;P&gt;OR&lt;/P&gt;
&lt;P&gt;&lt;FONT size=2&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&lt;FONT color=#0000ff&gt;// #2&lt;/FONT&gt;&lt;BR&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&lt;FONT color=#008080&gt;List&lt;/FONT&gt;&amp;lt;&lt;FONT color=#0000ff&gt;ushort&lt;/FONT&gt;&amp;gt; tmp = &lt;FONT color=#0000ff&gt;new&lt;/FONT&gt; &lt;FONT color=#008080&gt;List&lt;/FONT&gt;&amp;lt;&lt;FONT color=#0000ff&gt;ushort&lt;/FONT&gt;&amp;gt;(500000); // this doesn't count&lt;BR&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&lt;FONT color=#0000ff&gt;for&lt;/FONT&gt; (&lt;FONT color=#0000ff&gt;int&lt;/FONT&gt; i = 0; i &amp;lt; 500000; i++)&amp;nbsp;tmp.Add(0); // this doesn't count&lt;BR&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;Sum(tmp); // this is what we are timing&lt;/FONT&gt;&lt;FONT size=1&gt;&lt;/P&gt;
&lt;P&gt;&lt;/FONT&gt;What say you gentle readers?&lt;/P&gt;
&lt;P&gt;(my solution is now posted &lt;A href="http://blogs.msdn.com/ricom/archive/2006/03/12/549987.aspx"&gt;here&lt;/A&gt;)&lt;/P&gt;&lt;img src="http://blogs.msdn.com/aggbug.aspx?PostID=548097" width="1" height="1"&gt;</description><category domain="http://blogs.msdn.com/ricom/archive/tags/performance/default.aspx">performance</category><category domain="http://blogs.msdn.com/ricom/archive/tags/quiz/default.aspx">quiz</category></item><item><title>Performance Quiz #8 -- The problems with parsing -- Part 5</title><link>http://blogs.msdn.com/ricom/archive/2005/11/30/performance-quiz-8-the-problems-with-parsing-part-5.aspx</link><pubDate>Wed, 30 Nov 2005 22:32:00 GMT</pubDate><guid isPermaLink="false">91d46819-8472-40ad-a661-2c78acb4018c:498546</guid><dc:creator>ricom</dc:creator><slash:comments>6</slash:comments><comments>http://blogs.msdn.com/ricom/comments/498546.aspx</comments><wfw:commentRss>http://blogs.msdn.com/ricom/commentrss.aspx?PostID=498546</wfw:commentRss><description>&lt;P&gt;And today we conclude our little saga with one final experiment, a little analysis and some comments.&amp;nbsp; I hope you've enjoyed reading this series as much as I've enjoyed doing the experiments.&lt;/P&gt;
&lt;P&gt;First, as promised I created yet another version of this little parser.&amp;nbsp; This version is very similar to the last except now it's using our new lightweight code generation to actually make native code for each predicate.&amp;nbsp; You can find &lt;a href="http://blogs.msdn.com/ricom/articles/498520.aspx"&gt;parser_jitting.cs&lt;/A&gt;&amp;nbsp;in my articles area like the others.&lt;/P&gt;
&lt;P&gt;And how does this new parser combination fare?&amp;nbsp; Surely generating native code is going to win over my lame little stack machine yes?&amp;nbsp; And I've even stacked the deck by doing an admittedly unrealistic 1,000,000 iterations over the predicates for just one compilation.&lt;/P&gt;
&lt;P&gt;The results are as follows:&lt;/P&gt;
&lt;TABLE&gt;
&lt;TBODY&gt;
&lt;TR&gt;
&lt;TD align=left&gt;&lt;STRONG&gt;test case&lt;/STRONG&gt;&lt;/TD&gt;
&lt;TD align=middle&gt;&lt;STRONG&gt;predicates&lt;/STRONG&gt;&lt;/TD&gt;
&lt;TD align=right&gt;&lt;STRONG&gt;&amp;nbsp;&amp;nbsp; iterations&lt;/STRONG&gt;&lt;/TD&gt;
&lt;TD align=right&gt;&lt;STRONG&gt;time&lt;/STRONG&gt;&lt;/TD&gt;&lt;/TR&gt;
&lt;TR&gt;
&lt;TD align=left&gt;parser_compiling.cs&lt;/TD&gt;
&lt;TD align=middle&gt;8&lt;/TD&gt;
&lt;TD align=right&gt;&amp;nbsp;1,000,000&lt;/TD&gt;
&lt;TD align=right&gt;225ms&lt;/TD&gt;&lt;/TR&gt;
&lt;TR&gt;
&lt;TD align=left&gt;parser_jitting.cs&lt;/TD&gt;
&lt;TD align=middle&gt;8&lt;/TD&gt;
&lt;TD align=right&gt;1,000,000&lt;/TD&gt;
&lt;TD align=right&gt;339ms&lt;/TD&gt;&lt;/TR&gt;
&lt;TR&gt;
&lt;TD align=left&gt;parser_original.cs&lt;/TD&gt;
&lt;TD align=middle&gt;8&lt;/TD&gt;
&lt;TD align=right&gt;1,000,000&lt;/TD&gt;
&lt;TD align=right&gt;&amp;nbsp; 4,839ms&lt;/TD&gt;&lt;/TR&gt;&lt;/TBODY&gt;&lt;/TABLE&gt;
&lt;P&gt;Wait a second, it's slower!&amp;nbsp; What happened?&amp;nbsp; &lt;/P&gt;
&lt;P&gt;OK I admit it; once again I threw you a curve ball.&amp;nbsp; What's going on here is that the predicates in the inital test case are fairly simple.&amp;nbsp; There just isn't very much code.&amp;nbsp; On the other hand the preamble for a call through a delegate is not totally trivial.&amp;nbsp; Normally it doesn't matter much but here those functions are so small (maybe a half dozen instructions or so) that the function overhead is just killing us.&amp;nbsp;&amp;nbsp; So the delegates actually lose!&amp;nbsp;&amp;nbsp; Now for kicks I made a version that has the predicates hard coded, that one clocked in at a brisk 70ms (same test case), so that's a good baseline for the fastest this could possibly be.&amp;nbsp;I used that version as a sanity check and as a template for how to create the IL in my jitting version.&lt;/P&gt;
&lt;P&gt;But all is not lost.&amp;nbsp; As has been pointed out to me these predicates are not especially complicated, maybe unrealistically so.&amp;nbsp; So while strictly speaking the analysis for the benchmark as given is done at this point let's do a little more bonus work and see what we get.&lt;/P&gt;
&lt;P&gt;&lt;A href="http://strangelights.com/blog/archive/2005/11/26/1268.aspx"&gt;Robert Pickering&lt;/A&gt; conveniently provided some more complicated predicates in his &lt;A href="http://strangelights.com/blog/archive/2005/11/26/1268.aspx"&gt;F# port&lt;/A&gt; of the parser, so I pasted those into my array making a total of 17 predicates, 9 of which are more complicated.&lt;/P&gt;
&lt;TABLE&gt;
&lt;TBODY&gt;
&lt;TR&gt;
&lt;TD align=left&gt;&lt;STRONG&gt;test case&lt;/STRONG&gt;&lt;/TD&gt;
&lt;TD align=middle&gt;&lt;STRONG&gt;predicates&lt;/STRONG&gt;&lt;/TD&gt;
&lt;TD align=right&gt;&lt;STRONG&gt;&amp;nbsp;&amp;nbsp; iterations&lt;/STRONG&gt;&lt;/TD&gt;
&lt;TD align=right&gt;&lt;STRONG&gt;time&lt;/STRONG&gt;&lt;/TD&gt;&lt;/TR&gt;
&lt;TR&gt;
&lt;TD align=left&gt;parser_compiling.cs&lt;/TD&gt;
&lt;TD align=middle&gt;8+9&lt;/TD&gt;
&lt;TD align=right&gt;&amp;nbsp;1,000,000&lt;/TD&gt;
&lt;TD align=right&gt;1,378ms&lt;/TD&gt;&lt;/TR&gt;
&lt;TR&gt;
&lt;TD align=left&gt;parser_jitting.cs&lt;/TD&gt;
&lt;TD align=middle&gt;8+9&lt;/TD&gt;
&lt;TD align=right&gt;1,000,000&lt;/TD&gt;
&lt;TD align=right&gt;716ms&lt;/TD&gt;&lt;/TR&gt;
&lt;TR&gt;
&lt;TD align=left&gt;parser_original.cs&lt;/TD&gt;
&lt;TD align=middle&gt;8+9&lt;/TD&gt;
&lt;TD align=right&gt;1,000,000&lt;/TD&gt;
&lt;TD align=right&gt;&amp;nbsp; not tested&lt;/TD&gt;&lt;/TR&gt;&lt;/TBODY&gt;&lt;/TABLE&gt;&lt;BR&gt;Now we can see that if we&amp;nbsp;can pay for the overhead of invoking a delegate then things go a lot better for the jitted version.&amp;nbsp; But what about those compilation costs?&amp;nbsp; Surely we shouldn't just disregard the cost of invoking the JIT.&amp;nbsp; So again while it wasn't in my original formulation lets go on for even more bonus points and see what the cost of compliation is in a more realistic mix of compiling and evaluating.&amp;nbsp; Here I added a LOT more predicates, duplicating the 9 above to get us above 2000 predicates.&amp;nbsp; Then I turned the iteration count way down to 20 which I think is more like what I'll see in the application that motivated this whole problem.&amp;nbsp; Finally I eliminated the test output and moved the timer start to the top of Main.&amp;nbsp; And here are the results for the "more realistic mix." 
&lt;P&gt;
&lt;TABLE&gt;
&lt;TBODY&gt;
&lt;TR&gt;
&lt;TD align=left&gt;&lt;STRONG&gt;test case&lt;/STRONG&gt;&lt;/TD&gt;
&lt;TD align=middle&gt;&lt;STRONG&gt;predicates&lt;/STRONG&gt;&lt;/TD&gt;
&lt;TD align=right&gt;&lt;STRONG&gt;&amp;nbsp;&amp;nbsp; iterations&lt;/STRONG&gt;&lt;/TD&gt;
&lt;TD align=right&gt;&lt;STRONG&gt;time&lt;/STRONG&gt;&lt;/TD&gt;&lt;/TR&gt;
&lt;TR&gt;
&lt;TD align=left&gt;parser_compiling.cs&lt;/TD&gt;
&lt;TD align=middle&gt;2258&lt;/TD&gt;
&lt;TD align=middle&gt;20&lt;/TD&gt;
&lt;TD align=right&gt;27ms&lt;/TD&gt;&lt;/TR&gt;
&lt;TR&gt;
&lt;TD align=left&gt;parser_jitting.cs&lt;/TD&gt;
&lt;TD align=middle&gt;2258&lt;/TD&gt;
&lt;TD align=middle&gt;20&lt;/TD&gt;
&lt;TD align=right&gt;631ms&lt;/TD&gt;&lt;/TR&gt;
&lt;TR&gt;
&lt;TD align=left&gt;parser_original.cs&lt;/TD&gt;
&lt;TD align=middle&gt;2258&lt;/TD&gt;
&lt;TD align=middle&gt;20&lt;/TD&gt;
&lt;TD align=right&gt;&amp;nbsp; 213ms&lt;/TD&gt;&lt;/TR&gt;&lt;/TBODY&gt;&lt;/TABLE&gt;&lt;/P&gt;
&lt;P&gt;I think it's very clear which one I should pick for my actual project at this point.&amp;nbsp; There just aren't enough evaluations to pay for the cost of jitting.&amp;nbsp; Even the original parser is holding up all right;&amp;nbsp; with no compilation costs and fewer evaluations its problems are not being multiplied.&lt;/P&gt;
&lt;P&gt;Now let me go back and answer my original questions directly.&lt;/P&gt;
&lt;P&gt;&lt;STRONG&gt;Q1: What's wrong with this parser from a performance perspective?&amp;nbsp;&amp;nbsp; &lt;BR&gt;&lt;/STRONG&gt;&lt;BR&gt;Fundamentally the problem is that evaluation has not been seperated from parsing and that as a consequence each piece of input is being examined much more often than it needs to be.&lt;/P&gt;
&lt;P&gt;&lt;STRONG&gt;Q2: What should we be doing to improve it?&amp;nbsp; &lt;/STRONG&gt;&lt;/P&gt;
&lt;P&gt;Seperate the costs of parsing from the costs of evaluation.&amp;nbsp; Reduce the cost of evaluation by using a better truth mechanism.&lt;/P&gt;
&lt;P&gt;&lt;STRONG&gt;Q3: How big of a difference is it likely to make?&lt;/STRONG&gt;&lt;BR&gt;&lt;BR&gt;We can guess at this by thinking about how often we look at any given character of input.&amp;nbsp;&amp;nbsp; We scan it in getc, then we switch on it (hash and equality), then we do a lookup (hash+equality), then we copy it onto the heap, there is no per-iteration cost of compacting it off the heap as dead strings on the heap do not have to be moved (only live strings are moved).&amp;nbsp; So I'm seeing memory traffic of at least 8 times the sum of all the parse strings (per iteration).&amp;nbsp; In contrast if the strings were pre-tokenized then the memory traffic is more like 1/2 the sum of all the strings (per iteration).&amp;nbsp; So I'd ballpark the gains available to be around a factor of 16 just on that basis.&amp;nbsp; We ended up getting somewhat more as we did more than just eliminate the pressure on the memory system.&lt;/P&gt;
&lt;P&gt;&lt;STRONG&gt;Q4: If you hadn't written all the code yet, what would you do to get a better idea what was going to matter and what wasn't?&lt;/STRONG&gt;&lt;/P&gt;
&lt;P&gt;Well I think I gave away the answer in Q3.&amp;nbsp; I'd approach this my looking at how much work I do per character of input on average and go from there.&amp;nbsp; That gives us good ballpark numbers without too much work.&lt;/P&gt;
&lt;P&gt;Thank you very much for all the emails and comments!&lt;/P&gt;&lt;img src="http://blogs.msdn.com/aggbug.aspx?PostID=498546" width="1" height="1"&gt;</description><category domain="http://blogs.msdn.com/ricom/archive/tags/performance/default.aspx">performance</category><category domain="http://blogs.msdn.com/ricom/archive/tags/quiz/default.aspx">quiz</category></item><item><title>Performance Quiz #8 -- The problems with parsing -- Part 4</title><link>http://blogs.msdn.com/ricom/archive/2005/11/29/performance-quiz-8-the-problems-with-parsing-part-4.aspx</link><pubDate>Tue, 29 Nov 2005 22:48:00 GMT</pubDate><guid isPermaLink="false">91d46819-8472-40ad-a661-2c78acb4018c:498009</guid><dc:creator>ricom</dc:creator><slash:comments>5</slash:comments><comments>http://blogs.msdn.com/ricom/comments/498009.aspx</comments><wfw:commentRss>http://blogs.msdn.com/ricom/commentrss.aspx?PostID=498009</wfw:commentRss><description>&lt;P&gt;In my &lt;a href="http://blogs.msdn.com/ricom/archive/2005/11/28/497595.aspx"&gt;last posting&lt;/A&gt; I made some recommendations about how to drastically improve the evaluation process based on the key observations that even the basic operations required to evaluate the truth of the facts were quite costly -- more so than all of the extra allocations.&lt;/P&gt;
&lt;P&gt;I suggested a plan like this one:&lt;/P&gt;
&lt;LI&gt;preparse the predicates so that no string processing is required for evaluation 
&lt;LI&gt;assign facts a number in sequence as they are encountered in the predicates 
&lt;LI&gt;the&amp;nbsp;preparsed predicate is a mix of fact numbers and operators probably in postfix order 
&lt;LI&gt;when looking up facts simply access an array of truth bits indexed&amp;nbsp;by fact number 
&lt;LI&gt;as facts change simply change the array contents and then reevaluate predicates as desired 
&lt;LI&gt;do the evaluation itself with a simple stack machine; no recursion in the evaluation 
&lt;P&gt;And as promised here's some code&amp;nbsp;that does just that:&amp;nbsp; &lt;a href="http://blogs.msdn.com/ricom/articles/498000.aspx"&gt;parser_compling.cs&lt;/A&gt;&lt;/P&gt;
&lt;P&gt;Now how do we fare with this code?&amp;nbsp; Well the raw number I get on my machine is now 225ms to do those same predicate evaluations.&amp;nbsp; That's down from 4.839s -- so it's over 21 times faster.&amp;nbsp; Let's have a look at how the CPU usage is distributed now:&lt;/P&gt;
&lt;P&gt;
&lt;TABLE style="WIDTH: 302pt; BORDER-COLLAPSE: collapse" cellSpacing=0 cellPadding=0 width=402 border=0 x:str&gt;
&lt;COLGROUP&gt;
&lt;COL style="WIDTH: 243pt; mso-width-source: userset; mso-width-alt: 11849" width=324&gt;
&lt;COL style="WIDTH: 59pt; mso-width-source: userset; mso-width-alt: 2852" width=78&gt;
&lt;TBODY&gt;
&lt;TR style="HEIGHT: 12.75pt" height=17&gt;
&lt;TD class=xl25 style="BORDER-RIGHT: #dcd9df; BORDER-TOP: #dcd9df; BORDER-LEFT: #dcd9df; WIDTH: 243pt; BORDER-BOTTOM: #dcd9df; HEIGHT: 12.75pt; BACKGROUND-COLOR: transparent" width=324 height=17&gt;&lt;FONT face=Arial size=2&gt;&lt;STRONG&gt;Module/Function&lt;/STRONG&gt;&lt;/FONT&gt;&lt;/TD&gt;
&lt;TD class=xl25 style="BORDER-RIGHT: #dcd9df; BORDER-TOP: #dcd9df; BORDER-LEFT: #dcd9df; WIDTH: 59pt; BORDER-BOTTOM: #dcd9df; BACKGROUND-COLOR: transparent" width=78&gt;&lt;STRONG&gt;&lt;FONT face=Arial size=2&gt;Exclusive %&lt;/FONT&gt;&lt;/STRONG&gt;&lt;/TD&gt;&lt;/TR&gt;
&lt;TR style="HEIGHT: 12.75pt" height=17&gt;
&lt;TD style="BORDER-RIGHT: #dcd9df; BORDER-TOP: #dcd9df; BORDER-LEFT: #dcd9df; BORDER-BOTTOM: #dcd9df; HEIGHT: 12.75pt; BACKGROUND-COLOR: transparent" height=17&gt;&lt;FONT face=Arial size=2&gt;ParserCompiling.exe&lt;/FONT&gt;&lt;/TD&gt;
&lt;TD class=xl24 style="BORDER-RIGHT: #dcd9df; BORDER-TOP: #dcd9df; BORDER-LEFT: #dcd9df; BORDER-BOTTOM: #dcd9df; BACKGROUND-COLOR: transparent" align=right x:num="93.501000000000005"&gt;&lt;FONT face=Arial size=2&gt;93.50&lt;/FONT&gt;&lt;/TD&gt;&lt;/TR&gt;
&lt;TR style="HEIGHT: 12.75pt" height=17&gt;
&lt;TD style="BORDER-RIGHT: #dcd9df; BORDER-TOP: #dcd9df; BORDER-LEFT: #dcd9df; BORDER-BOTTOM: #dcd9df; HEIGHT: 12.75pt; BACKGROUND-COLOR: transparent" height=17&gt;&lt;FONT size=2&gt;&lt;FONT face=Arial&gt;&lt;SPAN style="mso-spacerun: yes"&gt;&amp;nbsp;&amp;nbsp; &lt;/SPAN&gt;ParserCompiling.Program.EvaluatePredicate(int32[])&lt;/FONT&gt;&lt;/FONT&gt;&lt;/TD&gt;
&lt;TD class=xl24 style="BORDER-RIGHT: #dcd9df; BORDER-TOP: #dcd9df; BORDER-LEFT: #dcd9df; BORDER-BOTTOM: #dcd9df; BACKGROUND-COLOR: transparent" align=right x:num&gt;&lt;FONT face=Arial size=2&gt;76.52&lt;/FONT&gt;&lt;/TD&gt;&lt;/TR&gt;
&lt;TR style="HEIGHT: 12.75pt" height=17&gt;
&lt;TD style="BORDER-RIGHT: #dcd9df; BORDER-TOP: #dcd9df; BORDER-LEFT: #dcd9df; BORDER-BOTTOM: #dcd9df; HEIGHT: 12.75pt; BACKGROUND-COLOR: transparent" height=17&gt;&lt;FONT size=2&gt;&lt;FONT face=Arial&gt;&lt;SPAN style="mso-spacerun: yes"&gt;&amp;nbsp;&amp;nbsp; &lt;/SPAN&gt;ParserCompiling.Program.Main(string[])&lt;/FONT&gt;&lt;/FONT&gt;&lt;/TD&gt;
&lt;TD class=xl24 style="BORDER-RIGHT: #dcd9df; BORDER-TOP: #dcd9df; BORDER-LEFT: #dcd9df; BORDER-BOTTOM: #dcd9df; BACKGROUND-COLOR: transparent" align=right x:num="16.981000000000002"&gt;&lt;FONT face=Arial size=2&gt;16.98&lt;/FONT&gt;&lt;/TD&gt;&lt;/TR&gt;
&lt;TR style="HEIGHT: 12.75pt" height=17&gt;
&lt;TD style="BORDER-RIGHT: #dcd9df; BORDER-TOP: #dcd9df; BORDER-LEFT: #dcd9df; BORDER-BOTTOM: #dcd9df; HEIGHT: 12.75pt; BACKGROUND-COLOR: transparent" height=17&gt;&lt;FONT size=2&gt;&lt;FONT face=Arial&gt;&lt;SPAN style="mso-spacerun: yes"&gt;&amp;nbsp;&amp;nbsp; &lt;/SPAN&gt;ParserCompiling.Program.CompilePredicate(string)&lt;/FONT&gt;&lt;/FONT&gt;&lt;/TD&gt;
&lt;TD class=xl24 style="BORDER-RIGHT: #dcd9df; BORDER-TOP: #dcd9df; BORDER-LEFT: #dcd9df; BORDER-BOTTOM: #dcd9df; BACKGROUND-COLOR: transparent" align=right x:num&gt;&lt;FONT face=Arial size=2&gt;0.00&lt;/FONT&gt;&lt;/TD&gt;&lt;/TR&gt;
&lt;TR style="HEIGHT: 12.75pt" height=17&gt;
&lt;TD style="BORDER-RIGHT: #dcd9df; BORDER-TOP: #dcd9df; BORDER-LEFT: #dcd9df; BORDER-BOTTOM: #dcd9df; HEIGHT: 12.75pt; BACKGROUND-COLOR: transparent" height=17&gt;&lt;FONT size=2&gt;&lt;FONT face=Arial&gt;&lt;SPAN style="mso-spacerun: yes"&gt;&amp;nbsp;&amp;nbsp; &lt;/SPAN&gt;ParserCompiling.Program.CompileOptionalNot()&lt;/FONT&gt;&lt;/FONT&gt;&lt;/TD&gt;
&lt;TD class=xl24 style="BORDER-RIGHT: #dcd9df; BORDER-TOP: #dcd9df; BORDER-LEFT: #dcd9df; BORDER-BOTTOM: #dcd9df; BACKGROUND-COLOR: transparent" align=right x:num&gt;&lt;FONT face=Arial size=2&gt;0.00&lt;/FONT&gt;&lt;/TD&gt;&lt;/TR&gt;
&lt;TR style="HEIGHT: 12.75pt" height=17&gt;
&lt;TD style="BORDER-RIGHT: #dcd9df; BORDER-TOP: #dcd9df; BORDER-LEFT: #dcd9df; BORDER-BOTTOM: #dcd9df; HEIGHT: 12.75pt; BACKGROUND-COLOR: transparent" height=17&gt;&lt;FONT size=2&gt;&lt;FONT face=Arial&gt;&lt;SPAN style="mso-spacerun: yes"&gt;&amp;nbsp;&amp;nbsp; &lt;/SPAN&gt;ParserCompiling.Program.GetToken()&lt;/FONT&gt;&lt;/FONT&gt;&lt;/TD&gt;
&lt;TD class=xl24 style="BORDER-RIGHT: #dcd9df; BORDER-TOP: #dcd9df; BORDER-LEFT: #dcd9df; BORDER-BOTTOM: #dcd9df; BACKGROUND-COLOR: transparent" align=right x:num&gt;&lt;FONT face=Arial size=2&gt;0.00&lt;/FONT&gt;&lt;/TD&gt;&lt;/TR&gt;
&lt;TR style="HEIGHT: 12.75pt" height=17&gt;
&lt;TD style="BORDER-RIGHT: #dcd9df; BORDER-TOP: #dcd9df; BORDER-LEFT: #dcd9df; BORDER-BOTTOM: #dcd9df; HEIGHT: 12.75pt; BACKGROUND-COLOR: transparent" height=17&gt;&lt;FONT size=2&gt;&lt;FONT face=Arial&gt;&lt;SPAN style="mso-spacerun: yes"&gt;&amp;nbsp;&amp;nbsp; &lt;/SPAN&gt;ParserCompiling.Program.CompileAnd()&lt;/FONT&gt;&lt;/FONT&gt;&lt;/TD&gt;
&lt;TD class=xl24 style="BORDER-RIGHT: #dcd9df; BORDER-TOP: #dcd9df; BORDER-LEFT: #dcd9df; BORDER-BOTTOM: #dcd9df; BACKGROUND-COLOR: transparent" align=right x:num&gt;&lt;FONT face=Arial size=2&gt;0.00&lt;/FONT&gt;&lt;/TD&gt;&lt;/TR&gt;
&lt;TR style="HEIGHT: 12.75pt" height=17&gt;
&lt;TD style="BORDER-RIGHT: #dcd9df; BORDER-TOP: #dcd9df; BORDER-LEFT: #dcd9df; BORDER-BOTTOM: #dcd9df; HEIGHT: 12.75pt; BACKGROUND-COLOR: transparent" height=17&gt;&lt;FONT size=2&gt;&lt;FONT face=Arial&gt;&lt;SPAN style="mso-spacerun: yes"&gt;&amp;nbsp;&amp;nbsp; &lt;/SPAN&gt;ParserCompiling.Program.CompileOr()&lt;/FONT&gt;&lt;/FONT&gt;&lt;/TD&gt;
&lt;TD class=xl24 style="BORDER-RIGHT: #dcd9df; BORDER-TOP: #dcd9df; BORDER-LEFT: #dcd9df; BORDER-BOTTOM: #dcd9df; BACKGROUND-COLOR: transparent" align=right x:num&gt;&lt;FONT face=Arial size=2&gt;0.00&lt;/FONT&gt;&lt;/TD&gt;&lt;/TR&gt;
&lt;TR style="HEIGHT: 12.75pt" height=17&gt;
&lt;TD style="BORDER-RIGHT: #dcd9df; BORDER-TOP: #dcd9df; BORDER-LEFT: #dcd9df; BORDER-BOTTOM: #dcd9df; HEIGHT: 12.75pt; BACKGROUND-COLOR: transparent" height=17&gt;&lt;FONT face=Arial size=2&gt;mscorwks.dll&lt;/FONT&gt;&lt;/TD&gt;
&lt;TD class=xl24 style="BORDER-RIGHT: #dcd9df; BORDER-TOP: #dcd9df; BORDER-LEFT: #dcd9df; BORDER-BOTTOM: #dcd9df; BACKGROUND-COLOR: transparent" align=right x:num="3.145"&gt;&lt;FONT face=Arial size=2&gt;3.15&lt;/FONT&gt;&lt;/TD&gt;&lt;/TR&gt;
&lt;TR style="HEIGHT: 12.75pt" height=17&gt;
&lt;TD style="BORDER-RIGHT: #dcd9df; BORDER-TOP: #dcd9df; BORDER-LEFT: #dcd9df; BORDER-BOTTOM: #dcd9df; HEIGHT: 12.75pt; BACKGROUND-COLOR: transparent" height=17&gt;&lt;FONT face=Arial size=2&gt;ntdll.dll&lt;/FONT&gt;&lt;/TD&gt;
&lt;TD class=xl24 style="BORDER-RIGHT: #dcd9df; BORDER-TOP: #dcd9df; BORDER-LEFT: #dcd9df; BORDER-BOTTOM: #dcd9df; BACKGROUND-COLOR: transparent" align=right x:num="1.468"&gt;&lt;FONT face=Arial size=2&gt;1.47&lt;/FONT&gt;&lt;/TD&gt;&lt;/TR&gt;
&lt;TR style="HEIGHT: 12.75pt" height=17&gt;
&lt;TD style="BORDER-RIGHT: #dcd9df; BORDER-TOP: #dcd9df; BORDER-LEFT: #dcd9df; BORDER-BOTTOM: #dcd9df; HEIGHT: 12.75pt; BACKGROUND-COLOR: transparent" height=17&gt;&lt;FONT face=Arial size=2&gt;mscorjit.dll&lt;/FONT&gt;&lt;/TD&gt;
&lt;TD class=xl24 style="BORDER-RIGHT: #dcd9df; BORDER-TOP: #dcd9df; BORDER-LEFT: #dcd9df; BORDER-BOTTOM: #dcd9df; BACKGROUND-COLOR: transparent" align=right x:num="0.629"&gt;&lt;FONT face=Arial size=2&gt;0.63&lt;/FONT&gt;&lt;/TD&gt;&lt;/TR&gt;
&lt;TR style="HEIGHT: 12.75pt" height=17&gt;
&lt;TD style="BORDER-RIGHT: #dcd9df; BORDER-TOP: #dcd9df; BORDER-LEFT: #dcd9df; BORDER-BOTTOM: #dcd9df; HEIGHT: 12.75pt; BACKGROUND-COLOR: transparent" height=17&gt;&lt;FONT face=Arial size=2&gt;Unknown&lt;/FONT&gt;&lt;/TD&gt;
&lt;TD class=xl24 style="BORDER-RIGHT: #dcd9df; BORDER-TOP: #dcd9df; BORDER-LEFT: #dcd9df; BORDER-BOTTOM: #dcd9df; BACKGROUND-COLOR: transparent" align=right x:num="0.41899999999999998"&gt;&lt;FONT face=Arial size=2&gt;0.42&lt;/FONT&gt;&lt;/TD&gt;&lt;/TR&gt;
&lt;TR style="HEIGHT: 12.75pt" height=17&gt;
&lt;TD style="BORDER-RIGHT: #dcd9df; BORDER-TOP: #dcd9df; BORDER-LEFT: #dcd9df; BORDER-BOTTOM: #dcd9df; HEIGHT: 12.75pt; BACKGROUND-COLOR: transparent" height=17&gt;&lt;FONT face=Arial size=2&gt;SHLWAPI.dll&lt;/FONT&gt;&lt;/TD&gt;
&lt;TD class=xl24 style="BORDER-RIGHT: #dcd9df; BORDER-TOP: #dcd9df; BORDER-LEFT: #dcd9df; BORDER-BOTTOM: #dcd9df; BACKGROUND-COLOR: transparent" align=right x:num&gt;&lt;FONT face=Arial size=2&gt;0.21&lt;/FONT&gt;&lt;/TD&gt;&lt;/TR&gt;
&lt;TR style="HEIGHT: 12.75pt" height=17&gt;
&lt;TD style="BORDER-RIGHT: #dcd9df; BORDER-TOP: #dcd9df; BORDER-LEFT: #dcd9df; BORDER-BOTTOM: #dcd9df; HEIGHT: 12.75pt; BACKGROUND-COLOR: transparent" height=17&gt;&lt;FONT face=Arial size=2&gt;KERNEL32.dll&lt;/FONT&gt;&lt;/TD&gt;
&lt;TD class=xl24 style="BORDER-RIGHT: #dcd9df; BORDER-TOP: #dcd9df; BORDER-LEFT: #dcd9df; BORDER-BOTTOM: #dcd9df; BACKGROUND-COLOR: transparent" align=right x:num&gt;&lt;FONT face=Arial size=2&gt;0.21&lt;/FONT&gt;&lt;/TD&gt;&lt;/TR&gt;
&lt;TR style="HEIGHT: 12.75pt" height=17&gt;
&lt;TD style="BORDER-RIGHT: #dcd9df; BORDER-TOP: #dcd9df; BORDER-LEFT: #dcd9df; BORDER-BOTTOM: #dcd9df; HEIGHT: 12.75pt; BACKGROUND-COLOR: transparent" height=17&gt;&lt;FONT face=Arial size=2&gt;mscorlib.ni.dll&lt;/FONT&gt;&lt;/TD&gt;
&lt;TD class=xl24 style="BORDER-RIGHT: #dcd9df; BORDER-TOP: #dcd9df; BORDER-LEFT: #dcd9df; BORDER-BOTTOM: #dcd9df; BACKGROUND-COLOR: transparent" align=right x:num&gt;&lt;FONT face=Arial size=2&gt;0.21&lt;/FONT&gt;&lt;/TD&gt;&lt;/TR&gt;
&lt;TR style="HEIGHT: 12.75pt" height=17&gt;
&lt;TD style="BORDER-RIGHT: #dcd9df; BORDER-TOP: #dcd9df; BORDER-LEFT: #dcd9df; BORDER-BOTTOM: #dcd9df; HEIGHT: 12.75pt; BACKGROUND-COLOR: transparent" height=17&gt;&lt;FONT face=Arial size=2&gt;MSVCR80.dll&lt;/FONT&gt;&lt;/TD&gt;
&lt;TD class=xl24 style="BORDER-RIGHT: #dcd9df; BORDER-TOP: #dcd9df; BORDER-LEFT: #dcd9df; BORDER-BOTTOM: #dcd9df; BACKGROUND-COLOR: transparent" align=right x:num&gt;&lt;FONT face=Arial size=2&gt;0.21&lt;/FONT&gt;&lt;/TD&gt;&lt;/TR&gt;
&lt;TR style="HEIGHT: 12.75pt" height=17&gt;
&lt;TD style="BORDER-RIGHT: #dcd9df; BORDER-TOP: #dcd9df; BORDER-LEFT: #dcd9df; BORDER-BOTTOM: #dcd9df; HEIGHT: 12.75pt; BACKGROUND-COLOR: transparent" height=17&gt;&lt;FONT face=Arial size=2&gt;shell32.dll&lt;/FONT&gt;&lt;/TD&gt;
&lt;TD class=xl24 style="BORDER-RIGHT: #dcd9df; BORDER-TOP: #dcd9df; BORDER-LEFT: #dcd9df; BORDER-BOTTOM: #dcd9df; BACKGROUND-COLOR: transparent" align=right x:num&gt;&lt;FONT face=Arial size=2&gt;0.00&lt;/FONT&gt;&lt;/TD&gt;&lt;/TR&gt;
&lt;TR style="HEIGHT: 12.75pt" height=17&gt;
&lt;TD style="BORDER-RIGHT: #dcd9df; BORDER-TOP: #dcd9df; BORDER-LEFT: #dcd9df; BORDER-BOTTOM: #dcd9df; HEIGHT: 12.75pt; BACKGROUND-COLOR: transparent" height=17&gt;&lt;FONT face=Arial size=2&gt;mscoree.dll&lt;/FONT&gt;&lt;/TD&gt;
&lt;TD class=xl24 style="BORDER-RIGHT: #dcd9df; BORDER-TOP: #dcd9df; BORDER-LEFT: #dcd9df; BORDER-BOTTOM: #dcd9df; BACKGROUND-COLOR: transparent" align=right x:num&gt;&lt;FONT face=Arial size=2&gt;0.00&lt;/FONT&gt;&lt;/TD&gt;&lt;/TR&gt;&lt;/TBODY&gt;&lt;/TABLE&gt;&lt;/P&gt;
&lt;P&gt;Things have really improved at this point, the vast majority of the time (76.52%) is now being spent directly doing predicate evaluation -- that's just where we want the time to go.&amp;nbsp; Our new implementation is also much more cache friendly as the number of facts increases because now we simply have an array of booleans (it could be even denser if it was an array of bits).&amp;nbsp; Even if there were a few thousand facts we'd still have a tight representation -- the hash table approach tends to scatter things (deliberately) throughout a larger table.&amp;nbsp; There are no redundant string comparisons, no hashing, it just goes voom :)&lt;/P&gt;
&lt;P&gt;Now, what if we get out the big guns and start doing native code generation.&amp;nbsp; Could we do even better?&amp;nbsp; After all, this is managed code, we have the JIT at our disposal.&lt;/P&gt;
&lt;P&gt;Tune in tomorrow :)&lt;/P&gt;
&lt;P&gt;&lt;STRONG&gt;NOTE:&lt;/STRONG&gt;&amp;nbsp; When I originally posted this I reported the times for the debug build... I had to restate the results because the release build is approximately three times faster.&amp;nbsp; While I was at it I re-verified that my previous numbers were on the release build and they were.&amp;nbsp; Sorry about that...&lt;BR&gt;&lt;BR&gt;Concluded in &lt;a href="http://blogs.msdn.com/ricom/archive/2005/11/30/498546.aspx"&gt;Performance Quiz #8 -- The problems with parsing -- Part 5&lt;/A&gt;&lt;/P&gt;&lt;/LI&gt;&lt;img src="http://blogs.msdn.com/aggbug.aspx?PostID=498009" width="1" height="1"&gt;</description><category domain="http://blogs.msdn.com/ricom/archive/tags/performance/default.aspx">performance</category><category domain="http://blogs.msdn.com/ricom/archive/tags/quiz/default.aspx">quiz</category></item><item><title>Performance Quiz #8 -- The problems with parsing -- Part 3</title><link>http://blogs.msdn.com/ricom/archive/2005/11/28/performance-quiz-8-the-problems-with-parsing-part-3.aspx</link><pubDate>Mon, 28 Nov 2005 22:53:00 GMT</pubDate><guid isPermaLink="false">91d46819-8472-40ad-a661-2c78acb4018c:497595</guid><dc:creator>ricom</dc:creator><slash:comments>8</slash:comments><comments>http://blogs.msdn.com/ricom/comments/497595.aspx</comments><wfw:commentRss>http://blogs.msdn.com/ricom/commentrss.aspx?PostID=497595</wfw:commentRss><description>&lt;P&gt;The thing about performance work is that it's very easy to be fooled into looking into the wrong areas.&amp;nbsp; That's why you want your changes to be firmly grounded in data whenever they can be.&amp;nbsp; Or if you're planning, you want to be thinking about what your customer experience needs to be, what the key metrics will be, and how your overall architecture will support those needs.&amp;nbsp; Whatever it is that you're doing should be as grounded as you can make it for the stage your are in.&amp;nbsp; After that you should be doing whatever you can to verify your assumptions, as often as you can with reasonable cost.&amp;nbsp; The trick is to not go crazy applying too much effort doing all this because because that's just as bad.&lt;/P&gt;
&lt;P&gt;Take this little parser example.&amp;nbsp; You could conclude that all that string allocation in in GetToken() is the crux of the problems -- and you'd be partly correct -- but it's not really the full picture.&lt;/P&gt;
&lt;P&gt;To try to see what was going on, one of the first things I did was try to eliminate the GC as a source of possible issues while leaving the rest as similar as possible.&amp;nbsp; This actually illustrates a cool technique even though it turns out that it's not really the right approach.&amp;nbsp; I wanted to eliminate all of those Substring calls while parsing -- since I expect we'll be parsing over and over again.&amp;nbsp; The substring itself is just one example of looking at the input more than once -- a definate no-no in any high performance parser.&amp;nbsp; But lets see where this leads us.&lt;/P&gt;
&lt;P&gt;What I want to do is add a member that will hold the various fact and token strings that occur in the predicates once and for all.&amp;nbsp; I'd like to have something like this:&lt;/P&gt;
&lt;P&gt;&lt;FONT face="Courier New" size=2&gt;private static System.Xml.NameTable ntTokens = new System.Xml.NameTable();&lt;/FONT&gt;&lt;/P&gt;
&lt;P&gt;Now the reason I like this class so much is that it's one of few that offers you the opportunity to create strings without having to first allocate a string.&amp;nbsp; You can call it starting with a character buffer and get a string back.&amp;nbsp; That's just what we need... only the rest of the plumbing isn't quite right.&lt;/P&gt;
&lt;P&gt;I want to make these changes:&lt;/P&gt;
&lt;P&gt;&lt;FONT face="Courier New" size=2&gt;&amp;lt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; return strInput.Substring(idx, ichInput - idx);&lt;BR&gt;&lt;/FONT&gt;becomes&lt;FONT face="Courier New" size=2&gt;&lt;BR&gt;&amp;gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; // returns an existing string if there is one&lt;BR&gt;&amp;gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; return ntTokens.Add(arrayInput, idx, ichInput - idx); &lt;/FONT&gt;&lt;/P&gt;
&lt;P&gt;and&lt;/P&gt;
&lt;P&gt;&lt;FONT face="Courier New" size=2&gt;&amp;lt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; return strInput.Substring(idx, 1);&lt;BR&gt;&lt;/FONT&gt;becomes&lt;BR&gt;&lt;FONT face="Courier New" size=2&gt;&amp;gt;&amp;nbsp;&amp;nbsp;&amp;nbsp; &amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; // returns an existing string if there is one&lt;BR&gt;&amp;gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; return ntTokens.Add(arrayInput, idx, 1);&lt;/FONT&gt;&lt;BR&gt;&lt;BR&gt;To do that I also have to change the input from a string to a char array&lt;/P&gt;
&lt;P&gt;&lt;FONT face="Courier New" size=2&gt;&amp;lt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; private static string strInput = null;&lt;BR&gt;&lt;/FONT&gt;becomes&lt;BR&gt;&lt;FONT face="Courier New" size=2&gt;&amp;gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; private static char[] arrayInput = null;&lt;/FONT&gt;&lt;/P&gt;
&lt;P&gt;&lt;BR&gt;&lt;FONT face="Courier New" size=2&gt;&amp;lt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; static bool EvaluatePredicate(string predicate)&lt;BR&gt;&amp;lt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; {&lt;BR&gt;&amp;lt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; strInput = predicate;&lt;BR&gt;&lt;/FONT&gt;becomes&lt;BR&gt;&lt;FONT face="Courier New" size=2&gt;&amp;gt;&amp;nbsp; &amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; static bool EvaluatePredicate(char[] predicate)&lt;BR&gt;&amp;gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; {&lt;BR&gt;&amp;gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; arrayInput = predicate;&lt;BR&gt;&lt;/FONT&gt;&lt;BR&gt;There's a couple more places where strInput has to be changed to arrayInput, like the getc() function,&amp;nbsp;but these are very straight forward.&lt;/P&gt;
&lt;P&gt;For my test case I wanted the predicates in character arrays now&amp;nbsp;so I did this quick and dirty thing (in a real program you'd likely read the predicates from a file so you could just arrange for them to be in char arrays in the first place)&lt;/P&gt;
&lt;P&gt;&lt;FONT face="Courier New" size=2&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; struct CharArray&lt;BR&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; {&lt;BR&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; public char[] ptr;&lt;BR&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; }&lt;/FONT&gt;&lt;/P&gt;
&lt;P&gt;&lt;FONT face="Courier New" size=2&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; static CharArray[] predicateChars = new CharArray[predicates.Length];&lt;BR&gt;&lt;/FONT&gt;&lt;/P&gt;
&lt;P&gt;and then later:&lt;BR&gt;&lt;BR&gt;&lt;FONT face="Courier New" size=2&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; for (i = 0; i&amp;lt;len;i++)&lt;BR&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; predicateChars[i].ptr = predicates[i].ToCharArray();&lt;/FONT&gt;&lt;/P&gt;
&lt;P&gt;&lt;FONT face="Courier New" size=2&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; for (i = 0; i&amp;lt; len; i++)&lt;BR&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; Console.WriteLine("{0} =&amp;gt; {1}", predicates[i], &lt;BR&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; EvaluatePredicate(predicateChars[i].ptr));&lt;BR&gt;&lt;BR&gt;&lt;/FONT&gt;And for&amp;nbsp;the test loop itself simply call EvaluatePredicate using predicateChars.&lt;/P&gt;
&lt;P&gt;OK that little experiment is about the smallest change that removes any need to create substrings while changing nothing else.&lt;/P&gt;
&lt;P&gt;On my test system the original code was taking 4.839s to run; the new version with these improvements takes 4.680s.&amp;nbsp; That's about a 3.3% savings.&lt;/P&gt;
&lt;P&gt;Now I have to tell you in all honesty that if you had asked me to comment on that change off hand I think I would have told you that I'd expect a much greater impact than just 3.3% -- not that 3.3% is anything to sneeze at (see &lt;a href="http://blogs.msdn.com/ricom/archive/2005/10/17/481999.aspx"&gt;The Performance War -- Win it 5% at a time&lt;/A&gt;) but that isn't really the killer issue.&lt;/P&gt;
&lt;P&gt;Lets go back and look at what the performance report shows and see if any of this makes sense.&lt;/P&gt;
&lt;P&gt;Well the telling line is this one right here:&lt;/P&gt;
&lt;P&gt;&lt;FONT face="Courier New" size=2&gt;Exclusive% Inclusive% Function Name&lt;BR&gt;0.01&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; 4.62&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; WKS::GCHeap::Alloc(unsigned int,unsigned long)&lt;BR&gt;&lt;BR&gt;&lt;/FONT&gt;The sum total of the allocations on that substring path, including all the garbage collections, was only 4.62%.&amp;nbsp; It's no surprise then that we got less than a 4% improvement by only eliminating the garbage and doing nothing else.&amp;nbsp; The collector was actually doing a pretty darn good job of cleaning up that trash on the cheap!&lt;/P&gt;
&lt;P&gt;Even looking further at the string creation costs you'll see that InternalSubStringWithChecks has an inclusive time of about 22%. That means the bulk of the time creating these substrings had more to do with setting up for the array copy, moving the memory, checking the indexes for out of bounds, and so forth than it did to do with allocation.&amp;nbsp; We can only attribute 4.62% of the time to actually allocating the memory.&amp;nbsp; All of that checking and rechecking is costing us.&lt;/P&gt;
&lt;P&gt;Let's look elsewhere:&amp;nbsp; the getc() function is taking a total of 10.49% of the time -- that's more than the collector!&amp;nbsp; It doesn't do anything but a couple of checks into an in memory array,&amp;nbsp;but it's called an awful&amp;nbsp;lot.&lt;/P&gt;
&lt;P&gt;Looking&amp;nbsp;at&amp;nbsp;some other interesting operations, you can see that&amp;nbsp;just hashing strings in this run, so that we can do fact lookups, is costing us 2.9% That's an astounding figure because you'll recall that ALL of our allocation activity was only 4.62%. Who would have expected that the allocation figure would be so low that just the string hashes could compete with it.&amp;nbsp; And that's not all, if you look at the full comparison situation&amp;nbsp;you'll see that&amp;nbsp;checking for existing entries in the hashtable is accounting for 22% of the run time.&amp;nbsp; Again, that whole operation is something we do merely to find out if any given fact is true or false.&lt;/P&gt;
&lt;P&gt;Well, all of this really changed my mind about how to attack this problem.&amp;nbsp;&amp;nbsp; My plan looks like this:&lt;/P&gt;
&lt;UL&gt;
&lt;LI&gt;preparse the predicates so that no string processing is required for evaluation 
&lt;LI&gt;assign facts a number in sequence as they are encountered in the predicates 
&lt;LI&gt;the&amp;nbsp;preparsed predicate is a mix of fact numbers and operators probably in postfix order 
&lt;LI&gt;when looking up facts simply access an array of truth bits indexed&amp;nbsp;by fact number 
&lt;LI&gt;as facts change simply change the array contents and then reevaluate predicates as desired 
&lt;LI&gt;do the evaluation itself with a simple stack machine; no recursion in the evaluation&lt;/LI&gt;&lt;/UL&gt;
&lt;P&gt;I'll blog some code that does this in the next part and we'll see how it does.&lt;/P&gt;
&lt;P&gt;Any predictions based on what we've seen so far?&lt;/P&gt;
&lt;P&gt;See the continuation in&amp;nbsp;&lt;a href="http://blogs.msdn.com/ricom/archive/2005/11/29/498009.aspx"&gt;&lt;FONT color=#003399&gt;Performance Quiz #8 -- The problems with parsing -- Part 4&lt;/FONT&gt;&lt;/A&gt;&lt;/P&gt;&lt;img src="http://blogs.msdn.com/aggbug.aspx?PostID=497595" width="1" height="1"&gt;</description><category domain="http://blogs.msdn.com/ricom/archive/tags/performance/default.aspx">performance</category><category domain="http://blogs.msdn.com/ricom/archive/tags/quiz/default.aspx">quiz</category></item></channel></rss>