<?xml version="1.0" encoding="UTF-8" ?>
<?xml-stylesheet type="text/xsl" href="http://blogs.msdn.com/utility/FeedStylesheets/rss.xsl" media="screen"?><rss version="2.0" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:slash="http://purl.org/rss/1.0/modules/slash/" xmlns:wfw="http://wellformedweb.org/CommentAPI/"><channel><title>Rico Mariani's Performance Tidbits : design advice</title><link>http://blogs.msdn.com/ricom/archive/tags/design+advice/default.aspx</link><description>Tags: design advice</description><dc:language>en-US</dc:language><generator>CommunityServer 2.1 SP1 (Build: 61025.2)</generator><item><title>Visual Studio: Why is there no 64 bit version?  (yet)</title><link>http://blogs.msdn.com/ricom/archive/2009/06/10/visual-studio-why-is-there-no-64-bit-version.aspx</link><pubDate>Thu, 11 Jun 2009 06:34:00 GMT</pubDate><guid isPermaLink="false">91d46819-8472-40ad-a661-2c78acb4018c:9725681</guid><dc:creator>ricom</dc:creator><slash:comments>78</slash:comments><comments>http://blogs.msdn.com/ricom/comments/9725681.aspx</comments><wfw:commentRss>http://blogs.msdn.com/ricom/commentrss.aspx?PostID=9725681</wfw:commentRss><description>&lt;P&gt;&lt;EM&gt;Disclaimer: This is yet another of my trademarked "approximately correct" discussions&lt;/EM&gt;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;From time to time customers or partners ask me about our plans to create a 64 bit version of Visual Studio. When is it coming? Why aren’t we making it a priority? Haven’t we noticed that 64 bit PC’s are very popular? Things like that. We just had an internal discussion about “the 64 bit issue” and so I thought I would elaborate a bit on that discussion for the blog-o-sphere.&lt;/P&gt;
&lt;P&gt;&lt;STRONG&gt;So why not 64 bit right away?&lt;/STRONG&gt;&lt;/P&gt;
&lt;P&gt;Well, there are several concerns with such an endeavor.&lt;/P&gt;
&lt;P&gt;First, from a performance perspective the pointers get larger, so data structures get larger, and the processor cache stays the same size. That basically results in a raw speed hit (your mileage may vary).&amp;nbsp; So you start in a hole and you have to dig yourself out of that hole by using the extra memory above 4G to your advantage.&amp;nbsp; In Visual Studio this can happen in some large solutions but I think a preferable thing to do is to just use less memory in the first place.&amp;nbsp; Many of VS’s algorithms are amenable to this.&amp;nbsp; Here’s an old article that discusses the performance issues at some length: &lt;A href="http://blogs.msdn.com/joshwil/archive/2006/07/18/670090.aspx" mce_href="http://blogs.msdn.com/joshwil/archive/2006/07/18/670090.aspx"&gt;http://blogs.msdn.com/joshwil/archive/2006/07/18/670090.aspx&lt;/A&gt;&lt;/P&gt;
&lt;P&gt;Secondly, from a cost perspective, probably the shortest path to porting Visual Studio to 64 bit is to port most of it to managed code incrementally and then port the rest.&amp;nbsp; The cost of a full port of that much native code is going to be quite high and of course all known extensions would break and we’d basically have to create a 64 bit ecosystem pretty much like you do for drivers.&amp;nbsp; Ouch.&lt;/P&gt;
&lt;P&gt;&lt;EM&gt;[Clarification 6/11/09: The issue is this:&amp;nbsp; If all you wanted to do was move the code to 64 bit then yes the shortest path is to do a direct port.&amp;nbsp; But that’s never the case.&amp;nbsp; In practice porting has an opportunity cost, it competes with other desires.&amp;nbsp; So what happens is more like this:&amp;nbsp; you get teams that have&amp;nbsp;C++ code written for 32 bits and they say “I want to write feature X, if I port to managed I can do feature X plus other things more easily, that seems like a good investment” so they go to managed code for other reasons.&amp;nbsp; But now they also have a path to 64 bit.&amp;nbsp; What’s happening in practice is that more and more of the&amp;nbsp;Visual Studio&amp;nbsp;is becoming managed for reasons unrelated to bitness. Hence a sort of net-hybrid porting strategy over time.]&lt;/EM&gt;&lt;/P&gt;
&lt;P&gt;So, all things considered, my feeling is that the best place to run VS for this generation is in the 32 bit emulation mode of a 64 bit operating system; this doubles your available address space without taking the data-space hit and it gives you extra benefits associated with that 64 bit OS.&amp;nbsp; More on those benefits later.&lt;/P&gt;
&lt;P&gt;Having said that, I know there are customers that &lt;EM&gt;would&lt;/EM&gt; benefit from a 64 bit version but I actually think that amount of effort would be better spent in reducing the memory footprint of the IDE’s existing structures rather than doing a port.&amp;nbsp; There are many tradeoffs here and&amp;nbsp; the opportunity cost of the port is high.&lt;/P&gt;
&lt;P&gt;&lt;STRONG&gt;Is it expensive because the code is old and of poor quality?&lt;/STRONG&gt;&lt;/P&gt;
&lt;P&gt;It’s not really about the quality of the code – a lot of it is only a few releases old – as it is the &lt;I&gt;amount&lt;/I&gt; of code involved.&amp;nbsp; Visual Studio is huge and most of its packages wouldn’t benefit from 64 bit addressing but nearly all of it would benefit from using more lazy algorithms – the tendency to load too much about the current solution is a general problem which results in slowness even when there is enough memory to do the necessary work.&amp;nbsp; Adding more memory to facilitate doing even more work that we shouldn’t be doing in the first place tends to incent the wrong behavior.&amp;nbsp; I want to load less, not more.&lt;/P&gt;
&lt;P&gt;&lt;STRONG&gt;Doesn’t being a 64 bit application save you all kinds of page faults and so forth?&lt;/STRONG&gt;&lt;/P&gt;
&lt;P&gt;A 64 bit address space for the &lt;EM&gt;process &lt;/EM&gt;isn’t going to help you with page faults except in maybe indirect ways, and it will definitely hurt you in direct ways because your data is bigger.&amp;nbsp; In contrast a 64 bit &lt;EM&gt;operating system &lt;/EM&gt;could help you a lot!&amp;nbsp; If you’re running as a 32 bit app on a 64 bit OS then you get all of the 4G address space and all of that could be backed by physical memory (if you have the RAM) even without you using 64 bit pointers yourself.&amp;nbsp;&amp;nbsp; You’ll see potentially huge improvements related to the size of the disk cache (not in your address space) and the fact that your working set won’t need to be eroded in favor of other processes as much.&amp;nbsp; Transient components and data (like C++ compilers and their big .pch files) stay cached&amp;nbsp; in physical memory, but not in your address space.&amp;nbsp; 32 bit processes accrue all these benefits just as surely as 64 bit ones.&lt;/P&gt;
&lt;P&gt;In fact, the only direct benefit you get from having more address space for your process is that you can allocate more total memory, but if we’re talking about scenarios that already fit in 4G then making the pointers bigger could cause them to not fit and certainly will make them take more memory, never less.&amp;nbsp; If you don’t have abundant memory that growth&amp;nbsp;might make you page, and even if you do have the memory it will certainly make you miss the cache more often.&amp;nbsp; Remember the cache size does not grow in 64 bit mode but your data structures do.&amp;nbsp; Where you might get savings is if the bigger address space allowed you to have less fragmentation and more sharing.&amp;nbsp; But Vista+ &lt;A href="http://en.wikipedia.org/wiki/Address_space_layout_randomization" mce_href="http://en.wikipedia.org/wiki/Address_space_layout_randomization"&gt;auto-relocates&lt;/A&gt; images efficiently anyway for other reasons so this is less of a win.&amp;nbsp; You might also get benefits if the 64 bit instruction set is especially good for your application (e.g. if you do a ton of 64 bit math)&lt;/P&gt;
&lt;P&gt;So, the only way you’re going to see serious benefits is if you have scenarios that simply will not fit into 4G at all.&amp;nbsp; But, in Visual Studio anyway, when we don’t fit into 4G of memory I have never once found myself thinking “wow, System X needs more address space” I always think “wow, System X needs to go on a diet.”&lt;/P&gt;
&lt;P&gt;Your mileage may vary and you can of course imagine certain VS packages (such as a hypothetical data analytics debugging system) that might require staggering amounts of memory but those should be handled as special cases. And it is possible for us to do a hybrid plan with including some 64 bit slave processes.&amp;nbsp; &lt;/P&gt;
&lt;P&gt;I do think we might seem less cool because we’re 32 bit only but I think the right way to fight that battle is with good information, and a great product.&lt;/P&gt;
&lt;P&gt;&lt;STRONG&gt;Then why did Office make the decision to go 64 bit?&lt;/STRONG&gt;&lt;/P&gt;
&lt;P&gt;This section is &lt;EM&gt;entirely recreational speculation &lt;/EM&gt;because I didn’t ask them (though frankly I should). But I think I can guess why. Maybe a kind reader can tell me how wrong I am :)&lt;/P&gt;
&lt;P&gt;First, some of the hardest porting issues aren’t about getting the code to run properly but are about making sure that the file formats the new code generates remain compatible with previous (and future) versions of those formats. Remember, the ported code now thinks it has 64 bit offsets in some data structures.&amp;nbsp; That compatibility could be expensive to achieve because these things find their way into subtle places – potentially any binary file format could have pointer-size issues. However, Office already did a pass on all its file formats to standardize them on compressed XML, so they cannot possibly have embedded pointers anymore. That’s a nice cost saver on the road to 64 bit products.&lt;/P&gt;
&lt;P&gt;Secondly, on the benefit side, there are customers out there that would love to load enormous datasets into Excel or Access and process them interactively. Now in Visual Studio I can look you in the face and say “even if your solution has more than 4G of files I shouldn’t have to load it all for you to build and refactor it” but that’s a much harder argument to make for say Excel.&lt;/P&gt;
&lt;P&gt;In Visual Studio if you needed to do a new feature like debugging of a giant analytics system that used a lot of memory I would say “make that analytics debugging package 64 bit, the rest can stay the way they are” but porting say half of Excel to 64 bits isn’t exactly practical.&lt;/P&gt;
&lt;P&gt;So the Office folks have different motivations and costs and therefore came to different conclusions -- the above are just my personal uninformed guesses as to why that might be the case.&lt;/P&gt;
&lt;P&gt;One thing is for sure though: I definitely think that the benefits of the 64 bit operating system are huge for everyone. Even if it was nothing more than using all that extra memory as a giant disk cache, just that can be fabulous, and you get a lot more than that!&lt;/P&gt;&lt;img src="http://blogs.msdn.com/aggbug.aspx?PostID=9725681" width="1" height="1"&gt;</description><category domain="http://blogs.msdn.com/ricom/archive/tags/performance/default.aspx">performance</category><category domain="http://blogs.msdn.com/ricom/archive/tags/ramblings/default.aspx">ramblings</category><category domain="http://blogs.msdn.com/ricom/archive/tags/design+advice/default.aspx">design advice</category><category domain="http://blogs.msdn.com/ricom/archive/tags/visual+studio/default.aspx">visual studio</category></item><item><title>Performance Advice, Southern Style </title><link>http://blogs.msdn.com/ricom/archive/2008/11/28/performance-advice-southern-style.aspx</link><pubDate>Sat, 29 Nov 2008 00:29:00 GMT</pubDate><guid isPermaLink="false">91d46819-8472-40ad-a661-2c78acb4018c:9153447</guid><dc:creator>ricom</dc:creator><slash:comments>1</slash:comments><comments>http://blogs.msdn.com/ricom/comments/9153447.aspx</comments><wfw:commentRss>http://blogs.msdn.com/ricom/commentrss.aspx?PostID=9153447</wfw:commentRss><description>&lt;P&gt;While I was at PDC2008 I was introduced to Keith and Woody -- pretty soon there was a microphone in front of me and we were doing a podcast.&amp;nbsp; Now I already liked these guys but when they used a picture of me from about 1998 I really liked them a lot more.&amp;nbsp; I wish I still looked like that :).&lt;/P&gt;
&lt;P&gt;And the interview was pretty fun too. ;)&lt;/P&gt;
&lt;P&gt;Keith, Woody,&amp;nbsp;thanks for having me.&lt;/P&gt;
&lt;P&gt;You guys can check it out here:&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;A href="http://deepfriedbytes.com/podcast/episode-21-talking-performance-with-performance-preacher-rico-mariani/" mce_href="http://deepfriedbytes.com/podcast/episode-21-talking-performance-with-performance-preacher-rico-mariani/"&gt;http://deepfriedbytes.com/podcast/episode-21-talking-performance-with-performance-preacher-rico-mariani/&lt;/A&gt;&lt;/P&gt;&lt;img src="http://blogs.msdn.com/aggbug.aspx?PostID=9153447" width="1" height="1"&gt;</description><category domain="http://blogs.msdn.com/ricom/archive/tags/performance/default.aspx">performance</category><category domain="http://blogs.msdn.com/ricom/archive/tags/design+advice/default.aspx">design advice</category><category domain="http://blogs.msdn.com/ricom/archive/tags/recommendations/default.aspx">recommendations</category></item><item><title>Linq Compiled Queries Q &amp; A</title><link>http://blogs.msdn.com/ricom/archive/2008/08/25/linq-compiled-queries-q-a.aspx</link><pubDate>Mon, 25 Aug 2008 22:21:00 GMT</pubDate><guid isPermaLink="false">91d46819-8472-40ad-a661-2c78acb4018c:8894803</guid><dc:creator>ricom</dc:creator><slash:comments>5</slash:comments><comments>http://blogs.msdn.com/ricom/comments/8894803.aspx</comments><wfw:commentRss>http://blogs.msdn.com/ricom/commentrss.aspx?PostID=8894803</wfw:commentRss><description>&lt;P&gt;I did a series of postings on &lt;A href="http://blogs.msdn.com/ricom/archive/2007/06/22/dlinq-linq-to-sql-performance-part-1.aspx" mce_href="http://blogs.msdn.com/ricom/archive/2007/06/22/dlinq-linq-to-sql-performance-part-1.aspx"&gt;Linq Compiled Queries&lt;/A&gt; last year, I recently got some questions on those postings that I thought would be of general interest.&lt;/P&gt;
&lt;P&gt;Q1:&lt;/P&gt;
&lt;P&gt;Why use the 'new' keyword in this snippet?&lt;/P&gt;
&lt;P&gt;var q = from o in nw.Orders &lt;BR&gt;select &lt;STRONG&gt;new &lt;/STRONG&gt;{o.everything …};&lt;/P&gt;
&lt;P&gt;A:&lt;/P&gt;
&lt;P&gt;If you did just :&lt;/P&gt;
&lt;P&gt;var q = from o in nw.Orders &lt;BR&gt;select o;&lt;/P&gt;
&lt;P&gt;You're getting editable orders. Linq then has to track them in case you change them and want to submit the changes. If you use new effectively you're making a copy of the orders that is not going to be change tracked. That's faster for read only cases. The other thing you can do is mark the query context as read-only and then you get the same effect.&amp;nbsp; When I wrote that test case, that feature wasn't available yet so I used &lt;STRONG&gt;new &lt;/STRONG&gt;to simulate it.&lt;/P&gt;
&lt;P&gt;Q2:&amp;nbsp; &lt;/P&gt;
&lt;P&gt;What do you mean when you say that linq will 'Create custom methods that bind the data perfectly' ?&lt;/P&gt;
&lt;P&gt;A:&lt;/P&gt;
&lt;P&gt;Whenever you use linq to sql to read data from a database it has to do two important things for you. The first is convert your Linq query into SQL. The second is to make a method that takes the stream of data that comes back from the database and converts it into the managed objects you required. That's the data-binding step. Linq creates the necessary methods automatically, and it makes the perfect code for doing this.&lt;/P&gt;
&lt;P&gt;Q3:&lt;/P&gt;
&lt;P&gt;How did Linq to SQL beat your ADO.Net code for insert times.&amp;nbsp; Shouldn't a tie be the best possible result?&lt;/P&gt;
&lt;P&gt;A:&lt;/P&gt;
&lt;P&gt;The SQL I used in my test case was pretty much the standard simplest SQL you would use for such a job. The automatically generated SQL from Linq was better than what I wrote by hand because they had parameterized the insert statements which I never bothered to do. Had I changed my SQL to what they created it would have been a tie. This is kind of like when the C++ compiler finds a machine code pattern that is better than what you would have written doing it by hand because it did something you don't usually bother doing with hand tuned machine code. But you *could* replace what you wrote with what the compiler generated.&lt;/P&gt;
&lt;P&gt;Q4: &lt;/P&gt;
&lt;P&gt;What are the downsides to precompiled queries?&lt;/P&gt;
&lt;P&gt;A:&lt;/P&gt;
&lt;P&gt;There is no penalty to precompiling (&lt;A href="http://blogs.msdn.com/ricom/archive/2008/01/11/performance-quiz-13-linq-to-sql-compiled-queries-cost.aspx" mce_href="http://blogs.msdn.com/ricom/archive/2008/01/11/performance-quiz-13-linq-to-sql-compiled-queries-cost.aspx"&gt;see Quiz #13&lt;/A&gt;). The only way you might lose performance is if you precompile a zillion queries and then hardly use them at all -- you'd be wasting a lot of memory for no good reason.&amp;nbsp; &lt;/P&gt;
&lt;P&gt;But measure :)&lt;/P&gt;&lt;img src="http://blogs.msdn.com/aggbug.aspx?PostID=8894803" width="1" height="1"&gt;</description><category domain="http://blogs.msdn.com/ricom/archive/tags/performance/default.aspx">performance</category><category domain="http://blogs.msdn.com/ricom/archive/tags/design+advice/default.aspx">design advice</category><category domain="http://blogs.msdn.com/ricom/archive/tags/databases/default.aspx">databases</category></item><item><title>Rico's Instrumentation Aphorisms</title><link>http://blogs.msdn.com/ricom/archive/2007/11/30/rico-s-instrumentation-aphorisms.aspx</link><pubDate>Sat, 01 Dec 2007 05:14:00 GMT</pubDate><guid isPermaLink="false">91d46819-8472-40ad-a661-2c78acb4018c:6622408</guid><dc:creator>ricom</dc:creator><slash:comments>1</slash:comments><comments>http://blogs.msdn.com/ricom/comments/6622408.aspx</comments><wfw:commentRss>http://blogs.msdn.com/ricom/commentrss.aspx?PostID=6622408</wfw:commentRss><description>&lt;P mce_keep="true"&gt;A few months ago, Mary Gray of the Management Practices Team came to talk to me about good practices for creating performance counters and doing measurements generally.&amp;nbsp; She interviewed me on the topic for about an hour and was madly scribbling notes the whole time while I talked a mile a minute.&amp;nbsp; What's below is a slightly edited version what she took away from the interview.&amp;nbsp; I thought it was interesting enough that you guys might like to see it so here it is. &lt;/P&gt;
&lt;P&gt;Mary, thank you for allowing me to share. 
&lt;BLOCKQUOTE&gt;
&lt;P&gt;Adding instrumentation in the form of events and performance counters to your software is one of the most important things you can do to make your component or application more manageable by IT personnel, more supportable by CSS, more easily tuned and debugged by developers and testers. 
&lt;P&gt;The OS already has performance counters you can use for such resources as CPU, disk, memory, and network resources. These are the primary resources that you will need to track for most software. You don't need to add a lot of performance counters or events to your software for raw resources; the trick is to correlate what your software thinks it is doing with the operating system resource impact of those operations. 
&lt;P&gt;Judiciously added instrumentation allows you to more easily pinpoint the states that lead to poor performance or failure. Well designed events inform monitoring software and IT admins about whether the software is operating normally, in a degraded state, or has failed completely. Good tracing events in conjunction with perf counters related to the work of the software allow diagnosis and tracking of trends. Events targeted to the administrator can identify what work was being done for which user context when a failure occurs. 
&lt;P&gt;&lt;B&gt;Rico's Instrumentation Aphorisms&lt;/B&gt; 
&lt;P&gt;&lt;B&gt;&lt;/B&gt;
&lt;P&gt;&lt;STRONG&gt;Instrumentation aphorism #1: Attribute the cost, don't describe it.&lt;/STRONG&gt;&lt;/P&gt;
&lt;P&gt;To attribute costs, the important word is "correlation". You want to correlate what your software thinks it is doing to what the operating system knows about resource usage. You can use (e.g.) ETW tracing events to mark the beginning and end of "jobs" or transactions in your software's work life.&lt;/P&gt;
&lt;P&gt;What is a “transaction” in the runtime life of your software?&amp;nbsp; Is it a mouse click event?&amp;nbsp; A business transaction of some kind?&amp;nbsp; An HTML page delivered to the user?&amp;nbsp; A database query performed?&amp;nbsp; Whatever it is, look at your critical resources and consider the cost per unit of work.&amp;nbsp; For example, consider CPU cycles per transaction, network bytes per transaction, disk i/o’s per transaction, etc.&amp;nbsp;&amp;nbsp; 
&lt;P&gt;Tracing events, to be useful, need to be associated with the higher level transactions of the software rather than associated with the life of single objects. You can have too many events and events at too low a level or marking time intervals that are too short to be useful. This use of events and perf counters just creates overwhelming noise and does not allow you to see trends easily. 
&lt;P&gt;This correlation between the work of the software and resources should also be used in administrator events marking changes of state, not just tracing events. Administrators are running the software for a reason and have every interest in knowing why (e.g.) MOM 2005 is reporting a degraded state for it - why the system is slowing down or why the software is banging away at the disks continuously. These events, as opposed to tracing events should provide actionable advice. 
&lt;P&gt;&lt;STRONG&gt;Instrumentation aphorism #2: Account for consumption. &lt;/STRONG&gt;
&lt;P&gt;To account for consumption, you will want to calculate rates rather than just measure occurrences. Look at the resource costs per unit-of-work of work. What is your software accomplishing to justify its consumption of CPU, memory, disk, network , or other resources? Expressing resource costs in a per-unit-of work fashion will help you to see which costs are reasonable and which are problems. You want to be able to trace or to inform adminstrators what resources are being used. 
&lt;P&gt;The operating system already gives you a variety of performance counters that measure CPU consumption, disk I/Os, memory usage, and network activity. These are the primary measuring sticks you need to compare to what your software is doing. The performance counters you add are most useful when they calculate the rate of work accomplished. 
&lt;P&gt;You can generate tracing events that tell you the rate of work, what the user context is. The combination of events that mark the start and end of transactions with rate counters allows developers and CSS people to pinpoint the resource that is being pinched and wrecking performance or starting a death spiral to failure. 
&lt;P&gt;If you are considering a design which sequesters a chunk of memory which your software managed, you may want to think twice about it. The OS already tracks memory resources. If you manage your own memory, then you have to duplicate the operating system plumbing to be able to diagnose performance problems and failures. The programming and maintenance costs for this may outweigh the hoped-for design benefits.&amp;nbsp;&lt;/P&gt;&lt;/BLOCKQUOTE&gt;&lt;img src="http://blogs.msdn.com/aggbug.aspx?PostID=6622408" width="1" height="1"&gt;</description><category domain="http://blogs.msdn.com/ricom/archive/tags/performance/default.aspx">performance</category><category domain="http://blogs.msdn.com/ricom/archive/tags/design+advice/default.aspx">design advice</category></item><item><title>Performance Threat Models</title><link>http://blogs.msdn.com/ricom/archive/2007/11/13/performance-threat-models.aspx</link><pubDate>Tue, 13 Nov 2007 23:37:50 GMT</pubDate><guid isPermaLink="false">91d46819-8472-40ad-a661-2c78acb4018c:6182851</guid><dc:creator>ricom</dc:creator><slash:comments>1</slash:comments><comments>http://blogs.msdn.com/ricom/comments/6182851.aspx</comments><wfw:commentRss>http://blogs.msdn.com/ricom/commentrss.aspx?PostID=6182851</wfw:commentRss><description>&lt;p&gt;I've been meaning to post this for ages and somehow I kept forgetting.&lt;/p&gt; &lt;p&gt;&lt;a href="http://blogs.msdn.com/jmeier/"&gt;J.D.&lt;/a&gt; and I have long thought that many of the techniques used to do a security threat model are actually directly applicable to doing performance analysis as well.&amp;nbsp; The idea of threats and mitigations is quite general but more importantly a direct analysis of the &lt;em&gt;architecture &lt;/em&gt;is invaluable and its something you can do very early in the lifecycle of a product.&amp;nbsp; Think of it as "testing" the architecture while it's still just a diagram.&lt;/p&gt; &lt;p&gt;A while ago J.D. produced this analysis which I think you might find useful: &lt;a href="http://blogs.msdn.com/jmeier/archive/2007/08/28/performance-threats.aspx"&gt;http://blogs.msdn.com/jmeier/archive/2007/08/28/performance-threats.aspx&lt;/a&gt;&lt;/p&gt; &lt;p&gt;The idea of testing the architecture is something I want to do a lot more of in the next version of Visual Studio (but more on that another time)&lt;/p&gt;&lt;img src="http://blogs.msdn.com/aggbug.aspx?PostID=6182851" width="1" height="1"&gt;</description><category domain="http://blogs.msdn.com/ricom/archive/tags/performance/default.aspx">performance</category><category domain="http://blogs.msdn.com/ricom/archive/tags/design+advice/default.aspx">design advice</category><category domain="http://blogs.msdn.com/ricom/archive/tags/recommendations/default.aspx">recommendations</category></item><item><title>Database Performance, Correctness, Compostion, Compromise, and Linq too</title><link>http://blogs.msdn.com/ricom/archive/2007/08/31/database-performance-correctness-compostion-compromise-and-linq-too.aspx</link><pubDate>Sat, 01 Sep 2007 04:10:00 GMT</pubDate><guid isPermaLink="false">91d46819-8472-40ad-a661-2c78acb4018c:4678887</guid><dc:creator>ricom</dc:creator><slash:comments>16</slash:comments><comments>http://blogs.msdn.com/ricom/comments/4678887.aspx</comments><wfw:commentRss>http://blogs.msdn.com/ricom/commentrss.aspx?PostID=4678887</wfw:commentRss><description>&lt;P mce_keep="true"&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&lt;STRONG&gt;Introduction and Disclaimer&lt;/STRONG&gt; 
&lt;P&gt;Regular readers of my blog are already familiar with my goal to provide brief and useful information that is approximately correct and that illustrates some key truths. Most of the time my articles are not authoritative and that is especially true in this case. I am certainly not an Official Microsoft Authority on databases and data systems, I just have a good bit of experience in that area, and I wanted to convey some things I learned that I thought were important, and that I’ve never seen assembled as a whole before, so I’ve written this article. This article uses Linq to SQL for its examples but I think it is actually more broadly applicable, with due caution. 
&lt;P&gt;&lt;STRONG&gt;Performance in Many Tier Systems&lt;/STRONG&gt; 
&lt;P&gt;Again if you read my blog you’ll know that I always talk about the importance of &lt;A href="http://blogs.msdn.com/ricom/archive/2006/12/21/do-performance-analysis-in-context.aspx" mce_href="http://blogs.msdn.com/ricom/archive/2006/12/21/do-performance-analysis-in-context.aspx"&gt;measuring performance in context&lt;/A&gt;.&amp;nbsp; This is especially important in systems with multiple tiers because making a choice in one tier can profoundly impact the consequences in other tiers. For instance, a client-side cache might take a lot of load off of the middle tier. Sounds great right? Oops, you of course remembered that if you cached the contents on the client then they are going to be at least a little of out of date right? You’re ok with that? Or did you add a scheme to periodically recheck validity. Oops, now you have more traffic to the middle teir again. No, wait, you have a nice periodic discard policy where you only keep local data for a known interval? Ah yes, but now your cache contents aren’t necessarily self-consistent because they were fetched with different queries at different times and unified to create the cache.&amp;nbsp; Does it ever end? 
&lt;P&gt;There is a dance here, and it is a complicated one. Only by understanding how all the dancers play together in your system can you truly create systems that have solid correctness characteristics and good performance. And it’s a bad idea to look at the performance of any one of the dancers independently. 
&lt;P&gt;&lt;STRONG&gt;Key Factors&lt;/STRONG&gt; 
&lt;P&gt;I break it down into a fairly small list of considerations, and these are of course deeply entangled. 
&lt;UL&gt;
&lt;LI&gt;Locality&lt;/LI&gt;
&lt;LI&gt;Isolation&lt;/LI&gt;
&lt;LI&gt;Unit of Work&lt;/LI&gt;&lt;/UL&gt;
&lt;P&gt;You might wonder why I’m not mentioning things like network, schema, and so forth… It’s because I think about them in the context of the bigger phenomena. And maybe you’re wondering what this could possibly have to do with Linq to SQL but don’t worry we’re going to get there by the time we talk about unit of work. 
&lt;P&gt;&lt;STRONG&gt;Locality&lt;/STRONG&gt; 
&lt;P&gt;Even when we’re not talking about databases I often say “Locality is everything” and it’s possible that it is even more true in the database world (don’t nitpick me on “degrees of truth”, it’s only a figure of speech ^_^). When you’re talking about a data system, bad locality generally translates directly to more disk operations which in turn translate into torpedoed performance. 
&lt;P&gt;What do we do about it? Two big things: Schema and Indexes. Maybe that’s really two sides of the same thing but let me separate those concerns for a moment. 
&lt;P&gt;We create schema in a manner that is consistent with the data we intend to represent and so that logically related concerns will tend to be physically together. If we normalize well we also tend to find that a single logical fact tends to be represented in a single physical location. All of this bodes well for locality. 
&lt;P&gt;But now we’re faced with a problem. One organization of the data is frequently not enough to support all the operations that we intend to perform on the data. If we want to look up only by (e.g.) ID, one ordering is fine but the moment that we want to look up also by (e.g.) surname then we need to organize the data in more than one way. That is where indexes come in to play. 
&lt;P&gt;Trust me, I’m going somewhere with all of this. 
&lt;P&gt;To get multiple orderings of the same data you could create multiple tables with the same data arranged differently. That would work but it would be highly inconvenient as you would always have to be choosing the flavor of the data to access when you queried the data and you would have to update multiple copies of the data whenever you made changes. 
&lt;P&gt;Indexes let you automate this. Indexes are your way of telling a database that you want multiple copies of your data, ordered differently, and any time you do an operation that updates the main copy you want all of the alternate copies also updated automatically in the same transaction. 
&lt;P&gt;This is actually a profound statement. Remember we added indexes because there were some questions we could not answer without looking at a lot of data (bad locality) and the price we pay is that now when we make changes to one spot in the data we actually have to propagate that change to many places atomically (reduced locality). 
&lt;P&gt;In the final analysis an index isn’t much more than a second, or third, etc. copy of the table with some of the columns reordered and some removed entirely. 
&lt;P&gt;Locality means we get just the data we need, that it’s organized in such a way that we do not have to sift though vast volumes of data to get to what we want, and that we can make surgical changes to the data in support of the things our system needs to do as it runs. Good locality translates directly into good performance. 
&lt;P&gt;&lt;STRONG&gt;Isolation&lt;/STRONG&gt; 
&lt;P&gt;Isolation is a notion that is cloaked in mystery, perhaps unnecessarily. I’ve met many developers that know what a transaction is but far fewer that know about isolation levels and fewer still that understand how these things are entangled. In short, isolation is about giving every program using a data system the illusion that it is the only user. Levels of isolation, roughly, describe how imperfect that illusion is going to be. 
&lt;P&gt;OK if you’re still with me at this point you must be wondering what any of this could possibly have to do with performance and even more wondering why on earth I am now talking about Isolation, a concept that is understood even less well than Locality. 
&lt;P&gt;I’m glad you asked :) 
&lt;P&gt;The first thing you should know is that the better your data locality the easier it is to create the illusion of isolation. The simplest case is two clients looking at two completely unrelated sections of a database – they have no overlap in their operations whatsoever and so isolation is easy. Now of course the more data you access the greater the likelihood that you will overlap with someone else and some isolation technique is going to be necessary to preserve the illusion. If your data has great locality then you’ll be able to make nice tight queries minimizing your isolation needs. 
&lt;P&gt;The second thing you should know is that maintaining isolation has a cost, like everything else. Depending on the technique used it can have both direct costs and indirect costs. But that’s really abstract so let’s be more concrete to illustrate these costs with a specific example. 
&lt;P&gt;Let’s suppose I have a simple database with just one table in it. It’s a set of account numbers, names, and balances. Now just one table isn’t enough to really model a bank with necessary auditing and so forth but this example is already complicated enough to show some isolation concerns and how they turn into performance problems. Furthermore let’s say the isolation required is one of the most basic choices “READ_COMMITTED” meaning I’m never allowed to observe someone else’s data if they have not yet committed their transaction – so no intermediate results. 
&lt;P&gt;Now let’s suppose I’m a big customer of this bank and I have 100 accounts, conveniently numbered 1000, to 1099. Some of the accounts are in my name and some of them are in my wife’s name. I want to know the total amount of money in the accounts in my name. In SQL it would look like this 
&lt;BLOCKQUOTE&gt;
&lt;P&gt;select sum(balance) &lt;BR&gt;from account &lt;BR&gt;where account.id &amp;gt; = 1000 and account.id &amp;lt;= 1099 and account.name = ‘Rico’&lt;/P&gt;&lt;/BLOCKQUOTE&gt;
&lt;P&gt;The situation doesn’t get much easier than that. We could assume there is a nice index on the account table by ID so that all those accounts are (nearly) contiguous and with one scan of just my slice of the data we can get the answer. That’s about the best locality we could hope for. 
&lt;P&gt;Now you can imagine that you are the little database engine, you go and read account #1000, check the name, find that it’s Rico, and add the balance. Move on to #1001, check the name, add the balance. Chug chug chug. 
&lt;P&gt;Now let me complicate things just a little bit. While this is happening, and we’re on account #1050 (chug chug) another user comes along and deposits money into one of my accounts. Let’s say its account #1060. Is this a problem? Well no it isn’t. Since I haven’t yet read the contents of account #1060 when I get there I will find the new balance and all is well. Or is it? 
&lt;P&gt;Here’s a subtle point: the total that I get isn’t guaranteed to be the total that I would have gotten if the whole query had run at the instant that that it started (you could imagine such a system but that’s unusual) the only guarantee is that we will get some total that represents the sum of the balances as they could have existed at some moment where all the data was committed. So here we’ve created valid total, an especially interesting one because it shows the balance at the instant the query finishes. 
&lt;P&gt;Great, so we can write into areas we have not yet read without any trouble. 
&lt;P&gt;In fact we could even do something a little more complicated. Suppose we did a transfer of $100 from one of my accounts, #1060, to another account, #1070. We’re still in good shape because neither of those two accounts has yet been read and so the sum will be computed correctly. 
&lt;P&gt;Hey, this isolation thing doesn’t seem very hard so far! &amp;nbsp;I wanted to subtract money from #1060 and put in #1070 and everything is great. But what if I had wanted to put the money in account #1030? 
&lt;P&gt;Now I have a problem, #1030 has already been read. If I allow the money to be moved then the sum calculation will be off by $100 because the $100 is gone from #1060 which I have not yet read and it was missing from #1030 which I have read. Ooops. 
&lt;P&gt;What do we do about this? Well there are many approaches, I will pick one for this example, the key thing to remember is that all the approaches have costs. 
&lt;P&gt;You might think that what&amp;nbsp;the database&amp;nbsp;must do when the query begins is lock everything&amp;nbsp;that is&amp;nbsp;going to be&amp;nbsp;read and prevent anyone/everyone from writing to those things. That could work, if only&amp;nbsp;it could predict, accurately, what it is that will be read. Or&amp;nbsp;it could lock more than will be&amp;nbsp;read, hopefully not much more, that could work too. But keep in mind these things: 
&lt;P&gt;#1 predicting the future is hard (e.g. in this example, which rows belong to me and which belong to my wife?) 
&lt;P&gt;#2 we want to lock as little as possible so that as much can continue running as possible (e.g. all updates to my wife’s accounts would be fine since they do not affect the sum) 
&lt;P&gt;So we might end up at a different scheme – rather than guess the future, lets lock the past. When the transaction wrote to #1060, subtracting $100 that was just fine, so far. The contents of #1060 are still dirty so if the reader arrives there it must wait for the transaction to finish. Meanwhile, if the writer chooses to move the $100 to account #1070 all is well. #1070 has not been read yet, it isn’t locked, the writer moves the cash, commits the transaction and if the reader had been waiting for #1060 to finalize it will do so and the read operation can proceed. So perhaps the summing was paused for a moment but otherwise everything went well. 
&lt;P&gt;Notice that we already have taken a bunch of performance hits. We had to mark rows that were “dirty” and pending a transaction with some kind of write lock. We had to have readers checking the write locks and waiting (i.e. taking longer) if they encountered a lock. And we had to clean out the write locks whenever a transaction committed. But that was the easy case. 
&lt;P&gt;What if, after subtracting $100 from account #1060 we then attempted to add the $100 to account #1030. Lucky for us new “lock the past” isolation scheme says that row #1030, since it has already been read, now has a read-lock. That means we can’t write to it. But wait. #1060 has a write lock. So when the reader eventually gets to #1060 it will stop. It will not give up its read lock on #1030 so the write operation cannot complete, meanwhile the read operation on #1060 cannot complete and so the read long on #1030 well never be released. 
&lt;P&gt;Deadlock. 
&lt;P&gt;You’ll notice I’ve managed to create a deadlock with just one writer in the system merely because of the need for isolation. 
&lt;P&gt;Now you can imagine other isolation methods that would not get a deadlock in this case but they have other costs and we’re not studying them. The point here is that there is a direct isolation cost, in any scheme, and now we’re about to see an example of an indirect cost. 
&lt;P&gt;Since the system is deadlocked, the database must now choose one of those two transactions (the reader or the writer) and abort it. 
&lt;P&gt;If it chooses the reader, then the write will be allowed to complete, the reader must retry the operation and then get the correct and accurate new sum based on the completed write transaction. 
&lt;P&gt;If it chooses the writer, then the reader may complete but first we must use the transaction log to restore the original contents account #1060 (i.e. the transaction does not commit, it aborts) and we get the correct sum. Meanwhile the writer must retry its operation. 
&lt;P&gt;Can you see indirect costs? Probably the most significant is that either the read or the write must be retried. Now these were both tiny cases but you can imagine if many accounts were implicated and if there was complicated math, sorting, etc. that redoing that read query could be costly and likewise the write query with many indices and if it spanned many tables could be rather complex, to say nothing of the business logic that might be required to redo the math. In this case it’s a simple add $100, subtract $100 but in a complex system it might be a tricky reservation computation, mileage credit and commission distribution – with auditing. Anything you have to redo is just wasted work. 
&lt;P&gt;Importantly, the above is a normal and natural part of using a database and nothing has gone “wrong” here and we can reach this vital conclusion. &lt;B&gt;The more we read or write in one unit, the greater the amortized cost of isolation. &lt;/B&gt;The more we attempt to write in one unit the greater the chance that the entire write will have to be aborted (and then redone from scratch). 
&lt;P&gt;This leads us right into the next topic… 
&lt;P&gt;&lt;STRONG&gt;Unit of Work&lt;/STRONG&gt; 
&lt;P&gt;More than anything else how much work you do in one chunk drives everything else. Sure you need good locality – via a proper schema and indexes – and yes you need to choose the right isolation technique, where you have options, but you can destroy a system by trying to do too much at once – and people do. 
&lt;P&gt;If you’ve been waiting for me to make the connection to Linq here it is: every data layer has to make choices about how much to read and when. It may seem like a good idea to do large amounts of pre-reading, and in fact when measured in isolation it may seem like you get better performance when doing so, but, like the &lt;A href="http://en.wikipedia.org/wiki/Prisoner%27s_dilemma" mce_href="http://en.wikipedia.org/wiki/Prisoner%27s_dilemma"&gt;prisoners’ dilemma&lt;/A&gt;, when composed with the other operations that are happening on the server you may find that making the best looking choice independently results in a poor situation for the system as a whole. 
&lt;P&gt;Should you make the choice that is best for one part of the system at the expense of the system as a whole? Probably not. 
&lt;P&gt;Linq to SQL is faced with very real choices around how much to read in one chunk and how much isolation to guarantee. Creating a solution where we appear to offer excellent but non-composable performance would simply be leading people down a path to disaster – and you know I always advocate the Pit of Success. The obvious way should work out well. 
&lt;P&gt;&lt;STRONG&gt;Correctness&lt;/STRONG&gt; 
&lt;P&gt;Once you’ve thought about unit of work you soon realize that you cannot afford to submit large transactions to your database – they are too likely to not commit successfully. So what do you do? 
&lt;P&gt;A classic thing to do is to break up those transactions into smaller pieces, but doing so creates a new set of intermediate stored results which must be valid. It’s a designed “workflow” for your uber-transaction where for instance every-line item in an order, or every 10 or something like that, is independently written and the order is flagged as “in flight.” In that world “in flight” orders are a normal healthy thing and your system has to be able to handle them appropriately – basically some intermediate states have become visible. 
&lt;P&gt;It sounds bad, but it’s essentially inescapable. The alternative is to *try* to commit gianormous transactions that really have no hope of *actually* committing with any kind of realiability. You may think that you’re going to get great performance by submitting all that work in one nice big chunk but you may find that all those savings are lost to the cost of all the extra retries you have to do to actually succeed. 
&lt;P&gt;And it gets worse. 
&lt;P&gt;If you have a client side component, like you do with Linq, then the data you have saved on the client might become “wrong” if something happens on the server that you don’t know about. You’ll need some kind of &lt;A href="http://blogs.msdn.com/ricom/archive/2004/06/24/165063.aspx" mce_href="http://blogs.msdn.com/ricom/archive/2004/06/24/165063.aspx"&gt;locking mechanism&lt;/A&gt; like Optimistic (what Linq uses) or Pessimistic. What this says is that before you write data back to the database you first re-verify that the data now in the database is still what it was before you made the update – if anything has changed out from under you then you throw an exception. 
&lt;P&gt;What does that mean? Well if you are writing a large amount of data and it has to be atomic that’s a bad thing because even if there wasn’t a database deadlock you still might find that some part of what you wrote has been altered. If you require that it all be as-it-was you may find you can never write anything, or that you often have to try 2, 3, 4, 5, 10, 50… who knows how many times to actually get the stuff to write. 
&lt;P&gt;What do you do about this? 
&lt;P&gt;We’re right back to the unit of work discussion. If you break those writes down into smaller chunks and make it so that you can write back in pieces – including some markers to show what is in flight and what is not – then you can “partly succeed” in your writes, even “mostly succeed” and when things go wrong because of a conflict you only need resolve those few records that actually did conflict and write them back to the database in your retry operation. 
&lt;P&gt;It is not wise to expect to successfully write thousands and thousands of rows in one operation and actually succeed on a production database under load. 
&lt;P&gt;Yes breaking those operations down will result in more round trips to the server and it may seem that such a thing performs more poorly (it will if measured alone) but those operations are much more composable with other things the server is going to be doing. 
&lt;P&gt;Your overall performance could be, almost certainly will be,&amp;nbsp;A Lot Better (TM). 
&lt;P&gt;You’ll of course have to measure, in context. 
&lt;P&gt;&lt;STRONG&gt;One Last Warning&lt;/STRONG&gt; 
&lt;P&gt;If you consider what I said, about the natural occurrence of failures in a database, then you’ll soon realize that it is *normal*, using Linq parlance, for db.SubmitChanges() to throw an exception from time to time. If you are trying to write a robust application with high reliability you need to think about that. 
&lt;P&gt;In addition to obvious things like, “the network went down”, “the database went down”, there are less obvious things like, “there was a deadlock”, “there was an optimistic lock conflict” that can and do happen. Those latter two things should be appropriately retried because *nothing bad has happened*. The strategy you choose, especially for cases where the optimistic lock failed, can have a profound impact on your performance and certainly you can’t just let those exceptions flow to the user. I think I can safely say that my mom doesn’t want to hear about how table X on connection A deadlocked with table Y on connection B. 
&lt;P&gt;If you’ve been reading carefully then you’ll see that it’s also “normal” for a foreach operation over a Linq query to fail from time to time – you need a retry strategy for those too to be fully robust. 
&lt;P&gt;Don’t get down on Linq though, those problems exist with all data solutions, the productivity benefits you get from Linq will go a long way to helping you to add the robustness you need in the areas you need it. 
&lt;P&gt;Don’t read “too much” at once. Don’t write “too much” at once. Handle deadlocks, they’re normal. Handle optimistic lock failures, they’re also normal. You should land in the Pit of Success.&lt;/P&gt;&lt;img src="http://blogs.msdn.com/aggbug.aspx?PostID=4678887" width="1" height="1"&gt;</description><category domain="http://blogs.msdn.com/ricom/archive/tags/performance/default.aspx">performance</category><category domain="http://blogs.msdn.com/ricom/archive/tags/design+advice/default.aspx">design advice</category><category domain="http://blogs.msdn.com/ricom/archive/tags/databases/default.aspx">databases</category></item><item><title>Caching Redux</title><link>http://blogs.msdn.com/ricom/archive/2007/06/25/caching-redux.aspx</link><pubDate>Mon, 25 Jun 2007 18:47:52 GMT</pubDate><guid isPermaLink="false">91d46819-8472-40ad-a661-2c78acb4018c:3521960</guid><dc:creator>ricom</dc:creator><slash:comments>0</slash:comments><comments>http://blogs.msdn.com/ricom/comments/3521960.aspx</comments><wfw:commentRss>http://blogs.msdn.com/ricom/commentrss.aspx?PostID=3521960</wfw:commentRss><description>&lt;p&gt;I got some interesting questions about how to build good middle-tier caches in my inbox last week.&amp;nbsp; I cleaned up the responses a little bit and I'm posting them here because they're actually pretty general.&amp;nbsp; I've written about this before but some things merit repeating :)&lt;/p&gt; &lt;p&gt;Here's what I wrote:&lt;/p&gt; &lt;p&gt;If I had a dime for every person who thought caching was the answer but then didn’t actually build a cache… &lt;p&gt;First, consider your cache &lt;b&gt;&lt;i&gt;policy&lt;/i&gt;&lt;/b&gt; carefully.&amp;nbsp; As I’ve often written, &lt;b&gt;&lt;i&gt;caching implies policy&lt;/i&gt;&lt;/b&gt; &lt;p&gt;&lt;a href="http://blogs.msdn.com/ricom/archive/2004/01/19/60280.aspx"&gt;http://blogs.msdn.com/ricom/archive/2004/01/19/60280.aspx&lt;/a&gt; &lt;p&gt;And as I told Raymond – a cache with bad policy is another name for a memory leak &lt;p&gt;&lt;a href="http://blogs.msdn.com/oldnewthing/archive/2006/05/02/588350.aspx"&gt;http://blogs.msdn.com/oldnewthing/archive/2006/05/02/588350.aspx&lt;/a&gt; &lt;p&gt;Raymond turns this into some excellent recommendations, including instrumentation and observation which result in cache design by a &lt;i&gt;quantitative&lt;/i&gt; approach. &lt;p&gt;If I had a dime for everyone who built a cache because they thought it was a good idea but then did not measure the efficacy of what they had built… &lt;p&gt;Explore the space, try rough experiments at different layers and try different policies.&amp;nbsp; Often very aggressive policies (fast retirement of cache data) are effective but you must understand not only how data gets in the cache (that is obvious) but how does it get OUT?&amp;nbsp; Actively or passively?&amp;nbsp; Based on limits or hit rate or? &lt;p&gt;Whatever you do, be sure you do it on the basis of measurements.&amp;nbsp; Any kind of automatic “magic” caching layer that somehow knows about new business objects immediately sounds like a disaster to me.&amp;nbsp; It’s not a question of knowing the business objects it’s a question of knowing usage patterns and policy.&amp;nbsp; I don’t know how to do that automatically -- but maybe your particular problem has patterns you can leverage.&amp;nbsp; I also know that (e.g.) SQL server already has a good cache at the data level and if you do your job right that is often all you need.&lt;/p&gt; &lt;p&gt;Policy is everything. &lt;img src="http://blogs.msdn.com/aggbug.aspx?PostID=3521960" width="1" height="1"&gt;</description><category domain="http://blogs.msdn.com/ricom/archive/tags/performance/default.aspx">performance</category><category domain="http://blogs.msdn.com/ricom/archive/tags/design+advice/default.aspx">design advice</category><category domain="http://blogs.msdn.com/ricom/archive/tags/databases/default.aspx">databases</category></item><item><title>Krzysztof Cwalina on Framework Design</title><link>http://blogs.msdn.com/ricom/archive/2007/04/03/krzysztof-cwalina-on-framework-design.aspx</link><pubDate>Tue, 03 Apr 2007 21:10:00 GMT</pubDate><guid isPermaLink="false">91d46819-8472-40ad-a661-2c78acb4018c:2019983</guid><dc:creator>ricom</dc:creator><slash:comments>2</slash:comments><comments>http://blogs.msdn.com/ricom/comments/2019983.aspx</comments><wfw:commentRss>http://blogs.msdn.com/ricom/commentrss.aspx?PostID=2019983</wfw:commentRss><description>&lt;P&gt;Krzysztof has been recorded for MS Research and gives an excellent presentation on framework design.&amp;nbsp; The details are on his &lt;A class="" href="http://blogs.msdn.com/kcwalina/archive/2007/03/29/1989896.aspx" mce_href="http://blogs.msdn.com/kcwalina/archive/2007/03/29/1989896.aspx"&gt;blog&lt;/A&gt; here.&amp;nbsp; Lots of great notes.&amp;nbsp;&amp;nbsp; Contributing annotations to his &lt;A href="http://www.amazon.com/Framework-Design-Guidelines-Conventions-Development/dp/0321246756" mce_href="http://www.amazon.com/Framework-Design-Guidelines-Conventions-Development/dp/0321246756"&gt;book&lt;/A&gt; on the same subject was one of the more fun things I did last year.&amp;nbsp; And &lt;A href="http://blogs.msdn.com/brada" mce_href="http://blogs.msdn.com/brada"&gt;Brad's&lt;/A&gt; contributions were not to shabby either :)&lt;/P&gt;&lt;img src="http://blogs.msdn.com/aggbug.aspx?PostID=2019983" width="1" height="1"&gt;</description><category domain="http://blogs.msdn.com/ricom/archive/tags/design+advice/default.aspx">design advice</category></item><item><title>Memory leaks 101: Objects anchored by event generators</title><link>http://blogs.msdn.com/ricom/archive/2007/01/09/memory-leaks-101-objects-anchored-by-event-generators.aspx</link><pubDate>Tue, 09 Jan 2007 22:20:00 GMT</pubDate><guid isPermaLink="false">91d46819-8472-40ad-a661-2c78acb4018c:1440078</guid><dc:creator>ricom</dc:creator><slash:comments>9</slash:comments><comments>http://blogs.msdn.com/ricom/comments/1440078.aspx</comments><wfw:commentRss>http://blogs.msdn.com/ricom/commentrss.aspx?PostID=1440078</wfw:commentRss><description>&lt;P mce_keep="true"&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;This problem actually comes up pretty often so I thought I'd write a little article about it, and a couple of approaches to solving it.&amp;nbsp; 
&lt;P&gt;Basically any time you take an object "Your Object" whose life you want the GC to manage and then create a reference to it from a long lived object you’ve “anchored” the thing and now it will never go away unless you take special steps to do so. The most common case where this happens is where Your Object registers for events from some global event generator that never goes away. Now Your Object won’t go away and neither will anything that it references. 
&lt;P&gt;There are basically two ways to fix this, both of which are about&amp;nbsp;detaching your object from that anchor:&amp;nbsp; 
&lt;P&gt;The first approach is to make Your Object IDisposable. This works well if it has a well understood lifetime, such as a window class. When the window is (e.g.) closed the Dispose method unregisters the object from its various event sources and the whole thing goes away. This is a bit of a chore but it involves the least amount of additional object overhead. Since these leaks are easy to find (they jump right off the page using &lt;A href="http://blogs.msdn.com/ricom/archive/2004/12/10/279612.aspx" mce_href="http://blogs.msdn.com/ricom/archive/2004/12/10/279612.aspx"&gt;the usual techniques&lt;/A&gt;) you just target them for eradication and systematically fix all the window types until you don’t have a problem.&amp;nbsp; Note there are often other cleanups that you want to do when a window is closed because such things often correspond to &lt;A href="http://blogs.msdn.com/ricom/archive/2004/11/29/271829.aspx" mce_href="http://blogs.msdn.com/ricom/archive/2004/11/29/271829.aspx"&gt;mass object extinctions&lt;/A&gt;. 
&lt;P&gt;The second approach is to create some indirections. This has the advantage that it’s automatic but it has the disadvantage of adding some complexity and non-determinism to the picture. Basically you create this picture 
&lt;BLOCKQUOTE&gt;
&lt;P&gt;Event Source ----&amp;gt; Courier Object ----&amp;gt; Weak Reference ----&amp;gt; Your Object &lt;/P&gt;&lt;/BLOCKQUOTE&gt;
&lt;P&gt;Now the way this is works is that instead of registering your own object for the event, you make some kind of courier. Your Object can now go away as is normal, because the Weak Reference isn’t really holding on to it. When that happens you’ll have a dangling Courier Object. The courier in turn removes itself as a listener when it notices that the Weak Reference it is holding on to can no longer be resolved and is therefore moot. This forces you to have one Courier per object per event source. You can generalize this so that the Courier is in fact a broadcaster of sorts, which would then give you one courier per event source and you register with the courier. That’s basically the weak (broadcasting) delegate pattern. 
&lt;P&gt;Now as for limitations, well there are none per se but none of this stuff is free. In the second approach you allocate more memory, make more work for the collector, force more objects into generation 2 and so forth. My preference would be the first method if it is at all practical. 
&lt;P&gt;Sometimes folks think this is is a bug but it isn't really a CLR bug here per se: it’s a classic memory leak. If you read my blog on &lt;A href="http://blogs.msdn.com/ricom/archive/2004/12/10/279612.aspx" mce_href="http://blogs.msdn.com/ricom/archive/2004/12/10/279612.aspx"&gt;managed memory leaks&lt;/A&gt; you can see that tracking these down is basically shooting fish in a barrel. They can’t hide. There may be many of them but you just make a list and start killing them off.&amp;nbsp; 
&lt;P&gt;So it’s your choice. 
&lt;P&gt;Greg Schechter gives an example of the courier approach here.&amp;nbsp; In his notation&lt;/P&gt;
&lt;UL&gt;
&lt;LI&gt;Event Source = "Container" &lt;/LI&gt;
&lt;LI&gt;Your Object = "Containee" &lt;/LI&gt;
&lt;LI&gt;Courier Object = "WeakContainer"&lt;/LI&gt;&lt;/UL&gt;
&lt;P&gt;&lt;A href="http://blogs.msdn.com/greg_schechter/archive/2004/05/27/143605.aspx" mce_href="http://blogs.msdn.com/greg_schechter/archive/2004/05/27/143605.aspx"&gt;http://blogs.msdn.com/greg_schechter/archive/2004/05/27/143605.aspx&lt;/A&gt;&lt;/P&gt;&lt;img src="http://blogs.msdn.com/aggbug.aspx?PostID=1440078" width="1" height="1"&gt;</description><category domain="http://blogs.msdn.com/ricom/archive/tags/performance/default.aspx">performance</category><category domain="http://blogs.msdn.com/ricom/archive/tags/design+advice/default.aspx">design advice</category></item><item><title>Avoiding Coding Pitfalls with Performance Signatures</title><link>http://blogs.msdn.com/ricom/archive/2006/12/11/avoiding-coding-pitfalls-with-performance-signatures.aspx</link><pubDate>Mon, 11 Dec 2006 23:03:57 GMT</pubDate><guid isPermaLink="false">91d46819-8472-40ad-a661-2c78acb4018c:1260762</guid><dc:creator>ricom</dc:creator><slash:comments>11</slash:comments><comments>http://blogs.msdn.com/ricom/comments/1260762.aspx</comments><wfw:commentRss>http://blogs.msdn.com/ricom/commentrss.aspx?PostID=1260762</wfw:commentRss><description>&lt;p&gt;&lt;/p&gt; &lt;p&gt;On Friday of last week I gave this presentation at the Computer Measurements Group CMG2006 conference(&lt;a href="http://www.cmg.org"&gt;http://www.cmg.org&lt;/a&gt;).&amp;nbsp; I had previously alluded to it in &lt;a href="http://blogs.msdn.com/ricom/archive/2006/07/26/679621.aspx"&gt;this posting&lt;/a&gt;&amp;nbsp;and I have been waiting to write about it until after the conference. &lt;p&gt;So for the benefit of those of you who couldn't be there; here's the slides plus some high level points from the talk.&amp;nbsp; I hope you find it at least somewhat interesting. &lt;p&gt;&amp;nbsp; &lt;p&gt;&lt;strong&gt;Why Am I Talking About This?&lt;/strong&gt; &lt;p&gt;Performance Problems: Top Two: &lt;ol&gt; &lt;li&gt;Algorithms with a lot of waste&lt;/li&gt; &lt;li&gt;Dependencies we cannot afford&lt;/li&gt;&lt;/ol&gt; &lt;p&gt;But how do we know we can’t afford them? &lt;p&gt;&amp;nbsp; &lt;p&gt;Speaking very broadly, most performance &lt;em&gt;disasters&lt;/em&gt; owe their origin to one of the above problems.&amp;nbsp; An algorithm fundamentally unsuitable for the problem, or a dependancy that is fundamentally not affordable.&amp;nbsp; It is that second one that this set of slides is all about.&amp;nbsp; I realized just the day before that I had made the cosmic mistake of creating a slide deck whose sole purpose was to talk about "number two."&amp;nbsp; Har har :) &lt;p&gt;&lt;strong&gt;&lt;/strong&gt;&amp;nbsp; &lt;p&gt;&lt;strong&gt;What’s an Affordable Dependency?&lt;/strong&gt; &lt;p&gt;Many people ask about the cost of certain methods &lt;ul&gt; &lt;li&gt;“Is this slow?”&lt;/li&gt; &lt;li&gt;“What’s the performance of System.Foo?”&lt;/li&gt;&lt;/ul&gt; &lt;p&gt;A lot of times they have little in the way of details &lt;p&gt;Crystal balls are back-ordered until 2015 &lt;p&gt;&amp;nbsp; &lt;p&gt;I'm often approached by developers looking for guidance with regard to certain approaches.&amp;nbsp; They want to know if something is "fast" or "slow" or reasonable or what.&amp;nbsp; Of course it's very hard to answer these questions in any kind of general way -- what can you say about a method's suitability for use in someone elses' system?&amp;nbsp; Yet you really want to help these people -- they're trying to do the right thing. &lt;p&gt;&amp;nbsp; &lt;p&gt;&lt;strong&gt;Context is Everything&lt;/strong&gt; &lt;p&gt;A better approach: Who Am I? &lt;ul&gt; &lt;li&gt;What context am I in?&lt;/li&gt; &lt;li&gt;What are reasonable costs in this context?&lt;/li&gt; &lt;li&gt;What do I need to say about my own performance?&lt;/li&gt;&lt;/ul&gt; &lt;p&gt;Who are they? &lt;ul&gt; &lt;li&gt;That code I’m using: What context was it written for?&lt;/li&gt; &lt;li&gt;Can I afford that cost in my context?&lt;/li&gt;&lt;/ul&gt; &lt;p&gt;&lt;strong&gt;&lt;/strong&gt;&amp;nbsp; &lt;p&gt;When I'm asked about particular methods I can often help without having taken a lot of measurements.&amp;nbsp; I do this by asking the customer something about their context.&amp;nbsp; What are they trying to use and where are they trying to use it?&amp;nbsp; Then you can make some basic conclusions about fitness for purpose.&amp;nbsp; Once you know where the code their are intending to write is going to sit in context, you can often give some general guidance.&amp;nbsp; It's not perfect but at least you can give some sense reasonableness. &lt;p&gt;&lt;strong&gt;&lt;/strong&gt;&amp;nbsp; &lt;p&gt;&lt;strong&gt;The Need For Qualitative Advice&lt;/strong&gt; &lt;p&gt;Quantitative advice is best &lt;ul&gt; &lt;li&gt;But there is a deep need for less formal advice&lt;/li&gt;&lt;/ul&gt; &lt;p&gt;When costs are wrong they are very wrong &lt;ul&gt; &lt;li&gt;So we can prevent problems with a rough approach&lt;/li&gt;&lt;/ul&gt; &lt;p&gt;&amp;nbsp; &lt;p&gt;While it's true that the very best advice you could give is precise and quantitative in nature it's often the case that qualitative advice is still useful.&amp;nbsp; The main reason for this is that in many situations if a developer gets within a few percent of their performance goal they will be having a large party celebrating their success.&amp;nbsp; Failure more often looks like some kind of calamity where the goals have been missed by a factor of 20 or something like that.&amp;nbsp;&amp;nbsp; So in order to prevent calamities we can at least start by giving very basic advice. &lt;p&gt;&amp;nbsp; &lt;p&gt;&lt;strong&gt;Durable and Consumable Advice&lt;/strong&gt; &lt;p&gt;We don’t want to have to change the advice every time the libraries change &lt;p&gt;We want the advice to be easily consumable &lt;ul&gt; &lt;li&gt;no complicated units, friendly to all programmers&lt;/li&gt;&lt;/ul&gt; &lt;p&gt;&amp;nbsp; &lt;p&gt;Quantitative advice has the problem that it changes almost day to day.&amp;nbsp; Consider just CPU usage of an API.&amp;nbsp; Do you specify it in time, or cycles?&amp;nbsp; Which processor?&amp;nbsp; What if the microarchitecture changes, do you republish?&amp;nbsp; Does it really matter if the delta is only a few percent?&amp;nbsp; Qualitative descriptions on the other hand tend to last -- an API that is "medium" cost problably is medium forever (relative to other methods anyway) and if it ever did change to "hi" that would be a very big deal, worthy of republishing.&amp;nbsp; Minor maintenance on an API isn't likely to change it from a "low" to a "medium" etc. &lt;p&gt;&amp;nbsp; &lt;p&gt;&lt;strong&gt;Introducing Performance Signatures&lt;/strong&gt; &lt;p&gt;To reach these goals we associate a &lt;em&gt;signature&lt;/em&gt; with every method &lt;p&gt;A signature characterizes some kind of &lt;em&gt;cost&lt;/em&gt; &lt;p&gt;You can do this as a mental exercise &lt;p&gt;You can do it with tools support (coming, I hope&amp;nbsp;&amp;lt;g&amp;gt;) &lt;p&gt;Signatures can be simple one-worders &lt;ul&gt; &lt;li&gt;As simple as “Heavy”, “Medium”, “Light”&lt;/li&gt;&lt;/ul&gt; &lt;p&gt;&amp;nbsp; &lt;p&gt;If you publish a simple description of the cost of a method.&amp;nbsp; Even as simple as "Heavy", "Medium", "Light" you immediately give your customers some grasp of what is expensive and what isn't so much.&amp;nbsp; Today we have &lt;em&gt;nothing &lt;/em&gt;in the way of published costs.&amp;nbsp; Compare this to say a structural engineer who has all manner of information about his/her raw materials.&amp;nbsp; We have virtually nothing in the way of costs -- even very rough costs would be something.&amp;nbsp; As I said in the talk, "Just throw me a bone or something... anything!"&amp;nbsp; The costs can potentially be along any dimension that is interesting -- even things that are only measured approximately or estimated by static analysis. &lt;p&gt;&amp;nbsp; &lt;p&gt;&lt;strong&gt;How Do We Use Them?&lt;/strong&gt; &lt;p&gt;You can do profound things even with dumb signatures &lt;ul&gt; &lt;li&gt;First, if writing a “Medium” don’t call any “Heavies”&lt;/li&gt; &lt;li&gt;Typical mistake&lt;/li&gt;&lt;/ul&gt; &lt;p&gt;Define best practices for “Medium” and check them in code reviews &lt;ul&gt; &lt;li&gt;e.g. Medium cannot do more than 1k of allocations and no long lived objects&lt;/li&gt; &lt;li&gt;Your costs&lt;/li&gt;&lt;/ul&gt; &lt;p&gt;&amp;nbsp; &lt;p&gt;The most important thing to do is to think about the costs and apply "Rico's Rule" which is that if you're trying to write a method that is supposed to have a certain cost (low, medium, 12s, 42 iops, whatever units are meaningful to you) you can't succeed if you use any methods whose cost is typically greater than what you're supposed to be doing.&amp;nbsp;&amp;nbsp; This isn't an especially profound observation but the consequence really is profound.&amp;nbsp; Just think about this particular case:&amp;nbsp; If you decide that all hashing functions (every implementation of GetHashCode) should have "low" cost and that "low" cost means no memory allocations then you can give immediate guidance as to what parts of the framework are reasonable to call within the context of a GetHashCode implemenation. &lt;p&gt;But more importantly, you can extend this idea at any level.&amp;nbsp; Suppose you're trying to use some web service.&amp;nbsp; Which methods are reasonable to call?&amp;nbsp; Can you characterize the rough costs?&amp;nbsp; Is it suitable to use these methods at your level of abstraction?&amp;nbsp; Many times very unfortunate cycles or backwards dependencies are introduced into systems because very high level methods are called from a very low level context.&amp;nbsp; A very rough scoring could prevent these in a lot of cases.&amp;nbsp;&amp;nbsp; It isn't perfect by any means but it's something. &lt;p&gt;Remember, we're not trying to find little grains of gold in the sand on a beach here.&amp;nbsp; Performance calamaties look more like a WW2 tank in the surf.&amp;nbsp; We can at least make sure we have none of those before we start with very little effort. &lt;p&gt;&amp;nbsp; &lt;p&gt;&lt;strong&gt;What about those Tools?&lt;/strong&gt; &lt;p&gt;Insane coolness is possible &lt;ul&gt; &lt;li&gt;I’m writing an “Inner Loop”&lt;/li&gt; &lt;ul&gt; &lt;li&gt;My Intellisense color codes methods that are not reasonable to use in that context by cross-checking their signature!&lt;/li&gt;&lt;/ul&gt; &lt;li&gt;I’m looking at a profile&lt;/li&gt; &lt;ul&gt; &lt;li&gt;The profiler notices that your intended costs are not matching the measured costs and points out this problem&lt;/li&gt;&lt;/ul&gt;&lt;/ul&gt; &lt;p&gt;&amp;nbsp; &lt;p&gt;Let's think about that GetHashCode example again.&amp;nbsp; Because you're using the .NET framework -- or any framework, certain conventions apply and can be readily detected.&amp;nbsp; One of these might be that all methods named "GetHashCode" should be of "low" cost.&amp;nbsp; Now these costs I'm considering in this example are just three states "low", "medium", and "high" so we need two lousy bits per method to hold them all.&amp;nbsp; But look at what coolness I can do:&amp;nbsp; while I'm in a method named GetHashCode I could put a little green squiggly underline underneath any method I tried to call that was not of "low" cost because it's probably a bad idea.&amp;nbsp; Sure it's syntactically valid, even semantically correct but is it wise?&amp;nbsp;&amp;nbsp; This is incredible performance guidance &lt;em&gt;as you type&lt;/em&gt;.&amp;nbsp; And these are real problems we're avoiding here&lt;em&gt; even with signatures that are very&amp;nbsp;simple-minded.&lt;/em&gt; &lt;p&gt;Likewise, when looking at actual measured results you can cross check the actuals against the intended and highlight cases where things didn't go as planned.&amp;nbsp; The fact that there is a plan, or that one can be at least approximately inferred is invaluable to the report.&amp;nbsp; "This function was intended to be 'medium' but the measured cost is 'hi' -- click here for details."&amp;nbsp; Again, any kind of hints in this space are invaluable, even if they are imperfect. &lt;p&gt;&amp;nbsp; &lt;p&gt;&lt;strong&gt;Static Analysis Based Tools&lt;/strong&gt; &lt;p&gt;We can auto-generate signatures based on code complexity and other heuristics &lt;ul&gt; &lt;li&gt;Now we have qualitative guidance (that we can amend) about any managed method&lt;/li&gt; &lt;li&gt;We can do this for old frameworks too!&lt;/li&gt;&lt;/ul&gt; &lt;p&gt;Tools like FXCOP can give warnings when methods take bad dependency &lt;p&gt;Can do the same for unmanaged code &lt;p&gt;&amp;nbsp; &lt;p&gt;Not only can we generate the signatures of methods for current and past frameworks based on heuristics like total allocations, we can publish them -- which I hope to do soon.&amp;nbsp; With managed code it's fairly easy to decompile and get these costs; with other code it's harder but still doable. &lt;p&gt;&amp;nbsp; &lt;p&gt;&lt;strong&gt;Does this Really Help?&lt;/strong&gt; &lt;p&gt;Absolutely! &lt;ul&gt; &lt;li&gt;Real problems like this happen all the time&lt;/li&gt; &lt;li&gt;Hashing and Comparison functions need to be “Inner Loop” quality&lt;/li&gt; &lt;li&gt;“Throughput” oriented functions cannot create long-lived objects&lt;/li&gt; &lt;li&gt;We see these mistakes in code all the time&lt;/li&gt; &lt;li&gt;Sometimes they’re baked into the design!&lt;/li&gt; &lt;li&gt;In principle, Intellisense can save you from performance mistakes as you type – a first!&lt;/li&gt;&lt;/ul&gt; &lt;p&gt;&amp;nbsp; &lt;p&gt;I have some real results from my initial analysis of mscorlib to share as well. &lt;p&gt;&amp;nbsp; &lt;p&gt;&lt;strong&gt;Initial Results&lt;/strong&gt; &lt;p&gt;Applying the algorithm as described in the paper found 53 "violations" in mscorlib &lt;ul&gt; &lt;li&gt;The test is to make sure that functions named “Equals”, “CompareTo”, and “GetHashCode” result in no memory allocations&lt;/li&gt;&lt;/ul&gt; &lt;p&gt;Twenty of these were false positives, introduced because of methods that did one-time allocations &lt;p&gt;&amp;nbsp; &lt;p&gt;The most basic analysis was still pretty good, but we can make it better with just a little help. &lt;p&gt;&amp;nbsp; &lt;p&gt;&lt;strong&gt;False Positive Pattern&lt;/strong&gt; &lt;p&gt;Typical false positives looked like this: &lt;blockquote&gt; &lt;p&gt;Something GetSomething()&lt;br&gt;{&lt;br&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; if (savedSomething == null)&lt;br&gt;&amp;nbsp; &amp;nbsp;&amp;nbsp; {&lt;br&gt;&amp;nbsp; &amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; savedSomething = CreateSomething();&lt;br&gt;&amp;nbsp;&amp;nbsp; &amp;nbsp; }&lt;br&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; return savedSomething;&lt;br&gt;}&lt;/p&gt;&lt;/blockquote&gt; &lt;p&gt;Any function calling GetSomething() would appear to allocate (statically) which is problematic because in this case the costs are associated with allocations. &lt;p&gt;&lt;strong&gt;&lt;/strong&gt;&amp;nbsp; &lt;p&gt;&lt;strong&gt;Remedy&lt;/strong&gt; &lt;p&gt;Initially, I used a broad filter that gives zero cost to all functions of the form &lt;blockquote&gt; &lt;p&gt;… F()&lt;br&gt;{&lt;br&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp; if (…)&lt;br&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp; {&lt;br&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; …&lt;br&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp; }&lt;br&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp; return …;&lt;br&gt;}&lt;/p&gt;&lt;/blockquote&gt; &lt;p&gt;A more discriminating filter is clearly possible, but even this dumb filter gets the job done. &lt;p&gt;&amp;nbsp; &lt;p&gt;&lt;strong&gt;Results to Date&lt;/strong&gt; &lt;p&gt;With the filter in place, runs over mscorlib resulted in 33 reports, all of which were confirmed to be actual problems by manual examination &lt;p&gt;However, some are not that important because the methods would not be considered performance critical&amp;nbsp;&lt;/p&gt; &lt;ul&gt; &lt;li&gt;e.g. hashing of security policy&lt;/li&gt;&lt;/ul&gt; &lt;p&gt;Some are known performance problems with comments to that effect  &lt;ul&gt; &lt;li&gt;e.g. System.ValueType.Equals()&lt;/li&gt;&lt;/ul&gt; &lt;p&gt;&amp;nbsp; &lt;p&gt;On the other hand, there were some I would consider to be real problems. &lt;p&gt;&amp;nbsp; &lt;p&gt;&lt;strong&gt;A Typical Error&lt;/strong&gt; &lt;p&gt;Problems like this one are easily addressed once they are found: &lt;blockquote&gt; &lt;p&gt;public override int System.Net.Cookie.GetHashCode() { &lt;br&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp; return string.Concat(&lt;br&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; new object[] { &lt;br&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; &amp;nbsp;&amp;nbsp;&amp;nbsp; this.Name, "=", this.Value, ";", &lt;br&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; &amp;nbsp;&amp;nbsp;&amp;nbsp; this.Path, "; ", this.Domain, "; ", this.Version }&lt;br&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;).GetHashCode(); &lt;br&gt;} &lt;/p&gt;&lt;/blockquote&gt; &lt;p&gt;Do we really need to make a string array, with 4 constant elements, concatenate them and then hash that string in order to hash a cookie?&amp;nbsp; I think not.&amp;nbsp; Now is this a calamity?&amp;nbsp; Not so much but then if I found any real calamities in code with as much inspection as mscorlib that would be pretty amazing -- what's important is that the technique holds water and customers writing code can get some immediate guidance either in real-time or in an FXCOP-like fashion for potential issues. &lt;p&gt;Addressing these kinds of issues, to give basic guidance, is, in my opinion, vastly superior to making a lot of ad-hoc rules of thumb about API usage which try to prevent the same kinds of problems one at a time. &lt;p&gt;I hope to keep making progress in this area.&amp;nbsp; I'd be happy to hear your thoughts.&lt;/p&gt;&lt;img src="http://blogs.msdn.com/aggbug.aspx?PostID=1260762" width="1" height="1"&gt;</description><category domain="http://blogs.msdn.com/ricom/archive/tags/performance/default.aspx">performance</category><category domain="http://blogs.msdn.com/ricom/archive/tags/design+advice/default.aspx">design advice</category></item><item><title>Taming the CLR: How to Write Real-Time Managed Code</title><link>http://blogs.msdn.com/ricom/archive/2006/08/22/713396.aspx</link><pubDate>Wed, 23 Aug 2006 03:01:00 GMT</pubDate><guid isPermaLink="false">91d46819-8472-40ad-a661-2c78acb4018c:713396</guid><dc:creator>ricom</dc:creator><slash:comments>37</slash:comments><comments>http://blogs.msdn.com/ricom/comments/713396.aspx</comments><wfw:commentRss>http://blogs.msdn.com/ricom/commentrss.aspx?PostID=713396</wfw:commentRss><description>&lt;P&gt;I've actually been meaning to write about real time applications for ages so when I was asked to give a talk at MS Gamefest (&lt;A href="http://microsoftgamefest.com/"&gt;http://microsoftgamefest.com&lt;/A&gt;) I jumped at the opportunity to give myself a hard reason to do the homework.&amp;nbsp; Last Tuesday I gave&amp;nbsp;that talk&amp;nbsp;&amp;nbsp;and below&amp;nbsp;are the slide contents plus my speaker notes.&amp;nbsp; The actual audio was recorded an will probably be available soonish, when that happens I'll post a link. &lt;/P&gt;
&lt;P&gt;I hope you find this somewhat interesting :)&lt;/P&gt;
&lt;P&gt;&lt;STRONG&gt;Managed Code for Real Time?&amp;nbsp;&lt;/STRONG&gt;&lt;/P&gt;
&lt;UL&gt;
&lt;LI&gt;It may seem like a crazy notion to some people but gamers probably aren’t nearly as surprised 
&lt;LI&gt;Managed Runtimes in the game world are common 
&lt;UL&gt;
&lt;LI&gt;My first AI code was written in LPC for LPMud&lt;/LI&gt;&lt;/UL&gt;
&lt;LI&gt;You can get great results out of a managed runtime 
&lt;LI&gt;Why?&lt;/LI&gt;&lt;/UL&gt;
&lt;P&gt;&lt;STRONG&gt;Good for What Ails You&lt;/STRONG&gt;&lt;/P&gt;
&lt;UL&gt;
&lt;LI&gt;Garbage Collected memory model is the centerpiece 
&lt;UL&gt;
&lt;LI&gt;Amortized cost of allocations in this model can be excellent!&lt;/LI&gt;&lt;/UL&gt;
&lt;LI&gt;Long term fragmentation is less of a problem 
&lt;UL&gt;
&lt;LI&gt;Who hasn’t played a game that degrades over 2 hours of play?&lt;/LI&gt;&lt;/UL&gt;
&lt;LI&gt;Locality can be good due to compaction – we need all the cache hits we can get 
&lt;LI&gt;Easy to use model, immune to classic “leaks” and wild pointers&lt;/LI&gt;&lt;/UL&gt;
&lt;P&gt;Garbage collected memory models tend not to have long term degradations.&amp;nbsp; Many of the most deadly problems simply can’t happen in this kind of model.&amp;nbsp; When people talk about a “GC” leak what they mean is that they are holding on to a pointer that they should have nulled out.&amp;nbsp; This is much easier to track down than memory that was “lost” in the classic leak model – nothing points to it or it isn’t freed.&amp;nbsp; Wild pointers are totally impossible.&lt;/P&gt;
&lt;P&gt;All of this is great news for game developers.&lt;/P&gt;
&lt;P&gt;&lt;STRONG&gt;Great, I’ll take a dozen!&lt;/STRONG&gt;&lt;/P&gt;
&lt;UL&gt;
&lt;LI&gt;Wait, not so fast, you have to read the fine print… 
&lt;LI&gt;There are important practices to get these benefits 
&lt;LI&gt;The GC is like a BFG9000 
&lt;UL&gt;
&lt;LI&gt;Don’t shoot yourself in the foot with it!&lt;/LI&gt;&lt;/UL&gt;&lt;/LI&gt;&lt;/UL&gt;
&lt;P&gt;People learned best practices for classic memory allocators.&amp;nbsp; In a word – they have &lt;EM&gt;awful&lt;/EM&gt; performance so you have to wrap them.&amp;nbsp; And that’s exactly what everyone does.&amp;nbsp; You get assorted different custom allocators for different purposes.&amp;nbsp; Ones that allocate and free a big chuck, or carve out lots of little objects, or some other important specialized requirement.&amp;nbsp; The main thing is you are careful to get to know your allocator(s) and use them accordingly.&lt;BR&gt;Likewise, you have to get to know the GC.&amp;nbsp; It’s useable directly – without wrapping – in a variety of cases, a great step forward.&amp;nbsp; But you might shoot yourself in the foot if you do unwise things.&lt;/P&gt;
&lt;P&gt;&lt;STRONG&gt;Things you need to know&lt;/STRONG&gt;&lt;/P&gt;
&lt;UL dir=ltr style="MARGIN-RIGHT: 0px"&gt;
&lt;LI&gt;In .NET CF the GC is a compacting mark and sweep collector 
&lt;UL&gt;
&lt;LI&gt;Contrast with the regular .NET Collector which is generational&lt;/LI&gt;&lt;/UL&gt;
&lt;LI&gt;What does this mean? 
&lt;UL&gt;
&lt;LI&gt;You probably should understand both models because you can find yourself on the PC platform too&lt;/LI&gt;&lt;/UL&gt;&lt;/LI&gt;&lt;/UL&gt;
&lt;P&gt;(You can find the picture that I used &lt;A href="http://msdn.microsoft.com/library/default.asp?url=/library/en-us/dndotnet/html/dotnetgcbasics.asp"&gt;in this article here&lt;/A&gt;, this discussion is taken directly from that article)&lt;/P&gt;
&lt;P&gt;Collectors come in many flavors and today we’ll be talking about the flavor of a couple of different collectors that you might run into.&amp;nbsp; The one in the .NET CF is quite a bit different than the one in the desktop.&amp;nbsp; There are different rules for both – although if you follow the .NET CF rules you should get excellent performance out of the Desktop collector as well.&lt;/P&gt;
&lt;P&gt;“Compacting” refers to the fact that our collectors will squeeze out the free memory, kind of like a disk defrag, when they think it is wise to do so.&lt;/P&gt;
&lt;P&gt;“Generational” refers to the fact that the desktop collector can collect just some of the objects rather than all of them – notably it can collect “just the new stuff”&lt;/P&gt;
&lt;P&gt;&lt;STRONG&gt;Simplified GC Model&lt;/STRONG&gt;&lt;/P&gt;&lt;IMG src="http://msdn.microsoft.com/library/en-us/dndotnet/html/dotnetgcbasics_01.gif"&gt; 
&lt;UL&gt;
&lt;LI&gt;This shows generations which are not always present 
&lt;LI&gt;Some of the details in the following discussion are not exactly correct 
&lt;UL&gt;
&lt;LI&gt;The idea is to have a mental model that helps you understand costs, you need this more than you need the exact details&lt;/LI&gt;&lt;/UL&gt;&lt;/LI&gt;&lt;/UL&gt;
&lt;P&gt;In the simplified model (only):&lt;/P&gt;
&lt;UL&gt;
&lt;LI&gt;All garbage-collectable objects are allocated from one contiguous range of address space 
&lt;LI&gt;The heap can be divided into generations (more on this later) 
&lt;LI&gt;Objects within a generation are all roughly the same age. 
&lt;LI&gt;The oldest objects are at the lowest addresses, while new objects are created at increasing addresses. (In the drawing, addresses are increasing going down) 
&lt;LI&gt;The allocation pointer for new objects marks the boundary between the used (allocated) and unused (free) areas of memory. 
&lt;LI&gt;Periodically the heap is compacted by removing dead objects and sliding the live objects up toward the low-address end of the heap. 
&lt;LI&gt;The order of objects in memory remains the order in which they were created, for good locality.&amp;nbsp; 
&lt;LI&gt;There are never any gaps between objects in the heap. 
&lt;LI&gt;Write Barriers are used to note changes in pointers from old objects to new objects allowing the newer objects to be collected seperately. 
&lt;LI&gt;See &lt;A href="http://blogs.msdn.com/ricom"&gt;http://blogs.msdn.com/ricom&lt;/A&gt;, &lt;A href="http://blogs.msdn.com/maoni"&gt;http://blogs.msdn.com/maoni&lt;/A&gt; for more articles and a longer discussion of this topic here &lt;A href="http://msdn.microsoft.com/library/default.asp?url=/library/en-us/dndotnet/html/dotnetgcbasics.asp"&gt;http://msdn.microsoft.com/library/default.asp?url=/library/en-us/dndotnet/html/dotnetgcbasics.asp&lt;/A&gt;&lt;/LI&gt;&lt;/UL&gt;
&lt;P&gt;&lt;STRONG&gt;Why do you care?&lt;/STRONG&gt;&lt;/P&gt;
&lt;UL&gt;
&lt;LI&gt;Really there are just a few things that you need to keep top of mind 
&lt;LI&gt;Allocations are very cheap, basically you increment a pointer 
&lt;LI&gt;Objects that are allocated close together in time tend to be close together in space 
&lt;LI&gt;Full collections can be very expensive because the heap could be arbitrarily big&lt;/LI&gt;&lt;/UL&gt;
&lt;P&gt;You need to know these things so that you get good locality of reference in your data structures.&amp;nbsp; Better locality means fewer cache misses which means fewer clocks per instruction and therefore more frames per second.&lt;/P&gt;
&lt;P&gt;You need to know how to prevent very expensive collection costs.&amp;nbsp; Why do these costs arise and what can you do to limit the cost.&lt;/P&gt;
&lt;P&gt;&lt;STRONG&gt;Some good things to keep in mind&lt;/STRONG&gt;&lt;/P&gt;
&lt;UL&gt;
&lt;LI&gt;The garbage collector concerns itself with the living – dead objects cost nearly nothing.&amp;nbsp; Even if we don’t compact and choose to thread a free list through the dead objects we still don’t have to visit every dead object to do that.&amp;nbsp; Costs of collection tend to be driven by live objects. 
&lt;LI&gt;The overall amortized cost of memory allocation in a garbage collected world tends to be linear in the number of bytes allocated.&amp;nbsp; i.e. allocation volume drives the total cost, which includes zeroing memory, actually doing the allocations (the cheapest part) and the eventual collections.&amp;nbsp; It’s not always the same linear factor because that depends on how much is living and how rich in pointers everything is but, generally, allocation cost is a “per byte” thing for any given application.&lt;/LI&gt;&lt;/UL&gt;
&lt;P&gt;&lt;STRONG&gt;Stay in Control&lt;/STRONG&gt;&lt;/P&gt;
&lt;UL&gt;
&lt;LI&gt;You can get great performance in this kind of world if you stay in control 
&lt;LI&gt;In the regular CLR, for real-time or high throughput applications, I advise people to keep their eye on mid-life objects, make sure they have few of them 
&lt;UL&gt;
&lt;LI&gt;Mid life objects cause you to have to do full collections, and a lot more memory is implicated&lt;/LI&gt;&lt;/UL&gt;
&lt;LI&gt;But what if we don’t even have generations?&lt;/LI&gt;&lt;/UL&gt;
&lt;P&gt;The classic thing you have to worry about in the desktop CLR is if you start leaking a lot of objects into generation 2 – the oldest generation.&amp;nbsp; So basically object-lifetime patterns drive performance – the allocation rate and the death rate in each generation.&lt;/P&gt;
&lt;P&gt;This is true because in the desktop world the GC heap could be very large (e.g. 1GB+) and full collects could therefore be very costly.&amp;nbsp; So the trick in this world is to make sure you’re only doing partial collections.&lt;/P&gt;
&lt;P&gt;There are many real-time applications in the desktop/server world that are successful, e.g. assorted financial companies have stock streaming services that do things with quotes and then facilitate order processing.&amp;nbsp; These are all done on a deadline.&amp;nbsp; How do they do it?&amp;nbsp; Strict control of allocation volume and promotions to the elder generations.&lt;/P&gt;
&lt;P&gt;&lt;STRONG&gt;.NET CF Considerations&lt;/STRONG&gt;&lt;/P&gt;
&lt;UL&gt;
&lt;LI&gt;A good way to think about .NET CF is that it’s like you only have generation 0 
&lt;LI&gt;Your total heap size should be roughly what your generation 0 size would have been – that means comparable to the CPU cache 
&lt;LI&gt;Collection times in that world will be excellent 
&lt;LI&gt;If your heap gets too big collect times will kill you&lt;/LI&gt;&lt;/UL&gt;
&lt;P&gt;If you want to get the best performance your total heap sizes in .NET CF should be roughly what your generation 0 size would have been in a generational collector.&lt;/P&gt;
&lt;P&gt;If you keep accumulating old objects thereby letting your heap grow, the fixed cost of marking those objects during collections will start to hurt you.&amp;nbsp; &lt;/P&gt;
&lt;P&gt;If you have large numbers of objects that you need to pre-create that are going to survive you can minimize the cost of managing these object by keeping them devoid of pointers.&amp;nbsp; If no tracing needs to happen huge swaths of memory can be marked as in-use (e.g. arrays) without even having to look at them.&lt;/P&gt;
&lt;P&gt;So a good tactic is to keep your very long lived data low in pointers (e.g. use handles) and things you churn rich in pointers so that they are easy to manage.&amp;nbsp; Controlling this blend is up to you.&lt;/P&gt;
&lt;P&gt;Remember that when you start using handles you’re back to managing your own object lifetimes and free lists and so forth but that doesn’t matter at all if you’re talking about objects that basically stay around for a whole level.&amp;nbsp; Use this wisely.&lt;/P&gt;
&lt;P&gt;The total heap should be about the size of the CPU cache or a small multiple of it so that you are going to get lots of hits when collecting and processing generally.&lt;/P&gt;
&lt;P&gt;Also, .NET CF needs no write barriers as its not generational, so one less cost right there.&lt;/P&gt;
&lt;P&gt;&lt;STRONG&gt;Allocation Rate&lt;/STRONG&gt;&lt;/P&gt;
&lt;UL&gt;
&lt;LI&gt;In addition to typical heap size you also need to keep your eye on the allocation rate 
&lt;LI&gt;Freshly allocated memory has to be zeroed, and then “constructed” (by you) 
&lt;UL&gt;
&lt;LI&gt;Basically that’s a lot of memory traffic&lt;/LI&gt;&lt;/UL&gt;
&lt;LI&gt;At 20-40 MB/s you are going to be pleased with the memory management overhead 
&lt;UL&gt;
&lt;LI&gt;At 60 frames per second&amp;nbsp;that’s around 500k per frame, or say 20k objects of modest size&lt;/LI&gt;&lt;/UL&gt;&lt;/LI&gt;&lt;/UL&gt;
&lt;P&gt;Volume can kill you as well.&amp;nbsp; Zeroing out all the memory can be expensive – and of course volume drives collections.&lt;/P&gt;
&lt;P&gt;Collections are typically triggered when a certain number of bytes has been allocated – at that point the GC deems it wise to do a collection because there is likely to be enough junk that it’s worth bothering with.&lt;/P&gt;
&lt;P&gt;The GC gets its efficiency by being lazy – collect to aggressively and all those savings go away.&amp;nbsp; On the flip side, collect not enough and memory usage skyrockets and locality is destroyed.&amp;nbsp; The GC keeps these two competing things in balance with allocation budgets.&lt;BR&gt;At the suggested volume of allocations and heap size you should be able to achieve garbage collection overheads in the low to mid single digits of percent.&amp;nbsp; That’s a great result for general memory management.&lt;/P&gt;
&lt;P&gt;&lt;STRONG&gt;What else?&lt;/STRONG&gt;&lt;/P&gt;
&lt;UL&gt;
&lt;LI&gt;Jitted code optimization is not as good as a full optimizing compiler 
&lt;UL&gt;
&lt;LI&gt;So this is no place to do intense floating point math, that’s what Shaders are for&lt;/LI&gt;&lt;/UL&gt;
&lt;LI&gt;BUT! You’ll write simpler code 
&lt;UL&gt;
&lt;LI&gt;The fastest code is the code that isn’t there! 
&lt;LI&gt;Simplifications can easily dwarf code-gen taxes, and give you coding time elsewhere 
&lt;LI&gt;Cheap memory model can be ideal for AI logic and physics&lt;/LI&gt;&lt;/UL&gt;&lt;/LI&gt;&lt;/UL&gt;
&lt;P&gt;Jitting isn’t going to be your death… it’s like a fixed overhead.&lt;/P&gt;
&lt;P&gt;The idea is that you can more than win this back with simplified logic.&amp;nbsp; Lack of destructors.&amp;nbsp; No cleanup code to run (and who wants to write code to (e.g.) visit partly created trees and release them?)&lt;/P&gt;
&lt;P&gt;Less code to write means more time to focus on algorithmic gains in your code – that’s where the real money is.&amp;nbsp; But, full disclosure, you should know you are starting at a modest penalty.&lt;/P&gt;
&lt;P&gt;Use the coding simplifications to your advantage, build great algorithms that would have been too hard or too expensive in man-hours to code otherwise and win.&lt;/P&gt;
&lt;P&gt;Don’t try to write a Phong Shader in managed code for your game.&amp;nbsp; That’s what the GPU is for.&lt;/P&gt;
&lt;P&gt;&lt;/P&gt;
&lt;P&gt;&lt;STRONG&gt;Interop with Native&lt;/STRONG&gt;&lt;/P&gt;
&lt;P&gt;Just a few words here, again you can shoot yourself in the foot&lt;/P&gt;
&lt;UL&gt;
&lt;LI&gt;Keep your interop under control&amp;nbsp; 
&lt;LI&gt;Use primitive types in the API whenever possible 
&lt;UL&gt;
&lt;LI&gt;Marshalling costs are what will kill you&lt;/LI&gt;&lt;/UL&gt;
&lt;LI&gt;Simple calls can be very fast&amp;nbsp; 
&lt;UL&gt;
&lt;LI&gt;Desktop CLR can get many millions of calls per second on simple signatures&lt;/LI&gt;&lt;/UL&gt;
&lt;LI&gt;Native Interop is not possible on the Xbox 360&lt;/LI&gt;&lt;/UL&gt;
&lt;P&gt;Not much to say here other than, if you must, on the desktop, then keep it as simple as possible.&lt;/P&gt;
&lt;P&gt;On 360 it isn’t even an option, therefore you can’t get it wrong :)&lt;/P&gt;&lt;img src="http://blogs.msdn.com/aggbug.aspx?PostID=713396" width="1" height="1"&gt;</description><category domain="http://blogs.msdn.com/ricom/archive/tags/performance/default.aspx">performance</category><category domain="http://blogs.msdn.com/ricom/archive/tags/design+advice/default.aspx">design advice</category></item><item><title>Performance Lifecycle</title><link>http://blogs.msdn.com/ricom/archive/2005/10/05/477470.aspx</link><pubDate>Wed, 05 Oct 2005 22:30:00 GMT</pubDate><guid isPermaLink="false">91d46819-8472-40ad-a661-2c78acb4018c:477470</guid><dc:creator>ricom</dc:creator><slash:comments>0</slash:comments><comments>http://blogs.msdn.com/ricom/comments/477470.aspx</comments><wfw:commentRss>http://blogs.msdn.com/ricom/commentrss.aspx?PostID=477470</wfw:commentRss><description>&lt;P&gt;I get many opportunities to review documents and processes in the course of my job, and sometimes they’re not even about performance. :) &lt;/P&gt;
&lt;P&gt;About 2 years ago I started seeing a goodly number of security related documents; and now increasingly I see things about the Security Development Lifecycle. The thing that struck me then, as now, is that in many of these documents you could basically do a mass substitution of “performance” where it says “security”, change a few metrics and the document would still read fairly well. I suppose this shouldn’t be too terribly surprising because after all performance and security are both quality attributes and many of the processes that you use for one are perfect for the other. &lt;/P&gt;
&lt;P&gt;If you have a moment, pick up your favorite security process literature – you do have some handy right? You don’t? Oops shame on you, better go get some; it’s important stuff – and have a look at how the security best practices you’re using compare to your performance practices. It’s interesting to see just how much cross fertilization is possible. &lt;/P&gt;
&lt;P&gt;I think one of the most important notions that performance engineers can borrow from the security lifecycle is that there are many different activities, that each stage of the project has appropriate activities, and that massive effort at any one stage is the wrong idea. &lt;/P&gt;
&lt;P&gt;If you’ve ever heard me speak you’ll probably remember me cautioning that the old cliché “Premature Optimization is the Root of All Evil” often leads to very bad thinking like “Just make it work the easy way first and worry about making it fast later.” Now of course the reason that this is bad is that the most egregious performance mistakes are generally not in execution but in design and those mistakes are made very early in the project – it’s important to have a design that is sound from a performance perspective. That means at least some consideration should be given to performance right away. &lt;/P&gt;
&lt;P&gt;I’m quite certain that Tony Hoare (who Don Knuth was quoting when he made the phrase popular) didn’t mean that all performance work should be done at the end. His caution, and it is a good one, is that doing a lot of micro-tuning in the early stages of a project, without good data to support it, is much more likely to introduce unneeded complexity than it is to actually help your performance. &lt;/P&gt;
&lt;P&gt;This brings us full circle. Hoare’s admonition is just one example of doing the wrong performance activity at the wrong time in your projects lifecycle. There is a time to micro-tune and it isn’t on the whiteboard on the first day of your project. At that point you should be thinking about your overall goals, thinking about what you will measure and how, how you will track performance regressions, what the key resources will be, what dependencies you can afford, and other overarching issues. As you move along in the project you’ll find that different activities start to become appropriate and rewarding. &lt;/P&gt;
&lt;P&gt;If you expend all your performance effort at any one stage in your project’s life then you can expect a disaster. Instead balance the time you have available and invest in performance in appropriate ways throughout the cycle. &lt;/P&gt;
&lt;HR&gt;

&lt;P&gt;To see more of these similarities have a look at the lifecycle approaches being taken by the Patterns and Practices team. Look at the &lt;A href="http://msdn.com/SecurityEngineering"&gt;Security Engineering&lt;/A&gt; documents and the by-design similarities between &lt;A href="http://msdn.com/SecNet"&gt;Improving Web App Security&lt;/A&gt; and &lt;A href="http://msdn.com/Perf"&gt;Improving .NET Performance and Scalability&lt;/A&gt;.&lt;/P&gt;&lt;img src="http://blogs.msdn.com/aggbug.aspx?PostID=477470" width="1" height="1"&gt;</description><category domain="http://blogs.msdn.com/ricom/archive/tags/performance/default.aspx">performance</category><category domain="http://blogs.msdn.com/ricom/archive/tags/design+advice/default.aspx">design advice</category></item><item><title>How To Do A Good Performance Investigation</title><link>http://blogs.msdn.com/ricom/archive/2005/05/23/421205.aspx</link><pubDate>Tue, 24 May 2005 02:54:00 GMT</pubDate><guid isPermaLink="false">91d46819-8472-40ad-a661-2c78acb4018c:421205</guid><dc:creator>ricom</dc:creator><slash:comments>11</slash:comments><comments>http://blogs.msdn.com/ricom/comments/421205.aspx</comments><wfw:commentRss>http://blogs.msdn.com/ricom/commentrss.aspx?PostID=421205</wfw:commentRss><description>&lt;P&gt;I find that sometimes people have difficultly just getting started when doing a performance analysis – meaning they’re faced with a potentially big problem and don’t know where to begin.&amp;nbsp; Over the years many people have come to me under those circumstances and asked me how I would approach the problem.&amp;nbsp; So today I’m trying to distill those bits of advice – my&amp;nbsp;&lt;EM&gt;Modus Operandi &lt;/EM&gt;– into some simple steps in the hope that it might help others to get off to a good start.&lt;/P&gt;
&lt;P&gt;So here it is, Rico’s advice on how to do a good performance investigation.&lt;/P&gt;
&lt;P&gt;&lt;STRONG&gt;Preliminaries&lt;/STRONG&gt;&lt;/P&gt;
&lt;P&gt;The first thing to remember is not to try to do this in a rushed way.&amp;nbsp; The more of a hurry you are in to get to the bottom of the problem the less you can afford to be hasty.&amp;nbsp; Be deliberate and careful.&amp;nbsp; Block out a good chunk of time to think about the problem.&amp;nbsp; Make sure you have the resources you need to succeed – that means enough access to the right hardware and the right people.&amp;nbsp; Prepare a log book – electronic if you like – so that you can keep notes and interim results at each step.&amp;nbsp; This will be the basis of your final report and it will be an invaluable reference to anyone that follows in your footsteps – even if (especially if?) that someone is you.&lt;/P&gt;
&lt;P&gt;&lt;STRONG&gt;Step 1 – Get the Lay of the Land&lt;/STRONG&gt;&lt;/P&gt;
&lt;P&gt;Before you look at any code, talk to the people involved.&amp;nbsp; You’ll want to get a feel for what they are trying to accomplish.&amp;nbsp; What are their key difficulties?&amp;nbsp; What is inherently hard about this problem?&amp;nbsp; What is the overall organization of the system?&amp;nbsp; What are the inherent limitations of those choices?&amp;nbsp; What is the chief complaint with the current system?&amp;nbsp; What would a successful system look like?&amp;nbsp; How do they measure success? &lt;/P&gt;
&lt;P&gt;In addition to a basic understanding of how the system is intended to work the key question you want to answer is this:&amp;nbsp; Which resource is the one that should be critical in this system?&amp;nbsp; Is the problem fundamentally CPU bound, disk bound, network bound?&amp;nbsp; If things were working perfectly what would be constraining the performance?&lt;/P&gt;
&lt;P&gt;&lt;STRONG&gt;Step 2 – Identify the current bottlenecks&lt;/STRONG&gt;&lt;/P&gt;
&lt;P&gt;In this step we cast a broad net to see what resource is the current limiting factor.&amp;nbsp; The tool I reach for first is PerfMon.&amp;nbsp; Look at key counters like CPU Usage, Memory Usage, Disk and Network I/O.&amp;nbsp; Examine these on all the various different machines involved if this is a server problem (i.e. check all the tiers).&amp;nbsp; Check for high levels of lock contention.&lt;/P&gt;
&lt;P&gt;At this point you should be able to identify which resource is currently the one that is limiting performance.&amp;nbsp; Often it is not the resource identified in Step 1.&amp;nbsp; &lt;/P&gt;
&lt;P&gt;If it is the correct resource – the one that is supposed to be the limiting factor in this kind of computing – that’s a good preliminary sign that sensible algorithms have been selected.&amp;nbsp; If it’s the wrong resource it means you are going to be looking for design problems where a supposedly non-critical resource has been overused to the point that it has become the critical resource.&amp;nbsp;&amp;nbsp; The design will have to be altered such that it does not have this (fundamentally unnecessary) dependence.&lt;/P&gt;
&lt;P&gt;&lt;STRONG&gt;Step 3 – Drill down on the current bottleneck&lt;/STRONG&gt;&lt;/P&gt;
&lt;P&gt;A common mistake in a performance analysis is to try to do step 3 before step 2.&amp;nbsp; This is going to lead to a good bit of waste because you could do a deep analysis of say CPU usage when CPU usage isn’t the problem.&amp;nbsp; Instead, choose tools that are good at measuring the problem resource and don’t worry so much about the others for now.&amp;nbsp; If it’s hard to measure the resource in question, add instrumentation for this resource if possible.&amp;nbsp; The goal is to find out as much as you can about what is causing the (over) use of the bottleneck resource.&lt;/P&gt;
&lt;P&gt;When doing this analysis, consider factors that control the resources at different system levels starting from the largest and going to the smallest.&amp;nbsp; Is there something about the overall system architecture that is causing overuse of this resource?&amp;nbsp; Is it something in the overall application design?&amp;nbsp; Or is it a local problem with a module or subcomponent?&amp;nbsp;&amp;nbsp; Most significant problems are larger in scope, look at those first.&amp;nbsp; They are the easiest to diagnose and sometimes the easiest to correct.&amp;nbsp; E.g. if caching were disabled on a web server you could expect big problems in the back end servers.&amp;nbsp; You’ll want to make sure caching is working properly before you decide to (e.g.) add more indexes to make some query faster.&lt;/P&gt;
&lt;P&gt;Your approach will need to be tailored to the resource and the system. For CPU problems a good time profiler can be invaluable.&amp;nbsp; For SQL problems there’s of course SQL Profiler (find the key queries) and Query Analyzer (view the plans).&amp;nbsp; For memory issues there are abundant performance counters that can be helpful, including the raw memory counters and .NET memory management ones, and others.&amp;nbsp; Tracking virtual memory use over time can be helpful – sampling with vadump is handy.&amp;nbsp; Examination of key resources by breaking into the system with a debugger can also be useful.&lt;/P&gt;
&lt;P&gt;&lt;STRONG&gt;Step 4 – Identify anomalies in the measurements&lt;/STRONG&gt;&lt;/P&gt;
&lt;P&gt;The most interesting performance problems are almost always highlighted by anomalous observed costs.&amp;nbsp; Things are happening that shouldn’t be happening or need not happen.&amp;nbsp; If the critical resource isn’t the “correct” one as identified in Step 1 it’s almost certainly the case that your root-cause analysis will find an undesired use of the resource.&amp;nbsp; If the resource was the “correct” one then you’ll be looking for overuse to get an improvement.&lt;/P&gt;
&lt;P&gt;In both cases it is almost always helpful to look at the resource costs per unit-of-work.&amp;nbsp; What is the “transaction” in this system?&amp;nbsp; Is it a mouse click event?&amp;nbsp; Is it a business transaction of some kind?&amp;nbsp; Is it an HTML page delivered to the user?&amp;nbsp; A database query performed?&amp;nbsp; Whatever it is look at your critical resource and consider the cost per unit of work.&amp;nbsp; E.g. consider CPU cycles per transaction, network bytes per transaction, disk i/o’s per transaction, etc.&amp;nbsp;&amp;nbsp; These metrics will often show you the source of the mistake – consider in my recent analysis of Performance Quiz #6 where I looked at the number of string allocations per line of input and then bytes allocate per byte in the input file.&amp;nbsp; Those calculations were both easy and revealing.&lt;/P&gt;
&lt;P&gt;Expressing your costs in a per-unit-of work fashion will help you to see which costs are reasonable and which are problems.&lt;/P&gt;
&lt;P&gt;&lt;STRONG&gt;Step 5 – Create a hypothesis and verify it&lt;/STRONG&gt;&lt;/P&gt;
&lt;P&gt;Based on the analysis in step 4 you should be able to find a root cause for the current bottleneck.&amp;nbsp; Design an experiment to validate that this is the case.&amp;nbsp; This can be a very simple test such as “if we change the settings [like so], it should make the problem much worse.”&amp;nbsp; Take advantage of any kind of fairly quick validation that you can do… the worst thing to do is to plunge into some massive correction without being sure that you’ve really hit the problem.&amp;nbsp; If there is nothing obvious, consider doing only a small fraction of the corrective work.&amp;nbsp; Perhaps just enough for one test case to function – validate that before you proceed.&lt;/P&gt;
&lt;P&gt;The trick to doing great performance work is to be able to try various things and yet spend comparatively little time on the losing strategies while quickly finding and acting on the winners.&lt;/P&gt;
&lt;P&gt;&lt;STRONG&gt;Step 6 – Apply the finished corrections, verify, and repeat as needed&lt;/STRONG&gt;&lt;/P&gt;
&lt;P&gt;After Step 5 you should be highly confident that you have a winning change on your hands.&amp;nbsp; Go ahead and finish it up to production quality and apply those changes.&amp;nbsp; Make sure things went as you expected and only then move on.&amp;nbsp; If your changes were not too sweeping you can probably resume at Step 2, or maybe even Step 3.&amp;nbsp; If they were very big changes you might have to go back to Step 1.&lt;/P&gt;
&lt;P&gt;&lt;STRONG&gt;Step 7 – Write a brief report&lt;/STRONG&gt;&lt;/P&gt;
&lt;P&gt;Summarize your method and findings for your teammates.&amp;nbsp; It’s invaluable as a learning exercise for yourself and as a long-term resource for your team.&amp;nbsp; &lt;BR&gt;&lt;/P&gt;
&lt;P&gt;&lt;STRONG&gt;Post Script&lt;/STRONG&gt;&lt;/P&gt;
&lt;P&gt;I wrote a followup article which offers more prescriptive advice about managed code specifically -- see &lt;A id=_ctl0__ctl0__ctl0__ctl0_RecentPosts__ctl0_postlist__ctl0_EntryItems__ctl1_PostTitle HREF="/ricom/archive/2005/05/25/421926.aspx"&gt;Narrowing Down Performance Problems in Managed Code&lt;/A&gt;&lt;/P&gt;&lt;img src="http://blogs.msdn.com/aggbug.aspx?PostID=421205" width="1" height="1"&gt;</description><category domain="http://blogs.msdn.com/ricom/archive/tags/performance/default.aspx">performance</category><category domain="http://blogs.msdn.com/ricom/archive/tags/design+advice/default.aspx">design advice</category><category domain="http://blogs.msdn.com/ricom/archive/tags/using+tools/default.aspx">using tools</category></item><item><title>Fat Free Bytes?  Not here!</title><link>http://blogs.msdn.com/ricom/archive/2005/04/06/406019.aspx</link><pubDate>Thu, 07 Apr 2005 03:01:00 GMT</pubDate><guid isPermaLink="false">91d46819-8472-40ad-a661-2c78acb4018c:406019</guid><dc:creator>ricom</dc:creator><slash:comments>0</slash:comments><comments>http://blogs.msdn.com/ricom/comments/406019.aspx</comments><wfw:commentRss>http://blogs.msdn.com/ricom/commentrss.aspx?PostID=406019</wfw:commentRss><description>&lt;P&gt;I'm going to have a little bit of fun with this one so bear with me.&amp;nbsp; :)&lt;/P&gt;
&lt;P&gt;Sometimes I talk to groups that have adopted managed code and they've gone a bit oopaholic.&amp;nbsp; Or maybe even a lot oopaholic.&amp;nbsp; (See this &lt;A href="http://msdn.microsoft.com/netframework/programming/classlibraries/clrperformancetips/"&gt;video&lt;/A&gt;&amp;nbsp;if you would like to hear a little more about the perils of oopaholism).&amp;nbsp; Oopaholics are often in denial.&amp;nbsp; Sure their data is big but no worries, they "can handle it", once things have matured "all will be well."&amp;nbsp; Everything is "under control."&lt;/P&gt;
&lt;P&gt;If you corner an oopaholic they sometimes say doozies like this one I heard not too long ago.&lt;/P&gt;
&lt;BLOCKQUOTE dir=ltr&gt;
&lt;P&gt;&lt;EM&gt;"The reason [edited] is so big is because of poor documentation."&lt;/EM&gt;&lt;/P&gt;&lt;/BLOCKQUOTE&gt;
&lt;P&gt;Wow that one was surprising.&amp;nbsp; Big objects due to poor documentation.&amp;nbsp; But fear not!&amp;nbsp; Rico to the rescue!&amp;nbsp; I've done a very careful examination of the managed subsystem and I'm now prepared to *officially* document my results.&amp;nbsp; No disclaimers this time.&amp;nbsp; Here it is, the unadorned truth.&lt;/P&gt;
&lt;P&gt;In the managed world one byte still equals one byte.&amp;nbsp; If you&amp;nbsp;build a system that requires ten times as much managed storage per equivalent data structure compared to unmanaged code you will in fact use ten times the memory and be ten times bigger than the equivalent unmanaged system.&amp;nbsp;Let the confusion end now!&amp;nbsp;&amp;nbsp;Managed bytes are not "90% fat free" all the bytes actually count.&amp;nbsp; Really and no kidding around.&lt;/P&gt;
&lt;P&gt;I wish I could offer you "diet bytes" but I don't have those.&amp;nbsp; Just regular bytes.&lt;/P&gt;
&lt;P&gt;Yup the code too.&lt;/P&gt;
&lt;P&gt;Phew.&amp;nbsp; Now we can rest easy knowing that it's fully documented.&amp;nbsp; I sure feel better :) :)&lt;/P&gt;
&lt;P&gt;Count your managed bytes just as carefully as your unmanaged bytes and you'll be taking a good step (one of twelve?) to better performance.&lt;/P&gt;&lt;img src="http://blogs.msdn.com/aggbug.aspx?PostID=406019" width="1" height="1"&gt;</description><category domain="http://blogs.msdn.com/ricom/archive/tags/performance/default.aspx">performance</category><category domain="http://blogs.msdn.com/ricom/archive/tags/design+advice/default.aspx">design advice</category></item><item><title>Giving your customers a good deal</title><link>http://blogs.msdn.com/ricom/archive/2005/03/21/399872.aspx</link><pubDate>Mon, 21 Mar 2005 18:01:00 GMT</pubDate><guid isPermaLink="false">91d46819-8472-40ad-a661-2c78acb4018c:399872</guid><dc:creator>ricom</dc:creator><slash:comments>0</slash:comments><comments>http://blogs.msdn.com/ricom/comments/399872.aspx</comments><wfw:commentRss>http://blogs.msdn.com/ricom/commentrss.aspx?PostID=399872</wfw:commentRss><description>&lt;p&gt;Earlier today someone suggested that I read this &lt;A href="http://blogs.msdn.com/cyrusn/archive/2005/03/19/399222.aspx"&gt;entry&lt;/a&gt; from &lt;A href="http://blogs.msdn.com/cyrusn"&gt;Cyrus&lt;/a&gt;.&lt;/p&gt; &lt;p&gt;As a performance guy people&amp;nbsp;basically expect me to veto every new idea that might grow the size of anything anywhere.&amp;nbsp; I guess I surprise them when I don't.&amp;nbsp; The fact is that it's very hard for me to help make any specific decisions about specific features because only rarely can I assess the value of those features against their cost.&amp;nbsp; Instead I admonish people closer to the problem to do that analysis with questions like:&lt;/p&gt; &lt;ul&gt; &lt;li&gt;Is this feature worth the cost?&amp;nbsp; &lt;/li&gt; &lt;li&gt;Do you even understand the cost? &lt;/li&gt; &lt;li&gt;In what dimensions will your &lt;em&gt;customer &lt;/em&gt;see the cost? &lt;/li&gt; &lt;li&gt;Will your customer think it's &lt;em&gt;a good deal&lt;/em&gt;?&lt;/li&gt;&lt;/ul&gt; &lt;p&gt;Sometimes I can help people understand the cost,&amp;nbsp;for instance in Cyrus' case I'd be a lot more concerned about the CPU usage and working set of the new additions when they are idle than I&amp;nbsp;would be&amp;nbsp;about the disk footprint.&amp;nbsp; But what I'd really be interested in that cost/value assessment.&amp;nbsp; At what point would it have not been worth cost?&amp;nbsp; Was a quantitative decision made?&lt;/p&gt; &lt;p&gt;Accidental decisions are rarely good ones.&lt;/p&gt;&lt;img src="http://blogs.msdn.com/aggbug.aspx?PostID=399872" width="1" height="1"&gt;</description><category domain="http://blogs.msdn.com/ricom/archive/tags/performance/default.aspx">performance</category><category domain="http://blogs.msdn.com/ricom/archive/tags/design+advice/default.aspx">design advice</category></item></channel></rss>