<?xml version="1.0" encoding="UTF-8" ?>
<?xml-stylesheet type="text/xsl" href="http://blogs.msdn.com/utility/FeedStylesheets/rss.xsl" media="screen"?><rss version="2.0" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:slash="http://purl.org/rss/1.0/modules/slash/" xmlns:wfw="http://wellformedweb.org/CommentAPI/"><channel><title>Rico Mariani's Performance Tidbits : signatures</title><link>http://blogs.msdn.com/ricom/archive/tags/signatures/default.aspx</link><description>Tags: signatures</description><dc:language>en-US</dc:language><generator>CommunityServer 2.1 SP1 (Build: 61025.2)</generator><item><title>Synchronization Complexity in the .NET Framework, Part 2</title><link>http://blogs.msdn.com/ricom/archive/2007/03/22/synchronization-complexity-in-the-net-framework-part-2.aspx</link><pubDate>Thu, 22 Mar 2007 23:00:00 GMT</pubDate><guid isPermaLink="false">91d46819-8472-40ad-a661-2c78acb4018c:1932956</guid><dc:creator>ricom</dc:creator><slash:comments>2</slash:comments><comments>http://blogs.msdn.com/ricom/comments/1932956.aspx</comments><wfw:commentRss>http://blogs.msdn.com/ricom/commentrss.aspx?PostID=1932956</wfw:commentRss><description>&lt;p&gt;Well it seems like an eternity ago but at last I'm writing the followup to my &lt;a href="http://blogs.msdn.com/ricom/archive/2007/02/16/synchronization-complexity-in-the-net-framework.aspx"&gt;initial question about synchronization complexity&lt;/a&gt;.&lt;/p&gt; &lt;p&gt;I'd like to start with &lt;a href="http://download.microsoft.com/download/9/2/a/92a2d637-d52c-4d6e-86cd-f4f79755d0b8/log10sync.zip"&gt;this link&lt;/a&gt; to a summary of the synchronization costs of nearly all of the framework.&amp;nbsp; And I say nearly all because I noticed that at least three methods had a synchronization cost that overflowed a 64 bit signed integer.&amp;nbsp; Assumming (and I think this is safe to do) that they only spilled into the negative regions, those three entries are&lt;/p&gt; &lt;p&gt;&amp;nbsp;&lt;/p&gt; &lt;p&gt;System.Windows.Forms.DataGridView.ProcessLeftKey(System.Windows.Forms.Keys)&lt;br&gt;count 10,823,468,784,974,500,000 &lt;br&gt;complexity 20.0&lt;/p&gt; &lt;p&gt;System.Windows.Forms.DataGridView.ProcessRightKey(System.Windows.Forms.Keys)&lt;br&gt;count 10,823,468,784,974,500,000 &lt;br&gt;complexity 20.0 &lt;p&gt;System.Windows.Forms.Design.DataGridViewDesigner.Initialize(System.ComponentModel.IComponent)&lt;br&gt;count 9,230,518,549,563,500,000&amp;nbsp;&lt;br&gt;complexity 20.0 &lt;p&gt;Those are in fact the answers to the problem&amp;nbsp;I posted in the original article and if you patch up those three lines in the .txt file I provided you'll have all the costs for the framework. &lt;p&gt;But what does it mean? &lt;p&gt;Well first let me remind you what the formula is.&amp;nbsp; If the count is 0 then the complexity is 0.&amp;nbsp; If the count is non-zero then the complexity is 1+log(count) rounded to 1 digit.&amp;nbsp; So a complexity of 20 means that there were 10^19 calls to Monitor.Enter in the complete calltree of the method in question.&amp;nbsp; 10^19 is a huge huge number! &lt;p&gt;OK, so what does &lt;em&gt;that &lt;/em&gt;mean? &lt;p&gt;Well I'll tell you first what it &lt;em&gt;doesn't &lt;/em&gt;mean.&amp;nbsp; It doesn't mean that if you call that method that you can expect 10^19 calls to Monitor.Enter.&amp;nbsp; Remember this is looking at the complete call graph and on any given invocation you certainly don't take both sides of every branch, ever branch of every switch and so on.&amp;nbsp; You'll get some narrow slice of that calltree.&amp;nbsp;  &lt;p&gt;OK great, so the static count isn't the actual observed count if you call it, so what &lt;em&gt;can &lt;/em&gt;we learn from this?&amp;nbsp; Well two things. &lt;ol&gt; &lt;li&gt;For methods with large complexity the reason that the complexity is so large is that the method has a very deep/wide call tree.&amp;nbsp; These costs accumulate pretty quickly in those cases and so the metric provides a way of seeing just how high level the method is from a synchronization prespective or &lt;a href="http://blogs.msdn.com/ricom/archive/2007/01/26/net-framework-allocation-complexity-graph.aspx"&gt;an allocation perspective&lt;/a&gt;.&amp;nbsp;&amp;nbsp; &lt;/li&gt; &lt;li&gt;For methods with small complexities, the call graph is probably not that busy and the reported complexity is much more likely to be very close to the actual number of sychronizations or allocations&amp;nbsp;you can expect to observe.&amp;nbsp; Certainly if the cost is zero there will be no synchronizations*.&lt;/li&gt;&lt;/ol&gt; &lt;p&gt;So what can you do with this?&amp;nbsp; Well there are two realizations that I hope you agree with.&lt;/p&gt; &lt;ol&gt; &lt;li&gt;For these very high level methods you can basically have &lt;em&gt;no idea whatsoever &lt;/em&gt;what the actual cost you are going to pay is without doing experiments in context.&amp;nbsp; The amount of variability could be vast (10^19 demonstrated).&amp;nbsp; So clearly you must have a lot of tolerance in the context in which you use such methods, or else plenty of experimenting.&lt;/li&gt; &lt;li&gt;Secondly, by comparing the complexity (either kind) of a method you have just implemented to either guidelines (like GetHashCode should have a nil complexity) or to observed patterns (like FooBar methods all have a cost of 4 but yours has a cost of 12) you can make better decisions.&lt;/li&gt;&lt;/ol&gt; &lt;p&gt;Food for thought.&lt;/p&gt; &lt;p&gt;&lt;font size="1"&gt;*For the synchronization counts I disregarded the one-time synchronizations present in the String class because they totally skew the truth and they do happen exactly once.&amp;nbsp; One-time costs are the bane of static analysis&lt;/font&gt;&lt;/p&gt;&lt;img src="http://blogs.msdn.com/aggbug.aspx?PostID=1932956" width="1" height="1"&gt;</description><category domain="http://blogs.msdn.com/ricom/archive/tags/performance/default.aspx">performance</category><category domain="http://blogs.msdn.com/ricom/archive/tags/signatures/default.aspx">signatures</category></item><item><title>Synchronization Complexity in the .NET Framework</title><link>http://blogs.msdn.com/ricom/archive/2007/02/16/synchronization-complexity-in-the-net-framework.aspx</link><pubDate>Sat, 17 Feb 2007 00:30:00 GMT</pubDate><guid isPermaLink="false">91d46819-8472-40ad-a661-2c78acb4018c:1691527</guid><dc:creator>ricom</dc:creator><slash:comments>11</slash:comments><comments>http://blogs.msdn.com/ricom/comments/1691527.aspx</comments><wfw:commentRss>http://blogs.msdn.com/ricom/commentrss.aspx?PostID=1691527</wfw:commentRss><description>&lt;P&gt;Several people here (you know who you are)&amp;nbsp;have been nagging me to do an analysis similar to the one I did for allocations but to get an idea of which methods might do locking and how much.&amp;nbsp; So I repeated my experiment, this time counting any calls to Monitor.Enter in the subtree of any given method.&lt;/P&gt;
&lt;P&gt;The results were very surprising, you can see that the bulk of the methods (78%) do no synchronization at all.&amp;nbsp; However, once you go down the dark path, it gets bad fast.&amp;nbsp; The synchronization complexity of the biggest is over 10^19.&amp;nbsp; Remember as with the &lt;A class="" title="allocation complexity metric" href="http://blogs.msdn.com/ricom/archive/2007/01/26/net-framework-allocation-complexity-graph.aspx" mce_href="http://blogs.msdn.com/ricom/archive/2007/01/26/net-framework-allocation-complexity-graph.aspx"&gt;allocation metric&lt;/A&gt; the number is offset by 1 so that 0 -&amp;gt; 0,&amp;nbsp; 1 -&amp;gt; 1,&amp;nbsp; 10-&amp;gt;2,&amp;nbsp; 100-&amp;gt;3, etc.&lt;/P&gt;
&lt;P&gt;Anyone care to guess what class has the methods with the greatest synchronization complexity?&lt;/P&gt;
&lt;P&gt;In a coming article I will write a few words about how to interpret these numbers.&amp;nbsp; Clearly a complexity of 10^19 does not mean that 10^19 synchronizations actually happen when you call the method.&lt;/P&gt;
&lt;P mce_keep="true"&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;IMG src="http://blogs.msdn.com/photos/ricom/images/1691495/original.aspx" mce_src="http://blogs.msdn.com/photos/ricom/images/1691495/original.aspx"&gt;&lt;BR&gt;&lt;/P&gt;&lt;img src="http://blogs.msdn.com/aggbug.aspx?PostID=1691527" width="1" height="1"&gt;</description><category domain="http://blogs.msdn.com/ricom/archive/tags/performance/default.aspx">performance</category><category domain="http://blogs.msdn.com/ricom/archive/tags/signatures/default.aspx">signatures</category></item><item><title>Performance Signatures CMG 2006 Paper</title><link>http://blogs.msdn.com/ricom/archive/2007/02/07/performance-signatures-cmg-2006-paper.aspx</link><pubDate>Wed, 07 Feb 2007 22:18:00 GMT</pubDate><guid isPermaLink="false">91d46819-8472-40ad-a661-2c78acb4018c:1621197</guid><dc:creator>ricom</dc:creator><slash:comments>7</slash:comments><comments>http://blogs.msdn.com/ricom/comments/1621197.aspx</comments><wfw:commentRss>http://blogs.msdn.com/ricom/commentrss.aspx?PostID=1621197</wfw:commentRss><description>&lt;P&gt;I presented this paper at the&amp;nbsp;&lt;A href="http://www.cmg.org/" mce_href="http://www.cmg.org"&gt;CMG&lt;/A&gt;&amp;nbsp;2006 conference.&amp;nbsp; I've previously posted the &lt;A href="http://blogs.msdn.com/ricom/archive/2006/12/11/avoiding-coding-pitfalls-with-performance-signatures.aspx" mce_href="http://blogs.msdn.com/ricom/archive/2006/12/11/avoiding-coding-pitfalls-with-performance-signatures.aspx"&gt;content of my slides&lt;/A&gt; and some &lt;A href="http://blogs.msdn.com/ricom/archive/2007/01/26/net-framework-allocation-complexity-graph.aspx" mce_href="http://blogs.msdn.com/ricom/archive/2007/01/26/net-framework-allocation-complexity-graph.aspx"&gt;interesting results&lt;/A&gt; from using this sort of analysis but I thought you might like to see the paper in full.&amp;nbsp; Comments are of course welcome.&lt;/P&gt;
&lt;P mce_keep="true"&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&lt;A&gt;&lt;/A&gt;&lt;A&gt;&lt;B&gt;PERFORMANCE SIGNATURES: A QUALITATIVE APPROACH TO DEPENDENCY GUIDANCE&lt;/B&gt;&lt;/A&gt; 
&lt;P&gt;Rico Mariani&lt;I&gt;, Microsoft Corporation&lt;/I&gt; 
&lt;P&gt;&lt;I&gt;This paper describes Performance Signatures, a simple qualitative approach to helping software developers improve their products’ performance. Performance Signatures allow development tools to give as-you-type prescriptive guidance and facilitate improved analysis and interpretation during traditional profiling sessions. Performance Signatures emphasize ease of adoption and prevention of common and/or large mistakes.&lt;/I&gt; 
&lt;P&gt;&lt;B&gt;Introduction&lt;/B&gt; 
&lt;P&gt;&lt;B&gt;&lt;/B&gt;
&lt;P&gt;Experience teaches us that large scale software development projects increasingly have difficulty meeting their performance goals. Individuals in the software development industry come from diverse backgrounds, many not directly related to computer science, and there is no one technique or family of techniques that are widely taught, much less practiced, to achieve performance success. 
&lt;P&gt;In the best of all worlds we could universally teach &lt;I&gt;Software Performance Engineering &lt;/I&gt;(SPE) [Smith and Williams 2002] with emphasis on certain core values of engineering: 
&lt;P&gt;· That engineering is a &lt;I&gt;quantitative&lt;/I&gt; discipline 
&lt;P&gt;· That engineering is about achieving predictable results for predictable costs 
&lt;P&gt;· That engineers use techniques like planning, modeling, and prototyping to manage risks and achieve predictable results 
&lt;P&gt;The trouble with this approach is that many programmers are not in fact trained engineers and furthermore do not wish to be. What can we, as performance professionals, do to help these people to succeed? 
&lt;P&gt;I don’t know of any method, other than SPE, that can &lt;I&gt;guarantee&lt;/I&gt; predictable performance results. It’s hard to imagine anything other than careful goal-setting and methodical planning that could yield such a result. However, in many cases the level of control that is possible with SPE is not necessary for all aspects of a project. 
&lt;P&gt;While it is true that for some projects missing performance goals by as little as 1% would represent a disaster, there are many projects for which arriving within 10% of the desired goal would be an occasion to celebrate. Indeed, in many situations, failure is more like a performance calamity observed at the end of the development cycle. The “final” product is totally unacceptable because it is operating perhaps five, ten, or even twenty times slower than is desired. Any reasonable process that would help prevent these kinds of disasters would be valuable, even if less than perfect. 
&lt;P&gt;This paper describes one such approach, still in its early stages, with potential to improve the state of the art in these matters. 
&lt;H3&gt;Typical Questions from Concerned Programmers&lt;/H3&gt;
&lt;P&gt;&lt;B&gt;&lt;/B&gt;
&lt;P&gt;Discussions with programmers working on systems where performance “matters” are, not surprisingly, rich with queries about specific practices. &lt;/P&gt;
&lt;P mce_keep="true"&gt;“Can I use this method in my program?” “If I use this class will it be fast?” “Does this technique give good performance?” You can fill in the blanks. Sometimes the discussions are about database access, sometimes about XML processing, sometimes about thread pools, string management, collection classes, enumeration, sorting, searching, you name it.&lt;/P&gt;
&lt;P mce_keep="true"&gt;The people asking these questions often have, or are in the process of building, a software solution that involves a complex framework, such as Microsoft&lt;SUP&gt;®&lt;/SUP&gt; .NET™. They are often overwhelmed with the available implementation choices and have little in the way of supporting information on which to make decisions. They are seeking any kind of guidance and the “blessing” of an expert is often a very desirable affirmation.&lt;/P&gt;
&lt;P mce_keep="true"&gt;The irony of this situation is that the expert is not likely to be able to help very much; lacking context, the performance expert can only provide the odd useful tidbit of advice. Sharpening the irony is the fact that the people asking these sorts of questions are, of course, the ones that care the most about performance. They are the ones we most want to help.&lt;/P&gt;
&lt;P mce_keep="true"&gt;What I invariably end up doing in these situations is asking some questions about the context and then trying to give some basic relevant guidelines, such as which services are reasonable to use, what is intended to be used in that context and what is not. It isn’t quantitative advice – usually there are no numbers available – but it can nonetheless be helpful in avoiding some of the biggest problems.&lt;/P&gt;
&lt;H3&gt;Context Specific Guidance&lt;/H3&gt;
&lt;P&gt;A colleague and I used to joke that all programmers should be forced to name their methods so that they end in a number – the floor of the base ten log of the number of processor cycles the method is expected to consume. Then, when writing a method that ends in a “5” you are simply not allowed to call any method that ends in “6.” Our thinking was that even this simple rule would prevent many performance calamities. It is perhaps the most basic example of somehow using the performance expectations of the caller to restrict the choice of callable methods to a subset consistent with that expectation. 
&lt;P&gt;However, even this simple approach is perilous. For starters, not every programmer is comfortable working with logarithms, much less predicting the likely base ten log of processor cycles that a method will execute when called. Moreover, the &lt;I&gt;cost&lt;/I&gt; in processor cycles for any given method is often not a fixed quantity – a spectrum is much more probable – in which case, the advice we must give is much more complicated, not to mention much more likely to change as the caller and callee evolve. 
&lt;P&gt;Notwithstanding this problem, I still find myself able to give reasonable advice under a wide variety of circumstances when put on the spot. This brings us at last to the main thrust of this paper, a qualitative approach to providing useful guidance to the developer about the performance consequences of calling a method. 
&lt;H3&gt;&lt;I&gt;Performance Signatures&lt;/I&gt;&lt;/H3&gt;
&lt;P&gt;To facilitate recommendations we begin by computing a qualitative &lt;I&gt;Performance Signature &lt;/I&gt;for every method in a system that reflects its relative &lt;I&gt;cost&lt;/I&gt; along some relevant performance dimension, such as a resource usage metric. Many potential formulations of signatures are possible, but to be useful, the computation must result in a formal partitioning with these desirable properties: 
&lt;OL&gt;
&lt;LI&gt;The number of signatures (partitions) is small enough to remember (e.g., less than seven unique signatures). 
&lt;LI&gt;The signature of any given method can be statically computed by analyzing the portions of that method which are typically executed (e.g., disregarding basic blocks which “throw”, and disregarding “catch” blocks) and any child methods (i.e., the transitive closure of the call graph). 
&lt;LI&gt;The signature implies reasonable prescriptive rules and/or limitations that a programmer could follow to create a method of a desired signature. 
&lt;LI&gt;The signatures are sufficiently broad that a method is unlikely to ever move from one signature to another.&lt;/LI&gt;&lt;/OL&gt;
&lt;P&gt;Ultimately, the utility of signatures is based on the notion that a method &lt;I&gt;M&lt;/I&gt; cannot meet its cost requirement &lt;I&gt;k&lt;/I&gt; if any method with cost requirement typically greater than &lt;I&gt;k&lt;/I&gt; appears in the call graph of &lt;I&gt;M, &lt;/I&gt;regardless of what cost dimension &lt;I&gt;k&lt;/I&gt; measures. 
&lt;P&gt;&lt;I&gt;&lt;/I&gt;
&lt;P&gt;Importantly, this is true &lt;I&gt;even if &lt;/I&gt;k&lt;I&gt; is a discrete cost metric with a small number of easily-remembered values&lt;/I&gt; as we’ll see below.&lt;I&gt;&lt;/I&gt; 
&lt;P&gt;&lt;B&gt;Approximate Characterization Still Effective&lt;/B&gt; 
&lt;P&gt;Since the signatures are designed to represent fairly broad categories of cost, the rules to compute them do not have to be complicated formulations. For instance, one important cost dimension is whether or not a given method allocates memory, and, if so, is the method limited to a one-time allocation of only its return type or does it necessarily allocate and free temporary memory repeatedly. A fairly simple analysis (with transitive closure) could partition methods effectively into three large buckets: 
&lt;UL&gt;
&lt;LI&gt;Those that make no allocations or that only allocate their return type (“low” cost) 
&lt;LI&gt;Those that have less than 5 distinct memory allocation requests in their call graph (“medium” cost) 
&lt;LI&gt;Everything else (“high” cost)&lt;/LI&gt;&lt;/UL&gt;
&lt;P&gt;It’s worth pointing out some facts about the partitioning proposed above: 
&lt;UL&gt;
&lt;LI&gt;“5” distinct allocation sources was chosen arbitrarily 
&lt;LI&gt;Classification along this dimension ignores the computational complexity of the method entirely 
&lt;LI&gt;Stack-based memory usage is entirely disregarded 
&lt;LI&gt;Cycles in the call graph are ignored (i.e. when a call that would cause a cycle in the call graph is encountered, it is ignored)&lt;/LI&gt;&lt;/UL&gt;
&lt;P&gt;And yet, even this simple-minded partitioning scheme can provide valuable guidance! Consider this real life example: a programmer writing a hashing function wishes to know if it would be wise to call “String.Split” in that context. Can we answer that question with the lame partitioning proposed above? Yes! 
&lt;P&gt;A hashing function must be very inexpensive, and therefore, should generally be in the lowest cost bucket. But the helper method String.Split needs to allocate an array of string pieces, it is therefore in the “medium” cost bucket. Thus, one should never call String.Split from inside a hashing function; a common programming error and a resulting performance problem has been prevented. 
&lt;P&gt;The resulting hashing function might still be unsuitable for other reasons, but at least now it has a better chance of delivering appropriate performance. 
&lt;P&gt;Experience teaches us that many costly performance mistakes are not small ones that require subtle filters to locate. Rather, programmers commonly make large mistakes which even simple qualitative filters can catch. 
&lt;P&gt;&lt;B&gt;Analysis by Cross-Checking&lt;/B&gt; 
&lt;P&gt;In order to perform any meaningful analysis, we must be in a situation where we can cross-check two different appraisals for consistency – two different sources of the Performance Signature. Of course one source will be a computed signature such as the one described above. The second source is the desired or required signature: 
&lt;UL&gt;
&lt;LI&gt;Desired, if a particular method has been annotated by the user. 
&lt;LI&gt;Required, if the method is an implementation of a well known type or pattern that is expected to have a certain signature.&lt;/LI&gt;&lt;/UL&gt;
&lt;P&gt;Common polymorphic methods like “GetHashCode” come with an implied promise of performance which can be translated into a constraint on the signature, facilitating analysis. Many required signatures emerge naturally in the context of particular frameworks. 
&lt;P&gt;Helpfully, these implied promises tend to occur in important areas because they are often central members of a framework. For instance, in the .NET framework, all implementations of primitive interfaces like IComparable, IEquatable, IEnumerable, as well as overrides of GetHashCode should be “low” cost. Other readily-identifiable methods such as UI event handlers, or portions of HTML page renderers should be “medium”. 
&lt;P&gt;The upshot is that we can use a simple set of name pattern matching rules to infer many required signatures and facilitate a cross-check even with little or no explicit annotation by a developer. Naturally, the developer can further improve the quality of the analysis by annotating his/her code with specific desired signatures. 
&lt;P&gt;&lt;B&gt;Polymorphism&lt;/B&gt; 
&lt;P&gt;&lt;B&gt;&lt;/B&gt;
&lt;P&gt;The presence of polymorphic call sites is often troublesome for algorithms that are attempting to perform static code analysis. However Performance Signatures offer some solace in the analysis phase: 
&lt;UL&gt;
&lt;LI&gt;The analysis is only intended to be approximately correct, so a worst-, average-, or even best-case approach can be used when computing transitive closures: whatever yields a reasonable partitioning. 
&lt;LI&gt;To avoid creating potential confusion for users it is desirable that polymorphic versions of the same method have similar performance characteristics, and therefore the same Performance Signature. Where the Performance Signature varies between implementations that fact is worthy of reporting in its own right. 
&lt;LI&gt;In many cases the above statement is so strong that it actually constitutes a&lt;I&gt; requirement&lt;/I&gt;. 
&lt;LI&gt;When visiting any potential child node (callee) whose signature is constrained, that constrained cost can be used in the parent (caller) rather than computing the cost of the child. This avoids the problem of having one poorly implemented polymorphic implementation causing a cascade of warnings further up the call tree.&lt;/LI&gt;&lt;/UL&gt;
&lt;P&gt;Overall, these factors seem to indicate that the presence of polymorphism will not make the computation of signatures intractable. 
&lt;P&gt;&lt;B&gt;Cycles&lt;/B&gt; 
&lt;P&gt;&lt;B&gt;&lt;/B&gt;
&lt;P&gt;Of course any algorithm requiring static analysis of call graphs faces the reality that call graphs have cycles. However, fairly straightforward techniques can still be used to compute a cost function even in the presence of cycles. 
&lt;P&gt;Preliminarily, many cycles are caused by polymorphic paths where, for instance, formatting methods call successively lower-level formatting methods all of which conform to the same type signature. In those cases the rules for managing polymorphism and using constrained costs (described above) are invaluable for computing signature of the cycle. 
&lt;P&gt;But, more generally, upon discovering a cycle, the cost computation can end and suitable pointers can be left in the graph nodes so that the final cost of all the nodes in the cycle becomes the greatest computed cost of any of the nodes in the cycle – which will perforce be the cost of the root of the spanning tree. Signature assignment can then be done on the basis of that cost, with all the nodes in the cycle gaining the same signature. 
&lt;P&gt;&lt;B&gt;&lt;/B&gt;
&lt;P&gt;&lt;B&gt;Enriching Signatures With Other Data&lt;/B&gt; 
&lt;P&gt;The creation of good partitioning strategies is the primary area for future investigation in this space. For instance, the signature computation described earlier used only memory consumption for partitioning. This approach yields useful results because memory usage is often a good indicator of algorithmic complexity, but it is not the only important cost function to evaluate. 
&lt;P&gt;To achieve more differentiation in the types of signatures that can be created, other metrics may be mixed into the partitioning strategy, for instance: 
&lt;UL&gt;
&lt;LI&gt;Cyclomatic Complexity [McCabe 94], or Maintainability Index [VanDoren 2006] can be used to help forecast algorithmic cost – an approximately correct cost is still useful for partitioning. 
&lt;LI&gt;Live execution time in a set of benchmarks could be blended into the cost 
&lt;LI&gt;Data or code volume (cache lines, pages, etc.) as measured in a set of benchmarks could be considered in the cost. 
&lt;LI&gt;Manual “override” of poorly computed signatures may be needed – methods with highly expensive, but rarely executed code blocks, for instance might get the wrong signature.&lt;/LI&gt;&lt;/UL&gt;
&lt;P&gt;In all cases the idea is to introduce readily computable results, plus overrides, to get reasonable signatures. Managed code environments are particularly friendly to these operations because both Java bytecodes and .NET IL instructions can be analyzed to compute the static costs described above. Source code is not required. 
&lt;P&gt;&lt;B&gt;Why Does This Work?&lt;/B&gt; 
&lt;P&gt;&lt;B&gt;&lt;/B&gt;
&lt;P&gt;It is perhaps surprising that a small number of signatures, computed very simply, can be used to give effective guidance; however, that conclusion does resonate with experience. 
&lt;P&gt;Many sorts of applications, both client and server, have the bulk of their functionality placed in middle layers of logic. In typical managed applications on the .NET platform for instance, there are event handlers dedicated to responding to various user stimuli in the client or for creating various HTML page fragments on the server. These middle level pieces of code have abundant freedom to do temporary allocations and modest computation without causing any kind of performance disaster. The disasters tend to occur when middle layers reach beyond their scope and try to invoke more intensive processing services in their medium level context. Until and unless that happens, performance tends to degrade in a fairly natural way that corresponds roughly to intuition – twice as much processing takes twice as long. 
&lt;P&gt;In similar fashion, the very lowest level computations – comparison functions, hashing functions, and the like – must stay within their own family or an observable disaster is likely to occur. 
&lt;P&gt;This basic analysis is certainly not any kind of complete solution to every performance problem, but in terms of disaster prevention it does seem as though a small number of performance signatures could be helpful. And the solution is all the more embraceable for its simplicity. 
&lt;P&gt;&lt;B&gt;Deliverables For an Ecosystem&lt;/B&gt; 
&lt;P&gt;&lt;B&gt;&lt;/B&gt;
&lt;P&gt;In order to create an ecosystem that facilitates adoption of Performance Signatures, library vendors should provide some additional supporting files with their libraries: 
&lt;UL&gt;
&lt;LI&gt;A list of methods in the libraries, each annotated with its pre-computed cost and resultant signature. 
&lt;LI&gt;A set of name-pattern-rules from which signatures can be implied (e.g. *.GetHashCode è “low”).&lt;/LI&gt;&lt;/UL&gt;
&lt;P&gt;The pre-computed costs are necessary, so that developers may use tools to compute the costs of their own methods without having to perform lengthy analysis on large framework libraries. Development tools may terminate cost and signature computations upon reaching any framework entry point with a known cost/signature. 
&lt;P&gt;The rules for implication are perhaps of even greater importance – they characterize some simple patterns that appear within the library. The knowledge that, for instance, methods named “*.GetHashCode” should be of “low” cost cannot be inferred by analysis in any easy way. A good set of default rules will save users from either building their own set of patterns or adding many repetitive per-method annotations to get good results. 
&lt;P&gt;As discussed previously, many forms in which libraries are delivered (e.g. Microsoft .NET assemblies) are amenable to analysis. A tool wishing to apply Performance Signature-based techniques could compute the needed signatures on the developer’s computer even if the library vendor either does not provide data for their libraries, or uses a different signature formulation. 
&lt;P&gt;Finally, Performance Signatures are programmer-friendly so they can appear in documentation as well as supporting various tools-based uses detailed below. 
&lt;P&gt;&lt;B&gt;&lt;/B&gt;
&lt;P&gt;&lt;B&gt;Application: Intellisense™&lt;/B&gt; 
&lt;P&gt;&lt;B&gt;&lt;/B&gt;
&lt;P&gt;Perhaps the most compelling application of Performance Signatures is during actual coding. Systems like Microsoft® Visual Studio.NET™ offer “Intellisense” where, as a programmer enters language statements, possible completions, notably lexically valid method calls, are offered in a drop down list. Given an existing or desired Performance Signature for the method that is being authored, it is a small matter to color-code the suggested method calls such that those with appropriate Signatures appear in, for instance, green and more dubious choices contrariwise appear in red: with Performance Signatures we could deliver &lt;I&gt;as-you-type &lt;/I&gt;guidance for the cost of only a few bits per method. 
&lt;P&gt;Such a feature on its own would likely prevent some of the most egregious performance problems very early in the programming cycle, with no need to run a profiler or understand its results. 
&lt;P&gt;A user desiring more information could inquire &lt;I&gt;why&lt;/I&gt; a particular method call was deemed inappropriate and receive advice based on their context, the rules for their context, the Signature of the target, and the factors which affected the target’s Signature. 
&lt;P&gt;&lt;B&gt;&lt;/B&gt;
&lt;P&gt;&lt;B&gt;Application: Reporting&lt;/B&gt; 
&lt;P&gt;&lt;B&gt;&lt;/B&gt;
&lt;P&gt;The approach described above could also be used in a batch context. A user’s entire application could be scrubbed to ensure that the computed signature of every method cross-checks with the desired or required signature. As described above the desired/required signatures can be implicit (e.g. all implementations of *.GetHashCode should be “low” cost) or explicit (e.g. user places a suitable annotation in an external XML file, an inline comment, or custom attribute). 
&lt;P&gt;When combined with the ability to ignore warnings that are deemed spurious, a batch validation of signatures could be an invaluable tool to help keep performance issues in check during development or maintenance cycles. 
&lt;P&gt;&lt;B&gt;&lt;/B&gt;
&lt;P&gt;&lt;B&gt;Application: Troubleshooting&lt;/B&gt; 
&lt;P&gt;&lt;B&gt;&lt;/B&gt;
&lt;P&gt;In the context of troubleshooting, Performance Signatures could enhance the value of measurements. For instance, observed memory patterns that do not agree with a desired or required signature can be highlighted if they have a cost that significantly exceeds their target. These costs can be described in terms of Performance Signatures to help isolate problems. E.g., “Method M1 was to be of ‘medium’ cost; however, its inclusive cost is ‘high’, this seems to be because method M2 is of ‘high’ cost,” and, drilling down deeper to the next level of detail, “Method ‘M2’ is of high cost because it allocated 100 System.String objects.” 
&lt;P&gt;The Performance Signatures would help to identify trouble-spots and to provide simple, actionable guidance in understandable terms. 
&lt;P&gt;&lt;B&gt;&lt;/B&gt;
&lt;P&gt;&lt;B&gt;Status&lt;/B&gt; 
&lt;P&gt;&lt;B&gt;&lt;/B&gt;
&lt;P&gt;Work on Performance Signatures is still in the early stages. Currently there is a working system that does basic analysis and transitive closure over Microsoft .NET assemblies, and at this time we are experimenting with partitioning choices and expect to be able to create meaningful reports – cross checking computed vs. constrained-by-interface signatures -- very soon. 
&lt;P&gt;While the &lt;I&gt;idea&lt;/I&gt; of providing this type of information has so far been received with universal enthusiasm, the coming experiments will tell us if the &lt;I&gt;reality&lt;/I&gt; lives up to the expectation. 
&lt;P&gt;&lt;B&gt;&lt;/B&gt;
&lt;P&gt;&lt;B&gt;Conclusion&lt;/B&gt; 
&lt;P&gt;&lt;B&gt;&lt;/B&gt;
&lt;P&gt;It is always desirable to prevent large performance problems as soon as possible. A tool that helps us to do this, even in only the “easy” cases, at the very least provides us with more time to invest in more difficult cases requiring more complex planning. 
&lt;P&gt;Programmers have few resources to turn to when it comes to characterizations of 3&lt;SUP&gt;rd&lt;/SUP&gt; party libraries that they intend to use or practices they should follow; Performance Signatures can provide both valuable data and vocabulary in these areas. A signature formulation that is readily computed and is easily digested by users can be a useful tool both to teach good usage and to describe the services offered. 
&lt;P&gt;The ability to provide assistance to programmers in &lt;I&gt;real time&lt;/I&gt;, in a natural and familiar context is invaluable – if only to build awareness.&lt;I&gt;&lt;/I&gt; 
&lt;P&gt;Even simple signature computations show promise of having real utility; further experimentation is likely to lead to a more robust system of partitioning which can serve developers in many contexts. 
&lt;H3&gt;Acknowledgements&lt;/H3&gt;
&lt;P&gt;I would like to thank my colleagues Hazim Shafi, Mark Friedman, and Gerardo Bermudez for their invaluable assistance in preparing this paper. 
&lt;H3&gt;References&lt;/H3&gt;
&lt;P&gt;[Smith and Williams 2002] C. U. Smith and L. G. Williams, &lt;I&gt;Performance Solutions: A Practical&lt;/I&gt; &lt;I&gt;Guide to Creating Responsive, Scalable Software&lt;/I&gt;, Boston, MA, Addison-Wesley, 2002. 
&lt;P&gt;[McCabe 94] McCabe, Thomas J. &amp;amp; Watson, Arthur H. "Software Complexity." &lt;I&gt;Crosstalk, Journal of Defense Software Engineering&lt;/I&gt; 7, 12 (December 1994): 5-9. 
&lt;P&gt;[VanDoren 2006] Edmond VanDoren, “Maintainability Index Technique for Measuring Program Maintainability”, &lt;A href="http://www.sei.cmu.edu/str/descriptions/mitmpm.html" mce_href="http://www.sei.cmu.edu/str/descriptions/mitmpm.html"&gt;http://www.sei.cmu.edu/str/descriptions/mitmpm.html&lt;/A&gt;, Carnegie Mellon Software Engineering Institute.&lt;/P&gt;&lt;img src="http://blogs.msdn.com/aggbug.aspx?PostID=1621197" width="1" height="1"&gt;</description><category domain="http://blogs.msdn.com/ricom/archive/tags/performance/default.aspx">performance</category><category domain="http://blogs.msdn.com/ricom/archive/tags/signatures/default.aspx">signatures</category></item><item><title>.NET Framework Allocation Complexity Graph</title><link>http://blogs.msdn.com/ricom/archive/2007/01/26/net-framework-allocation-complexity-graph.aspx</link><pubDate>Fri, 26 Jan 2007 21:16:00 GMT</pubDate><guid isPermaLink="false">91d46819-8472-40ad-a661-2c78acb4018c:1537648</guid><dc:creator>ricom</dc:creator><slash:comments>7</slash:comments><comments>http://blogs.msdn.com/ricom/comments/1537648.aspx</comments><wfw:commentRss>http://blogs.msdn.com/ricom/commentrss.aspx?PostID=1537648</wfw:commentRss><description>&lt;P&gt;A quick graphical view of how the framework measures up.&lt;/P&gt;
&lt;P mce_keep="true"&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&lt;IMG src="http://blogs.msdn.com/photos/ricom/images/1537628/original.aspx" mce_src="http://blogs.msdn.com/photos/ricom/images/1537628/original.aspx"&gt; &lt;/P&gt;
&lt;P&gt;This graph shows the number of methods of any given allocation complexity on a logarithmic scale.&amp;nbsp; This allocation complexity is discussed in more detail in &lt;A href="http://blogs.msdn.com/ricom/archive/2007/01/26/performance-quiz-12-the-cost-of-a-good-hash-solution.aspx" mce_href="http://blogs.msdn.com/ricom/archive/2007/01/26/performance-quiz-12-the-cost-of-a-good-hash-solution.aspx"&gt;performance quiz #12&lt;/A&gt;&amp;nbsp;and this summary of my &lt;A href="http://www.cmg.org/" mce_href="http://www.cmg.org/"&gt;CMG&lt;/A&gt; talk on &lt;A href="http://blogs.msdn.com/ricom/archive/2006/12/11/avoiding-coding-pitfalls-with-performance-signatures.aspx" mce_href="http://blogs.msdn.com/ricom/archive/2006/12/11/avoiding-coding-pitfalls-with-performance-signatures.aspx"&gt;performance signatures&lt;/A&gt;.&lt;/P&gt;
&lt;P&gt;Remember this is just a rough number, imperfect in many ways, it's only interesting to give you a general feeling about where particular methods are in the food-chain.&amp;nbsp; It's useful as a planning tool but not a replacement for &lt;EM&gt;actual measurement&lt;/EM&gt;.&amp;nbsp; And despite the fact that I think this metric has&amp;nbsp;several flaws, which I'll be discussing them in coming postings, even this feeble metric is actually useful --&amp;nbsp;an amazing result in an of itself.&lt;/P&gt;&lt;img src="http://blogs.msdn.com/aggbug.aspx?PostID=1537648" width="1" height="1"&gt;</description><category domain="http://blogs.msdn.com/ricom/archive/tags/performance/default.aspx">performance</category><category domain="http://blogs.msdn.com/ricom/archive/tags/signatures/default.aspx">signatures</category></item><item><title>Performance Quiz #12 -- The Cost of a Good Hash -- Some Help</title><link>http://blogs.msdn.com/ricom/archive/2007/01/24/performance-quiz-12-the-cost-of-a-good-hash-some-help.aspx</link><pubDate>Wed, 24 Jan 2007 22:30:00 GMT</pubDate><guid isPermaLink="false">91d46819-8472-40ad-a661-2c78acb4018c:1523243</guid><dc:creator>ricom</dc:creator><slash:comments>32</slash:comments><comments>http://blogs.msdn.com/ricom/comments/1523243.aspx</comments><wfw:commentRss>http://blogs.msdn.com/ricom/commentrss.aspx?PostID=1523243</wfw:commentRss><description>&lt;P&gt;I continue to be astounded by what my readers can come up with.&amp;nbsp; &lt;/P&gt;
&lt;P&gt;As usual I had a purpose for posing &lt;A href="http://blogs.msdn.com/ricom/archive/2007/01/22/performance-quiz-12-the-cost-of-a-good-hash.aspx" mce_href="http://blogs.msdn.com/ricom/archive/2007/01/22/performance-quiz-12-the-cost-of-a-good-hash.aspx"&gt;my last question&lt;/A&gt;&amp;nbsp;and that purpose was to show that basically its hard to get a handle on what things cost in any kind of omnibus way.&amp;nbsp; But imagine my surprise when Frank and&amp;nbsp;Shuggy turned this into an interesting discussion of hashing techniques and combining hashes.&amp;nbsp; To say nothing of some of the other regulars (you know who you are) that have insightful comments.&amp;nbsp; Those comments are definately worth a read and thanks very much for presenting them!&amp;nbsp; &lt;/P&gt;
&lt;P&gt;But I don't know that anyone has answered my question quite squarely (except maybe Alois) and I think this makes my point that it is hard to answer this question.&amp;nbsp; &lt;/P&gt;
&lt;P&gt;But what if you were to download &lt;A href="http://download.microsoft.com/download/8/d/a/8da59e54-0bbf-47c4-ab76-e38f478a44d2/log10costs.zip" mce_href="http://download.microsoft.com/download/8/d/a/8da59e54-0bbf-47c4-ab76-e38f478a44d2/log10costs.zip"&gt;this file&lt;/A&gt;...&amp;nbsp; I think you'd find it quite helpful.&lt;/P&gt;
&lt;P&gt;What you see is a compilation of one cost metric for the entire framework.&amp;nbsp; It's not the only cost metric by any means, you could imagine many others, but it is an interesting one.&amp;nbsp;&amp;nbsp; The cost is computed by doing a static analysis of all allocations made in the call tree of each method.&amp;nbsp; It's the metric that I discussed in this article on &lt;A href="http://blogs.msdn.com/ricom/archive/2006/12/11/avoiding-coding-pitfalls-with-performance-signatures.aspx" mce_href="http://blogs.msdn.com/ricom/archive/2006/12/11/avoiding-coding-pitfalls-with-performance-signatures.aspx"&gt;performance signatures&lt;/A&gt;&amp;nbsp;including the exception for methods&amp;nbsp;that look like they do a one-time allocation.&amp;nbsp; The cost is logarithmic, base 10, offset by 1, i.e.&lt;/P&gt;
&lt;BLOCKQUOTE&gt;
&lt;P&gt;if (allocation_count == 0)&lt;BR&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp; cost = 0;&lt;BR&gt;else&lt;BR&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp; cost = 1+log(allocation_count)&lt;/P&gt;&lt;/BLOCKQUOTE&gt;
&lt;P&gt;And I'm only reporting one digit after the decimal because I want to give the idea that this is just a rough cost, it's not supposed to be super-precise it's just supposed to give you a rough idea of whether or not a method is very complex or not so much.&lt;/P&gt;
&lt;P&gt;I'm trying to think of a good name for this metric, so far I'm thinking something like Allocation Complexity but I'm open to better ideas.&lt;/P&gt;
&lt;P&gt;Anyway snoop the file then see if you can answer my question.&amp;nbsp; For bonus points look at some other low level functions that might be interesting like say&amp;nbsp; Compare, Equals, or GetEnumerator.&amp;nbsp; It's very intersting to see that some methods that seem like they should be very low level are actually quite high level.&lt;/P&gt;
&lt;P&gt;I think costs like this one can be very useful in making decisions about which framework features to use in which contexts.&lt;/P&gt;
&lt;P&gt;Again this is only one kind of cost and its only approximate but I think it's interesting nonetheless.&lt;/P&gt;
&lt;P&gt;Now would you like to try the original question again:&lt;/P&gt;
&lt;P&gt;Q: Name&amp;nbsp;five implementations of GetHashCode in the framework that do things that you wouldn't expect a hashing function to do.&lt;/P&gt;&lt;img src="http://blogs.msdn.com/aggbug.aspx?PostID=1523243" width="1" height="1"&gt;</description><category domain="http://blogs.msdn.com/ricom/archive/tags/performance/default.aspx">performance</category><category domain="http://blogs.msdn.com/ricom/archive/tags/quiz/default.aspx">quiz</category><category domain="http://blogs.msdn.com/ricom/archive/tags/signatures/default.aspx">signatures</category></item></channel></rss>