I had an interesting email conversation here at work recently and as usual I can't share it all but I can share some of it because it's generally applicable and perhaps others could benefit from my response.  The particulars are really not important at all in any case, the concept is what counts.

What happened was one of my internal groups did a performance study and asked me for some feedback.  Now I think this was actually a great little study but I did have an important criticism.  It's something that I've written about before.  Here's what I wrote them, almost word for word.

----

Thanks so much for sharing this with me.  This is really a great effort on your part and I’m glad to see that you learned some things about how [your system] interacts with the CLR from it – I’m sure that knowledge will help you to make better decisions.

Sadly, that’s the good news.  The bad news is that I’m not sure you have the right data.  You might, but I’m not sure.  Let me explain why:

When doing performance tuning, and more importantly performance planning, context is everything.  It’s very difficult to interpret costs without context and so it’s basically impossible to say whether something is “good enough”, or “bad” or anything like that.  It might not even be relevant much less “good.”  It all depends on context.

In this particular case, these results are hard to interpret because they aren’t put in the context of representative customer scenarios.  The low level cost analysis is great, but are these the right costs?  How do they play in a customer scenario?  When you add the customer side of the equation are the costs better, worse, the same? 

You might want to take a quick look at these web pages for more background, I talked about this in the context of measuring the raw cost of exceptions a few months ago.

Problem: http://blogs.msdn.com/ricom/archive/2006/09/14/754661.aspx

Solution: http://blogs.msdn.com/ricom/archive/2006/09/25/771142.aspx

The comments in the problem area are especially interesting.

So getting back to this particular case, here are some things I would do:

  • Put together a few representative use cases that show approximately how customers will use the feature you intend to test
  • Measure those test cases in context with an eye to how customers will see performance
    • Startup time
    • Memory consumption
    • Throughput
  • See what costs are relevant in this context and which are not
  • Consider your possible solutions with regards with how they affect whatever costs are most important
  • When deciding which approach to use, consider the likely consequence in terms of how the customer sees it
  • Let’s suppose throughput turns out to be the hardest requirement to achieve
    • What is “world class throughput” the best you could achieve
    • How do each of your solutions compare to that?
    • What trade-offs in simplicity, semantics, etc. might you make (e.g. very small throughput loss traded for much ease of use could be a great choice for your customers)
    • Do you need to get an “A+” for throughput, or is a “C” good enough?  What about working set?  Startup time?
  • When you can, convert these goals, as the customer sees them, into raw low level requirements like “no more than 1k of metadata” or whatever but don’t be a slave to the low level goals, the performance as the customer sees it should be driving the analysis

To summarize:  You have great looking raw data for some dimensions that look interesting but no way to interpret them yet.  Calibrate according to some representative use cases.  Correlate your use cases to the consumption metrics like the ones you have and see where you stand.  Hopefully the things you have already measured will be the dominant costs, if not look into whatever looks to be most important.

Remember, as in the article, you need to measure in context so that you can see how your code affects a working system.  If your system used, for instance, a lot of L2 cache you might not notice this running it alone.  But you would see that it was degrading the performance of other code disproportionately.

----

 

The concept of context comes up a lot in performance work.  For instance, when I wrote a couple weeks ago about Performance Signatures this was another way to try to assess the wisdom of using some methods from the context which they are going to be used.  Context is critical.