<?xml version="1.0" encoding="UTF-8" ?>
<?xml-stylesheet type="text/xsl" href="http://blogs.msdn.com/utility/FeedStylesheets/rss.xsl" media="screen"?><rss version="2.0" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:slash="http://purl.org/rss/1.0/modules/slash/" xmlns:wfw="http://wellformedweb.org/CommentAPI/"><channel><title>Performance Tidbits</title><link>http://blogs.msdn.com/ricom/archive/2004/08/24/219751.aspx</link><description>(Some additional remarks on this posting can be found here -- feel free to continue comments on that chain) Here are a few things that I often look for when reviewing code or APIs for performance issues. None of these are absolutes but they’re little</description><dc:language>en-US</dc:language><generator>CommunityServer 2.1 SP1 (Build: 61025.2)</generator><item><title>Delegates vs. Virtual Methods</title><link>http://blogs.msdn.com/ricom/archive/2004/08/24/219751.aspx#219787</link><pubDate>Tue, 24 Aug 2004 22:13:00 GMT</pubDate><guid isPermaLink="false">91d46819-8472-40ad-a661-2c78acb4018c:219787</guid><dc:creator>Diego Mijelshon</dc:creator><description>The problem with this is ease of use.&lt;br&gt;&lt;br&gt;While C#/.NET 2.0 introduces Anonymous methods, it doesn't do the same for Anonymous classes, which Java has had for some years now.&lt;br&gt;&lt;br&gt;So, while in Java you can do this:&lt;br&gt;&lt;br&gt;class SomeClass&lt;br&gt;{&lt;br&gt;	abstract public void DoX();	&lt;br&gt;}&lt;br&gt;SomeClass instance = new SomeClass()&lt;br&gt;{&lt;br&gt;	public void DoX()&lt;br&gt;	{&lt;br&gt;		//implementation&lt;br&gt;	}&lt;br&gt;};&lt;br&gt;instance.DoX();&lt;br&gt;&lt;br&gt;&lt;br&gt;In C# you have to do this:&lt;br&gt;&lt;br&gt;&lt;br&gt;class SomeClass&lt;br&gt;{&lt;br&gt;	public void DoX(StrategyDelegate sd)&lt;br&gt;	{&lt;br&gt;		sd();&lt;br&gt;	}&lt;br&gt;}&lt;br&gt;SomeClass instance = new SomeClass();&lt;br&gt;instance.DoX(delegate&lt;br&gt;{&lt;br&gt;	//implementation&lt;br&gt;});</description></item><item><title>re: Performance Tidbits</title><link>http://blogs.msdn.com/ricom/archive/2004/08/24/219751.aspx#219829</link><pubDate>Tue, 24 Aug 2004 23:24:00 GMT</pubDate><guid isPermaLink="false">91d46819-8472-40ad-a661-2c78acb4018c:219829</guid><dc:creator>Sam</dc:creator><description>Hi Rico,&lt;br&gt;&lt;br&gt;&amp;gt;Don’t forget that each class and function has &amp;gt;a static overhead associated with it – all &amp;gt;things being equal less classes with less &amp;gt;functions gives better performance&lt;br&gt;&lt;br&gt;I have been wondering about this actually,&lt;br&gt;is there any chance you could expand on this point?&lt;br&gt;&lt;br&gt;Specifically, which parts of the CLR degrade in performance based on the number of classes I define in my application?&lt;br&gt;&lt;br&gt;Thanks,&lt;br&gt;&lt;br&gt;Sam</description></item><item><title>re: Performance Tidbits</title><link>http://blogs.msdn.com/ricom/archive/2004/08/24/219751.aspx#219920</link><pubDate>Wed, 25 Aug 2004 02:20:00 GMT</pubDate><guid isPermaLink="false">91d46819-8472-40ad-a661-2c78acb4018c:219920</guid><dc:creator>Rico Mariani</dc:creator><description>It may be that there are lingering non-linearities in some of the CLR class management algorithms but that isn't something I see people hit at all.&lt;br&gt;&lt;br&gt;What I was referring to is the fact that there is space cost associated with each class -- metadata to load, method tables, method descriptions per method and so forth.  The situation is somewhat better if you ngen but nonetheless there is definately a space cost there.  Reducing the overhead is always a good thing.</description></item><item><title>re: Performance Tidbits</title><link>http://blogs.msdn.com/ricom/archive/2004/08/24/219751.aspx#220003</link><pubDate>Wed, 25 Aug 2004 04:55:00 GMT</pubDate><guid isPermaLink="false">91d46819-8472-40ad-a661-2c78acb4018c:220003</guid><dc:creator>Matthew W. Jackson</dc:creator><description>Just for fun, try counting the number of these performance &amp;quot;issues&amp;quot; that apply to System.Windows.Forms.  And I'm in no way knocking the SWF designers...  I know that UI doesn't need to be *that* performant, but it does have a big effect on memory usage.&lt;br&gt;&lt;br&gt;1) Delegates...Lots of delegates...unless you inherit from most of the controls and provide any custom functionality you need in overridden methods.  It's a fairly clean design, but it also leads to more pointers.&lt;br&gt;&lt;br&gt;2) Virtual Methods...lots of these too.  Even the core methods that handle the message loop are virtual.  Of course, this makes it easier to provide customized behavior.  But I should also mention that many properties are virtual as well.&lt;br&gt;&lt;br&gt;3) Sealing...Not a big problem, since sealing most of these classes would really reduce extensibility.  And there aren't too many sealed classes I want unsealed here, except for a few big ones, such as the Common Dialogs.&lt;br&gt;&lt;br&gt;4) Type and API Inflation...lots of classes, but I don't know that I'd say too many.  I wish the superceded classes could be moved to another library (such as the old MainMenu, etc.), but that would break compatability.  Besides, the library is NGen'd.&lt;br&gt;&lt;br&gt;5) API Chunkiness...I'm not sure how this applies.&lt;br&gt;&lt;br&gt;6) Concurrency...Not much of an issue for the forms library, since you're not supposed to use the controls outside of their message-loop thread.  Invoke is provided, and most synchronization is left up to the consumer&lt;br&gt;&lt;br&gt;7) Fewer DLLs...Not a problem.  It's one huge DLL.  It uses System, System.Drawing, System.Data and maybe a few others, but those need to be separate for obvious reasons.&lt;br&gt;&lt;br&gt;8) Late Bound Semmantics...Quite a bit here.  Both data binding and automatic localization are late bound.  The first could be solved safely with delegates to properties (which I don't think is going to ever happen), while the second could be solved by doing your own localization code.&lt;br&gt;&lt;br&gt;9) Less Pointers...LOTS of pointers in SWF.  Not only does each Control maintain an object collection of all its child controls, but the designer adds ANOTHER pointer for each designed control, along with all of the parent pointers.  Luckily Whidbey lets you design controls without a member.&lt;br&gt;&lt;br&gt;10) Cache with Policy...Not sure if this is a problem in SWF.  I'd have to load it up in Reflector to see, but that's a lot of code to skim.&lt;br&gt;&lt;br&gt;And one more...Control deriving from MarshalByRefObject seems like a bit of an oversight.  If I want to communicate with my app across domains, I will use a more abstract version of my app, not the form itself.  To make controls and forms perform well, the bulk of their code really needs to be moved to external classes, which leads to Type Inflation.</description></item><item><title>re: Performance Tidbits</title><link>http://blogs.msdn.com/ricom/archive/2004/08/24/219751.aspx#220068</link><pubDate>Wed, 25 Aug 2004 07:44:00 GMT</pubDate><guid isPermaLink="false">91d46819-8472-40ad-a661-2c78acb4018c:220068</guid><dc:creator>Luc Cluitmans</dc:creator><description>Thanks for the advice!&lt;br&gt;&lt;br&gt;I recently did some performance tests on an area you didn't mention: calculations. In my case, I was wondering what the actual performance differences are between doing calculations using Doubles and simulating fixed-point operations with Longs, and also in analyzing the effect of having instances of Double.NaN in a data stream. Background: I am implementing some signal processing algorithms. These tests were of course simple, and not too much conclusions should be drawn from them, but I found a few surprises.&lt;br&gt;&lt;br&gt;- As soon as you do some multiplications (in addition to additions/compares), floating point is faster. This is more or less expected.&lt;br&gt;&lt;br&gt;- When checking algorithms that do nothing but simple operations (addition, subtraction, compare), longs are faster, but the margin by which they are faster was surprising: it is highly dependent on the processor! For the one Intel CPU based machine I tested it on, floating point arithmetic was about 1.5 times slower than long integer arithmetic, while for the AMD processor the difference was negligible...&lt;br&gt;&lt;br&gt;- Beware of NaNs: they severely slow down processing. I realize NaNs are not encountered in many programming scenarios, but in the signal processing scenarios I was analyzing they are used (to indicate missing data). In a processing pipeline that only uses (floating point) additions, subtractions and compares, they slowed down the test by a factor 100 if I used only NaNs as inputs (which is an unrealistic scenario, but still...)&lt;br&gt;&lt;br&gt;- Explicitly testing for NaNs (putting an 'if(Double.IsNan(x)){}else{}' block around your core calculation code) does barely improve the performance in the case where all processed samples actually are NaNs, and degrades performance in other cases quite significantly. It seems that the 'Double.IsNan()' method itself has a similar influence on the pipeline as actually having NaNs in the data...&lt;br&gt;</description></item><item><title>re: Performance Tidbits</title><link>http://blogs.msdn.com/ricom/archive/2004/08/24/219751.aspx#220178</link><pubDate>Wed, 25 Aug 2004 13:25:00 GMT</pubDate><guid isPermaLink="false">91d46819-8472-40ad-a661-2c78acb4018c:220178</guid><dc:creator>Ryan Lamansky (Kardax)</dc:creator><description>A note on sealing:&lt;br&gt;&lt;br&gt;This only improves performance if you have virtual methods on that class or its parents, and you're calling them.  It helps because calling a virtual method that could not possibly be overridden will be changed into a direct call or inlined by JIT.&lt;br&gt;&lt;br&gt;I still recommend sealing as much as you can, though.  You can always unseal it later without breaking anything.</description></item><item><title>re: Performance Tidbits</title><link>http://blogs.msdn.com/ricom/archive/2004/08/24/219751.aspx#220512</link><pubDate>Wed, 25 Aug 2004 22:08:00 GMT</pubDate><guid isPermaLink="false">91d46819-8472-40ad-a661-2c78acb4018c:220512</guid><dc:creator>Sam</dc:creator><description>Hi Rico,&lt;br&gt;&lt;br&gt;Thanks for the info.&lt;br&gt;&lt;br&gt;&amp;lt;Quote&amp;gt;&lt;br&gt;What I was referring to is the fact that there is space cost associated with each class -- metadata to load, &lt;br&gt;&lt;br&gt;method tables, method descriptions per method and so forth. The situation is somewhat better if you ngen but &lt;br&gt;&lt;br&gt;nonetheless there is definately a space cost there. Reducing the overhead is always a good thing. &lt;br&gt;&amp;lt;/Quote&amp;gt;&lt;br&gt;&lt;br&gt;It would be interesting to know how much space a simple class (say with one non virtual method)&lt;br&gt;takes in the CLR with all its associated data. Can you think of a good way to measure this?&lt;br&gt;&lt;br&gt;Thanks,&lt;br&gt;&lt;br&gt;Sam</description></item><item><title>re: Performance Tidbits</title><link>http://blogs.msdn.com/ricom/archive/2004/08/24/219751.aspx#220589</link><pubDate>Wed, 25 Aug 2004 23:54:00 GMT</pubDate><guid isPermaLink="false">91d46819-8472-40ad-a661-2c78acb4018c:220589</guid><dc:creator>Rico Mariani</dc:creator><description>Write a little program generator that makes something like this:&lt;br&gt;&lt;br&gt;class Test&lt;br&gt;{&lt;br&gt;  public static void Main(String [] args)&lt;br&gt;  {&lt;br&gt;     Foo1.f();&lt;br&gt;     Foo2.f();&lt;br&gt;     ...&lt;br&gt;     Console.ReadLine(); // pause to allow measurement&lt;br&gt;  }&lt;br&gt;  class Foo1 { static public void f() {} }&lt;br&gt;  class Foo2 { static public void f() {} }&lt;br&gt;  ...&lt;br&gt;}&lt;br&gt;&lt;br&gt;Make it for 500 Foos and measure the size with your favorite tool.  Then make it for 1000 foos and measure the size again, divide the delta by 500 for the per class size.&lt;br&gt;&lt;br&gt;Consider various different measures, such as size of the assembly, working set size of the process, working set size using ngen, etc. etc.&lt;br&gt;&lt;br&gt;Might make a nice Quiz #5 </description></item><item><title>re: Performance Tidbits</title><link>http://blogs.msdn.com/ricom/archive/2004/08/24/219751.aspx#220850</link><pubDate>Thu, 26 Aug 2004 12:59:00 GMT</pubDate><guid isPermaLink="false">91d46819-8472-40ad-a661-2c78acb4018c:220850</guid><dc:creator>Andy Sujdak</dc:creator><description>Luc&lt;br&gt;&lt;br&gt;Performance implication of evenly splitting code into 50% add 50% multiply (or any other ratio) is processor architecture specific, just so you know. I BELIEVE it's the Intel chips that like an even split better (they also do exceedingly well at vectorized SSE2 but have problems with branchy and scalar-y code).  Obviously the biggest place where you see a nice even split like that between adds and multiplies is in matrix multiply.&lt;br&gt;&lt;br&gt;AMD chips GENERALLY have better performance on code that wasn't written for low-level performance optimizations but Intel chips generally better if you've really killed yourself to maximize cache coherency and use of vectorized instructions.&lt;br&gt;&lt;br&gt;However, you probably want to worry just as much (if not more) about how you can:&lt;br&gt;1. maximize parallelism in your algorithms&lt;br&gt;2. distribute this parallel load best among multiple cores, chips, and computers. &lt;br&gt;&lt;br&gt;Most of signal processing work is pretty EP-ish and .NET offers some different ways besides the old MPI of exploiting this.</description></item><item><title>re: Performance Tidbits</title><link>http://blogs.msdn.com/ricom/archive/2004/08/24/219751.aspx#222116</link><pubDate>Sat, 28 Aug 2004 19:37:00 GMT</pubDate><guid isPermaLink="false">91d46819-8472-40ad-a661-2c78acb4018c:222116</guid><dc:creator>Sam</dc:creator><description>Rico,&lt;br&gt;&lt;br&gt;I thought I would give your suggestion a shot last night. &lt;br&gt;&lt;br&gt;If there is one thing I have learned from the experience, it's that accurately measuring the memory usage of a .net application is quite a tricky business, considering working sets and so forth.&lt;br&gt;&lt;br&gt;The best I have come up with so far is &amp;quot;somewhere around the 200 bytes mark&amp;quot;, but I have little confidence in that figure.&lt;br&gt;&lt;br&gt;Have you any suggestions on how I might go about getting an accurate memory reading?&lt;br&gt;&lt;br&gt;Sam&lt;br&gt;</description></item><item><title>Performance Code Review</title><link>http://blogs.msdn.com/ricom/archive/2004/08/24/219751.aspx#222373</link><pubDate>Sun, 29 Aug 2004 22:40:00 GMT</pubDate><guid isPermaLink="false">91d46819-8472-40ad-a661-2c78acb4018c:222373</guid><dc:creator>ThoughtChain</dc:creator><description /></item><item><title>Blog link of the week 35</title><link>http://blogs.msdn.com/ricom/archive/2004/08/24/219751.aspx#222445</link><pubDate>Mon, 30 Aug 2004 01:13:00 GMT</pubDate><guid isPermaLink="false">91d46819-8472-40ad-a661-2c78acb4018c:222445</guid><dc:creator>Daniel Moth</dc:creator><description>&lt;a target="_new" href="http://www.zen13120.zen.co.uk/Blog/2004/08/blog-link-of-week-35.html"&gt;http://www.zen13120.zen.co.uk/Blog/2004/08/blog-link-of-week-35.html&lt;/a&gt;</description></item><item><title>C# Performance Tips</title><link>http://blogs.msdn.com/ricom/archive/2004/08/24/219751.aspx#224401</link><pubDate>Thu, 02 Sep 2004 04:46:00 GMT</pubDate><guid isPermaLink="false">91d46819-8472-40ad-a661-2c78acb4018c:224401</guid><dc:creator>Alex Dong's Weblog</dc:creator><description /></item><item><title>Performance Tidbits by Rico Mariani</title><link>http://blogs.msdn.com/ricom/archive/2004/08/24/219751.aspx#224552</link><pubDate>Thu, 02 Sep 2004 10:20:00 GMT</pubDate><guid isPermaLink="false">91d46819-8472-40ad-a661-2c78acb4018c:224552</guid><dc:creator>Insomnia</dc:creator><description /></item><item><title>re: Performance Tidbits</title><link>http://blogs.msdn.com/ricom/archive/2004/08/24/219751.aspx#224589</link><pubDate>Thu, 02 Sep 2004 09:08:00 GMT</pubDate><guid isPermaLink="false">91d46819-8472-40ad-a661-2c78acb4018c:224589</guid><dc:creator>M</dc:creator><description>Rico&lt;br&gt;&lt;br&gt;Any thoughts on Java's advanced JIT has been inlining virtual method calls for many years.&lt;br&gt;&lt;a target="_new" href="http://java.sun.com/products/hotspot/whitepaper.html"&gt;http://java.sun.com/products/hotspot/whitepaper.html&lt;/a&gt;&lt;br&gt;&lt;br&gt;</description></item><item><title>Performance Tidbits by Rico</title><link>http://blogs.msdn.com/ricom/archive/2004/08/24/219751.aspx#225100</link><pubDate>Fri, 03 Sep 2004 09:45:00 GMT</pubDate><guid isPermaLink="false">91d46819-8472-40ad-a661-2c78acb4018c:225100</guid><dc:creator>Dev Notes</dc:creator><description /></item><item><title>re: Performance Tidbits</title><link>http://blogs.msdn.com/ricom/archive/2004/08/24/219751.aspx#225290</link><pubDate>Fri, 03 Sep 2004 15:42:00 GMT</pubDate><guid isPermaLink="false">91d46819-8472-40ad-a661-2c78acb4018c:225290</guid><dc:creator>Rico Mariani</dc:creator><description>I haven't read the whitepaper -- it's usually a very bad idea for me to look at Sun's Intellectual Property in any way as sadly it's more likely to put me in a bad position than a good one.  But let me talk about dynamic inlining -- it's not a new idea anyway.&lt;br&gt;&lt;br&gt;There are cases where &amp;quot;inlining&amp;quot; a virtual function call works out well (i.e. guess the class it probably is, put a test for that class, and then either run the inlined code or else make the call if it turns out the class is wrong) &lt;br&gt;&lt;br&gt;1) You have to know what the call is probably going to be&lt;br&gt;&lt;br&gt;2) You have to be willing to eat the extra code size&lt;br&gt;&lt;br&gt;3) What's being inlined needs to be big enough that adding an extra test won't slow it down percentage-wise by much yet not so big that inlining is moot (keeping in mind processor caches are fixed size so copying the code lots places isn't such a great idea -- by default bigger is slower)&lt;br&gt;&lt;br&gt;4) It's best if you know not only which function to inline but also which of the many call sites is the important one to inline -- less bloat that way.&lt;br&gt;&lt;br&gt;5) Even if all of those things work out it's still not as good as not having the test and the fallback virtual dispatch at all and just inlining the code.  So still seal if you can :)&lt;br&gt;&lt;br&gt;</description></item><item><title>re: Performance Tidbits</title><link>http://blogs.msdn.com/ricom/archive/2004/08/24/219751.aspx#225848</link><pubDate>Sun, 05 Sep 2004 16:37:00 GMT</pubDate><guid isPermaLink="false">91d46819-8472-40ad-a661-2c78acb4018c:225848</guid><dc:creator>Nicholas Allen</dc:creator><description>Rico-&lt;br&gt;&lt;br&gt;The Sun inliner doesn't put a test into the code.  It just inserts the virtual method straight.  So it ends up being the exact same native code (and performance) as sealing.  They have a speculative optimizer/deoptimizer pair to compute when devirtualizing a method in this way is safe.</description></item><item><title>.NET Performance Tidbits</title><link>http://blogs.msdn.com/ricom/archive/2004/08/24/219751.aspx#226316</link><pubDate>Tue, 07 Sep 2004 17:38:00 GMT</pubDate><guid isPermaLink="false">91d46819-8472-40ad-a661-2c78acb4018c:226316</guid><dc:creator>Matt's Blog</dc:creator><description /></item><item><title>re: Performance Tidbits</title><link>http://blogs.msdn.com/ricom/archive/2004/08/24/219751.aspx#229285</link><pubDate>Tue, 14 Sep 2004 10:38:00 GMT</pubDate><guid isPermaLink="false">91d46819-8472-40ad-a661-2c78acb4018c:229285</guid><dc:creator>M</dc:creator><description>Hey Rico&lt;br&gt;&lt;br&gt;Got anything to add in response to Nicholas' comments?&lt;br&gt;&lt;br&gt;Is this perhaps the reason why .NET methods were made non-virtual by default, while Java methods are virtual by default?&lt;br&gt;&lt;br&gt;M.</description></item><item><title>re: Performance Tidbits</title><link>http://blogs.msdn.com/ricom/archive/2004/08/24/219751.aspx#229364</link><pubDate>Tue, 14 Sep 2004 14:28:00 GMT</pubDate><guid isPermaLink="false">91d46819-8472-40ad-a661-2c78acb4018c:229364</guid><dc:creator>Rico Mariani</dc:creator><description>I wasn't party to either decision so it's hard for me to say why the default is what it is.  I surely can't speak to my competitors choice.  For C# that's really a question for Anders though, not me.&lt;br&gt;&lt;br&gt;I don't think we'd make such a choice on the basis of what our inliner happens to do at any given moment.  They probably thought it best to align with C++ on the matter.&lt;br&gt;&lt;br&gt;</description></item><item><title>re: Performance Tidbits</title><link>http://blogs.msdn.com/ricom/archive/2004/08/24/219751.aspx#231062</link><pubDate>Fri, 17 Sep 2004 23:40:00 GMT</pubDate><guid isPermaLink="false">91d46819-8472-40ad-a661-2c78acb4018c:231062</guid><dc:creator>Nicholas Allen</dc:creator><description>M-&lt;br&gt;&lt;br&gt;It probably wasn't an impact on the decision in Java.  It was several years in before JIT compilers became a regular feature, and several more before these kinds of aggressive optimizations were being made.  Of course, the idea of a JIT is not recent so Sun may have had it in mind, although I personally don't think that was the case.  You might be able to find more information about that design decision by asking Gosling, or examining the prototype Oak language.&lt;br&gt;&lt;br&gt;The reverse statement also might be true: the decision to make methods virtual by default may have spurred Sun to invest more resources into their VM and optimize that case.&lt;br&gt;&lt;br&gt;As for C#, I think Rico says it well.</description></item><item><title>A performance tidbits reference</title><link>http://blogs.msdn.com/ricom/archive/2004/08/24/219751.aspx#244737</link><pubDate>Tue, 19 Oct 2004 23:46:00 GMT</pubDate><guid isPermaLink="false">91d46819-8472-40ad-a661-2c78acb4018c:244737</guid><dc:creator>Rico Mariani's WebLog</dc:creator><description /></item><item><title>A performance tidbits reference</title><link>http://blogs.msdn.com/ricom/archive/2004/08/24/219751.aspx#244741</link><pubDate>Tue, 19 Oct 2004 23:51:00 GMT</pubDate><guid isPermaLink="false">91d46819-8472-40ad-a661-2c78acb4018c:244741</guid><dc:creator>Rico Mariani's WebLog</dc:creator><description /></item><item><title>A performance tidbits reference</title><link>http://blogs.msdn.com/ricom/archive/2004/08/24/219751.aspx#244754</link><pubDate>Wed, 20 Oct 2004 00:05:00 GMT</pubDate><guid isPermaLink="false">91d46819-8472-40ad-a661-2c78acb4018c:244754</guid><dc:creator>dotRob</dc:creator><description /></item></channel></rss>