Yesterday I started a series of posts on how the use the CLRProfiler for the .Net Compact Framework.  The first post contained the basic information you need to get started.  I described how to install the profiler, launch an application on the device, and collect profiling data.

In order to direct the discussion, I've written a sample application that exhibits a performance problem that is surprisingly easy to fall into.  Throughout these posts I'll show you how to use the profiler to diagnose the problem.  To refresh your memory, the sample application is a basic game and the performance problem is that the main windows paints way too slowly.

After I stopped profiling the game in the first post, following summary page was displayed.   

In this post I'll use some of the histograms to begin diagnosing our performance problem.


The first thing that stands out at me when looking at the summary form is the amount of managed data I'm creating.  While profiling the painting portion of my application I generated over 6MB of managed objects.  That's clearly way too much for a relatively simple operation like painting my main window.  My first step in determining what's going on is to get some basic statistics about the objects my application is using.  For example, I'm interested in which objects I'm creating, how many of them there are and how long they live.  This data can be obtained by looking at some of the histograms the profiler offers.

I can choose to view a histogram for all objects created as my application ran or only for those objects that were in the GC heap when my application exited.  In my scenario I need to look at all objects.  If I were to only look at the objects alive at the end of the run I may miss some important trend that occurred earlier on.

Clicking the "Histogram" button next to the "Allocated Bytes" value displays the following graph: 

The histogram form has two panes. The pane on the right describes how many instances of each type of object were created and the total size of those instances.  The pane on the left graphs type instances by size.  The color coding next to the types in the right pane matches the bars in the left pane which show the relative amounts of objects created.

A quick glance at this form helps narrow my suspicions about what's causing my performance issue.  As you can see, about 97% of the objects I created were of type Box.Block as indicated by the red box on the right hand pane and the red bar in the left hand pane.  I can also see that each instance of Box.Block is relatively small at an average size of 136 bytes (see the right hand pane).


Who Allocated all those Objects?

Now that I know the majority of my objects are instances of Box.Block, I'd like to see where in my application those instances are getting created. 

To determine the source of my allocations I can right-click on the bar that represents Box.Block in the histogram and select "Show Who Allocated" (the bar turns black when selected):

Doing so brings up a window referred to as an Allocation Graph:

The Allocation Graph traces the flow of every call that allocated an instance of Box.Block.  I typically interpret this graph starting with the rightmost node.  This node represents all instances of Box.Block in the system.  Stepping back one level to the left we see two nodes representing methods that created instances of Box.Block: Form1.RotateGameBlocks and Form1.InitializeGameBlocks. The data in these nodes tell us that 75% of the Blocks were created in RotateGameBlocks and 25% were created in InitializeGameBlocks.  Notice that the width of the lines connecting the nodes represents the percentage of instances that call created.

Now that I know where my objects are coming from I can dig into my code to see what's going on. 

In some scenarios, the information we've learned so far may be all that we need to fix the problem.  However, there are a few more pieces of data that may be required in some cases.  For example, it may be useful to know the times at which Blocks were created and destroyed.  Also, if RotateGameBlocks and InitializeGameBlocks are long, complicated methods, we may need to know the exact calls within those methods that caused the allocations.  I'll describe how to get this information in future posts.



This posting is provided "AS IS" with no warranties, and confers no rights