In my last entry, I suggested the optimization strategy of pre-rasterizing vector images in order to improve performance. With the example presented, we realized a fairly substantial performance bump - an 1,802% increase in performance.

This technique has applications beyond hand-written vector code, such as the rendering code written to create the navigation buttons. On Windows, the predominant file format to represent vector images is the metafile - either a wmf or an emf. But what exactly is a metafile? Along with headers, handles, and a palette, a metafile is a series of metafile records that map directly to GDI calls. When you display a metafile, you are essentially "playing back" these GDI calls one after the other. So, in the same way that we were able to improve the performance of our C# GDI+ calls by pre-rasterizing, we could improve the performance of our metafile rendering by pre-rasterizing.

Have we found a magic bullet? Can we really promise an increase of ~2,000% performance across the board using this technique? Unfortunately, the answer is a definitive no. As with so many techniques for improving performance, the answer is always "it depends," and you need to validate any performance work you do with measurement.

In the example we used in my last post, we were placing a series of alpha blended objects on top of each other. Since monitors don't have 256 layers of pixels with varying degrees of translucency, alpha blending must be accomplished by taking the original color and "mixing" it with the color you are placing on top of it with an alpha channel, and with our various gradient alpha channels this must be done pixel by pixel. Obviously, that's quite a bit of math, and that takes time. Radial gradient blends also tend to be computationally intensive. Pre-rasterizing our images was quite a bit of help. We were also rendering something fairly small - moving pre-rendered images around takes more time as the size of the image grows.

To try and understand these performance characteriztics better, let's begin with a metafile. I used a metafile from the Microsoft Office clip-art library of a house, which looked relatively complex but doesn't appear to use alpha channels (although I did not dissect the metafile to verify this).

Rendering a metafile from Office Clip-Art Collection

Rendering this image at 178 x 153 pixels, real-time rasterization required an average of 2.47 milliseconds, while rendering a pre-rasterized bitmap required an average of 0.35 milliseconds. While not quite as significant as our navigation buttons, that is still a 605% increase in performance. Should we pat ourselves on the back and say job well done? Not just yet...

When we compare rendering times at 1073 x 918 for this image, we observe that our real-time rendering requires an average of 12.24 milliseconds, while our pre-rasterized rendering requires an average of 19.19 milliseconds. Wait ... what happened? Our rendering speed decreased by 36%?


Well, this tells us that the performance characteristics of this technique are related to the size of the image we are trying to render. However, with only two data points, the best choice for a trend would be linear, which we really wouldn't expect. Rather, we would expect rendering speed to be a second order polynomial, given that we are varying both width and height. This means that we need more measurements, which we can collect and graph. Let's take a look at the chart:

Real-Time vs. Pre-Rasterized Rendering Performance: Size (as proportion of original) vs. Rendering Speed (in ms)

As we predicted, with additional data points we can find a trend line for real time rendering (with an R2 value of 0.9908598) of:

y = 0.0698718 x2 + 0.0569021 x + 2.6299394

Similarly, we can find a trend line for pre-rasterized rendering (with an R2 value of 0.9995371) of:

y = 0.1357692 x2 + 0.2795874 x - 0.2465455

Using this data, we could make an informed decision. We know that our pre-rasterized rendering performs faster than our real-time rendering at small sizes, but as the size gets larger, the pre-rasterized implementation becomes slower more quickly. This data would be even better if we had samples from additional hardware - are we completely sure that hardware, drivers, and memory configuration of a single graphics card apply to everyone's graphics card? With some additional machine data, we can then make an educated decision. If we will never need to render our image above a particular size, and if that size is going to be predominantly smaller than the crossover point in our chart, then pre-rasterizing may be a good decision.

So, we don't want to just arbitrarily pre-rasterize everything. We want to understand the performance characteristics of pre-rasterizing the particular image we are rendering at the particular sizes that we reasonably expect to be drawing to, hopefully on a fairly representative selection of hardware. In some cases, pre-rasterization will be the hands-down winner. In others, it will be much closer. In still others, it will be the wrong approach. The data can help us decide.

If it's really close, then there are other factors to consider. Rendering in real time tends to make for smaller and more maintainable code. That is almost always important. There are trade-offs everywhere.