Ok, so it's been a long 'couple of days' since my last post but I haven't disappeared, completely. I'm back from vacation, caught up on email, and starting back into writing about Windows performance. One thing I've noticed over the past week or so is that with the pending General Availability of Windows Vista, we're starting to see more benchmarking of our new OS.

One in particular, over at Tom's Hardware, caught my eye; as a good example of how to do measurement of the OS. Overall, Tom's found that the OS performed pretty well at an application level and that some of the intensive 3D graphics operations weren't quite there, yet. What impressed me most was their methodology. They really understand that some of the changes that we made in Windows Vista to help the user can really wreak havoc with benchmarks.

To quote their article:

Knowing that Windows Vista has its SuperFetch feature, it is important to set up your test system to receive maximum performance that is reproducible... Vista learned about our preferred applications: Microsoft Office Outlook launched noticeably faster, and Skype launched almost instantly ...they are available much more quickly by relocating frequently accessed files from the slow hard drive into the quicker main memory.

Tom's understands that Windows Vista adapts to the user so, to make sure that their results were reproducible, they followed a checklist to ensure maximum reproducibility of the results. We've been working with several partners on how to accurately benchmark Windows Vista and as a result have developed a set of guidelines that we use for our own in-house methodology. These aren't necessarily recommended for everyday use but will ensure that your benchmarks run smoothly.  We'll be publishing these guidelines as a whitepaper (which is currently in editing) but I'd like to give the blog readers a preview.

The easiest way to approach Microsoft's recommendations by breaking them down into three groups - configuring the system, preparing the system and preparing the workload. Today I'll discuss bits involved in configuring the system:

Prepare graphics

We recommend turning off animations (of windows, menus, dialog boxes, etc) that introduce artificial delays. I love watching a menu slide down into place instead of popping up onto the screen but, in a benchmarking scenario, these animations could impact accurate measurement. This is especially true if you are timing a sequence of events. Although they improve user experience, animations may erroneously impact your responsiveness measurements.

One place where Tom's differs from our methodolgy was in using Aero Glass. At Microsoft, we believe that glass is efficient and, recommend leaving all Aero features (except animation) at the system default settings. If the system enables compositing, keep it on. The OS wants to enable transparency? Then by all means, you should run a benchmark with transparency turned on.

Prepare User Account Control

If your workload requires elevation, then UAC prompts can really create some hurdles. Put bluntly, UAC is not friendly to scripting (imagine if you could script clicking ‘Accept' on an elevation dialog - shudder). This promotes good security but, again, makes for benchmarking headaches.

We don't recomend completely disabling the feature - just the prompts. We recommend going into the security policy manager and configuring the system to automatically accept UAC prompts for Admin group users. Thus, the user runs as a normal user most of the time, elevation still happens but the process is automated. We want benchmarks to measure any perf impact of the feature (which we believe to be negligible :) ) but not be hampered by the prompts.

Turn off system restore

This one is really your call - Windows Vista has the ability to roll files back to previous versions which has saved my clumsy bacon on more than one occasion. Although the impact on standard users is minimal, the work that system restore does in the background may affect the repeatability of benchmarks(this really depends on the amount of disk activity). The bottom line on this one is: if you disabled it for your Windows XP benchmarking, go ahead and do the same for Windows Vista.

 

Well, I hope that this has been helpful in both understanding the steps that we take at on the perf team to prepare a system for benchmarking; we recommend doing the same for your own measurements. My next post will look at how to prepare a system for actual benchmarking runs.

‘till next time,
-M