<?xml version="1.0" encoding="UTF-8" ?>
<?xml-stylesheet type="text/xsl" href="http://blogs.msdn.com/utility/FeedStylesheets/rss.xsl" media="screen"?><rss version="2.0" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:slash="http://purl.org/rss/1.0/modules/slash/" xmlns:wfw="http://wellformedweb.org/CommentAPI/"><channel><title>Windows Performance Blog : benchmarking</title><link>http://blogs.msdn.com/winperf/archive/tags/benchmarking/default.aspx</link><description>Tags: benchmarking</description><dc:language>en-US</dc:language><generator>CommunityServer 2.1 SP1 (Build: 61025.2)</generator><item><title>Measuring Performance in Windows Vista (configuring the system)</title><link>http://blogs.msdn.com/winperf/archive/2007/01/30/measuring-perf-in-windows-vista-configuring-the-system.aspx</link><pubDate>Tue, 30 Jan 2007 22:58:00 GMT</pubDate><guid isPermaLink="false">91d46819-8472-40ad-a661-2c78acb4018c:1558781</guid><dc:creator>mayers</dc:creator><slash:comments>1</slash:comments><comments>http://blogs.msdn.com/winperf/comments/1558781.aspx</comments><wfw:commentRss>http://blogs.msdn.com/winperf/commentrss.aspx?PostID=1558781</wfw:commentRss><description>Ok, so it's been a long 'couple of days' since my last post but I haven't disappeared, completely. I'm back from vacation, caught up on email, and starting back into writing about Windows performance. One thing I've noticed over the past week or so is that with the pending General Availability of Windows Vista, we're starting to see more benchmarking of our new OS. 
&lt;P&gt;One in particular, &lt;A href="http://www.tomshardware.com/2007/01/29/xp-vs-vista/" mce_href="http://www.tomshardware.com/2007/01/29/xp-vs-vista/"&gt;over at Tom's Hardware&lt;/A&gt;, caught my eye; as a good example of how to do measurement of the OS. Overall, Tom's found that the OS performed pretty well at an application level and that some of the intensive 3D graphics operations weren't quite there, yet. What impressed me most was their methodology. They really understand that some of the changes that we made in Windows Vista to help the user can really wreak havoc with benchmarks. &lt;/P&gt;
&lt;P&gt;To quote their article: &lt;/P&gt;
&lt;BLOCKQUOTE&gt;
&lt;P&gt;Knowing that Windows Vista has its SuperFetch feature, it is important to set up your test system to receive maximum performance that is reproducible... Vista learned about our preferred applications: Microsoft Office Outlook launched noticeably faster, and Skype launched almost instantly ...they are available much more quickly by relocating frequently accessed files from the slow hard drive into the quicker main memory.&lt;/P&gt;&lt;/BLOCKQUOTE&gt;
&lt;P&gt;&lt;A class="" title=benchmarking_checklist name=benchmarking_checklist&gt;&lt;/A&gt;Tom's understands that Windows Vista adapts to the user so, to make sure that their results were reproducible, they followed a checklist to ensure maximum reproducibility of the results. We've been working with several partners&amp;nbsp;on how to accurately benchmark Windows Vista and as a result have developed a set of guidelines that we use for&amp;nbsp;our own in-house methodology. These aren't necessarily recommended for everyday use but will ensure that your benchmarks run smoothly.&amp;nbsp;&amp;nbsp;We'll be publishing these guidelines as a whitepaper (which is currently in editing) but I'd like to give the blog readers a preview.&lt;/P&gt;
&lt;P mce_keep="true"&gt;The easiest way to approach Microsoft's recommendations by breaking them down into three groups - configuring the system, preparing the system and preparing the workload. Today I'll discuss bits involved in configuring the system:&lt;/P&gt;
&lt;P&gt;&lt;B&gt;Prepare graphics&lt;/B&gt; &lt;/P&gt;
&lt;P&gt;We recommend turning off animations (of windows, menus, dialog boxes, etc) that introduce artificial delays. I love watching a menu slide down into place instead of popping up onto the screen but, in a benchmarking scenario, these animations could impact accurate measurement. This is especially true if you are timing a sequence of events. Although they improve user experience, animations may erroneously impact your responsiveness measurements. &lt;/P&gt;
&lt;P&gt;One place where Tom's differs from our methodolgy was in using Aero Glass. At Microsoft, we believe that glass is efficient and,&amp;nbsp;recommend leaving all Aero features (except animation) at the system default settings. If the system enables compositing, keep it on. The OS wants to enable transparency? Then by all means, you should run a benchmark with transparency turned on.&lt;/P&gt;
&lt;P&gt;&lt;B&gt;Prepare User Account Control&lt;/B&gt;&lt;/P&gt;
&lt;P&gt;If your workload requires elevation, then UAC prompts can really create some hurdles. Put bluntly, UAC is not friendly to scripting (imagine if you could script clicking ‘Accept' on an elevation dialog - shudder). This promotes good security but, again, makes for benchmarking headaches.&lt;/P&gt;
&lt;P&gt;We don't recomend completely disabling the feature - just the prompts. We recommend going into the security policy manager and configuring the system to automatically accept UAC prompts for Admin group users. Thus, the user runs as a normal user most of the time, elevation still happens but the process is automated. We want benchmarks to measure any perf impact of the feature (which we believe to be negligible&amp;nbsp;:) ) but not be hampered by the prompts.&lt;/P&gt;
&lt;P&gt;&lt;B&gt;Turn off system restore&lt;/B&gt;&lt;/P&gt;
&lt;P&gt;This one is really your call - Windows Vista has the ability to roll files back to previous versions which has saved my clumsy bacon on more than one occasion. Although the impact on standard users is minimal, the work that system restore does in the background may affect the repeatability of benchmarks(this really&amp;nbsp;depends on the amount of disk activity).&amp;nbsp;The bottom line on this one is: if you disabled it for your Windows XP benchmarking, go ahead and do the same for Windows Vista.&lt;/P&gt;
&lt;P mce_keep="true"&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Well, I hope that this has been helpful in both understanding the steps that we take at on the perf team to prepare a system for benchmarking; we recommend doing the same for your own measurements. My next post will look at&amp;nbsp;how to prepare&amp;nbsp;a system&amp;nbsp;for actual benchmarking runs.&lt;/P&gt;
&lt;P mce_keep="true"&gt;‘till next time,&lt;BR&gt;-M&lt;/P&gt;&lt;img src="http://blogs.msdn.com/aggbug.aspx?PostID=1558781" width="1" height="1"&gt;</description><category domain="http://blogs.msdn.com/winperf/archive/tags/benchmarking/default.aspx">benchmarking</category><category domain="http://blogs.msdn.com/winperf/archive/tags/vista/default.aspx">vista</category></item><item><title>The results are in</title><link>http://blogs.msdn.com/winperf/archive/2006/12/14/the-results-are-in.aspx</link><pubDate>Thu, 14 Dec 2006 12:32:00 GMT</pubDate><guid isPermaLink="false">91d46819-8472-40ad-a661-2c78acb4018c:1287002</guid><dc:creator>mayers</dc:creator><slash:comments>9</slash:comments><comments>http://blogs.msdn.com/winperf/comments/1287002.aspx</comments><wfw:commentRss>http://blogs.msdn.com/winperf/commentrss.aspx?PostID=1287002</wfw:commentRss><description>&lt;P class=MsoNormal style="MARGIN: 0in 0in 0pt"&gt;&lt;?xml:namespace prefix = o ns = "urn:schemas-microsoft-com:office:office" /&gt;&lt;o:p&gt;&lt;FONT face=Calibri size=3&gt;&amp;nbsp;&lt;/FONT&gt;&lt;/o:p&gt;&lt;/P&gt;
&lt;P class=MsoNormal style="MARGIN: 0in 0in 0pt"&gt;&lt;FONT face=Calibri size=3&gt;As I said in yesterday’s introduction,&amp;nbsp;my job as an engineer on the Windows Vista team is to improve performance.&amp;nbsp;&amp;nbsp;I wanted to look at a study that measure a key area&amp;nbsp;that we&amp;nbsp;focused on&amp;nbsp;for Windows Vista – consistent responsiveness during the times that matter most to users &amp;nbsp;(when starting up their machine, after being idle, and when you are under the gun running tons of apps, etc.).&lt;/FONT&gt;&lt;/P&gt;
&lt;P class=MsoNormal style="MARGIN: 0in 0in 0pt"&gt;&lt;o:p&gt;&lt;FONT face=Calibri size=3&gt;&amp;nbsp;&lt;/FONT&gt;&lt;/o:p&gt;&lt;/P&gt;
&lt;P class=MsoNormal style="MARGIN: 0in 0in 0pt"&gt;&lt;FONT face=Calibri size=3&gt;To objectively measure how we did, I’ve been working with a company named Principled Technologies.&amp;nbsp; If you’ve been involved in the (admittedly somewhat niche) specialty of perf testing over the last decade you likely know the &lt;A class="" href="http://principledtechnologies.com/about/about.htm" target=_blank mce_href="http://principledtechnologies.com/about/about.htm"&gt;people&lt;/A&gt; if not the company.&amp;nbsp; We commissioned Principled Technologies to develop, run, and document the results of a set of tests that compare the performance of Windows Vista RTM and Windows XP on common business tasks. &amp;nbsp;Today they published their findings &lt;A class="" href="http://principledtechnologies.com/clients/reports/Microsoft/VistaXPBusResp.pdf" target=_blank mce_href="http://principledtechnologies.com/clients/reports/Microsoft/VistaXPBusResp.pdf"&gt;here&lt;/A&gt;.&amp;nbsp;I am, of course, really excited by the results.&amp;nbsp; &lt;/FONT&gt;&lt;/P&gt;
&lt;P class=MsoNormal style="MARGIN: 0in 0in 0pt"&gt;&lt;o:p&gt;&lt;FONT face=Calibri size=3&gt;&amp;nbsp;&lt;/FONT&gt;&lt;/o:p&gt;&lt;/P&gt;
&lt;P class=MsoNormal style="MARGIN: 0in 0in 0pt"&gt;&lt;FONT face=Calibri size=3&gt;Now you should read the whole report, but I wanted to talk a bit about their key findings:&lt;/FONT&gt;&lt;/P&gt;
&lt;P class=MsoNormal style="MARGIN: 0in 0in 0pt"&gt;&lt;o:p&gt;&lt;FONT face=Calibri size=3&gt;&amp;nbsp;&lt;/FONT&gt;&lt;/o:p&gt;&lt;/P&gt;
&lt;P class=MsoNormal style="MARGIN: 0in 0in 0pt 0.5in; TEXT-INDENT: -0.25in; mso-list: l0 level1 lfo1; tab-stops: list .5in"&gt;&lt;SPAN style="FONT-SIZE: 9pt; COLOR: navy; FONT-FAMILY: Wingdings; mso-fareast-font-family: Wingdings; mso-bidi-font-family: Wingdings"&gt;&lt;SPAN style="mso-list: Ignore"&gt;l&lt;SPAN style="FONT: 7pt 'Times New Roman'"&gt;&amp;nbsp; &lt;/SPAN&gt;&lt;/SPAN&gt;&lt;/SPAN&gt;&lt;FONT face=Calibri size=3&gt;“Windows Vista was noticeably more responsive after rebooting than Windows XP on several common business operations.”&lt;/FONT&gt;&lt;/P&gt;
&lt;P class=MsoNormal style="MARGIN: 0in 0in 0pt"&gt;&lt;o:p&gt;&lt;FONT face=Calibri size=3&gt;&amp;nbsp;&lt;/FONT&gt;&lt;/o:p&gt;&lt;/P&gt;
&lt;P class=MsoNormal style="MARGIN: 0in 0in 0pt"&gt;&lt;FONT size=3&gt;&lt;FONT face=Calibri&gt;As I alluded to, yesterday, superfetch is the key driver behind this. I want to point out, though, that 'after rebooting' can be seen as a proxy for lots of cold operations. Rebooting is just the easiest to reliably measure. The second bullet is:&lt;/FONT&gt;&lt;/FONT&gt;&lt;/P&gt;
&lt;P class=MsoNormal style="MARGIN: 0in 0in 0pt"&gt;&lt;o:p&gt;&lt;FONT face=Calibri size=3&gt;&amp;nbsp;&lt;/FONT&gt;&lt;/o:p&gt;&lt;/P&gt;
&lt;P class=MsoNormal style="MARGIN: 0in 0in 0pt 0.5in; TEXT-INDENT: -0.25in; mso-list: l0 level1 lfo1; tab-stops: list .5in"&gt;&lt;SPAN style="FONT-SIZE: 9pt; COLOR: navy; FONT-FAMILY: Wingdings; mso-fareast-font-family: Wingdings; mso-bidi-font-family: Wingdings"&gt;&lt;SPAN style="mso-list: Ignore"&gt;l&lt;SPAN style="FONT: 7pt 'Times New Roman'"&gt;&amp;nbsp; &lt;/SPAN&gt;&lt;/SPAN&gt;&lt;/SPAN&gt;&lt;FONT face=Calibri size=3&gt;“Overall, Windows Vista and Windows XP were roughly equally responsive on most test operations. Windows Vista was more responsive on some operations, and on those operations on which it was more responsive, Windows XP typically responded only a half a second or so faster.”&lt;/FONT&gt;&lt;/P&gt;
&lt;P class=MsoNormal style="MARGIN: 0in 0in 0pt"&gt;&lt;o:p&gt;&lt;FONT face=Calibri size=3&gt;&amp;nbsp;&lt;/FONT&gt;&lt;/o:p&gt;&lt;/P&gt;
&lt;P class=MsoNormal style="MARGIN: 0in 0in 0pt"&gt;&lt;FONT face=Calibri size=3&gt;This is great, especially since Windows Vista is doing considerably more out of the box (e.g. UAC, Defender, search indexing, etc.). One of the most interesting bits for me was their 3rd highlight... you can run Aero without guilt!&lt;/FONT&gt;&lt;/P&gt;
&lt;P class=MsoNormal style="MARGIN: 0in 0in 0pt"&gt;&lt;o:p&gt;&lt;FONT face=Calibri size=3&gt;&amp;nbsp;&lt;/FONT&gt;&lt;/o:p&gt;&lt;/P&gt;
&lt;P class=MsoNormal style="MARGIN: 0in 0in 0pt 0.5in; TEXT-INDENT: -0.25in; mso-list: l0 level1 lfo1; tab-stops: list .5in"&gt;&lt;SPAN style="FONT-SIZE: 9pt; COLOR: navy; FONT-FAMILY: Wingdings; mso-fareast-font-family: Wingdings; mso-bidi-font-family: Wingdings"&gt;&lt;SPAN style="mso-list: Ignore"&gt;l&lt;SPAN style="FONT: 7pt 'Times New Roman'"&gt;&amp;nbsp; &lt;/SPAN&gt;&lt;/SPAN&gt;&lt;/SPAN&gt;&lt;FONT face=Calibri size=3&gt;“Windows Vista Aero had little effect on the responsiveness of Windows Vista. Over 95 percent of the response-time differences between tests we ran with and without Vista Aero were under a tenth of a second, and all of the differences were under one second.”&lt;/FONT&gt;&lt;/P&gt;
&lt;P class=MsoNormal style="MARGIN: 0in 0in 0pt"&gt;&lt;o:p&gt;&lt;FONT face=Calibri size=3&gt;&amp;nbsp;&lt;/FONT&gt;&lt;/o:p&gt;&lt;/P&gt;
&lt;P class=MsoNormal style="MARGIN: 0in 0in 0pt"&gt;&lt;FONT face=Calibri size=3&gt;We put quite a bit of effort into making sure that the new visuals were as efficent as possible and it really paid off. &lt;/FONT&gt;&lt;/P&gt;
&lt;P class=MsoNormal style="MARGIN: 0in 0in 0pt" mce_keep="true"&gt;&amp;nbsp;&lt;/P&gt;
&lt;P class=MsoNormal style="MARGIN: 0in 0in 0pt"&gt;&lt;FONT face=Calibri size=3&gt;For the truly technical, as you would expect, the report lists exactly how PT developed and ran these tests.&amp;nbsp; The short answer is that they used a range of machines (laptops and desktops, 512M&amp;nbsp;to 2GB, mix of graphics cards &amp;amp; processors, high-end and bare minimum, etc). I encourage you to dig into the&amp;nbsp;report to learn more about the perf&amp;nbsp;of Windows Vista compared to XPSP2.&amp;nbsp; All in all, we were more consistent - better on cold and still doing well on warm; users can come to their machine and begin working, regardless of what state the box is in.&lt;/FONT&gt;&lt;/P&gt;
&lt;P class=MsoNormal style="MARGIN: 0in 0in 0pt"&gt;&lt;o:p&gt;&lt;FONT face=Calibri size=3&gt;&amp;nbsp;&lt;/FONT&gt;&lt;/o:p&gt;&lt;/P&gt;
&lt;P class=MsoNormal style="MARGIN: 0in 0in 0pt"&gt;&lt;FONT face=Calibri size=3&gt;Anyhow, in the next couple of posts I think that I'll focus on some of the things that you will run into when designing and running a performance evaluation like this one.&amp;nbsp;For example, one thing that PT did that I think is important is that they ran the same workload three times on each system before beginning their timed runs. &amp;nbsp;This put the system into a quiescent state by allowing SuperFetch to learn and tune itself for the work it would be facing - similar to what it does in the wild, for real users.&amp;nbsp; This is important to consider because otherwise you will get weird data that is much less repeatable.&lt;/FONT&gt;&lt;/P&gt;
&lt;P class=MsoNormal style="MARGIN: 0in 0in 0pt"&gt;&lt;o:p&gt;&lt;FONT face=Calibri size=3&gt;&amp;nbsp;&lt;/FONT&gt;&lt;/o:p&gt;&lt;/P&gt;
&lt;P class=MsoNormal style="MARGIN: 0in 0in 0pt"&gt;&lt;FONT face=Calibri size=3&gt;Let me also take a second to note that a lot of the advice I am going to be giving in these next posts will be targeted specifically at performance &lt;EM&gt;benchmarking&lt;/EM&gt;.&amp;nbsp; I am &lt;STRONG&gt;not &lt;/STRONG&gt;talking about how to maximize the performance of the everyday system (we'll do that in later posts).&amp;nbsp; Perf tests are almost always automated to ensure consistency and repeatability.&amp;nbsp; So I’ll focus on the benchmarking impact of some key features in Windows Vista (such as SuperFetch and UAC)&amp;nbsp; which may not generalize to all situations.&amp;nbsp; &lt;/FONT&gt;&lt;/P&gt;
&lt;P class=MsoNormal style="MARGIN: 0in 0in 0pt"&gt;&lt;FONT face=Calibri size=3&gt;&lt;/FONT&gt;&amp;nbsp;&lt;/P&gt;
&lt;P class=MsoNormal style="MARGIN: 0in 0in 0pt"&gt;&lt;FONT face=Calibri size=3&gt;Well, now that we've gotten all that straight, go read the report, and come back in a couple of days. In my next post I'll start talking about preparing a system for accurate, repeatable benchmarking.&lt;/FONT&gt;&lt;/P&gt;
&lt;P class=MsoNormal style="MARGIN: 0in 0in 0pt"&gt;&lt;o:p&gt;&lt;FONT face=Calibri size=3&gt;&amp;nbsp;&lt;/FONT&gt;&lt;/o:p&gt;&lt;/P&gt;
&lt;P class=MsoNormal style="MARGIN: 0in 0in 0pt"&gt;&lt;FONT face=Calibri size=3&gt;-M&lt;/FONT&gt;&lt;/P&gt;&lt;img src="http://blogs.msdn.com/aggbug.aspx?PostID=1287002" width="1" height="1"&gt;</description><category domain="http://blogs.msdn.com/winperf/archive/tags/benchmarking/default.aspx">benchmarking</category></item></channel></rss>