Notes on comments.
Welcome to our blog dedicated to the engineering of Microsoft Windows 7
We’re busy going through tons of telemetry from the many people that have downloaded and installed the Windows 7 beta around the world. We’re super excited to see the excitement around kicking the tires. Since most folks on the beta are well-versed in the hardware they use and very tuned into the choices they make, we’ve received a few questions about the Windows Experience Index (WEI) in Windows 7 and how that has been changed and improved in Windows 7 to take into account new hardware available for each of the major classes in the metric. In this post Michael Fortin returns to dive into the engineering details of the WEI.
The WEI was introduced in Windows Vista to provide one means across PCs to measure the relative performance of key hardware components. Like any index or benchmark, it is best used as a relative measure and should not be used to compare one measure to another. Unlike many other measures, the WEI merely measures the relative capability of components. The WEI only runs for a short time and does not measure the interactions of components under a software load, but rather characteristics or your hardware. As such it does not (nor cannot) measure how a system will perform under the your own usage scenarios. Thus the WEI does not measure performance of a system, but merely the relative hardware capabilities when running Windows 7.
We do want to caution folks in trying to generalize an “absolute” WEI as necessary for a given individual. We each have different tolerances or more importantly expectations for how a PC should perform and the same WEI might mean very different things to different individuals. To personalize this, I do about 90% of my work on a PC with a WEI of 2.0, primarily driven by the relatively low score for the gaming graphics component on my very low cost laptop. I run Outlook (with ~2GB of email), Internet Explorer (with a dozen tabs), Excel (with longs list of people on the development team), PowerPoint, Messenger (with video), and often I am running one of several LOB applications written in .NET. I feel with this type of workload and a PC with Windows 7 and that WEI my own brain and fingers continues to be my “bottleneck”. At the other end of the spectrum is my holiday gift machine which is a 25” all-in-one with a WEI of 5.1 (though still limited by gaming graphics, with subscores of 7.2, 7.2, 6.2, 5.1, 5.9). This machine runs Windows 7 64-bit and I definitely don’t keep it very busy even though I run MediaCenter in a window all the time, have a bunch of desktop gadgets, and run the PC as our print server (I use about 25% of available RAM and the CPU almost never gets above 10%).
The overall Windows Experience Index (WEI) is defined to be the lowest of the five top-level WEI subscores, where each subscore is computed using a set of rules and a suite of system assessment tests. The five areas scored in Windows 7 are the same as they were in Vista and include:
Though the scoring areas are the same, the ranges have changed. In Vista, the WEI scores ranged from 1.0 to 5.9. In Windows 7, the range has been extended upward to 7.9. The scoring rules for devices have also changed from Vista to reflect experience and feedback comparing closely rated devices with differing quality of actual use (i.e. to make the rating more indicative of actual use.) We know during the beta some folks have noticed that the score changed (relative to Vista) for one or more components in their system and this tuning, which we will describe here, is responsible for the change.
For a given score range, we hope our customers will be able to utilize some general guidelines to help understand the experiences a particular PC can be expected to deliver well, relatively speaking. These Vista-era general guidelines for systems in the 1.0, 2.0, 3.0, 4.0 and 5.0 ranges still apply to Windows 7. But, as noted above, Windows 7 has added levels 6.0 and 7.0; meaning 7.9 is the maximum score possible. These new levels were designed to capture the rather substantial improvements we are seeing in key technologies as they enter the mainstream, such as solid state disks, multi-core processors, and higher end graphics adapters. Additionally, the amount of memory in a system is a determining factor.
For these new levels, we’re working to add guidelines for each level. As an example for gaming users, we expect systems with gaming graphics scores in the 6.0 to 6.9 range to support DX10 graphics and deliver good frames rates at typical screen resolutions (like 40-50 frames per second at 1280x1024). In the range of 7.0 to 7.9, we would expect higher frame rates at even higher screen resolutions. Obviously, the specifics of each game have much to do with this and the WEI scores are also meant to help game developers decide how best to scale their experience on a given system. Graphics is an area where there is both the widest variety of scores readily available in hardwaren and also the widest breadth of expectations. The extremes at which CAD, HD video, photography, and gamers push graphics compared to the average business user or a consumer (doing many of these same things as an avocation rather than vocation) is significant.
Of course, adding new levels doesn’t explain why a Vista system or component that used to score 4.0 or higher is now obtaining a score of 2.9. In most cases, large score drops will be due to the addition of some new disk tests in Windows 7 as that is where we’ve seen both interesting real world learning and substantial changes in the hardware landscape.
With respect to disk scores, as discussed in our recent post on Windows Performance, we’ve been developing a comprehensive performance feedback loop for quite some time. With that loop, we’ve been able to capture thousands of detailed traces covering periods of time where the computer’s current user indicated an application, or Windows, was experiencing severe responsiveness problems. In analyzing these traces we saw a connection to disk I/O and we often found typical 4KB disk reads to take longer than expected, much, much longer in fact (10x to 30x). Instead of taking 10s of milliseconds to complete, we’d often find sequences where individual disk reads took many hundreds of milliseconds to finish. When sequences of these accumulate, higher level application responsiveness can suffer dramatically.
With the problem recognized, we synthesized many of the I/O sequences and undertook a large study on many, many disk drives, including solid state drives. While we did find a good number of drives to be excellent, we unfortunately also found many to have significant challenges under this type of load, which based on telemetry is rather common. In particular, we found the first generation of solid state drives to be broadly challenged when confronted with these commonly seen client I/O sequences.
An example problematic sequence consists of a series of sequential and random I/Os intermixed with one or more flushes. During these sequences, many of the random writes complete in unrealistically short periods of time (say 500 microseconds). Very short I/O completion times indicate caching; the actual work of moving the bits to spinning media, or to flash cells, is postponed. After a period of returning success very quickly, a backlog of deferred work is built up. What happens next is different from drive to drive. Some drives continue to consistently respond to reads as expected, no matter the earlier issued and postponed writes/flushes, which yields good performance and no perceived problems for the person using the PC. Some drives, however, reads are often held off for very lengthy periods as the drives apparently attempt to clear their backlog of work and this results in a perceived “blocking” state or almost a “locked system”. To validate this, on some systems, we replaced poor performing disks with known good disks and observed dramatically improved performance. In a few cases, updating the drive’s firmware was sufficient to very noticeably improve responsiveness.
To reflect this real world learning, in the Windows 7 Beta code, we have capped scores for drives which appear to exhibit the problematic behavior (during the scoring) and are using our feedback system to send back information to us to further evaluate these results. Scores of 1.9, 2.0, 2.9 and 3.0 for the system disk are possible because of our current capping rules. Internally, we feel confident in the beta disk assessment and these caps based on the data we have observed so far. Of course, we expect to learn from data coming from the broader beta population and from feedback and conversations we have with drive manufacturers.
For those obtaining low disk scores but are otherwise satisfied with the performance, we aren’t recommending any action (Of course the WEI is not a tool to recommend hardware changes of any kind). It is entirely possible that the sequence of I/Os being issued for your common workload and applications isn’t encountering the issues we are noting. As we’ve said, the WEI is a metric but only you can apply that metric to your computing needs.
Earlier, I made note of the fact that our new levels, 6 and 7, were added to recognize the improved experiences one might have with newer hardware, particularly SSDs, graphics adapters, and multi-core processors. With respect to SSDs, the focus of the newer tests is on random I/O rates and their avoidance of the long latency issues noted above. As a note, the tests don’t specifically check to see if the underlying storage device is an SSD or not. We run them no matter the device type and any device capable of sustaining very high random I/O rates will score well.
For graphics adapters, both DX9 and DX10 assessments can be run now. In Vista, the tests were specific to DX9. To obtain scores in the 6 or 7 ranges, a graphics adapter must obtain very good performance scores, support DX10 and the driver must be a WDDM 1.1 driver (which you might have noticed are being downloaded in beta during the Windows 7 beta). For WDDM 1.0 drivers, only the DX9 assessments will be run, thus capping the overall score at 5.9.
For multi-core processors, both single threaded and multi-threaded scenarios are run. With levels 6 and 7, we aim to indicate that these systems will be rarely CPU bound for typical use and quite suitable for demanding processing tasks and multi-tasking. As examples, we anticipate many quad core processors will be able to score in the high 6 to low 7 ranges, and 8 core systems to be able to approach 7.9. The scoring has taken into account the very latest micro-processors available.
For many key hardware partners, we’ve of course made available additional details on the changes and why they were made. We continue to actively work with them to incorporate appropriate feedback.
@david1948, So, you don't condone it in Windows 7, but you'll condone it in Vista and XP? You do realize WMP is set to update media information by default when you use the Express Setup, right? It's not just happening on 7.
If you're so protective of your files, perhaps you should make them all "read-only".
The disk test caps are entirely counterproductive to the overall WEI score - my system can still easily handle 5.0 WEI score applications, but the arbitrary disk test caps rate my RAID 0 array at 3.0, which sets my score (as displayed in the Games Explorer) at 3.0.
I know full well to ignore the WEI, but this is going to be more than a little confusing to the intended target of the WEI system.
I'm glad to see transparency is forthcoming, I assume this is because it is still under works.
If and when the details are revealed, I highly suggest it be done within windows through help and support or some other internal method, rather than a blog like this. That kind of information should be a click away from the actual score.
I think it would be far more interesting if there was more detailed information given rather than a general explanation of how scores are reached. For instance, if both memory capacity and speed have an effect on the score, rather than just saying that, it would be nice to see on some sort of advanced details dialog the measured read/write speeds, latency, etc.
At first, I wondered why would you use a scale that ends in 5.9? Just reading people complain about having their HDD score drop helps me to understand the purpose of this scale. The idea was to increase the top end according to hardware available at the time. If they made it on a scale to ten, then every update to WEI would result in a drop in score for the same hardware, and otherwise rational people would be crying.
I am particularly excited about the adjustment in HDD scores to reflect real world results (besides some oddness with write caching). People might helped to realize that the biggest upgrade you can get for a relatively modern system is a SSD of good repute.
I do not think I have come across any software that depicts minimum WEI values for proper operation besides Microsoft's general guidelines. I wonder if there are any plans to help get this introduced in the minimum/recommended requirements specifications section for new software.
"Some drives, however, reads are often held off for very lengthy periods as the drives apparently attempt to clear their backlog of work and this results in a perceived “blocking” state or almost a “locked system”"
I can only confirm this. Currently I'm using a Vista SP1 system with two Seagate 400 GB drives and huge amounts of Outlook mail (4-5 GB). These drives get a WEI rating of 5.3 (very good).
When I work inside Outlook deleting, moving etc. mails the program regularly locks up completely and the HDD led flashes constantly. This makes the whole computer seem to be very slow.
"we synthesized many of the I/O sequences and undertook a large study on many, many disk drives, including solid state drives"
It would be great if you published this study :-)
Gaming graphics: 4.9
HDD with Write Caching turned off: 5.4
:O Please 'correct' this. Or should I turn off write caching? Will have to re-test it now using Iometer and Sandra.
Also for HDD score, do you look at the drive RPM while rating it?
On my Seagate 500GB (7200.10) I would get A Vista WEI score of 5.3... However I knew from daily use that my HDD was rather sluggish and I couldn't understand why it had such a high WEI score.
I was pleasantly disappointed that Windows 7 rated my HDD as a measly 2.9.
I'll now feel more confidant in purchasing a HDD based on the Windows 7 WEI score rather than the Vista WEI score.
I also want to agree with an eariler commentor that said it would be more beneficial to make the top WEI score a 10 (or 9.9). This will be more intuitive to the casual user that happens across his WEI score. It should also be required that future software depict both a Vista WEI score and a Windows 7 WEI score. That will help reinforce the idea that the scores are calculated differently.
Toshiba satellite with MK1637GSX
Vista WEI: 4.8
Windows 7 WEI: 2.0
No difference in this rating based on write caching.
Of course why are we even worrying about this? The Windows Experience Index can be altered to show any wonderful number.
Here have a look: http://weblogs.asp.net/mikedopp/archive/2008/12/25/is-your-windows-experience-index-lying-to-you-windows-experience-index-editing.aspx
This works for all versions of vista including Windows 7 and if you get really creative Windows xp as the app can be added to xp yes.
Don't forget the Buffer overflow error when using this tool. http://www.nabble.com/Windows-Vista-winsat.exe-Integer-Overflow-td16363210.html
All good times.
Q9300@3.4, 4 GB RAM@1066, ATI 4870
Gaming graphics: 6.8
When I first saw those scores I’ve thought, ah seems they already considered faster systems, good to know. Especially on Ram and Hdd with a score of 5.9 it seems that DDR3 and SSD drives are already considered. Hence the WEI makes quite sense to me.
I have been testing Windows 7 on a partitoned Hard drive. of 25GB When I run the test it give me a 1.0 on my (Data Transer rate) on my Vista OS with same hard drive it gives me a 5.3 (Dual Boot)
Now the only thing I could come up with was that it was a partitoned drive... and it was only 25GB out of 120GB wich might have caused it to think the data rate is slow. But it is a question that I can not find a solution
Anybody have an idea?
Well lets say that vista WEI is incorrect for the hard drives score, then why a lot slower and older hard drives based on IDE interface gain much higher score in Windows 7 WEI than SATAII based drives, when the performance of the IDE ones is clearly lower?
Also i can agree to lower the scores for hard disk drives in favor of SSD, but only in a condition when Windows 7 is released to manifacturers that we can buy lets say 500 GB SSD for a 150-250$ which is far from possible.
IMO there are few bugs in Windows 7 WEI scoring system which will be worked on and fixed in future builds.
Gigabyte MA78GM-S2H mobo (ATI HD3200 DX10 WDDM 1.1) with AMD 4850e processor and 2 GB ddr 800 memory gets a Windows Experience Index score of 3,5 for aero but my Asus P5KPL-MA (Intel GMA3100 DX9) mobo with Intel 5200E processor and 2GB ddr 800 geheugengets a score of 4.1 for aero.
Hello there, the most experience I hope to be improved in Windows 7 is, please don't jump any window or dialog to the front and bring the input focus when I didn't allowed it!
This is really an evil feature, specially when I'm inputting and a window with keyboard shortcuts jumps out, one of these shortcuts will instantly occurred and I even don't have time to see what's happened! It's really causing misoperations.
And again, DON'T jump any window to the front and bring the input focus when I didn't allowed it please!
Is here is correct place to post this?
"Well lets say that vista WEI is incorrect for the hard drives score, then why a lot slower and older hard drives based on IDE interface gain much higher score in Windows 7 WEI than SATAII based drives, when the performance of the IDE ones is clearly lower?"
Because those slower IDE hard drives don't suffer the same crippling performance issue, and are therefore actually faster in some situations. Besides, IDE hard drives aren't generally all that slow as the bottleneck is in the platter, not the interface.