Notes on comments.
Welcome to our blog dedicated to the engineering of Microsoft Windows 7
We've talked some about performance in this blog and recently many folks have been blogging and writing about the topic as well. We thought it would be a good time to offer some more behind the scenes views on how we have been working on and thinking about performance because it such an interesting topic for the folks reading this blog. Of course I've been using some pretty low-powered machines lately so performance is top of mind for me as well. But for fun I am writing this on my early holiday present--my new home machine is a 64-bit all-in-one desktop machine with a quad core CPU, discrete graphics, 8GB of memory, and hardware RAID all running a pretty new build of Windows 7 upgraded as soon as I finished the out of box experience. Michael Fortin and I authored this post. --Steven
Our beta isn’t even out the door yet and many are already dusting off their benchmarks and giving them a whirl. As a reminder, we are encouraging folks to hold on benchmarking our pre-production builds. Yet we’ve come to expect it will happen, and we realize it will lead many to conclude one thing or another, and at the same time we appreciate those of you who take the time to remind folks of the pre-ship status of the code. Nevertheless we’re happy that many are seeing good results thus far. We're not yet as happy as we believe we will be when we finish the product as we continue to work on all the fundamental capabilities of Windows 7 as well as all the new features folks are excited about.
Writing about performance in this blog is nearly as tricky as measuring it. As we've seen directional statements are taken further than we might intend and at the same time there are seemingly infinite ways to measure performance and just as many ways to perceive the same data. Ultimately, performance is something each individual feels is right--whether than means adequate or stellar might vary scenario to scenario, individual to individual. Some of the mail we've received has been clear about performance:
You can also see through some of these quotes that performance means something different to different people. As user-interface folks know, perceived performance and actual performance can often be different things. I [Steven] remember when I was writing a portion of the Windows UI for Visual C++ and when I benchmarked against Borland C++ at the time, we were definitely faster (measured by seconds). However the reviews consistently mentioned Borland as being faster and providing feedback in the form of counts of lines compiled flying by. So I coded up a line count display that flashed a lot of numbers at you while compiling (literally flashy so it looked like it couldn't keep up). In clock times it actually consumed a non-zero amount of time so we got "slower" but the reviewers then started giving us credit for being faster. So in this case slower actually got faster.
There's another story from the past that is the flip side of this which is the scrolling speed in Microsoft Word for DOS (and also Excel for Windows--same dynamic). BillG always pushed hard on visible performance in the "early" days and scrolling speed was one of those things that never seemed to be fast enough. Well clever folks worked hard on the problem and subsequently made scrolling too fast--literally to the point that we had to slow it down so you didn't always end up going from page 1 to the end of the document just because you hold down the page down key. It is great to be fast, but sometimes there is "too much speed".
We have seen the feedback about what to turn off or adjust for better performance. In many ways what we're seeing are folks hoping to find the things that cause the performance to be less than they would like. I had an email conversation with someone recently trying to pinpoint the performance issues on a new laptop. Just by talking through it became clear the laptop was pretty "clean" (~40 processes, half the 1GB of RAM free, <5% CPU at idle, etc.) and after a few back and forths it became clear it was the internet connection (dial-up) that was actually the biggest bottleneck in the system. Many encourage us to turn off animations, graphics, or even color as there is a belief that these can be the root of performance. We've talked about the registry, disk space utilization, and even color depth as topics where folks see these as potential performance issues.
It is important to consider that performance is inherently a time/space tradeoff (computer science sense, not science fiction sense), and on laptops there is the added dimension of power consumption (or CPU utilization). Given infinite memory, of course many algorithms would be very different than the ones we use. In finite memory, performance is impacted greatly by the overall working set of a scenario. So in many cases when we talk about performance we are just as much talking about reducing the amount of memory consumed as we are talking about the clock time. Some parts of the OS are much more tunable in terms of the memory they use, which then improves the overall performance of the system (because there is less paging). Other parts of the system are much more about the number of instructions executed (because perhaps every operation goes through that code path). We work a great deal on both!
The reality of measuring and improving performance is one where we are focused at several "levels" in Windows 7: micro-benchmarks, specific scenarios, system tuning. Each of these plays a critical role in how we are engineering Windows 7 and while any single one can be measured it is not the case that one can easily conclude the performance of the system from a measurement.
Micro-benchmarks. Micro-benchmarks are the sort of tests that stress a specific subsystem at extreme levels. Often these are areas of the code that are hard to see the performance of during usage as they go by very fast or account for a small percentage of time during overall execution. So tests are designed to stress part of the system. Many parts of the system are subjected to micro-benchmarking such as the file system, networking, memory management, 2D and 3D graphics, etc. A good example here is the work we do to enable fast file copying. There is a lot of low level code that accounts for a (very significant) number of conditions when copying files around, and that code is most directly executed through XCOPY in a command window (or an API). Of course the majority of copy operations take place through the explorer and along with that comes a progress indicator, cancellable operation, counting up bytes to copy, etc. All of those have some cost with the benefit as well. The goal of micro-benchmarks is to enable us to best understand the best possible case and then compare it to the most usable case. Advanced folks always have access to the command line for more power, control, and flexibility. It is tempting to measure the performance of the system by looking at improvements in micro-benchmarks, but time and time again this proves to be inadequate as routine usage covers a much broader code path and time is spent in many places. For Internet Explorer 8 we did a blog post on performance that went into this type issue relative to script performance. At the other end of the spectrum we definitely understand the performance of micro-benchmarks on some subsystems will be, and should be, carefully measured --the performance of directx graphics is an area that gamers rely on for example. It is worth noting that many micro-benchmarks also depend heavily on a combination of Windows OS, hardware, and specific drivers.
Specific scenarios. Most people experience the performance of a PC through high level actions such as booting, standby/resume, launching common applications. These are topics we have covered in previous posts to some degree. In Engineering Windows 7, each team has focused on a set of specific scenarios that are ones we wanted to make better. This type of the work should be demonstrable without any elaborate setup or additional tools. This work often involves tuning the code path for the number of instructions executed, looking at the data allocated for the common case, or understanding all the OS APIs called (for example registry lookups). One example that comes to mind is the work that we have going on to reduce the time to reinsert a USB device. This is particularly noticeable for UFD (USB flash drives) or memory cards. Windows of course allows the whole subsystem to be plumbed by unique drivers for a specific card reader or UFD, even if most of the time they are the same we still have to account for the variety in the ecosystem. At the start of the project we looked at a full profile of the code executed when inserting a UFD and worked this scenario end-to-end. Then systematically each of the "hot spots" was worked through. Another example along these lines was playback of DVD movies which involves not only the storage subsystem but the graphics subsystem as well. The neat thing about this scenario is that you also want to optimize for the CPU utilization (which you might not even notice while playing back the movie) as that dictates the power consumption.
System tuning. A significant amount of performance work falls under the umbrella of system tuning. To ascertain what work we do in this area we routinely look at the overall performance of the system relative to the same tests on previous builds and previous releases of Windows. We're looking for things that we can do to remove operations that take a lot of time/space/power or things that have "grown" in one of those dimensions. We have build-to-build testing we do to make sure we do not regress and of course every developer is responsible for making sure their area improves as well. We left no stone unturned in terms of investigating opportunities to improve. One of the areas many will notice immediately when looking at the pre-beta or beta of Windows 7 is the memory usage (as measured by task manager, itself a measurement that can be misunderstood) of the desktop window manager. For Windows 7, a substantial amount of architectural work went into reducing the amount of memory consumed by the subsystem. We did this work while also maintaining compatibility with the Windows Vista drivers. We did similar work on the desktop search engine where we reduced not just the memory footprint, but the I/O footprint as well. One the most complex areas to work on was the improvements in the taskbar and start menu. These improvements involved substantial work on critical sections ("blocking" areas of the code), registry I/O, as well as overall code paths. The goal of this work is to make sure these UI elements are always available and feel snappy.
It is worth noting that there are broad based measures of performance as well that drive the user interface of a selection of applications. These too have their place--they are best used to compare different underlying hardware or drivers with the same version of Windows. The reason for this is that automation itself is often version dependent and because automation happens in a less than natural manner, there can be a tendency to measure these variances rather than any actual perceptible performance changes. The classic example is the code path for drawing a menu drop down--adding some instructions that might make the menu more accessible or more appealing would be impossible to perceive by a human, but an automated system that drives the menu at super human speed would see a change in "performance". In this type of situation the effect of a micro-benchmark is magnified in a manner inconsistent with actual usage patterns. This is just a word of caution on how to consider such measurements.
Given this focus across different types of measurement it is important to understand that the overall goal we have for Windows 7 is for you to experience a system that is as good as you expect it to be. The perception of performance is just as important as specific benchmarks and so we have to look to a broad set of tools as above to make sure we are operating with a complete picture of performance.
In addition to these broad strategies there are some specific tools we've put in place. One of these tools, PerfTrack, takes the role of data to the next level with regard to performance and so will play a significant role in the beta. In addition, it is worth reminding folks about the broad set of efforts that go into engineering for performance:
Perftrack is a very flexible, low overhead, dynamically configurable telemetry system. For key scenarios throughout Windows 7, there exist “Start” and “Stop” events that bracket the scenario. Scenarios can be pretty much anything; including common things like opening a file, browsing to a web page, opening the control panel, searching for a document, or booting the computer. Again, there are over 500 instrumented scenarios in Windows 7 for Beta.
Obviously, the time between the Stop and Start events is meant to represent the responsiveness of the scenario and clearly we’re using our telemetry infrastructure to send these metrics back to us for analysis. Perftrack’s uniqueness comes not just from what it measure but from the ability to go beyond just observing the occurrence of problematic response times. Perftrack allows us to “dial up” requests for more information, in the form of traces.
Let’s consider the distribution below and, for fun, let's pretend the scenario is opening XYZ. For this scenario, the feature team chose to set some goals for responsiveness. With their chosen goals, green depicts times they considered acceptable, yellow represents times they deemed marginal, and red denotes the poor times. The times are in milliseconds and shown along the X axis. The Hit Count is shown on the Y axis.
As can be seen, there are many instances where this scenario took more than 5 seconds to complete. With this kind of a distribution, the performance team would recommend that we “dial up” a request for 100+ traces from systems that have experienced a lengthy open in the past. In our “dialed up” request, we would set a “threshold” time that we thought was interesting. Additionally, we we may opt to filter on machines with a certain amount of RAM, a certain class of processor, the presence of specific driver, or any number of other things. Clients meeting the criteria would then, upon hitting the “Start” event, configure and enable tracing quickly and potentially send back to us if the “Stop” event occurred after our specified “threshold” of time.
As you might imagine, a good deal of engineering work went into making this end to end telemetry and feedback system work. Teams all across the Windows division have contributed to make this system a reality and I can assure you we’ll never approach performance the same now that we have these capabilities.
As a result of focusing on traces and fixing the very real issue revealed by them, we’ve seen significant improvements in actual responsiveness and have received numerous accolades on Windows 7. Additionally, I’d like to point out that these traces have served to further confirm what we’ve long believed t be the case.
This post provides an overview of the ways we have thought about performance with some specifics about how we measure it throughout the engineering of Windows 7. We believe that throughout the beta we will continue to have great telemetry to help make sure we are achieving our goals and that people perceive Windows 7 to perform well relative to their expectations.
We know many folks will continue to use stop watches, micro-benchmarks, or to drive automated tests. These each have their place in your own analysis and also in our engineering. We thought given all the interest we would talk more about how we measure things and how we're engineering the product.
--Steven and Michael
One thing that has often stroke me as "bad", performance-wise, in Windows, is the time it takes to install things.
To compare with the competition, in OSX, installing an application only takes a few seconds depending on its size.
And if you compare Firefox to ie I.E. under Windows,
Firefox also takes under a minute, while new versions of I.E. can take as much as 5 - 10 minutes to install.
Joke or Troll for free?
Perhaps you have a PC of 1982 and quand core Mac pro?
But this post seems to be "don't get your hopes up about Windows 7 performance"
The fact is lightweight code along with more disciplined, tighter coding practices = performance PERIOD! A topic I didn’t see mentioned at all in this post. Windows has needed a fat trimming for quite some time now do that and windows will fly!.
Steven and Michael
While many of us appreciate your efforts to self check yourself and create performance standards, lets be honest here. All of the tech bloggers, tech journalists, critics, and the lot are going to use benchmark testing to praise or slam Windows 7. Instead of saying we have our own methods, why not work with of these benchmark testers to make sure Windows 7 blows away the competition? Everyone is going to compare Windows 7 with XP, Vista, and OS-X Leopard, and the latest Ubuntu build. If anything the data accumulated could help you make sure that the perception and the reality is a much quicker operating system.
You guys should know from Vista, that perception becomes reality unfortunately. OS-X boots between 30 to 40 seconds. Versions of Linux boots between 15 to 20 seconds.
Using my cell phone as a stopwatch, I tested the startup time on my desktop using Windows Vista Ultimate. From the instant I hit the power switch to the log in screen was 43.7 seconds. Password to desktop adds another 13.5 seconds.
I've done this several times to verify. My desktop system has an AMD Athlon 64x2 processor, 2 GB of DDR2 800Mhz Memory, a 300 GB Seagate SATA drive, and an ATI HD 2600 Pro video card with 512 MB of GDDR2 memory.
Not the fanciest, but pretty well built to handle what Vista asks of it. Hopefully if a netbook using an Atom processor can run Windows Seven well, my system should run Seven very well.
However, the expectation is that Windows Seven smokes both Leopard and Snow Leopard in the wind. The harder challenge and expectation is that it smokes any version of Linux. I think its possible if you guys take the time. We live in the age of G-Hz processors, multicore processors, GB of Ram, and now TB of Hard Drive space. Making a speedy version of Windows Seven shouldn't be that difficult. Especially if you're not trying to boot over 70 processes at once, work with Intel and AMD to get every last bit of performance out of the processor, efficiently use the least amount of memory and better allocation, and have Windows run on the least possible amount of resources.
I don't envy the monumental task, but I know you guys can do it. I am dying to beta test it, as I have a 2nd hard drive in my desktop that has Windows 7 name on it.
Best Wishes and Happy Holidays.
In Vista you can use Hybernate
or you can use Sleep for your desktop (OSX only notebook), and the desktop is ready (no wait 13 sec.)
you have notice the milion problem for 10.5.6 update?
or perhaps believe that the problems are only with Windows
See here http://discussions.apple.com/category.jspa?categoryID=235
remember that the combination of configuration of these "PC" Apple is miserable
I was talking to brad weed who leads our design and research team and he text'ed me this from vacation. We were emailing about the response to a startup animation that some of you might have seen on the web :-)
By way of introduction, I work in Steven’s org on the User Experience team.
Let me start by saying that I firmly believe performance is the number one UX killer. No matter what user interface gets presented to the user, if it’s slow or perceived as slow it’s an uphill battle. But perception is a tricky thing. The field of visual arts is full of illusions and user interfaces are now squarely in the realm of the visual arts.
When it comes to performance and the visual arts, the field of animation comes to mind. Animators use tons of little illusionary tricks to alter the perception of speed and motion. Ironically they end up taking more time. Take anticipation, for example. Often times before a character or object moves, it will first move in the opposite direction ever so slightly to enhance the perception of acceleration – as if it was wound up before it is let go. This takes time away from the overall sequence, but it can make the movement seem faster, more believable or play up a particular personality. Believability and personality are a big part of why we incorporate animation in the UI. Ease in and out or cross fades are other examples of techniques that enhance the believability of motion.
Now if a given frame happens to stutter, freeze, jerk or suspend then we’re back to the performance as the number one UX killer. Imagine how you’d feel about an animated movie or a company’s flying logo on TV if they were jerky and stuttered a bunch. We don’t mash our teeth over the performance of the TV or the time it takes for things to animate because the performance is so good. We’re doing our earnest to track down as many cases of visual and animation performance issues in Windows as we can and are getting great responses from developers to fix them. It’s not the animation that is necessarily the problem, but the fact they may be interrupted or slowed for one reason or another. We all agree animation needs to enhance the experience of the product, not degrade the experience. And everyone should also know we’ve managed to squash a few gratuitous animations that tried to make it in the product!
Animation, like any visual effect, has to have a reason for existing. Look at movies. Sometimes a straight cut between scenes is necessary and sometimes a cross fade is best. But every movie is selective about which one to use and when. I suppose the movie would be faster if they had straight cuts, but that probably wouldn’t make for a very interesting movie. The same can be said for software.
I wanted to wait for beta 1, but some comments here made, that I need to say some words... Let's start from Windows 7 components.
Windows system - during years millions of users were notifying, that intensive used system is going slower and slower. Now some of you're writing, that this happens because applications vendors or drivers creators made wrong their job. I agree, that it can be part of problems...and I don't want impossible things from Microsoft...but the role of operating system creator is making it as difficult as possible. What will be changed in this area in Windows 7 ? Almost nothing ?
What else could convince users to new system ? Vista looking interface, which was already criticized by many people ?
Or maybe DRM ? BTW, why are you escaping from answer on question, how many CPU cycles it's using and when they're taken from user ?
Or maybe new version of IE8 ? tgdaily.com pusblihed IE8 RC1 test. Software prepared by big company with big financial funds. Costed many, many USD. ACID3 - 12/100, speed - slower than other browsers (see tgdaily.com). But...it will be still proposed as window for watching world and it will have to be used by some people (for example in some companies). What compatibility problems they will see ? And how they will be explained by you ?
I know, that many of you (Microsoft employees) are working very hard. But I'm still not convinced, that your new product will be better than previous one. Or different - it will be better, but it will not give enough profits in many daily tasks.
I want to say here, that you should start development from thinking about stable roots, not from building castles on the sand. Start from saying, that black is black and white is white. Look, what is liked by customers. When you will have it, you can return to computers and code.
Currently many people will maybe give you some credit and will use Windows 7...until they will see disadvantages (like these specified above). You will not have second chance and will loose them forever.
I wanted to show in my previous post, that Microsoft has got weak arguments, when speak about some aspects of new system. This is result of some decisions, which should be changed...
I will remind in the same time, that I don't have anything to company or people.
I obviously missed the post by David Fie so thank you for informing me. My eyes obviously were not working.
However, I must agree with marcinw on this particular issue. I think any technology professional that has been working with Windows for a long time knows that third party programs and drivers installed after the base OS installation have impacts to the overall performance on the system. We know that ITunes was definitely not written by a group of geniuses,who in there spare time of course, sell iphones at the apple store. No. We understand that the registry, poorly written drivers, WinSxS, the Windows code, third party programs, programs at startup, etc etc have been blamed for this degrade in performance. Some of them are valid depending on each individual system. However, even without this array of problems, I've seen PC's that are relatively clean from an application standpoint become unacceptable over time. I believe Mark Russinovich spoke to Windows slowing over time in a video session quite some time ago and I thought it was interesting but forget the point he made in the video. With that said, I guess my real question from the customer base is what happens from a system level to contribute to this degradation. I ask this question because fixing such issues and communicating them helps increase the interest "buzz" of Windows 7. In addition, since we know the problems that cause this "performance degradation over time" what can MS do from the OS perspective to help protect the non-technical end user from them? If the answer is Windows Defender..well yeah..umm...let's talk about them. I know MS is working with hardware OEM's on new drivers and I applaud you for that because it must be a monstrous undertaking. Efficient and well written drivers are a few reasons that Apple has been so successful but they are not a true integrator and do not have the problem you have given the scale of Windows. Also, jumping on the marcinw perspective, will the Win 7 team be doing anything to the interface to make it look less like Vista and more like something new? I agree that on top of the taskbar this is a necessity to create buzz. My fear for your team, whom I have a great deal of respect for, is that you create a Ferrari engine (code base/features) and then cover it with the chassis of a Dodge Shadow (GUI). No offense to owners of Dodge Shadows on this board but I think you get my point. We did see a change from 6801 to 6956 on the UI so hopefully from Beta to RC to RTM we'll see an improved UI. If not, would a please help to change your mind? ;) Thanks again.
I think MS did a pretty good job bringing a composited desktop to Windows with Vista's DWM. It's nice to be able to drag windows around without the tearing and artifacts of previous versions of Windows.
But I remain dissapointed with how Vista handles window redrawing while resizing. In most cases I find the artifacts and flickering while resizing a window to be worse aesthetically than when resizing a non-composited window. This is especially bad with windows that have client area glass like WMP.
I understand some of this is related to backwards compatibility with GDI style rendering and a desire to make it seem like Windows is responding quickly to the user's mouse movement. But even with more modern apps like WPF apps, I find the lag between the resize rectangle and the actual window contents to be disconcerting.
I remember when Mac OS X first came out, it had no hardware acceleration of its composited desktop. Resizing windows was really slow and often lagged behind the mouse cursor by a noticeable distance. But the windows always rendered completely with no artifacts or gaps between the window border and content. To me this gave apps a more solid feel despite the lag, like the app window was a physical thing.
I'd be interested to hear whether this responsiveness vs "solidity issue" has been considered by the w7 team and what arguments if any for each side of the coin have been made.
One thing that drives me crazy is when I am installing a program (MS or other vendor) and the estimated time remaining is 10 secs but 5 minutes later I am still installing. Or when I have an estimated time of 0 secs. If I have 0 secs left, shouldn't the install be done? And several installers seem to use about 3-5% of CPU. If I am not running any other program, when isn't it maxing out the CPU and installing faster.
The other thing about preformance is the hung up idea people get. If program A doesn't finish what is doing in X secs then it must be hung up (especially if the systems resources are not maxed). Then they kill the program and try again, only to continue to wait. I personally HATE visual indicates that just repeat the same pattern over and over (ie status bar that goes to the end and restarts about 2-3 secs). I never know if the program is still working or had an internal error and got stuck. Why not just show me how much of the task is remaining or in cases of unknown length so me what is complete?
UI Performance can be a killer -- take Outlook 2007 when I try and type a search. If I type too slow it starts the search prematurely, and then locks up on the IMAP server for 3-10 seconds. Clicking on the window makes it gray out (good idea, btw). Just because I didn't type fast enough, it goes and responds in kind. I don't know why Outlook isn't asynchronous all the time. Drives me nuts.
Very good article, and it's nice to see you innovating again (as far as I know) in the area of OS engineering.
I would really like to see a better task manager in Windows, one that would let me see which apps use the most disk IO, network, maybe even dig deeper into which files do they have open etc. Currently to do that I need extra tools and it's not easy to do -- and I'm a programmer. I'm quite sure my mom wouldn't have any idea how to go around finding that out. Keep in mind that this is directly related to issue of perceived system performance.
I like Vista a lot, and I think Windows 7 is exactly what is needed at this point -- no big API-level changes, but improvements in UI, reliability and performance.
Keep up the good work :)
One more thing, currently my total network IO is shown on the scale of 0 to 100 Mbit. That 100 Mbit comes from my DSL modem being connected to the computer with an Ethernet cable, so technically it _is_ a 100 Mbit connection, but for all the practical purposes it's a 2 Mbit connection, so the current usage that is displayed as a graph by task manager is never above 2% -- which renders the whole display useless (2% is around 2 pixels high).
Just wondering what do you guys think of this ongoing debate about the performance of windows 7 between Randall C. Kennedy and Thom Holwerda.
It is making quite alot of noise, surely you must have heard of it and have some thoughts?