Notes on comments.
Welcome to our blog dedicated to the engineering of Microsoft Windows 7
Just about every email we receive and every comment we get comes with feedback—something to change, something to do more of, something to do less of, and so on. As we’ve talked about in this blog, acting on each one in an affirmative manner is easier said than done. What we can say for certain, is that we are listening to each and every comment, blog post, news story, MS Connect report, Send Feedback item, and of course all the data and telemetry. This post kicks off the discussion of changes made to the product with an overview of the feedback process. We'll get into specific changes shortly and we'll continue to return to the theme of changes in the Release Candidate (RC) over the next weeks. Yesterday on the IE Blog, you saw that we'll be updating IE 8 on Windows 7, and there we also talked about the feedback process in general.
Feedback about Windows 7 of course starts before we've written any code, and by the time we've got running code thousands of people outside of Microsoft have provided input and influenced the feature set and design of Windows 7. As we've seen, the input from even a small set of customers can often represent a wide variety of choices--often in alignment, but just as often in opposition. As we're developing the features for Windows 7 we work closely with PC makers, enterprise customers, and all types of customers across small business, education, enthusiasts, product reviewers and industry "thought leaders", and so on. We shape the overall "blueprint" of the release based on this wide variety of input. As we have design prototypes or code running, we have much more targeted and specific feedback by using tools such as usability tests, concept tests, benchmark studies, and other techniques to validate the implementation of this blueprint. Our goal with this level of feedback is for it to be representative of the broad set of Windows customers, even if we don't have a 1:1 interaction with each and every customer. Hopefully this post will offer some insights into this process overall--the tools and techniques, and the scope of feedback.
In the first few weeks of the Windows 7 beta we had over one million people install and use Windows 7. That's an astounding number for any beta test and while we know it has been fun for many folks, it has been a lot of work for us--but work that helps to raise the quality of Windows 7. When you use the beta you are automatically enrolled in our Customer Experience Improvement Program (anonymous feedback and telemetry, which is voluntary and opt-in in the RTM release). Just by using Windows 7 as a beta tester you are helping to improve the product--you are providing feedback that we are acting on in a systematic manner. Here is a sense of the scale of feedback we are talking about:
We have a variety of tools we draw on to help inform the decision making process. A key element that we have focused on quite a bit in Windows 7 is the role of data in making decisions. Everything we do is a judgment call as ultimately product development is about deciding what to get done from an infinite set of possibilities, but the role of data is essential and is something that has become far more routine and critical. It is important to be super clear—data is not a substitute for good judgment or an excuse to make a decision one way or another, but it most definitely informs the decision. This is especially true in an era where the data is not only a survey or focus group, but often includes a “sampling” of millions of people using Windows over the course of an extended time period.
A quick story from years ago working on Office, many years ago before the development of telemetry and the internet deciding what features to put in a release of Office could really be best described as a battle. The battle took place in conference rooms where people would basically debate until one or more parties gave up from fatigue (mental or otherwise)—essentially adrenaline-based product development. The last person standing, the one with the most endurance, or the one who pulled an all-nighter to write the code pretty much determined how features ended up or what features ended up in a product. Sort of like turning feature design over to a Survivor-like process. I’m sure many of you are familiar with this sort of process. The challenges with this approach are numerous, but inevitably features do not hold together well (in terms of scenarios or architecture), the product lacks coherency, and most importantly unless you happen to have a good match between the “winner” and the target customers, features will often miss the mark.
In the early 1990’s we started instrumenting Word and learning about how people actually used the software (this was before the internet so this was a special version of the product we solicited volunteers to run and then we would collect the data via lots of floppies). We would compile data and learn about which features people used and how much people used them. We learned things such as how much more people used tables than we thought, but for things very different than tables. We learned that a very significant amount of time the first suggestion in the spelling dictionary was the right correction (hence autocorrect). We learned that no one ever read the tip of the day (“Don’t run with scissors”). This data enabled us to make real decisions about what to fix, the impact of changes, and then when looked at the goals (the resulting documents) what direction to take word processing.
Fast forward to the development of Windows 7 and we’re focused on using data to help inform decisions we make. This data takes many forms and helps in many ways. I know a lot of folks have questions about the data – is it representative, how does it help fix things people should be using but don’t, what about doing new things, and so on. Data is an important element of making decisions, but not a substitute for clear product goals, meaningful customer engagement, and working across the ecosystem to bring Windows 7 to customers.
Let’s talk a bit about “bugs”. Up front it is worth making sure we’re on the same page when we use the much overloaded term bug. For us a bug is any time the software does something that someone one wasn’t expecting it to do. A bug can be a cosmetic issue, a consistency issue, a crash, a hang, a failure to succeed, a confusing user experience, a compatibility issue, a missing feature, or any one of dozens of different ways that the software can behave in a way that isn’t expected. A bug for us is not an emotional term, but just shorthand for an entry in our database representing feedback on the product. Bugs can be reported by a human or by the various forms of telemetry built into Windows 7. This broad definition allows us to track and catalog everything experienced in the product and do so in a uniform manner.
Briefly, it is worth considering a few types of data that help to inform decisions as some examples.
This type of feedback all represents structured feedback in that the data is collected based on a systematic study and usually has a hypothesis associated with it. We also have the unstructured feedback which represents the vast array of bug reports, comments, questions, and points of view expressed in blogs, newsgroups, and the Send Feedback button—these are unstructured because these are not collected in a systematic manner, but aggressively collected by any and all means. A special form of this input is the bug reporting done through the Connect program—the technical beta—which represents bug reports, feature suggestions, and comments from this set of participants.
The Windows 7 beta represents a new level of feedback in this regard in terms of the overall volume as we talked about above. If you go back and consider the size of the development team and the time it would take to just read the reports you can imagine just digesting (categorizing, understanding, flagging) issues let alone responding to them is a massive undertaking (about 40 Send Feedback reports per developer during that one week, though as you can imagine they are not evenly distributed across teams).
The challenge of how to incorporate all the feedback at this stage in the cycle is significant. It is emotional for us at Microsoft and the source of both considerable pride and also some consternation. We often say “no matter what happens, someone always said it would.” By that we mean, on any given issue you can be assured that all sides will be represented by passionate and informed views of how to resolve it, often in direct opposition to each other plus every view in the middle. That means for the vast majority of issues there is no right or wrong in an absolute sense, only a good decision within the context of a given situation. We see this quite a bit in the debates about how features should work—multiple solutions proposed and debate takes place in comments on a blog (people even do whole blogs about how things should work). But ultimately on the Windows development team we have to make a call as we’re seeing a lot of people are looking forward to us finishing Windows 7, which means we need to stop changing the product and ship it. We might not always make the right call and we’ll admit if we don’t make the right call, even if we find changing the behavior is not possible.
Making these decisions is the job of program management (PM). PMs don’t lock themselves in their offices and issue opinions, but more realistically they gather all the facts, data, points of view, and work to synthesize the best approach for a given situation. Program management’s role is making sure all the voices are heard, including beta testers, development, testing, sales, marketing, design, customer support, other teams, ISVs, IHVs, and on and on. Their role is to synthesize and represent these points of view systematically.
There are many factors that go into understanding a given choice:
These are just a few of the factors that go into considering a product change. As you can see, this is not something that we take lightly and a lot goes into each and every change. We consider all the inputs we have and consider all the data we can gather. In some ways it is easy to freeze thinking about the decisions we must make to release Windows 7—if you think too hard about a decision because you might start to worry about a billion people relying on something and it gets very tricky. So we use data to keep ourselves objective and to keep the decision process informed and repeatable. We are always humbled by the responsibility we have.
While writing this post, I received a “bug report” email with the explicit statement “is Microsoft going to side step this issue despite the magnitude of the problem” along with the inevitable “Microsoft never listens to feedback”. Receiving mail like this is tough—we’re in the doghouse before we even start. The sender has decided that this report is symbolic of Microsoft’s inability or lack of desire to incorporate critical feedback and to fix must fix bugs during development. Microsoft is too focused on shipping to do the right thing. I feel like I’m stuck because the only answer being looked for is the fix and anything less is a problem or further proof of our failure. And in the back of my mind is the reality that this is just one person with one issue I just happen to be talking to in email. There over a couple of million people using the beta and if each one, or for that matter just one out of 10, have some unique change, bug fix, or must do work item we would have literally years of work just to make our way through that list. And if you think about the numbers and consider that we might easily get 1,000,000 submitted new “work items” for a product cycle, even if we do 100,000 of them it means we have 900,000 folks who feel we don’t listen compared to the 100,000 folks who feel listened to. Perhaps that puts the challenge in context.
With this post we tried to look at some of the ways we think about the feedback we’re getting and how we evaluate feedback in the course of developing Windows 7. No area is more complex than balancing the needs (and desires) of such a large and diverse population—end-users, developers, IT professionals, hardware makers, PC manufacturers, silicon partners, software vendors, PC enthusiasts, sysadmins, and so on. A key reason we augment our approach with data and studies that deliberately select for representative groups of “users” is that it is important to avoid “tyranny of the majority” or “rule by the crowd”. In a sense, the lesson we learned from adrenaline -based development was that being systematic, representative, and as scientific as possible in the use of data.
The work of acting on feedback responsibly and managing the development of Windows through all phases of the process is something we are very sincere about. Internally we’ve talked a lot about being a learning organization and how we’re always learning how to do a better job, improve the work we do, and in the process work to make Windows even better. We take this approach as individuals and how we view building Windows. We know we will continue to have tough choices to make as everyone who builds products understands and what you have is our commitment to continue to use all the tools available to make sure we are building the best Windows 7 we can build.
It's actually funny that there is nothing to report in the bug department.
I am using Windows 7 in production now, yet from the same machine I can perform my hobbies.
Very Cool when you can push the limits and not have a crash, I am glad that your team has had such a hands on approach to building this OS.
Mamma mia GREAT POST!!
Congratulation Mr. Steven
Great post, however there are a few changes that both break away from the de facto standard that has developed over the years and that make no sense.
My greatest bone to pick is the lack of Software Explorer in Windows Defender. What's the rationale behind that?
a good post, an essay in how it should be done and i agree and applaud it all. yet when i see so many applications in windows 7 with different user interface schemes and menus i wonder how that could have escaped through this wonderful process that ought to have caught that. i won't accept that it's unimportant and i know paul thurrot feels the same about this. i suspect the fragmentation of microsoft departments creating these different apps causes an integration problem but for me it should have been one of the cornerstones of w7 - integrate, stablize and make consistent the out of box experience.
I think that documenting the rational reasons for changes and communicating effectively about the *why* of changes/additions/removals would go a long way to reduce frustration with the apparent "not caring".
Part of the is, of course what you are doing now with a blog - but there are many teams that don't blog, or communicate much outside of putting a few api outlines on MSDN. There are so many changes and the scope of the product is so large that even a simple changelog would be good.
I don't have all the answers but I think that communicating and *responding* would make a huge difference.
PS. Connect, as a website, sucks. (good feedback I know;)
Thanks for the interesting, in-depth info.
I wanted to let you know about the issue I currently regularly experience with Windows 7 beta 1. This issue (of high priority IMO) has been described in detail in this blog post ("System Fatal Program Crashes"):
Hope you fix it.
And BTW: get yourself some more servers/bandwidth/anything because all blogs @msdn.com have become sooo sloow recently.
I've been submitting feedback using various tools about one noticeably broken bitmap in system file leading to Windows Explorer Classic interface back/forward arrows display problem since Vista beta 2.
Many things have changed, bitmaps were moved into new file, but it's still broken. :(
The frustrating thing about all my feedback on Microsoft Connect is that most of the time I got one reply: "You should create a DCR. This is not a bug.". My response: "How can I create a DCR?". And I got no answer...
And there are sites like the Windows 7 Taskforce. Where users discuss and most of the time come to a consensus. There is 1 issue that "will be fixed" and 4 things that are fixed. One feature request was marked as "fixed" when in fact it's not. Out of over 500 entries. Most of these things wouldn't be hard to implement: Adding 1px borders around elements, changing bitmaps, changing colors and so on have NO impact on stability and don't have to be localized. I don't get why MSFT doesn't change those things. If only to make Windows look more polished.
But in the end you guys are still on the right track. Windows 7 feels and looks better than any version before. If you continue to make improvements at this pace Windows 10 will be almost-perfect.
Btw: I need three hours on average per 100 bug-reports.
Very informative article (if a bit long;)
As Google's Marissa Meyer said it in her IO'08 keynote (http://www.youtube.com/watch?v=6x0cAzQ7PVs), that user testing finally allows us to transform usability and UI design from an art into a science.
While a great designer can intuitively resolve most obvious problems, user testing and bucket testing, of which telemetry is an example, allow us try to really measure, in hard numbers, what the best solutions are.
That won't put designers out of their jobs, since there are classes of problems that user feedback rarely solves (i.e. innovation.)
You da man Steven!! Fantastic article on feedback and bug reporting. Now I know why Windows 7 is so great even for a beta! I'm looking forward to the RC, can't wait.
PS. My Connect submissions have all been adequately responded to. Tell the folks reporting to you they are doing a super job!
of those 7.5M that didn't need drivers, how many were VM's?
In every manner in life, there are people who will always find something to argue, and look at the tree, not the forest.
We believe, we, as Beta Testers, Tech entousiasts that our feedback and our work, is going to be considered, whether nothing from our suggestions happens.
This is because, we are just customers, not Operating System designers and trust your desicions.
The "thing" you did with the Feedback on windows 7 Beta is something that is absolutely positive, and will change the way later products from microsoft being developed.
Keep up the good work, try to focus on the actual feedback and don't let anything "bashing" frustrate you, cause trolls won't apreciate anything either way, but us will !
Best Regards, Manos, Greece
first of all: great post to an interesting topic.
what I see is this: many people consider Apple as being innovative while Microsoft as being not innovative.
I don't think that this is true but in my opinion Apple is way better in communicating changes and advances of new products. And they have a great sense for taking PC downsides and make them their upsides.
Microsoft should adapt that. Instead many areas where Mac OS X shines feel abandoned: File Previews, File Visualisation (Coverflow), Gadgets, Style (the superbar looks like the dock in tiger) etc.
But the Windows Experience relies very much on third party applications. Too many software is low quality, slows down my pc even if I don't use the software, doesn't fit in with the interface and too many codecs make problems and so on. Windows needs an app store as a quality control and an easy way to find, install and update software.
Microsoft can create the best Windows, it's useless if it is installed on a crappy computer with crappy software.
Great discussion of what is clearly a passionate topic for you.
The beta is probably thick on skilled IT hobbiests. If you ask me how I like W7, I'd say it's great. If my boss asks me if we can roll it out, I'd say no way. We have this application that requires IE6, and vendor X, Y and Z don't support Vista/W7. In the workplace, Vista/W7 breaks too much stuff; I don't see how W7 solves that.
Telemetry measures quantity, but not quality. For example, the new Explorer library function has greatly improved Media Player's usability. The library in Media Player is completely dysfunctional. But the combination of Explorer's library, and Media Player's "Now Playing" mode, is workable.
So Media Player usage is up, but it's not because of Media Player has improved. That's the kind of thing that would be hard to extract from telemetry.