Notes on comments.
Welcome to our blog dedicated to the engineering of Microsoft Windows 7
Aka: A developers view of the Windows 7 Engineering process
This post is by Larry Osterman. Larry is one of the most “experienced” developers on the Windows team and has been at Microsoft since the mid 1980’s. There are only three other folks who have worked at Microsoft longer on the entire Windows team! Personally, I remember knowing about Larry when I started at Microsoft back in 1989—I remember he worked on “multimedia” (back when we used to host the Microsoft CD-ROM Conference) and he was one of those people that stood up and received a “5 Year” award from Bill Gates at the first company meeting I went to—that seemed amazing back then! For Windows 7, Larry is a developer on the Devices and Media team which is where we work on audio, video, bluetooth, and all sorts of cool features for connecting up devices to Windows.
Larry wrote this post without any prodding and given his experience on so many Windows releases these thoughts seemed really worthwhile in terms of sharing with folks. This post goes into “how” we work as a team, which for anyone part of a software team might prove pretty interesting. While this is compared and contrasted with Vista, everyone knows that there is no perfect way to do things and this is just a little well-informed perspective.
So thank you Larry! --Steven
Thanks to Steven and Jon for letting me borrow their soapbox :-).
I wanted to discuss my experiences working on building Windows 7 (as opposed to the other technical stuff that you’ve read on this blog so far), and to contrast that with my experiences building Windows Vista. Please note that these are MY experiences. Others will have had different experiences; hopefully they will also share their stories here.
The experience of building Windows 7 is dramatically different from the experience of building Vista. The rough outlines of the product development process haven’t changed, but organizationally, the Windows 7 process is dramatically better.
For Windows Vista, I was a part of the WAVE (Windows Audio Video Excellence) group. The group was led by a general manager who was ultimately responsible for the deliverables. There was a test lead, a development lead and a program management lead who reported to the general manager. The process of building a feature roughly worked like this: the lead program managers decided (based on criteria which aren’t relevant to the post) which features would be built for Windows and which program managers would be responsible for which feature. The development leads decided which developers on the team would be responsible for the feature. The program manager for the feature wrote a functional specification (which described the feature and how it should work) in conjunction with development. Note that the testers weren’t necessarily involved in this part of the process. The developer(s) responsible for the feature wrote the design specification (which described how the feature was going to be implemented). The testers associated with the feature then wrote a test plan which described how to test the feature. The program manager or the developer also wrote the threat model for the feature.
The developer then went off to code the feature, the PM spent their time making sure that the feature was on track, and when the developer was done, the tester started writing test cases.
Once the feature was coded and checked into the source tree, it moved its way up to the “winmain” branch. Aside: The Windows source code has been arranged into “branches” – the root is “winmain”, which is the code base that would ultimately become Windows Vista. Each developer works in what are called “feature branches”, which merge changes into “aggregation branches”, the aggregation branches move into winmain.
After the feature was coded, the testers tested, the developers fixed bugs and the program managers managed the program :-). As the product moved further along, it got harder and harder to get bug fixes checked into winmain (every bug fix carries with it a chance that the fix will introduce a regression, so the risk associated with each bug fix needs to be measured and the tolerance for risk decreases incrementally). The team responsible for managing this process met in the “ship room” where they made decisions every single day about which changes went into the product and which ones were left out. There could be a huge amount of acrimony associated with that – often times there were debates that lasted for hours as the various teams responsible for quality discussed the merits associated with a particular fix.
All-in-all, this wasn’t too different from the way that features have been developed at Microsoft for decades (and is basically consistent with what I was taught back in my software engineering class back in college).
For Windows 7, management decided to alter the engineering structure of the Windows organization, especially in the WEX [Windows Experience] division where I work. Instead of being fairly hierarchical, Steven has 3 direct reports, each representing a particular discipline: Development, Test and Program Management. Under each of the discipline leads, there are 6 development/test/program management managers, one for each of the major groups in WEX. Those 2nd level managers in turn have a half a dozen or so leads, each one with between 5 and 15 direct reports. This reporting structure has been somewhat controversial, but so far IMHO it’s been remarkably successful.
The other major change is the introduction of the concept of a “triad”. A “triad” is a collection of representatives from each of the disciplines – Dev, Test and PM. Essentially all work is now organized by triads. If there’s ever a need for a group to concentrate on a particular area, a triad is spun off to manage that process. That means that all three disciplines provide input into the process. Every level of management is represented by a triad – there’s a triad at the top of each of the major groups in WEX, each of the second level leads forms a triad, etc. So in my group (Devices and Media) there’s a triad at the top (known as DKCW for the initials of the various managers). Within the sound team (where I work), there’s another triad (known as SNN for the initials of the various leads). There are also triads for security, performance, appcompat, etc.
Similar to Windows Vista, the leads of all three disciplines get together and decide a set of features that go in each release. They then created “feature crews” to implement each of the features. Typically a feature crew consists of one or two developers, a program manager and one or two testers.
This is where one of the big differences between Vista and Windows 7 occurs: In Windows 7, the feature crew is responsible for the entire feature. The crew together works on the design, the program manager(s) then writes down the functional specification, the developer(s) write the design specification and the tester(s) write the test specification. The feature crew collaborates together on the threat model and other random documents. Unlike Windows Vista where senior management continually gave “input” to the feature crew, for Windows 7, management has pretty much kept their hands off of the development process. When the feature crew decided that it was ready to start coding (and had signed off on the 3 main documents), the feature crew met with the second level triad (in my case with DKCW) to sanity check the feature – this part of the process is critical because the second level triad gets an opportunity to provide detailed feedback to the feature crew about the viability of their plans.
And then the crew finally gets to start coding. Sort-of. There are still additional reviews that need to be done before the crew can be considered “ready”. For instance, the feature’s threat model needs to be reviewed by one of the members of the security triad. There are other parts of the document that need to be reviewed by other triads as well.
A feature is not permitted to be checked into the winmain branch until it is complete. And I do mean complete: the feature has to be capable of being shipped before it hits winmain – the UI has to be finished, the feature has to be fully functional, etc. In addition, when a feature team takes a dependency on another Windows 7 feature, the feature teams for the two features MUST sign a service level agreement to ensure that each team knows about the inter-dependencies. This SLA is especially critical because it ensures that teams know about their dependants – that way when they change the design or have to cut parts of the feature, the dependent teams aren’t surprised (they may be disappointed but they’re not surprised). It also helps to ensure tighter integration between the components – because one team knows the other team, they can ensure that both teams are more closely in alignment.
Back in the Vista day, it was not uncommon for feature development to be spread over multiple milestones – stuff was checked into the tree that really didn’t work completely. During Win7, the feature crews were forced to produce coherent features that were functionally complete – we were told to operate under the assumption that each milestone was the last milestone in the product and not schedule work to be done later on. That meant that teams had to focus on ensuring that their features could actually be implemented within the milestone as opposed to pushing them out.
For the nuts and bolts, The Windows 7 development process is scheduled over several 3-month long milestones. Each milestone allowed for 6 weeks of development and 6 weeks of integration – essentially time to fine-tune the feature and ensure that most of the interoperability problems were shaken out.
Ok, that’s enough background (it’s bad when over half a post on Windows 7 is actually about Windows Vista, but a baseline needed to be established). As I said at the beginning, this post is intended to describe my experiences as a developer on Windows 7. During Windows 7, I worked on three separate feature crews. The first crew delivered two features, the second crew delivered about 8 different features all relatively minor and the third crew delivered three major features and a couple of minor features. I also worked as the development part of the WEX Devices and Media security team (which is where my series of post on Threat Modeling came from – I wrote them while I was working with the members of D&M on threat modeling). And I worked as the development part of an end-to-end scenario triad that was charged with ensuring that scenarios that the Sound team defined at the start of the Windows 7 planning process were actually delivered in a coherent and discoverable way.
In addition, because the test team was brought into the planning process very early on, the test team provided valuable input and we were able to ensure that we built features that were not only code complete but also test complete by the end of the milestone (something that didn’t always happen in Vista). And it ensured that the features we built were actually testable (it sounds stupid I know, but you’d be surprised at how hard it can be to test some features). As a concrete example, we realized during the planning process that some aspect of one of the features we were working on in M2 couldn’t be completed during the milestone. So before the milestone was completed, we ripped the feature out (to be more accurate, we changed the system so that the new code was no longer being built as a part of the product). During the next milestone, after the test team had finished writing their tests, we re-enabled the feature. But we remained true to the design philosophy – at the end of the milestone everything that was checked into the “main” branch was complete – it was code AND test complete, so that even if we had to ship Windows 7 without M3 there was no test work that was not complete. This is a massive change from Vista – in Vista, since the code was complete we’d have simply checked in the code and let the test team deal with the fallout. By integrating the test teams into the planning process at the beginning we were able to ensure that we never put the test organization into that bind. This in turn helped to ensure that the development process never spiraled out of control. Please note that features can and do stretch across multiple milestones. In fact one of the features on the Sound team is scheduled to be delivered across three milestones – the feature crews involved in that feature carefully scheduled the work to ensure that they would have something worth delivering whenever Windows 7 development was complete.
Each of the feature crews I’ve worked on so far has had dramatically different focuses – some of the features I worked on were focused on core audio infrastructure, some were focused almost entirely on UX (user experience) changes, and some features involved much higher level components. Because each of the milestones was separate, I was able to work on a series of dramatically different pieces of the system, something I’ve really never had a chance to do before.
In Windows 7, senior management has been extremely supportive of the various development teams that have had to make the hard decisions to scale back features that were not going to be able to make the quality bar associated with a Windows release – and there absolutely are major features that have gone all the way through planning only to discover that there was too much work associated with the feature to complete it in the time available. In Vista it would have been much harder to convince senior management to abandon features. In Win7 senior management has stood behind the feature teams when they’ve had to make the tough decisions. One of the messages that management has consistently driven home to the teams is “cutting is shipping”, and they’re right. If a feature isn’t coming together, it’s usually far better to decide NOT to deliver a particular feature then to have that feature jeopardize the ability to ship the whole system. In a typical Windows release there are thousands of features and it would be a real shame if one or two of those features ended up delaying the entire system because they really weren’t ready.
The process of building 7 has also been dramatically more transparent – even sitting at the bottom of the stack, I feel that I’ve got a good idea about how decisions are being made. And that increased transparency in turn means that as an individual contributor I’m able to make better decisions about scheduling. This transparency is actually a direct fallout of management’s decision to let the various feature teams make their own decisions – by letting the feature teams deeper inside the planning process, the teams naturally make better decisions.
Of course that transparency works both ways. Not only were teams allowed to see more about what was happening in the planning process, but because management introduced standardized reporting mechanisms across the product, the leads at every level of the hierarchy were able to track progress against plan at a level that we’ve never had before. From an individual developer’s standpoint, the overhead wasn’t too onerous – basically once a week, you were asked to update your progress against plan on each of your work items. That status was then rolled up into a series of spreadsheets and web pages that allowed each manager to track all the teams’ progress against plan. This allowed management to easily and quickly identify which teams were having issues and take appropriate action to ensure that the schedules were met (either by simplifying designs, assigning more developers, or whatever).
In general, it’s been a total blast building 7. We’ve built some truly awesome features into the operating system and we’ve managed to keep the system remarkably stable during that entire process.
NT systems are stable starting from beginning. When you don't play with strange drivers or apps, you can make even NT 4.0, 2000, XP, 2003 or Vista rock stable. I don't say, that it's different.
I don't say too, that Microsoft needs to rewrite full system or I don't say: "Linux great, Windows wrong". No.
1. I'm tired with Vista system (resolving many issues among friends and coworkers), Vista word and Vista interface.
2. NT based systems have some disadvantages. If Microsoft want to make revolution, need to work on them. Shared Registry or giving access to WIndows directory for 3rd party apps is very wrong idea. When I see many solutions around it (System Restore, protecting system dlls, etc. etc.) and not main problem resolved, I can not say, that it's resolved.
3. what about DRM ? what will be implemented in main system (how many CPU cycles will be used for it ?) and what in the added apps ?
4. I don't say, that Steven is making wrong work. I say only, that in this moment even his manager is speaking, that we will see not revolution.
Generally speaking: I hope, that Steven will answer on all these difficult questions (yes, he can read them now and because of it this is not futile) after PDC. This is "faith". I don't speak, that WIndows is phenomenal and I don't speak, that it's totally wrong. That's all.
I would like to ask for some things about Win 7:
1.Would it be possible to have "XP-style" dialogs/controls?I mean to be able to bypass "connection center" for example.When in XP I double click network connection(concret device) I'll get status of it and from there I get properties.In Vista there is another layer/window to get past.Can we get back old way?
2.Ability to completely disable any DRM services?(Maybe you got rid of them,but still...) And Patchguard and such if Admin wants(It might not be good thing,but there are still things where this can be neccessary and using 3rd party tools to disable is not good.)
3.Get back ability to chose what parts to install as in 9x or at least get back size on HDD as in XP.(Vista is too space hungry from what I have seen)
4.better built in tools.As Windows were more and more advanced some tools were loosing configure-ability.See defrag.In 9x it had some options,in XP almost none.
5.Optimization.Since Win 7 will for sure require advance CPU,why not to use some extended instruction sets properly.(Like in FFmpeg)
6.Will it be possible to use unsigned drivers with no limitation in x64 version?
7.Since HD is coming surely will there be no limits in usage of HD movies be it on Blu-ray or aired on DTV?
That so far is all.Thank you for answers.
P.S.:I have only Win Vista on one notebook as part of offer and based on that experience when setting up it for company network that Vista is not good replacement for XP.
Nice post about working as a developer on Win7 and how MS work inside...
But i have some questions about Seven:
1. I read about change the Registry to a SQL database...is there something in the air?
2. Direct3D 11/GPGPU and .NET: will be there an easy way to work with GPGPU from the .NET Freamwork
3. Also about the .NET Freamwork...when i work with Aero in my Apps i need to use "dll"s from Windows self...will you put an easyer way to work with Aero in the next .NET Freamwork?
4. Steve Ballmer talks about Seven and by the question about the support of multicore CPUs he say that MS think about...its a little bit hard to hear that becose multicore CPUs are great this time and Vista can not support to mutch Cores(on my 4 Core System its great but i dont now about 8 Core Nehalem and Phenom(45nm) Systems)...
Interesting article. Vista's favoring security over backwards compatibility makes it impossible to deploy for many organizations. All you need is a single accounting package that doesn't work with Vista and you're screwed.
I hope you're making W7 more compatible than Vista! If W7 can run more XP programs than Vista, that could be a big thing.
A number of folks have specifically asked about the DirectX APIs. We have 3 sessions at the PDC on DirectX and of course more at the WinHEC conference.
All the sessions will be available via the PDC site as per the information on the site.
"As the product moved further along, it got harder and harder to get bug fixes checked into winmain (every bug fix carries with it a chance that the fix will introduce a regression, so the risk associated with each bug fix needs to be measured and the tolerance for risk decreases incrementally)."
I sense contradiction. As the product moved further along, known regressions were perpetuated rather than take the risk of an unknown degree of fixing, and the tolerance for bugs increased to 100%.
"the feature has to be capable of being shipped before it hits winmain"
Where capable of being shipped includes having known bugs. How does this differ from any other policy?
"stuff was checked into the tree that really didn’t work completely"
No ship-it. I assure you, customers are already reminded of this fact, every day.
Asesh: "I have heard that Windows 7 will only be available to some beta testers. Why? It should be open to all the users who would want to test it. That would help Microsoft to fix more bugs and security holes and result in an initial release of a more stable operating system."
No it would not. Here's why: "As the product moved further along, it got harder and harder to get bug fixes checked into winmain..." Allowing more beta testers would increase the number of status changes from "unknown bug" to "known bug", from "not discovered" to "won't fix", from "not yet reproduced" to "reproducible only outside of Microsoft". It would not increase the number of fixes, or stability, or anything like that.
@ndiamond -- @Asesh had a good question worthy of a good answer.
Early in the product stages if you have too many testers you end up finding the same issues over and over again, and you can't "break through" and see the functionality of the product. The classic for Windows would be setup and device installation. Even with user interface changes, if the features aren't in a stable and known state, then everyone just hits the same issues.
Thus you don't broaden the test coverage and improve the product quality. Rather you frustrate everyone and create a large volume of bug activity without really any progress.
We announced a pre-beta for attendees to the PDC. The beta will be available broadly as we have promised.
I think your analysis mixes regressions with new bugs. As a product moves further along we do reduce the number of code changes--that's just good software engineering. If something is a regression relative to existing and known functionality / behavior then we address it. However, if new functionality has "bugs" (a bug is defined as "any time anyone experiences anything they did not expect") then you always weigh the risk of new bugs (or new regressions) with the benefit of the code change. There's nothing out of the ordinary about how we approach this.
As pointed out by GRiNSER, I would like to know why User Experience and Design people are not part of the triads. I think UX should be integrated in every feature and not just UX intensive features.
I am not a programmer, (just Delphi code copier) but just wondering if young people would like to know what coding language you use for Windows? I'm supposing mostly C++, but you have so many available.
Your C# is a 'knock-off' of Delphi, right?
You'd be more modular with self contained Delphi wouldn't you? Bigger exe but you could use a 'packer' (to reduce code 50%) and this would provide more security?
I have never used an Apple, but I understand from what I read that you just drag a package to install - then drag it to the bin to uninstall. This is very Delphi like. (Delphi appears to be dying?)
Do you use the same throughout, or mix and match languages?
Interesting that the team is now responsible for the final GUI as well, this leads to the build looking like the final. Normally I would think you would try and hide the final 'look & feel' behind the previous O/S GUI. It's good to know what it looks like a bit, stops critics saying it's just the same as the old one, and doesn't lead to over expectations. But it still leaves room for a few little 'surprises' to be included in the RTM.
As each team is now implementing the UI for their project, you must get a 'blueprint' of the 'look & feel' down to forms, dialogs and glyphs? (No, I've got a big round purple button - Yeah, well I've got a small square yellow one!)
It's good to know where you are all going and not living in isolation - next you'll be having a one-a-month Friday afternoon BBQ together (rain permitting :-) - no sanity please!
Thanks for reading trough all those comments. Regarding small beta:
Betas for enthusiasts would be a good idea however. First and foremost because you'd make them happy - and some of them are influential. I'm a strong supporter of Vista primarily because I went trough all the betas. And reported non-duplicate bugs. I wasn't using legal ways to obtain ISO images, however. And that's frustrating as well...
If you don't want too many bugreports on the same subject just introduce some form of punishment for duplicates. Like: Every tester can report 2 bugs. For every ACKd non-duplicate bug, a user may report another bug.
While it sounds really good, and might do it for Windows, but really, I do keep my fingers crossed for Windows 7 to be dramatically better in its performance than Windows Vista.
If you, in your triads, are constantly thinking on how to improve the performance of your deliverables, then thanks God, you are on a right track. If you get triple speed in a milestone 3, comparing to what you had in a milestone 1, then I believe it should be good.
If again, if it's only about stability and security, then it's not good enough. Performance is also a must now.
I hope you can pull this off with your new working style.
I am very excited to see how this all goes, MS has always done a more efficient and powerful OS, and bar non Vista has been the best so far.
I look forward to getting my hands on the pre-beta.
I have been reviewing a lot of MS research projects, if you keep on track with what I can clearly see as the future of computing! Imaging Technology/IR is so important to this future. For the doubting lot...look at SideSight, why boink your screen when you can wiggle your finger.
Windows 7 is way better than Vista...
@steven your team is doing a great job, thanks for a what will be a great product.
Interesting about the SLA between feature teams that share dependencies.
It would be good to see this process extend to all feature teams across product lines - this means that the changes to the Remote Desktop Client for example would have been less likely to break console access to an SBS 2003 box using RWW.
Only just catching up on this blog after a couple of weeks being unable to read it, but I just wanted to say that this post is single-handedly the most reassuring post of the entire lot. I like the sound of your management's new perspective. The fact that you ('you' in the plural sense) are not letting rushed, unfinished features into the OS just to boost the list of things Win7 can do is great. I'd much rather know that any feature that wants to get into the OS needs to be of sufficient quality first than know that there were a few more features overall.
I can also see the idea of triads being very helpful. If we had that kind of co-operation between the different teams at my workplace, I can already think of a handful of projects that would have run into far fewer problems.
Thanks Larry, for making me aware of why I should have more faith in Windows 7.
You shouldn't be so negative :)