Notes on comments.
Welcome to our blog dedicated to the engineering of Microsoft Windows 7
Aka: A developers view of the Windows 7 Engineering process
This post is by Larry Osterman. Larry is one of the most “experienced” developers on the Windows team and has been at Microsoft since the mid 1980’s. There are only three other folks who have worked at Microsoft longer on the entire Windows team! Personally, I remember knowing about Larry when I started at Microsoft back in 1989—I remember he worked on “multimedia” (back when we used to host the Microsoft CD-ROM Conference) and he was one of those people that stood up and received a “5 Year” award from Bill Gates at the first company meeting I went to—that seemed amazing back then! For Windows 7, Larry is a developer on the Devices and Media team which is where we work on audio, video, bluetooth, and all sorts of cool features for connecting up devices to Windows.
Larry wrote this post without any prodding and given his experience on so many Windows releases these thoughts seemed really worthwhile in terms of sharing with folks. This post goes into “how” we work as a team, which for anyone part of a software team might prove pretty interesting. While this is compared and contrasted with Vista, everyone knows that there is no perfect way to do things and this is just a little well-informed perspective.
So thank you Larry! --Steven
Thanks to Steven and Jon for letting me borrow their soapbox :-).
I wanted to discuss my experiences working on building Windows 7 (as opposed to the other technical stuff that you’ve read on this blog so far), and to contrast that with my experiences building Windows Vista. Please note that these are MY experiences. Others will have had different experiences; hopefully they will also share their stories here.
The experience of building Windows 7 is dramatically different from the experience of building Vista. The rough outlines of the product development process haven’t changed, but organizationally, the Windows 7 process is dramatically better.
For Windows Vista, I was a part of the WAVE (Windows Audio Video Excellence) group. The group was led by a general manager who was ultimately responsible for the deliverables. There was a test lead, a development lead and a program management lead who reported to the general manager. The process of building a feature roughly worked like this: the lead program managers decided (based on criteria which aren’t relevant to the post) which features would be built for Windows and which program managers would be responsible for which feature. The development leads decided which developers on the team would be responsible for the feature. The program manager for the feature wrote a functional specification (which described the feature and how it should work) in conjunction with development. Note that the testers weren’t necessarily involved in this part of the process. The developer(s) responsible for the feature wrote the design specification (which described how the feature was going to be implemented). The testers associated with the feature then wrote a test plan which described how to test the feature. The program manager or the developer also wrote the threat model for the feature.
The developer then went off to code the feature, the PM spent their time making sure that the feature was on track, and when the developer was done, the tester started writing test cases.
Once the feature was coded and checked into the source tree, it moved its way up to the “winmain” branch. Aside: The Windows source code has been arranged into “branches” – the root is “winmain”, which is the code base that would ultimately become Windows Vista. Each developer works in what are called “feature branches”, which merge changes into “aggregation branches”, the aggregation branches move into winmain.
After the feature was coded, the testers tested, the developers fixed bugs and the program managers managed the program :-). As the product moved further along, it got harder and harder to get bug fixes checked into winmain (every bug fix carries with it a chance that the fix will introduce a regression, so the risk associated with each bug fix needs to be measured and the tolerance for risk decreases incrementally). The team responsible for managing this process met in the “ship room” where they made decisions every single day about which changes went into the product and which ones were left out. There could be a huge amount of acrimony associated with that – often times there were debates that lasted for hours as the various teams responsible for quality discussed the merits associated with a particular fix.
All-in-all, this wasn’t too different from the way that features have been developed at Microsoft for decades (and is basically consistent with what I was taught back in my software engineering class back in college).
For Windows 7, management decided to alter the engineering structure of the Windows organization, especially in the WEX [Windows Experience] division where I work. Instead of being fairly hierarchical, Steven has 3 direct reports, each representing a particular discipline: Development, Test and Program Management. Under each of the discipline leads, there are 6 development/test/program management managers, one for each of the major groups in WEX. Those 2nd level managers in turn have a half a dozen or so leads, each one with between 5 and 15 direct reports. This reporting structure has been somewhat controversial, but so far IMHO it’s been remarkably successful.
The other major change is the introduction of the concept of a “triad”. A “triad” is a collection of representatives from each of the disciplines – Dev, Test and PM. Essentially all work is now organized by triads. If there’s ever a need for a group to concentrate on a particular area, a triad is spun off to manage that process. That means that all three disciplines provide input into the process. Every level of management is represented by a triad – there’s a triad at the top of each of the major groups in WEX, each of the second level leads forms a triad, etc. So in my group (Devices and Media) there’s a triad at the top (known as DKCW for the initials of the various managers). Within the sound team (where I work), there’s another triad (known as SNN for the initials of the various leads). There are also triads for security, performance, appcompat, etc.
Similar to Windows Vista, the leads of all three disciplines get together and decide a set of features that go in each release. They then created “feature crews” to implement each of the features. Typically a feature crew consists of one or two developers, a program manager and one or two testers.
This is where one of the big differences between Vista and Windows 7 occurs: In Windows 7, the feature crew is responsible for the entire feature. The crew together works on the design, the program manager(s) then writes down the functional specification, the developer(s) write the design specification and the tester(s) write the test specification. The feature crew collaborates together on the threat model and other random documents. Unlike Windows Vista where senior management continually gave “input” to the feature crew, for Windows 7, management has pretty much kept their hands off of the development process. When the feature crew decided that it was ready to start coding (and had signed off on the 3 main documents), the feature crew met with the second level triad (in my case with DKCW) to sanity check the feature – this part of the process is critical because the second level triad gets an opportunity to provide detailed feedback to the feature crew about the viability of their plans.
And then the crew finally gets to start coding. Sort-of. There are still additional reviews that need to be done before the crew can be considered “ready”. For instance, the feature’s threat model needs to be reviewed by one of the members of the security triad. There are other parts of the document that need to be reviewed by other triads as well.
A feature is not permitted to be checked into the winmain branch until it is complete. And I do mean complete: the feature has to be capable of being shipped before it hits winmain – the UI has to be finished, the feature has to be fully functional, etc. In addition, when a feature team takes a dependency on another Windows 7 feature, the feature teams for the two features MUST sign a service level agreement to ensure that each team knows about the inter-dependencies. This SLA is especially critical because it ensures that teams know about their dependants – that way when they change the design or have to cut parts of the feature, the dependent teams aren’t surprised (they may be disappointed but they’re not surprised). It also helps to ensure tighter integration between the components – because one team knows the other team, they can ensure that both teams are more closely in alignment.
Back in the Vista day, it was not uncommon for feature development to be spread over multiple milestones – stuff was checked into the tree that really didn’t work completely. During Win7, the feature crews were forced to produce coherent features that were functionally complete – we were told to operate under the assumption that each milestone was the last milestone in the product and not schedule work to be done later on. That meant that teams had to focus on ensuring that their features could actually be implemented within the milestone as opposed to pushing them out.
For the nuts and bolts, The Windows 7 development process is scheduled over several 3-month long milestones. Each milestone allowed for 6 weeks of development and 6 weeks of integration – essentially time to fine-tune the feature and ensure that most of the interoperability problems were shaken out.
Ok, that’s enough background (it’s bad when over half a post on Windows 7 is actually about Windows Vista, but a baseline needed to be established). As I said at the beginning, this post is intended to describe my experiences as a developer on Windows 7. During Windows 7, I worked on three separate feature crews. The first crew delivered two features, the second crew delivered about 8 different features all relatively minor and the third crew delivered three major features and a couple of minor features. I also worked as the development part of the WEX Devices and Media security team (which is where my series of post on Threat Modeling came from – I wrote them while I was working with the members of D&M on threat modeling). And I worked as the development part of an end-to-end scenario triad that was charged with ensuring that scenarios that the Sound team defined at the start of the Windows 7 planning process were actually delivered in a coherent and discoverable way.
In addition, because the test team was brought into the planning process very early on, the test team provided valuable input and we were able to ensure that we built features that were not only code complete but also test complete by the end of the milestone (something that didn’t always happen in Vista). And it ensured that the features we built were actually testable (it sounds stupid I know, but you’d be surprised at how hard it can be to test some features). As a concrete example, we realized during the planning process that some aspect of one of the features we were working on in M2 couldn’t be completed during the milestone. So before the milestone was completed, we ripped the feature out (to be more accurate, we changed the system so that the new code was no longer being built as a part of the product). During the next milestone, after the test team had finished writing their tests, we re-enabled the feature. But we remained true to the design philosophy – at the end of the milestone everything that was checked into the “main” branch was complete – it was code AND test complete, so that even if we had to ship Windows 7 without M3 there was no test work that was not complete. This is a massive change from Vista – in Vista, since the code was complete we’d have simply checked in the code and let the test team deal with the fallout. By integrating the test teams into the planning process at the beginning we were able to ensure that we never put the test organization into that bind. This in turn helped to ensure that the development process never spiraled out of control. Please note that features can and do stretch across multiple milestones. In fact one of the features on the Sound team is scheduled to be delivered across three milestones – the feature crews involved in that feature carefully scheduled the work to ensure that they would have something worth delivering whenever Windows 7 development was complete.
Each of the feature crews I’ve worked on so far has had dramatically different focuses – some of the features I worked on were focused on core audio infrastructure, some were focused almost entirely on UX (user experience) changes, and some features involved much higher level components. Because each of the milestones was separate, I was able to work on a series of dramatically different pieces of the system, something I’ve really never had a chance to do before.
In Windows 7, senior management has been extremely supportive of the various development teams that have had to make the hard decisions to scale back features that were not going to be able to make the quality bar associated with a Windows release – and there absolutely are major features that have gone all the way through planning only to discover that there was too much work associated with the feature to complete it in the time available. In Vista it would have been much harder to convince senior management to abandon features. In Win7 senior management has stood behind the feature teams when they’ve had to make the tough decisions. One of the messages that management has consistently driven home to the teams is “cutting is shipping”, and they’re right. If a feature isn’t coming together, it’s usually far better to decide NOT to deliver a particular feature then to have that feature jeopardize the ability to ship the whole system. In a typical Windows release there are thousands of features and it would be a real shame if one or two of those features ended up delaying the entire system because they really weren’t ready.
The process of building 7 has also been dramatically more transparent – even sitting at the bottom of the stack, I feel that I’ve got a good idea about how decisions are being made. And that increased transparency in turn means that as an individual contributor I’m able to make better decisions about scheduling. This transparency is actually a direct fallout of management’s decision to let the various feature teams make their own decisions – by letting the feature teams deeper inside the planning process, the teams naturally make better decisions.
Of course that transparency works both ways. Not only were teams allowed to see more about what was happening in the planning process, but because management introduced standardized reporting mechanisms across the product, the leads at every level of the hierarchy were able to track progress against plan at a level that we’ve never had before. From an individual developer’s standpoint, the overhead wasn’t too onerous – basically once a week, you were asked to update your progress against plan on each of your work items. That status was then rolled up into a series of spreadsheets and web pages that allowed each manager to track all the teams’ progress against plan. This allowed management to easily and quickly identify which teams were having issues and take appropriate action to ensure that the schedules were met (either by simplifying designs, assigning more developers, or whatever).
In general, it’s been a total blast building 7. We’ve built some truly awesome features into the operating system and we’ve managed to keep the system remarkably stable during that entire process.
Yes guys, make Windows 7 the best OS out there. Let it be a Mac OS X killer!! :)
I have heard that Windows 7 will only be available to some beta testers. Why? It should be open to all the users who would want to test it. That would help Microsoft to fix more bugs and security holes and result in an initial release of a more stable operating system. As usual the initial release will be full of annoying bugs and many won't just upgrade to Windows 7 until it's SP1 is released. SO THE BETA VERSION OF WINDOWS 7 SHOULD BE AVAILABLE TO ALL THE USERS!!
>>"cutting is shipping"
THEY'VE TAKEN THE SCISSORS TO W7 ALREADY!!!!!111!1
All kidding aside, it looks like management of the operating system has gotten much more efficient this time around. Windows, prior to W7, sounds like a mess from what I gather in that post! It's good to hear things have gotten better... and that you're having so much fun, Mr. Osterman. :)
Thanks Larry - a great post and pick-me-up for a Windows user. As born-again XP user, I was really depressed, thinking that my experience with Vista signalled the beginning of the end of my use of Windows, that XP was the best I'd ever see from MS. You've helped me understand some of the reasons why Vista was so poor out of the gate. From what you write it seems MS will ship Win 7 thoroughly tested and more robust than Vista.
My only plea AS AN XP USER now to MS management and programmers and all is that you make the move from XP to Windows 7 as pain-free as possible, or even (dare I say) a joy!
Good to know the process if more transparent internally. Even though the difficulty to finish WPF early is a big problem, the development process is also needed to be changed.
I wonder how Win7 will be like. Could it be filled with tons of WPF based goody apps?
With earlier windows I got the impression that at MS the right hand does not know what the left does. And the head does not know anything. There were so many inconsistency and duplicates. Lets hope that this time "transparency" made a difference.
Do you know what will make Windows Seven 7.0 and not 6.2?
Nice post Mr. Larry
Many many Thank's!!
Thanks for the very interesting read.
You wrote that input for feature no longer comes from senior management but from the feature crews themselves (=developers, testers). Isn't there a danger that devs will focus on new features and bugfixes and don't invest time on "boring" things like consistency and fit & finish?
But considering all things I have to say: Wow! You changed the whole process completly since the first years of Longhorn. Pretty impressive stuff! I wished I could work on Windows 7 too.
Very interesting post indeed...
But reading and realizing it makes my hair rise.
I dunno, you refer to developers like they are robots or monkeys. Maybe i feel that because you don't describe any particular change specifically, or this "6 weeks of development and 6 weeks of integration" really frightens me out.
Who's the creator there? Where is the generator of the ideas? How deep feature can be cut if there's time out? Where's the GUIDELINES?!
What, you don't have time and leave only a dummy UI window in some temporary library just because you can't fulfill specifications and test everything? That smells like Vista, full of unfinished taste.
You have to fall back to previous milestone and 'polish' it for 6 weeks? Or team is broken apart?
That bad taste is especially because you don't talk specific... sorry.
That was very interesting... great to hear in more detail the development process...
You said it's been a blast building Windows 7 and I imagine it would be, especially compared to the complications around Longhorn/Vista whatever...
However surely it's also been a little stressful as well? Considering the fuss over Vista (justified initially, totally unjustified now) there must be a fair amount of pressure to improve Windows image and remove the stigma Apple has given Windows etc?
As I said earlier. If Windows 7 is gonna be at least as half as good as this Engineering blog, it will be the best OS ever!
Keep it up, guys!
"ASUS readying touchscreen Eee PC and laptops for 2009 Windows 7 launch?"
Does this mean that Win7 will have much lower system requirements than Vista?
What can you tell us about system requirements?
I'm a user not a programmer, as such I have no right to criticize the process.
Having read you post I'm pleased that a shift I'm the management process puts the emphasis on good code. We can all dream up the killer feature but if you can get it to work then what is the point. If management were to focus on the unattainable feature at the expense of good code. All it achieves is delays and poor customer experience.
From a users point of view Windows, historically always been buggie. With most geek guru's saying wait for SP1 whenever talking about new versions of Windows. If this new way of working reduces that then great. If it keeps the people working on the project happy then that's also good. We hear some odd things from Redmond about what it is like to work for MS. Having happy workers also means better code...
The feature cutting process is always iffy. Some guy usually "has it in" for some other technology, feature, or idea because he thought of it first or someone else gets to work on it besides him. If that jealous person is in a position of power, they will often axe at will.
Perhaps this has become easier in Windows 7? Hopefully the backing from management is on the developer's and users sides. They both want the best thing in the end. It's the management that sometimes only thinks about the green and ship.
The end result should be what was originally envisioned IMO and that means getting some floating teams involved to provide additional push on the fronts that need a boost.
Can't wait for the new Win7 release for public :P
Wiser behaviors of the new core , lowered down size of the OS , new program rules for program architectures , better support for old and new drivers, and speed like WinXP when it's at least running on lowered down settings is one of many dreams :(
Here are some sugegstions and problems -->>
Make that programs work and give out popups and settings what actually can't be used by quest or non administrative accounts!
During installation system files should not fragment pagefile and MFT palcements on disk and should be optimized on disk from the very first start for optimal experience! ( Perfectdisk 2008 website says it uses the info what Microsoft gaved for optimal speed with MFT and Pagefile with correct disk placement ).
System restore points should be excluded from defragmenter programs and be placed on the most inner disk part so it would not make confusions or unnecessary movements during defragment.
These are just problems and I don't know what things are possible do make or change :(