Engineering Windows 7

Welcome to our blog dedicated to the engineering of Microsoft Windows 7

Engineering 7: A view from the bottom

Engineering 7: A view from the bottom

  • Comments 63

Aka: A developers view of the Windows 7 Engineering process

This post is by Larry Osterman.  Larry is one of the most “experienced” developers on the Windows team and has been at Microsoft since the mid 1980’s.  There are only three other folks who have worked at Microsoft longer on the entire Windows team!  Personally, I remember knowing about Larry when I started at Microsoft back in 1989—I remember he worked on “multimedia” (back when we used to host the Microsoft CD-ROM Conference) and he was one of those people that stood up and received a “5 Year” award from Bill Gates at the first company meeting I went to—that seemed amazing back then!  For Windows 7, Larry is a developer on the Devices and Media team which is where we work on audio, video, bluetooth, and all sorts of cool features for connecting up devices to Windows. 

Larry wrote this post without any prodding and given his experience on so many Windows releases these thoughts seemed really worthwhile in terms of sharing with folks.  This post goes into “how” we work as a team, which for anyone part of a software team might prove pretty interesting.  While this is compared and contrasted with Vista, everyone knows that there is no perfect way to do things and this is just a little well-informed perspective.

So thank you Larry!  --Steven

Thanks to Steven and Jon for letting me borrow their soapbox :-).

I wanted to discuss my experiences working on building Windows 7 (as opposed to the other technical stuff that you’ve read on this blog so far), and to contrast that with my experiences building Windows Vista. Please note that these are MY experiences. Others will have had different experiences; hopefully they will also share their stories here.

The experience of building Windows 7 is dramatically different from the experience of building Vista. The rough outlines of the product development process haven’t changed, but organizationally, the Windows 7 process is dramatically better.

For Windows Vista, I was a part of the WAVE (Windows Audio Video Excellence) group. The group was led by a general manager who was ultimately responsible for the deliverables. There was a test lead, a development lead and a program management lead who reported to the general manager. The process of building a feature roughly worked like this: the lead program managers decided (based on criteria which aren’t relevant to the post) which features would be built for Windows and which program managers would be responsible for which feature. The development leads decided which developers on the team would be responsible for the feature. The program manager for the feature wrote a functional specification (which described the feature and how it should work) in conjunction with development. Note that the testers weren’t necessarily involved in this part of the process. The developer(s) responsible for the feature wrote the design specification (which described how the feature was going to be implemented). The testers associated with the feature then wrote a test plan which described how to test the feature. The program manager or the developer also wrote the threat model for the feature.

The developer then went off to code the feature, the PM spent their time making sure that the feature was on track, and when the developer was done, the tester started writing test cases.

Once the feature was coded and checked into the source tree, it moved its way up to the “winmain” branch. Aside: The Windows source code has been arranged into “branches” – the root is “winmain”, which is the code base that would ultimately become Windows Vista. Each developer works in what are called “feature branches”, which merge changes into “aggregation branches”, the aggregation branches move into winmain.

After the feature was coded, the testers tested, the developers fixed bugs and the program managers managed the program :-). As the product moved further along, it got harder and harder to get bug fixes checked into winmain (every bug fix carries with it a chance that the fix will introduce a regression, so the risk associated with each bug fix needs to be measured and the tolerance for risk decreases incrementally). The team responsible for managing this process met in the “ship room” where they made decisions every single day about which changes went into the product and which ones were left out. There could be a huge amount of acrimony associated with that – often times there were debates that lasted for hours as the various teams responsible for quality discussed the merits associated with a particular fix.

All-in-all, this wasn’t too different from the way that features have been developed at Microsoft for decades (and is basically consistent with what I was taught back in my software engineering class back in college).

For Windows 7, management decided to alter the engineering structure of the Windows organization, especially in the WEX [Windows Experience] division where I work. Instead of being fairly hierarchical, Steven has 3 direct reports, each representing a particular discipline: Development, Test and Program Management. Under each of the discipline leads, there are 6 development/test/program management managers, one for each of the major groups in WEX. Those 2nd level managers in turn have a half a dozen or so leads, each one with between 5 and 15 direct reports. This reporting structure has been somewhat controversial, but so far IMHO it’s been remarkably successful.

The other major change is the introduction of the concept of a “triad”. A “triad” is a collection of representatives from each of the disciplines – Dev, Test and PM. Essentially all work is now organized by triads. If there’s ever a need for a group to concentrate on a particular area, a triad is spun off to manage that process. That means that all three disciplines provide input into the process. Every level of management is represented by a triad – there’s a triad at the top of each of the major groups in WEX, each of the second level leads forms a triad, etc. So in my group (Devices and Media) there’s a triad at the top (known as DKCW for the initials of the various managers). Within the sound team (where I work), there’s another triad (known as SNN for the initials of the various leads). There are also triads for security, performance, appcompat, etc.

Similar to Windows Vista, the leads of all three disciplines get together and decide a set of features that go in each release. They then created “feature crews” to implement each of the features. Typically a feature crew consists of one or two developers, a program manager and one or two testers.

This is where one of the big differences between Vista and Windows 7 occurs: In Windows 7, the feature crew is responsible for the entire feature. The crew together works on the design, the program manager(s) then writes down the functional specification, the developer(s) write the design specification and the tester(s) write the test specification. The feature crew collaborates together on the threat model and other random documents. Unlike Windows Vista where senior management continually gave “input” to the feature crew, for Windows 7, management has pretty much kept their hands off of the development process. When the feature crew decided that it was ready to start coding (and had signed off on the 3 main documents), the feature crew met with the second level triad (in my case with DKCW) to sanity check the feature – this part of the process is critical because the second level triad gets an opportunity to provide detailed feedback to the feature crew about the viability of their plans.

And then the crew finally gets to start coding. Sort-of. There are still additional reviews that need to be done before the crew can be considered “ready”. For instance, the feature’s threat model needs to be reviewed by one of the members of the security triad. There are other parts of the document that need to be reviewed by other triads as well.

A feature is not permitted to be checked into the winmain branch until it is complete. And I do mean complete: the feature has to be capable of being shipped before it hits winmain – the UI has to be finished, the feature has to be fully functional, etc. In addition, when a feature team takes a dependency on another Windows 7 feature, the feature teams for the two features MUST sign a service level agreement to ensure that each team knows about the inter-dependencies. This SLA is especially critical because it ensures that teams know about their dependants – that way when they change the design or have to cut parts of the feature, the dependent teams aren’t surprised (they may be disappointed but they’re not surprised). It also helps to ensure tighter integration between the components – because one team knows the other team, they can ensure that both teams are more closely in alignment.

Back in the Vista day, it was not uncommon for feature development to be spread over multiple milestones – stuff was checked into the tree that really didn’t work completely. During Win7, the feature crews were forced to produce coherent features that were functionally complete – we were told to operate under the assumption that each milestone was the last milestone in the product and not schedule work to be done later on. That meant that teams had to focus on ensuring that their features could actually be implemented within the milestone as opposed to pushing them out.

For the nuts and bolts, The Windows 7 development process is scheduled over several 3-month long milestones. Each milestone allowed for 6 weeks of development and 6 weeks of integration – essentially time to fine-tune the feature and ensure that most of the interoperability problems were shaken out.

Ok, that’s enough background (it’s bad when over half a post on Windows 7 is actually about Windows Vista, but a baseline needed to be established). As I said at the beginning, this post is intended to describe my experiences as a developer on Windows 7. During Windows 7, I worked on three separate feature crews. The first crew delivered two features, the second crew delivered about 8 different features all relatively minor and the third crew delivered three major features and a couple of minor features. I also worked as the development part of the WEX Devices and Media security team (which is where my series of post on Threat Modeling came from – I wrote them while I was working with the members of D&M on threat modeling). And I worked as the development part of an end-to-end scenario triad that was charged with ensuring that scenarios that the Sound team defined at the start of the Windows 7 planning process were actually delivered in a coherent and discoverable way.

In addition, because the test team was brought into the planning process very early on, the test team provided valuable input and we were able to ensure that we built features that were not only code complete but also test complete by the end of the milestone (something that didn’t always happen in Vista). And it ensured that the features we built were actually testable (it sounds stupid I know, but you’d be surprised at how hard it can be to test some features). As a concrete example, we realized during the planning process that some aspect of one of the features we were working on in M2 couldn’t be completed during the milestone. So before the milestone was completed, we ripped the feature out (to be more accurate, we changed the system so that the new code was no longer being built as a part of the product). During the next milestone, after the test team had finished writing their tests, we re-enabled the feature. But we remained true to the design philosophy – at the end of the milestone everything that was checked into the “main” branch was complete – it was code AND test complete, so that even if we had to ship Windows 7 without M3 there was no test work that was not complete. This is a massive change from Vista – in Vista, since the code was complete we’d have simply checked in the code and let the test team deal with the fallout. By integrating the test teams into the planning process at the beginning we were able to ensure that we never put the test organization into that bind. This in turn helped to ensure that the development process never spiraled out of control. Please note that features can and do stretch across multiple milestones. In fact one of the features on the Sound team is scheduled to be delivered across three milestones – the feature crews involved in that feature carefully scheduled the work to ensure that they would have something worth delivering whenever Windows 7 development was complete.

Each of the feature crews I’ve worked on so far has had dramatically different focuses – some of the features I worked on were focused on core audio infrastructure, some were focused almost entirely on UX (user experience) changes, and some features involved much higher level components. Because each of the milestones was separate, I was able to work on a series of dramatically different pieces of the system, something I’ve really never had a chance to do before.

In Windows 7, senior management has been extremely supportive of the various development teams that have had to make the hard decisions to scale back features that were not going to be able to make the quality bar associated with a Windows release – and there absolutely are major features that have gone all the way through planning only to discover that there was too much work associated with the feature to complete it in the time available. In Vista it would have been much harder to convince senior management to abandon features. In Win7 senior management has stood behind the feature teams when they’ve had to make the tough decisions. One of the messages that management has consistently driven home to the teams is “cutting is shipping”, and they’re right. If a feature isn’t coming together, it’s usually far better to decide NOT to deliver a particular feature then to have that feature jeopardize the ability to ship the whole system. In a typical Windows release there are thousands of features and it would be a real shame if one or two of those features ended up delaying the entire system because they really weren’t ready.

The process of building 7 has also been dramatically more transparent – even sitting at the bottom of the stack, I feel that I’ve got a good idea about how decisions are being made. And that increased transparency in turn means that as an individual contributor I’m able to make better decisions about scheduling. This transparency is actually a direct fallout of management’s decision to let the various feature teams make their own decisions – by letting the feature teams deeper inside the planning process, the teams naturally make better decisions.

Of course that transparency works both ways. Not only were teams allowed to see more about what was happening in the planning process, but because management introduced standardized reporting mechanisms across the product, the leads at every level of the hierarchy were able to track progress against plan at a level that we’ve never had before. From an individual developer’s standpoint, the overhead wasn’t too onerous – basically once a week, you were asked to update your progress against plan on each of your work items. That status was then rolled up into a series of spreadsheets and web pages that allowed each manager to track all the teams’ progress against plan. This allowed management to easily and quickly identify which teams were having issues and take appropriate action to ensure that the schedules were met (either by simplifying designs, assigning more developers, or whatever).

In general, it’s been a total blast building 7. We’ve built some truly awesome features into the operating system and we’ve managed to keep the system remarkably stable during that entire process.

--Larry Osterman

Leave a Comment
  • Please add 7 and 4 and type the answer here:
  • Post
  • it sounds to me that you are finally cutting out the right code from win7. with vista all that ever seemed to get cut was the stuff we actually wanted, whereas you left in all the bloat. this new way of doing things appears much more logical

  • I'm not sure but how do the User Interface- and User Interaction Designers fit into this triad grouping?

  • Windows 7 is a make or break OS for Microsoft. If Microsoft does not deliver big time, then it is the end of Microsoft. This is the last opportunity for Microsoft to prove to the world that it is not a has-been company. If Microsoft does not deliver big with Windows 7, then I will shift to Mac OS.

  • Steven,

    All great stuff!

    I'd really like to see you guys post a little more info on the tools you use internally to manage all of this as well.  I know historically Microsoft has been a big 'eat our own dog food' kind of company, so are you managing all of this with Sharepoint?  Project Server?  Analytics?  What?

    There are more tools in a PM's toolbox than just one, and I'm curious as to how you've used technology to help those groups interact with one another and keep track of all those interface dependencies.

    Thanks and keep going!

  • Thanks for this very informative "how the guts of MS Development has worked and is working now" post!  I'm glad the re-organization is working out well, and that this guarantees a LEAN MEAN OS this time around, right!?  I have three questions I hope you can answer:

    1. I've heard that parts(all?) of the GDI interface wasn't hardware excellerated in Vista, will GDI(gdiplus?) be hardware excellerated in Windows 7?, Or is the WPF going to be the new standard.  

    2. Will there ever be a DirectX10 for Windows XP?

    3. Except for the Direct3D interface, are all the other DirectX interfaces completely dead now?  I keep reading that they are depricated, but are they truly dead, never to be updated again?  I personally hope not!

  • @Steven,

    Friday, 17 October 2008

    Steve Ballmer promises Windows 7 will be better than Vista: "Windows Vista is good, Windows 7 is Windows Vista with clean-up in user interface [and] improvements in performance,"

    Windows Vista according to some estimations has got 3 x more entries in Registry. What like what, but it can make even the most powerful PC slow. Add DRM, Indexer, UAC and you have "wow" effect.

    I really hope, that Microsoft will think about it and I really hope, that recession will make, that people will think about wasting hardware resources by wrong architecture.

  • 3 x more entries in Registry than XP...

  • Very nice reading, actually every blog was pleasent to read, good work there.

    I know that there people are working who u will never hear here, but for u, keep up that good work!

    Although, would be nice if there's is a credit page in W7 :)

    Making a OS that's more costly than a average Hollywood movie and no credits shown? U should be more proud of our delivered work!

  • "Windows 7 is Windows Vista ..."

    This is very disappointing news. I was hoping for fundamental changes in kernel, registry, system requirements, file system etc. Windows 7 will be the same old stuff with more lipstick. :(.

    As for home use this will put Windows from 80% below 50% in 5 years.

  • "Windows 7 is Windows Vista ..."

    Agree, it did sound like an change of lipstick. Sry guys, but that's how is Mr. Balmer made it sounds like.

    When reading here the effort u guys are making and the comment of Balmer, sound like a counterproductive commentary of Balmer to me.

    There reason choosing the name W7 is of course very bolt to me. No more fancy name's like XP, Vista, Longhorn. But again a plain number 7. To me it sounds like a distance of the existing product name want to be made, because this is such more then another lipstick.....  i hope.

  • Clarify one thing , Vista is (for now) the Best OS. 360 degrees

    People should learn to speak when the final product in our hands ,we can understand from where the Windows 7

    but none of us can imagine where arrives.

    Sinofksy was clear regarding the compatibility, Imagine having to face new problems  Driver, Software, Hardware etc.

    Let lose any comment PREFUD, Remember only that there are  2500 engineers to work for us.

    9 day for PDC

    @Windows TEAM ... Unleash Hell

    GO!

  • Well, I guess Ballmer has let the cat out of the bag officially now. "Windows 7 is Windows Vista" and the rest of his comments, shows that he doesn't understand how his company works and what its core processes are. He's a sales and marketing guy - he just wants to sell code and get $$$. He doesn't understand the development process, which is THE key business process at MS; he also doesn't understand why his customers do or don't use his product, which therefore means he's simply a sales guy. I think it would be very instructive for MS to review how Apple make their products: you can tell, they've thought everything through from an integration perspective, and what you have in your hands when its done shows that - its simple, it all fits together. I think its incredible (as an engineer) that the old MS way was to develop code and then toss it over the wall to the testers. However, the current model also has its issues, and primarily its about having everyone on a single team and getting into 'group-think'. I'm going to hazard a guess that a developer likes the current structure because he is, in fact, the center of attention. The testers are now part of 'his team', and they're probably going to knuckle under when he tells them what to worry about or not worry about. Testers need to be the voice of the customer. There are still numerous issues about how they decide what features should be in the product or not, and then how the overall design comes together as a coherent whole. You knew Vista was in real trouble when they starting dropping stuff that should have been foundational like WinFS. At that point, it just becomes a scramble to get something together so you've got something to ship and get revenue against all the hours you spent on it. As they talk about features here, they are very small elements within a wider framework: if the framework is badly conceived much time can be wasted polishing the little elements that then aren't going to get used. There's clearly a hierarchy here of how less important stuff is layered on top of more important stuff, without which the less important stuff (which may be in front of an actual user) won't be delivered. At least there's mention of a 'service agreement' between teams and a clear definition of the interface, beneath which the actual implementation should be allowed to change. This is a core engineering element of 'modularity'. I'm suspecting there's still a lot of legacy stuff in Windows which will carry over and which doesn't follow modular rules, and so that runs the risk of creating massive inter-dependencies at a lower level that can unravel the entire edifice. Until that's addressed, the developers can't make the assumption that what they've done won't be broken by some other change. Its clear MS have issues deciding whether to implement a fix because it requires a need for regression testing, which shows the level of inter-dependence in the code. A simple bug fix shouldn't need this; there's actually clearly a change in behavior/specification so the problem is more profound. It's not the fact that the code doesn't work as designed, its that the design wasn't complete or there's side-effects of the implementation that weren't appreciated. Vista had huge issues with 3rd party graphics drivers at launch because they kept changing the underlying driver model on their partners - there's nothing in this blog entry that suggests that still isn't the case. Given the fact that MS has stated "Windows 7 is Windows Vista" there are fundamental building blocks that are still exposed, the whole thing is built on a shaky foundation. Add to this that entire features are being 'retired' (no more Windows Mail, you've got to go to the 'cloud' of Windows Live to get this functionality - a dreadful decision because people want to work disconnected, like on a plane) or not touched at all from version to version (Notepad, Paint, Movie Maker) shows a rampant disregard for a good design/feature selection process for a coherent product. We get a bunch of stuff that's half-baked and then left to wither, but meanwhile, its a drag on everything else. MS has virtually unlimited resources and unlimited cash, so its a management issue of how to harness all that to deliver significantly new products on a regular basis. With Windows, you're kind of getting the impression MS thinks its largely good as-is and just needs a bit of spit and polish. Ballmer then says the goal post-7 is to get closer to future processor capabilities from AMD and Intel; meanwhile, the existing product still isn't using the capabilities in current generation hardware. Its only just getting around to 64-bit versions regularly shipping on hardware thats been 64-bit capable for about 4 years, and there's still no sign of a 64-bit Office suite. Too slow, too ponderous. And the reason its too slow is because the development process fundamentally doesn't support being fast and innovative, and neither do their sales and marketing practices. Ballmer again: buy and implement Windows Vista today even though 7 is only 12 months away... and thats still 2.5 yrs between releases and they wasted much effort in fixing a botched release. And everyone is conditioned to wait for SP1 (another year) before things are really working as expected.

  • I didn't want to make FUDs here, but when I hear, that "Windows 7" (which should be revolution), will be Windows 6.1 and more and more people are speaking about it as "Vista" with some improvements only, I have a little bad feelings...

    All current Windows systems are based on NT and have some disadvantages (like shared Registry). After PDC we will see some new facts. But I will read about thousands of new API only, it will be important sign for me, that I should start thinking more about other platforms than WIndows for ALL my tasks.

    The truth is too, that people will think much more now, before they will buy new hardware. Wrong architecture will mean, that resources are wasted. The more people will notify it, the more will think about Linux, MacOS or even about leaving x86 for some other solutions...

  • in previous post I wanted to say of course:

    But, when I will read about thousands of new API only, it will be important sign for me, that I should start thinking more about other platforms than WIndows for ALL my tasks.

  • @marcinw

    I see more and more and more person declare Windows Server 2008 an operating system phenomenal!

    Windows Server 2008 is Vista Sp1 +different Service  STOP.

    Now think about this system already perfectly stable , 2500 engineers continue to work for 3 years  direction By Steven Sinofsky!

    You just have faith and do not add anything in speeches futile

Page 3 of 5 (63 items) 12345