Welcome to MSDN Blogs Sign in | Join | Help

How Microsoft Uses VSTS 2008

In the last couple weeks, Steven Borg, Jeff Atwood and I did some one-day seminars on VSTS in California and Washington.  I keynoted the day and they did the heavy lifting of detailed demos.  In my talk, I demo’d some examples of how we inside Microsoft had dogfooded VSTS 2008 to manage the Orcas release.  There was a lot of interest in this, audible in the questions.  It’s also a hot topic every time customers come to campus and look for information on VSTS.  Obviously, I can’t provide access on our intranet, but I posted a video from a webcam that walks over a few of the sites and a PDF with the slides from the talk, including some screenshots. 

Thanks to Laura Smith, Raj Selvaraj and Master Chief (shown below) for organizing the events!

 

Credo

Last week I went on a Leadership Challenge Workshop.  It was a combined review of 360˚ feedback from customers, peers and managers, and exercises based on the work of Kouzes & Posner.   The feedback was largely positive, although I learned that I haven’t been communicating enough, something that I’ve heard from blog readers too, and I agree.  So I have resolved to resume blogging regularly. 

The best of the exercises was an assignment to write a personal credo.  It starts with a thought experiment that you will be on a six-month sabbatical, incommunicado, and need to provide operating principles to guide colleagues’ decisions while you’re gone.  Here’s what I wrote:

We win by delivering the best, most obvious, most approachable customer value.  We need to focus first on growing our market, by making the economic value of Application Lifecycle Management – our product category – obvious to all customers.  Only second should we worry about our explicit competitors. 

Don’t confuse competitors’ claims with customers’ needs.  Measure customer needs in customers’ own words and make sure they see our vision and participate in the choice of what we do.

Our reach should exceed our grasp.  We need to envision more than we can do, and we need to carefully choose which parts to do. 

Work iteratively.  We won’t ever know everything.  We’ll do what we’re most certain about first.  At the same time, we’ll keep a stack rank of priorities, and a list of hypotheses, and regularly revisit, revise and rerank them. 

Nothing succeeds like success.  Successful execution both proves our vision and prevents our digression into less valuable activities. 

Our success depends on our partners’ success.  We cannot possibly satisfy all of our customers’ needs, therefore we need to focus on the core services that only we can provide, and enable our partners to extend and complete our joint offering.

We are a team.  We learn from each other other’s wisdom and insight across every level.  Everyone on our team has something to say and contribute.  That’s why they’re here.  We are also a team of teams.  As we scale up, we need to make extra effort to draw on the same strengths of our colleagues who are one or two steps removed.

Look out the window, not in the mirror.  No matter how successful we are, the world changes faster than our awareness of the change.  Let’s keep our eyes on horizon and act on what we see.

So thanks to everyone who has pushed me to write.  In my next piece, I’ll share some thoughts on requirements, a pretty hot subject at the moment. 

Illuminated by STAR?

I recently spoke at the VSLive and STAR conferences in Orlando. (You can watch the VSLive talk here and I’ve attached the STAR slides to this post.)  The contrast between the conferences was a good reminder that different audiences live at very different points on the Technology Adoption Lifecycle.

Looking for the Revolution

I spent the bulk of the week at STAR, where unlike my Microsoft colleagues, I had the most personal history.  If you don’t know it, STAR is the major semiannual testing conference in North America.  I’d attended and spoken there perhaps ten times previously, but not in the last four years.  I came with a fantasy that, like Rip Van Winkle returning to town, I would see signs of a revolution while I was away.  There were lots of talks about different testing techniques or measurements, but very little new.  People are still speaking in single dimensions – do exploratory testing or test against requirements or test against code or drive tests from keywords or model requirements to create tests or apply attack patterns or test against risks or use these metrics or automate your testing or … It’s amazing to me that in such a small field there can be so many silos, because there really isn’t enough grain for them all to store. 

All of these ors really should be ands.  I made a point in my talk, as I do in the book, about the need for multidimensional descriptive metrics and diverse techniques applied together.  It’s about requirements, code, risks, fault models, qualities of service, data, configurations, and discovery.  They’re all relevant and complementary.  The most memorable talk I ever heard at STAR before was Cem Kaner’s Paradigms of black box software testing.  Cem surveyed all of these silos as separate paradigms and postulated, along the lines of Thomas Kuhn’s Structure of Scientific Revolutions, that the paradigm diversity was a indicator of a unifying revolution to come.  Unfortunately for my fantasy, at this STAR, George III was still on the road sign. 

Helping the Revolution

(Now I’ll stretch the metaphor to the breaking point.)  If you’re also waiting for the revolution and are interested in the continental congress, you might be interested in joining our “Enterprise IT 9-to-5” program.  Somasegar, my divisional VP, recently blogged about it here.    The goal of the program is to understand our IT customers’ needs at a much deeper level than can be typically achieved with our current programs.  It is also about taking what we learn back to our product teams in rich detail to help them think more deeply about our customers’ diversity of contexts, needs, and goals when considering the direction we take our products.  

In the 9-5 program, 4-5 key members of our product teams ask to spend 2-3 days on a customer site during normal working hours.    The team is led by one of our Product Unit Managers.   All the team members are highly influential in defining functionality of our lifecycle tools.  

This is a unique opportunity to influence the future of Microsoft’s products to support the software lifecycle.  Your teams’ processes, problems, work styles, and needs become reference examples that inform Microsoft’s lifecycle products when making a broad range of decisions.  We firmly believe that with a strong understanding of both your broad as well as your day-to-day needs we can positively improve the utility our tools will have on your software lifecycle.  If you’re interested in participating, please reply to me and I’ll connect you to the team privately.  Thanks!

Applying Value Up at Microsoft

Let me start with thanks for Rob Caron‘s persistence in encouraging me to blog.  I haven’t blogged since the initial Team System announcement at Tech∙Ed in 2004 (Announcing Visual Studio 2005 Team System).  I promised him that, as soon as my book went to press, I’d start a regular column.  The book comes out this week, and I’ll be speaking at two conferences –VSLive and STAR East – so now seems an auspicious time. 

In my book, I discuss what I call the value-up paradigm.  In short, this is the team’s focus on flow of customer value as the driver of a project, as opposed to the work-down through a list of planned tasks.  In the book, I list seven characteristics of this paradigm, describe its implications for each of the disciplines on the team, and the implementation with Team System.  Here’s the table from the book.  (If you’ve read either of the current versions of MSF, the value-up concepts should be very familiar.)

Core assumption

Work-down attitude

Value-up attitude

Planning and change process

Planning and design are the most important activities to get right.  You need to do these initially, establish accountability to plan, and monitor against the plan, and carefully prevent change from creeping in.

Change happens, embrace it.  Planning and design will continue through the project.  Therefore you should invest in just enough planning and design to understand risk and to manage the next small increment.   

Primary measurement

Task completion.  Because we know the steps to achieve the end goal, we can measure every intermediate deliverable and compute earned value running as the % of hours planned to be spent by now vs. the hours planned to be spent to completion. 

Only deliverables that the customer values (working software, completed documentation, etc.) count.  You need to measure the flow of the work streams by managing queues that deliver customer value and treat all interim measures skeptically. 

Definition of quality

Conformance to specification.  That’s why you need to get the specs right at the beginning.

Value to the customer.  This perception can (and probably will) change.  The customer may not be able to articulate how to deliver the value until working software is initially delivered.  Therefore, keep options open, optimize for continual delivery and don’t specify too much too soon.

Acceptance of variance

Tasks can be identified and estimated In a deterministic way   You don’t need to pay attention to variance. 

Variance is part of all process flows, natural and man-made.  To achieve predictability, you need to understand and reduce the variance.   

Intermediate work products

Documents, models, and other intermediate artifacts are necessary to decompose the design and plan tasks, and they provide the necessary way to measure intermediate progress.

Intermediate documentation should minimize the uncertainty and variation in order to improve flow.  Beyond that, they are unnecessary. 

 

Troubleshooting approach

The constraints of time, resource, functionality and quality determine what you can achieve.  If you adjust one, you need to adjust the others.  Control change carefully to make sure that there are no unmanaged changes to the plan.

The constraints may or may not be related to time, resource, functionality, or quality.  Rather, identify the primary bottleneck in the flow of value, work it until it is no longer the primary one, and then attack the next one.  Keep reducing variance to ensure smoother flow. 

Approach to Trust

People need to be monitored and measured to standards.  Incentives should be used by management to reward individuals for their performance relative to plan.

Pride of workmanship and teamwork are more effective than individual incentives.  Trustworthy transparency, where the whole team can see all the team’s performance data, works better than management directive.

 

When I joined Microsoft in 2003, I began driving the value-up approach to planning, managing and implementing Visual Studio Team System.  At the time, it was a big change for most of my 200 or so Team System colleagues, but we were a new project with strong leadership and a clear charter to focus on our customers’ requirements, breaking through a decade of stagnation in the market. 

Over the last several months, I’ve been heads down mentoring Developer Division on value-up planning for the Orcas release.  We’ve been trying to repeat the cycle on a scale ten times larger as we’ve been operationalizing this in Developer Division.  Along the way, there have been interesting issues of scale and culture change that I’d like to share.

Scenarios, Value Props, Experiences, Features

Dev Div is an organization conditioned over two decades to think in terms of features.  Define the features, break them down into tasks, work through the tasks, etc.  The first step in shifting to the value-up paradigm was to take a holistic and consistent approach to product planning.  We introduced a taxonomy of functional product definition that covers end-to-end scenarios, value propositions, experiences and features.  For each level, we used a canonical question to frame the granularity.  We rolled out training for teams, similar to Chapter 3 (“Requirements”) in my book

Conceptually, the taxonomy looks like this. 

End-to-end Scenarios

Each end-to-end scenario is targeted at a particular customer profile and is designed to capture a vision of enough business value for customer to decide to purchase or upgrade to the new version.

Value Propositions

In an end-to-end to scenario, we start by considering the value propositions that motivate customers (teams or individuals) to work with our platform and tools. We consider the complete customer experience during development, and we follow through to examine what it will take to make customers satisfied enough to want to buy more, renew, upgrade, and/or recommend our software to others.

A value proposition is a way of defining tangible customer value with our products. They address a problem that customers face, stated in terms that a customer will relate to. A value proposition is represented in the following statement that a customer might make: We would work with your product if it helped us to [value proposition].

Experiences

Value propositions translate into one or more experiences. Experiences are stories that describe how we envision users doing work with our product: what user tasks are required top deliver on a value proposition?

Features

Experiences in turn drive features. As we flesh out what experiences look like, we spec the features that we need to support the experience. A feature can support more than one experience. (In fact, this is common.)

We also created two value props that didn’t really belong to scenarios, called “Legacy Qualities of Service” and “Remove Customer Dissatisfiers”.  To manage this data, we set up a team project in our Team Foundation Server that we call (inappropriately, but for historical reasons) the “feature directory”.  We have separate work item types for each of the value proposition, experience, and feature.

Loading the Train

Key issues in every software project are scheduling and managing the backlog and release scope.  We had to develop rules that we could apply across a product of this breadth for prioritizing the envisioned functionality at successive levels of granularity.  We used three categories, Critical, Value Add and Incubate, to prioritize the scenarios, value propositions, experiences and feature.

  • Critical value props are ones around which we would build the schedule.  In other words, these are value props that we cannot ship without and we will adjust schedule and resources to make them fit.
  • Value Add value props (awkwardly named, I agree) are ones we want to deliver in the release, but for which we won’t adjust the schedule.  They get resources after the critical ones.
  • Incubate value props are ones that we plan for subsequent releases from the beginning. 

We stack-ranked the value props by scenario, applied this rating, proceeded to elaborate the experiences within each value prop, stack-ranked and rated the experiences by value prop, and repeated the process for features within experiences.  We vetted these heavily with customers at conferences and special meetings, and with non-customers in focus groups.

Next, we addressed the risk and cost of the features (to 3-day granularity) and segregated them into high- and low-confidence buckets.  Then we reassessed the experiences, so that only those experiences that still hold together with high confidence, based on the costing of the features, achieve high confidence.  Along the way, we’ve continued to vet these experiences with customers, which included their review of the experience specs.

This gives us a release backlog, stack-ranked within value prop and by contributing team, of critical and high confidence experiences and features.  We use these to lay a planned iteration schedule, and we can measure its completion on a daily basis at each level -- in terms of features, experiences or value props (In practice, there are a few other scheduling factors, notably dependencies across teams and resource availability, but I’m ignoring them here for simplicity.).

At this point you might be screaming, “How waterfallian!  What BDUF!”  Actually, no.  We’re managing the release in a series of 5-week iterations and in most cases, we’re doing detailed design only an iteration ahead.  Each iteration will produce a Community Technology Preview (CTP) as a deliverable increment of working software.  The learning from each iteration will feed into the design of the next.

At the same time, we do have a clear target, a backlog stack-ranked and understood in customer value, with a clear delineation of critical and value-add functionality.  We have rules for revising the ranking at iteration boundaries.  We also have transparent, daily assessment of progress. 

Quality Gates

In a key shift to value-up management, we have abandoned any measure of being Code Complete in favor of being Feature Complete.  Feature Complete gets measured by passage of Quality Gates, which capture quality-first practices of process like MSF and XP.   Given the amount of literature around Code Complete, you’ll recognize the significance of abandoning it in favor of the Feature Complete measure.

Feature Complete attempts to measure incremental customer value and keeps the whole product in working order as it evolves.  In addition to executing tests (unit and integration) with code coverage requirements, Quality Gates check for key qualities of service, such as security, performance, localizability, and usability.  The project is too big for “common code base” (i.e., the use of a single source branch).  Rather, we one Main for integration, and each “feature crew” has its own working branch.  Although each feature crew has full control over its private branch, the Quality Gates are applied stringently on delivery from the feature branch into Main. 

Reporting

Because we are using TFS, everyone can access daily reports that measure the forward progress and status of the release.  AFAIK, it’s the first time at the scale of Dev Div that we’ve been able to see progress of customer value as it’s been implemented.  We’ll use this to communicate the CTP contents as well, so that you know what to look for in each increment. 

What’s Next

I realize that I’m making everything seem easy and seamless, and of course, it hasn’t been.  One of the key learnings echoes DeMarco & Lister – If you don’t have enough time, start earlier.  We began planning Orcas very late, and because the product teams were heads down finishing the 2005 release (“Whidbey”), we did not engage them broadly until after a great deal of envisioning had been done.  We made this choice knowing the risk, but underestimated the amount of time it would take to reset everyone’s thinking to a common level.

We started the first iteration of building the product, so we’ll be managing change, measuring value and velocity using Team Foundation Server.  We’ll prove the Quality Gate model over the next iterations.  I’ll keep you posted.   

 
Page view tracker