Welcome to MSDN Blogs Sign in | Join | Help

Managing Quality (part 2) - Automated Testing

As I described in my last article, our first layer of quality assessment is build "scouting", designed to do a shallow but broad pass across the product to determine if it's worthwhile to proceed with deeper testing.

Our next layer is more thorough automated testing runs.  Many would call this "functional" testing but words in the testing space get so overloaded.  It includes both API level tests (which some may call Unit tests), UI automation tests and some broader "system integration tests".

We divide our automation tests at this level into two categories - Nightly Automation Runs (NARs) and Full Automation Runs (FARs).

As their name suggests, NARs are designed to be run every day, on every build that is not SelfToast (see part 1 for a description of this term).  Today we have on the order of 3,000 NAR tests.  In our test planning excercise (when we design the test cases we want to run and choose what will be automated) we prioritize our test cases (1, 2, 3).  The NARs are generally selected from the Pri 1 test cases and chosen so that they can run and the results analyzed within a few hours (important if you are going to do this every day).

FARs are the sum of all automated tests that we have.  For TFS, this amounts to something on the order of 20,000 tests today.  A FAR run takes about a week to run and analyze the results.  As a result, we don't start doing them until later in the product cycle and we run them less frequently - the closer to the end we get, the more frequently we run them.  Right now, I think we are running them every 2 or 3 weeks.

For completeness, beyond FARs, we have what we call a Full Test Pass (FTP).  Which is a period of time where we run multiple NAR and FAR runs on a cross section of our test matrix (the subject of a future Managing Quality post) and run our manual test cases.  Last I checked, we had about 10,000 manual test cases on top of the 20,000 FAR cases.  A Full Test Pass takes somewhere from 2 - 4 weeks.

So, with that background, on to the reports...  Here's a recent NAR trend report:

As you can see, we breakdown each result by the cause of the failure.  These warrant a little discussion:

  • Initial Pass Rate - The test passed the first time it was run.
  • Final Pass Rate - When tests fail, we "analyze" the run and some of them, we are able to tweak something about the test and re-run them.  Those that pass the second time are marked as "Final Pass Rate".  Over time this should go to zero as all tests should pass on first run but when the code is churning, it's not uncommon to need to tweak tests to keep them up to date.
  • Product issue - This is what you'd expect - it's a failure in the product and results in a "product bug report".
  • Test Issue - Believe it or not, test code can have bugs too.  These are failures in tests that can't be fixed by tweaks and require significant work in the test itself.  They can result from changes in the product or just improperly written test code.
  • Other Issue - Anything else.  These might be test infrastructure issues, lab network issues, etc.

Sidebar - I think I've described this before but since I'm showing a bunch of build numbers, let me tell you about them again.  20108.00 is a build number.  The format is YMMDD.NN.  Years is somewhat arbitrary but increases each year that the project is underway - we are using 2 for Orcas because this is the second calendar year (we started on it in '06).  0108 is January 8th.  NN represents the number of rebuilds of this "build".  Mostly this is used when we branch a build for a release and we freeze the main part of the build number and only increment NN.  During the main phase of development, it's pretty much always 00.

Here's a FAR report.  It looks pretty much the same:

This report is only showing 2 FAR runs because we've really just started getting going with FAR runs.  We are dusting them off and getting them running again on the Orcas code base.  You'll notice the gap between these two runs is a little over 2 weeks.

 

We also produce more detailed test result reports to drill into specific feature areas like this:

 

The wide variations in number of scenarios tends to come from how granularly different feature teams break up their tests.  As you'll see in a future post, the code coverage metrics for all feature areas are about the same.

Looking at all of this data from a project management perspective...  In addition to the build quality data in part 1, these form a very important part of telling how the quality of the product is progressing.  In my experience, these numbers rise and fall throughout the product cycle and tend to hit their low point right around our "code complete" deadline - that's when developers are feeling pressure to get their functionality done and quality tends to suffer a bit.  Then we move into a stabilization phase the the numbers trend up.  They need to be in the mid-high 90's to have a Beta quality release.  We generally never achieve more than about a 99% pass rate because there are inevitably some tests which fail for very minor reasons and we decide they do not materially impact the quality of the product.

This product cycle we've made some pretty significant changes to our macro level project management.  After I get through this quality series, perhaps I'll talk a bit about that and the impact it's had on how we manage quality.

Hopefully this continuing thread is useful to you.  Let me know if not, or if you would like me to focus on certain aspects.

Brian

Published Sunday, February 04, 2007 9:01 AM by bharry

Comment Notification

If you would like to receive an email when updates are made to this post, please register here

Subscribe to this post's comments using RSS

Comments

# re: Managing Quality (part 2) - Automated Testing

Fascinating!  Keep 'em coming!  It's great to see how a real-world team working on a large project works.

Sunday, February 04, 2007 11:25 AM by Carl Daniel

# re: Managing Quality (part 2) - Automated Testing

Indeed, keep them coming.  I would be interested in knowing some of the non-Visual Studio tools you're using for automated testing.  Even if it's just "internal".

Sunday, February 04, 2007 12:15 PM by Peter Ritchie

# An Inside Look at the TFS Product Team's Quality Process

Brian Harry has been blogging over the past few days about the Team Foundation Server team's product

Tuesday, February 06, 2007 1:40 PM by Jason Barile - Microsoft in Raleigh, NC

# VSTS Links - 02/07/2007

Rob Caron on Visual Studio and Daylight Saving Time Change. Willy-Peter Schaub on What "main" features...

Wednesday, February 07, 2007 9:28 AM by Team System News

# 'Grep'ing Groups

Man have I been busy. What have I been busy doing? Well, see there's.... actually, I'll let Brian explan

Friday, February 09, 2007 12:09 PM by Adam Singer

# Managing Quality (part 2) - Automated Testing

Lacking trackbacks I'm abusing the comment mechanism. This post was featured in the most recent Carnival of the Agilists: http://www.notesfromatooluser.com/2007/02/carnival_of_the.html.

Monday, February 19, 2007 3:42 PM by mlevison

# Managing Orcas Beta "exit criteria" by dogfooding TFS

I've written a few times about our usage of TFS while we develop newer versions of Visual Studio and

Monday, March 05, 2007 1:54 PM by Jeff Beehler's Blog

# Managing Orcas Beta exit criteria by dogfooding TFS

I've written a few times about our usage of TFS while we develop newer versions of Visual Studio and

Monday, March 05, 2007 2:38 PM by Jeff Beehler's Blog

# The many faces of quality

Brian's been compiling a great series of posts regarding managing quality which I would highly recommend

Saturday, April 21, 2007 9:59 AM by Jeff Beehler's Blog

Leave a Comment

(required) 
required 
(required) 

  
Enter Code Here: Required
 
Page view tracker