Welcome to MSDN Blogs Sign in | Join | Help
Tools we used to test C# during VS2005

During this last product cycle the C# test team faced some interesting testing challenges.  Below is a recap from some of tools written by the C# test team for Visual C# and Visual Studio 2005.


CompilerFX:
One of the big challenges of testing a compiler and other language related features of the IDE is that numerous combinations of syntax that are possible to combine.  To help solve this problem, we created CompilerFx.  CompilerFx is an internal test tool that enables programmatic access to the C# AST (abstract syntax tree).  It is written in C# and exposes nodes, the symbol table and annotated parse tree through a rich object model.  It has various helper APIs that enables querying the code context easily.  The AST information is dumped by the C# Compiler; this enables the tool to have access to the latest language features as soon as they are available.
Having the C# AST opened up the opportunity for automatically generating tests.  Knowing the code context by querying the AST, we are able to recognize interesting code points such as a method invocation and its return type and then generate test cases based on that.  This is powerful when it can be done across an existing test suite of 20,000 compiler tests.  Since these new compiler tests are compiled and run, we would expect test execution to continue to pass, since the semantics of the code should change.  During the VS2005 product cycle CompilerFX based tools found close to 100 bugs across features including the compiler, the refactoring engine, edit and continue, and code formatting.   Testing using the AST has a lot of potential and is a place we’ll continue to invest.


TAO: Test Automation Object
An important part of what a QA team does is automating test scenarios, many of which require manipulating the user interface. Over time we have created internal tools based mostly on Active Accessibility to do so. A common and significant problem with these tools is synchronizing the tests with the target application; writing very robust tests using this method has not always been easy. When the product UI changes, it often breaks tests and simple focus issues on the test machine can cause false positives.
Trying to circumvent this problem the C# test team created TAO (Test Automation Object).  TAO is basically a test object in the language services that is only instantiated in a test environment.  TAO differs from previous efforts in the following ways:

·         Separation between tests and execution engine: TAO tests are written in XML making them independent of the engine that executes them.   This is really a data driven test.  This abstraction also enables us to use the same tests to verify a feature through the UI (with a more traditional UI manipulation technique) or through an API.

·         Bypass the UI:  TAO does not interact with the product through the traditional approach using Windows messages or Active Accessibility talking to the UI controls.  Instead it drives the product through a lower level test interface that enables it to reach functionality without having to mimic a user’s actions. Through these APIs tests can also obtain more detailed information on the product state including events and internal data structures.

·         Performance:  You might not consider performance as a critical feature of a test; however, we are finding more and more that the usefulness of a test depends a lot on how quickly it can run.  TAO tests execute about 4 times faster than the traditional UI based tests.

·         Efficient to write:  TAO tests take about half as long to write as a traditional UI test.


DELTA: Debugger Engine Level Test Automation
During VS2002, VS2003 and most of VS2005 the Debugger was tested mainly using an UI-based approach, driving the product through the UI exercising the test scenarios. DELTA is an engine level test library developed in C# that communicates with and drives the lower level debugger components through a set of API’s exposed by the Microsoft.VisualStudio.Debugger.Interop.dll. DELTA creates an instance of the Session Debug Manager (“SDM”) component of the debugger which is a dispatcher between the UI and the various lower level components called “debug engines”.  DELTA uses this instance of the SDM to perform various debugging actions such as launching an executable or attaching to an already running one, controlling the execution flow through stepping or breakpoints or manipulating the current state through locals, call stack or threads information. DELTA tests have all the benefits of non-UI automation such as very fast execution and significantly increased robustness.  We found that the cost of writing a Delta test was about the same as the UI approach, however, the DELTA test was about 30 times faster to execute. This makes sense when you consider that it’s only loading a small fraction of the DLL’s loaded in VS2005.   By targeting the lower level debugger components, DELTA enables us to specifically target components in our testing.  One of the biggest benefits is that these test are very robust and fast, which make them useful for checkin tests in addition to running as part of distributed test automation runs.


If you passionate about writing tools and implementing creative ways to test, check out
C# job openings (external)You can send your résumé to vcsjobs@microsoft.com or directly to me, I would love to hear your feedback.

Customer bugs - list of fixes

To follow up from the previous post, here are all the fixed C# and Visual Studio Debugger bugs reported through MSDN Product Feedback for Beta1 and the CTP's (this doesn't include bugs entered on Beta2).

The top bug reporters for C# and the Debugger are:

   TAG 7
   
Werdna 4
   
Bob Cohen 3
   
Keith Hill 3
   
Jouko Kynsijrvi 3

Thanks to everyone that reported a bug!

- Rusty

Customer bugs - results

Almost 9 months after shipping Beta1, Beta2 is shipped and in your hands.  One big difference between VS2005 Beta1 and betas of our previous products has been the amount of community involvement.  Beta1 was the debut of the MSDN Product Feedback Center, which improved upon the previous bug reporting mechanism in a couple key ways:

  • It improved transparency through the way it synchronizes data back and forth from our internal bug database to the MSDN Product Feedback site. In particular it enabled MS employee’s to add comments to bug reports, which get pushed out to you.
  • It (finally) provided a more objective way to rank issues through voting and the importance rating.  This gave us a better picture of how many users were affected a particular issue.
  • Now that it was opened to the public (as opposed to you only being able to see your own bugs), it encouraged sharing workarounds, searching for duplicates before entering bugs and discussing/debating a particular issue through discussion threads.

This helped remove some of the barriers between the internal product team and customers.  The results so far have been great!  Here is a summary of the resolutions of C# and the Visual Studio Debugger feedback.  I’ve also included resolutions for bugs entered by MS employees for the same period for comparison.

Resolution of bugs reported since Beta1:

IssueType Source Total Fixed Postponed Won'tFix ByDesign NotRepro
CodeDefects MSDNFeedback 555 240 59 8 101 142
MSDNFeedback 43% 11% 1% 18% 26%
Internal 45% 11% 4% 10% 10%
Suggestions/DCRs MSDNFeedback  567 51 286 39 131 45
MSDNFeedback 9% 50% 7% 23% 8%
Internal 23% 45% 13% 8% 3%


For code defects, the fixed and postponed rates are almost the same between bugs reported through MSDN Feedback and those reported internally.   I think the higher “Not Repro” rate for MSDN Feedback code defects is due to having older builds compared with internal teams.   Getting your bug reports improves the product is two key ways.  First, obviously, it identifies specific issues that we missed through our normal testing, but need to fix.  Test teams employ a variety of test activities in the course of their testing, but even then it’s hard to find all the issues (or know which ones to fix) before it goes out in Beta.  The second way it improves the product is by identifying deficiencies in our test approach, which we can then address in our processes so that the next milestone doesn’t suffer the same class of problems.  We are currently in the process of doing a test hole analysis for these MSDN Feedback bugs to see if there are trends or patterns to the bugs we missed.  Getting beta and early adopter coverage is really important part of our strategy for testing the product and improving how we develop software.

Suggestions and Design Change Request (DCRs) are less of an apples-to-apples comparison with internally entered suggestions.  The first observation is that a lot more suggestions are entered through MSDN Feedback Center in percentage and total than MS employees enter internally.  Over half of MSDN Feedback reports are suggestions, but internal suggestions account for less than 5% of the total.  I think a lot of the “suggestions” internally are expressed through email and hallway conversations instead of bug reports.  I think the best forum for suggestions is really through the community anyway.  The MSDN Feedback site enables us to objectively rank what is important to you through voting and the importance rating.  This has been useful in weighing the different suggestions and as a result we’ve been able to act on your feedback – the two notable examples being C# EnC (399 votes) and Update Icon sets (813 votes), the #1 and #2 voted items.

Overally, I’ve been impressed with the quality of the bug reports and the volume of reports has been manageable.  In my next post I'll include the list of fixed C# and Debugger bugs.

- Rusty Miller
Visual C# Test Manager

 

Recruiting

I love going out to college campuses and meeting students, chatting about their projects and giving them a coding problem or two.  Every college has its only flavor and emphasis - some focus on Java, others C or C++, and some C#.  The programming language doesn’t really matter – they all give us a chance to talk about a coding problem, tradeoffs with optimization and interesting things to test.  There are hands-on schools where coding skills are quite good while others tend to be more theoretical and students have less coding experience.  In general, coding skills aren’t as important as problem solving, raw smarts, and of course a passion for software.  Technologies and even languages come and go, so your ability to learn and ramp up on the next technology or language is more important that deep skills in the technology of the day if you can't learn the next.   This last Winter I made a visit to the University of Waterloo, Ontario. (We have several Waterloo alums on C# team.)  Waterloo has a really cool coop program where students take 6-8 coops over the course of their degree.  I never did an internship or coop during college, but wish I had – it’s a great learning experience.  I also got a chance to visit Virginia Tech, another one of my favorite schools.  While doing recruiting at colleges, we’re looking for great hires for any discipline (Developer, Developer in Test and Program Management) and any team at Microsoft.

My team (C# and the VS Debugger) is looking for Developers in Test (aka SDET), which is probably the least understood discipline outside of Microsoft.  Santosh wrote a nice blog entry on what it’s like to be a tester at MS.  Scott Guthrie, Product Unit Manager, of ASP.NET, also has a thorough description of the whole testing process.  Check out the jobsblog for a couple other good descriptions (Are you a good enough developer to be a Microsoft SDET? and 3 reasons to consider SDET ).  You can also listen to Peter Hauge's podcastThe biggest misconception about testing at Microsoft is that it’s not technical.  I think a lot of this perception comes from people’s experience working for software companies where testers primarily do manual testing and the ratio is on the order of 1 tester for every 10 developers (or even fewer).  In those cases, testers really don't have the time to dedicate to the technical side of testing.  Depending on the team, the ratio at MS is one SDET to one SDE.  SDET’s are essentially developers that work on the testing problem.  One of the coolest things about the SDET role in my opinion is the mix of creativity, technical and problem solving you get to apply to most testing problems.  Many times the design and implementation is a known quantity, but the testing can be approached from many different ways - restricted to your imagination really.

If you are interested in an SDET position send me your resume at rustym@removeme.microsoft.com.  Our C# and Debugger job descriptions can be found
here.

Rusty Miller
Visual C# Test Manager

Test styles and predicting bugs

I recently had some flight/airport time to catch up on a couple articles I've been meaning to read.

What is a good test case by Cem Kaner
This is a good summary for people new to software testing.  I'm a strong believer in using multiple "test styles" and test activities as part of an overall testing strategy.  This paper breaks black box testing into Function, Domain, Specification, Risk-based, Stress, Regression, User, Scenario, State-model based, High volume automated and Exploratory testing.  Internally we may use different terms, but hit most of these categories in some form.   For example, in addition to functional testing, which is probably the dominant style, we also do stress, capacity, security, specification (feature specs as well as others Logo requirement and Accessiblity/Section 508 etc), scenario and exploratory testing.  We do “User” testing through both dogfooding our product as well as through betas and early adopter programs.  My team has a few pilot projects with State-model based testing, but it’s limited right at the moment.  We have recently done a lot more high volume automated testing on our IDE features, where we take arbitrary code and apply generic tests to it.  This has been pretty successful.  We’ve taken our compiler test suite of 20,000+ language tests and run in through this engine and have found a number of bugs we didn’t with traditional methods.

A Critique of Software Defect Prediction Models by Norman E. Fenton and Martin Neil
A Decision-Analytic Stopping Rule for Validation of Commercial Software Systems by Tom Chávez
Both papers related to predicting bugs and had interesting approaches, which I want to look into more when I get more time.  I appreciate the point in Fenton and Neil’s paper about testing effort and testability needing to be part of the equation.   They make a good point that variables such as developer skills are very important and can be more important than module size or complexity in bug rates.  If these are held constant, it’s more valid, but a magic number like the “Goldilocks’s Conjecture” that says there’s an ideal module size is suspect.

Chávez’s stopping rule paper also sounds promising, although in general I think we are still a long way from being able to rely on numbers alone.  I think we will still rely a lot on people’s experience and gut feel for setting schedules and determining when to ship.  We do something internally (and I’m sure many other software projects out there do the same), which we call “bake-time”, where we might have zero ship stoppers at a given point in time, but we still want to wait before we ship because of the possibility of unfound bugs.  Historically last minute ship stoppers have popped up due to hard to predict activities (if we knew what they were we’d do those tests ahead of time).   Typically it’s a result of someone intentionally doing creative testing or performing a real world scenario people hadn’t thought of yet.  It’s a challenge to anticipate what these are, but it’s this hunt that makes for some very exciting and creative testing.

Internally, we use a low tech but relatively accurate method of predicting bug rates with historical data.  We look at the shape of active and incoming bug graphs from previous product cycles and incoming rates from test passes and fixed rates to help predict our dates.  One interesting observation is that while we are driving to hit zero bugs for a even milestone, the total number of bugs stays roughly constant if add the current milestone’s bugs with the bug pushed off to the next milestone, however, the make-up of these bugs changes to weigh the low priority and lower severity variety over time.

I’d be interested to know if other people have implemented an bug predicting or stopping rules.

Introduction & Application Verifier

Hello, I'm the Test Manager for Visual C# and the Visual Studio Debugger.   During my career at MS, I've also worked on Win95, Visual Test, Visual J++ and Visual C++.

In addition to blogging about what my team is up to, I'll touch on some of the other stuff I have my fingers in:
- Various aspects of software quality
- Side-by-Side (for example, VS2003 SxS with VS2005 on the same machine).
-
MSDN Product Feedback
- Application Verifier, one of my favorite test tools

Which brings me to my first testing suggestion, AppVerifier.  If you haven't given it a try already, try running your app under AppVerifier.  If you are running on Windows XP or greater (the greater the better because more verifications are added to the OS on in later versions) you can almost immediately find bugs by running your app with this turned on.  For the “core“ set of verifications, the tool causes your app to fail fast when it does something wrong by AV'ing immediately instead of later down the line when the bad code is long gone from the scene.   These are the core verifications:

PageHeap:  Checks the heap for corruption and adds guard pages to the end of each allocation. This causes access violations when there are buffer overruns.
Locks:  Checks for errors in lock usage. This might cause access violations when errors are located.
Handles: Checks for handle errors. This might cause access violations when errors are located.

It can also help you make your app more secure.  See Mike Howard's article.

To get started:
1)
Install the Applications Compatibilty Toolkit and run “appverif“ from the Start.Run.
2) S
et the verifications you want to turn on for your EXE.
3) You should run it under a debugger, so open your EXE in the Visual Studio Debugger (of course! :-) ) and set up your symbol and source paths.
4) F5
5) Test your app.

Tips:
1)  With PageHeap on you will need lots of extra memory.  For VS2005 we turn on all the verifications and found 1GB is needed, although I'm sure you can get by with less depending on your app.
2)  As I mentioned, use the newest version of the OS.  I'd suggest Windows Server 2003 if you have it.  At a minimum you need Windows XP.
3)  If you turn on verifications that output to the log file, try “Break on Error”.  This will give you a call stack can you can immediately see who is causing the problem.

Also see Appverifer FAQ .

We use it internally and find a fair number of bugs, but it's the quality not the quanity that's important and AppVerifer can help identify those bad bugs that would otherwise reduce the stabilty of your app.

- Rusty Miller

Page view tracker