Visual C++ Libraries Development Regression Tests

Visual C++ Libraries Development Regression Tests

  • Comments 26

Hi, my name is Pat Brenner and I’m a software design engineer on the Visual C++ libraries team.  I’d like to spend some time talking about our process for preventing regressions in our libraries code.

 

After I joined the libraries team about a year ago, I was told about the set of sniff tests that we needed to run before checking in new features or bug fixes.  As it turned out, I worked on some of the new MFC code for Visual C++ 2008, and I didn’t need to run these tests for the first five or six months that I was on the team.  When it came time for me to start running the tests, because I was fixing bugs for Visual C++ 2008, I found that some of the tests did not pass for me, and I was not getting consistent results.  I (and the rest of the development team) got by with this up until Beta1 or thereabouts, because our QA team performs scheduled runs of all their tests, which included the tests included in the sniffs, and we were able to rely on their results.  But then I found that, in addition to the sniff tests, we also had a large set of regression tests, written by the development team, which were largely being ignored.

 

So I decided that I wanted to get take some time early this summer and accomplish this goal: the sniff tests and the regression tests needed to run well enough that I could schedule my development machine to run a build and the sniff and regression tests overnight, and have results in the morning.

 

I first needed to get the sniff tests running and passing consistently on my machine.  So I worked with QA to stabilize the broken tests, remove the tests that were obsolete, and clear out any other blocking problems so that the sniff tests would run start to end, and pass 100%.  This took several weeks (as a background task) and we found several bugs in tests that needed to be fixed (mostly platform issues, where a test would pass on Windows XP but not on Windows Vista, for instance).

 

Then, with help from a member of our QA team, I got our regression tests ported over to a QA test harness similar to the one which runs our sniff tests.  This meant that our regression tests now got run in many more configurations (static library vs. DLL, ANSI vs. Unicode, Debug vs. Release) than they had been before.  This flushed out a number of test bugs, as well as several actual libraries code bugs, all of which we have fixed for Visual C++ 2008.

 

Our policy now is that for almost every bug we fix, we write a regression test which fails without the fix and succeeds with the fix.  This helps us to keep our regression rate down when fixing bugs, especially those ones where a particular method has been changed a couple of different times.  Using the test harness, it’s very easy to add a regression test.

 

The set of sniff tests (owned by our QA team) consists of 6542 tests, of which over 6000 run on the C++ runtimes.  We have about 75 sniff tests on ATL and about 400 for MFC.  The set of development regression tests (owned by the development team) consists of another 2536 tests, of which about 1500 test the C++ runtimes and 1000 test ATL and MFC.  I have all of these 9000+ tests running on my development machine on a nightly basis, which is really convenient when it comes time to fix a bug.  I can make the fix, check that it does indeed fix the bug, and then let my overnight process build it and fully test it.  When I arrive in the morning, I can check my test results, and if all passed 100%, I know my fix has not caused any problems.  I’ve also written up scripts that the rest of my team can use, so they can have the same productivity increase that I’ve been enjoying with the automation of these tests.

 

Thanks, and I welcome any questions you might have.

 

Pat Brenner

Visual C++ Libraries Team

  • I hope to see the VC team include fixes in the next VC++/VS 2005 service pack and not delay them until VS 2008.  My current production environment has about 20 applications spread between VS 6.0, VS 2003 and VS 2005.  It is unlikely we will get the chance to port more than a handful of them to the latest version of Visual Studio since that re-testing an application is a nontrivial multiple month effort.  We cannot make the business case to spend 3 or more months retesting each of those 20 applications (3 * 30 = 60 months).  We will replace/rewrite the 2 or 3 oldest legacy applications each year as it is much cheaper than spending most of our time just porting and retesting for each new VS version.  For example, our target for the next 12 months is to elliminate all VB6 code (including 3rd party code or commercial applications using VB6 components - this is a big deal to us since we've experienced a flood of end of life notices with no viable upgrade/replacement path for commercial products  based on VB6 components.) as well as rewrite/replace all VS 6.0 sytems.  I hope to see service packs for VS 2003, 2005, Windows XP, and SQL Server 2000, 2005 in the next year so that the older environments can live on for 5+ more years.

  • Pat,

    I'd be interested in what kind of testing system you use. How do you collect and analyze the results of the nightly runs? What information does the system provide in case of a failing test - does it save a log, a minidump or something?

    Thanks, Jo

  • "Generally, fixes for Connect bugs go into the next full release. Service Packs consist of accumulated hotfixes, and hotfixes are made in response to support requests from corporate customers."

    Personally I think this is *stupid*. Again I cannot speak for Microsoft policies. Someone in the upper management might have decided to take care of the corporate customers first.

    Anothet thing I noticed is every issues which we complain, the answer is that the version after next version (Orcal +1) will fix that. Somehow it seems the "version after next version" is a standard repley.

    -Arun

  • [AntonioCS]

    > Can you give me an example of script or one of the

    > tests that you use to make sure there are no bugs in VS.

    Sure, here's an example.

    The Standard specifies that 9 iterators must derive from std::iterator. In VC8 SP1, only 6 did. This was further broken in early builds of VC9, where 0 did. In VC9 RTM, all 9 will derive from std::iterator. (More importantly, this fix makes these iterators smaller and faster.)

    We wrote a regression test to prevent this from ever breaking again. It looks like this:

    C:\Temp>type iterator.cpp

    #include <cstddef>

    #include <ios>

    #include <iostream>

    #include <iterator>

    #include <memory>

    #include <ostream>

    #include <streambuf>

    #include <string>

    #include <vector>

    using namespace std;

    template <typename B, typename D> void test(const string& d) {

       test_helper<__is_base_of(B, D)>(d);

    }

    template <bool B> void test_helper(const string& d) {

       cout << "std::iterator is a base of " << d << "." << endl;

    }

    template <> void test_helper<false>(const string& d) {

       cout << "std::iterator is NOT a base of " << d << "." << endl;

    }

    int main() {

       typedef vector<int> vec_t;

       typedef iterator<output_iterator_tag, void, void, void, void> outit_t;

       typedef iterator<input_iterator_tag, int, ptrdiff_t, const int *, const int&> istrit_t;

       typedef iterator<input_iterator_tag, char, streamoff, char *, char&> istrbufit_t;

       typedef iterator<random_access_iterator_tag, int, ptrdiff_t, int *, int&> revit_t;

       test<    outit_t,  back_insert_iterator<vec_t>      >( "back_insert_iterator");

       test<    outit_t, front_insert_iterator<vec_t>      >("front_insert_iterator");

       test<    outit_t,       insert_iterator<vec_t>      >(      "insert_iterator");

       test<   istrit_t,      istream_iterator<int>        >(     "istream_iterator");

       test<    outit_t,      ostream_iterator<int>        >(     "ostream_iterator");

       test<    outit_t,  raw_storage_iterator<int *, int> >( "raw_storage_iterator");

       test<istrbufit_t,   istreambuf_iterator<char>       >(  "istreambuf_iterator");

       test<    outit_t,   ostreambuf_iterator<char>       >(  "ostreambuf_iterator");

       test<    revit_t,      reverse_iterator<int *>      >(     "reverse_iterator");

    }

    C:\Temp>cl /EHsc /nologo /W4 /MT iterator.cpp

    iterator.cpp

    With VC8 SP1:

    C:\Temp>iterator

    std::iterator is a base of back_insert_iterator.

    std::iterator is a base of front_insert_iterator.

    std::iterator is a base of insert_iterator.

    std::iterator is a base of istream_iterator.

    std::iterator is a base of ostream_iterator.

    std::iterator is a base of raw_storage_iterator.

    std::iterator is NOT a base of istreambuf_iterator.

    std::iterator is NOT a base of ostreambuf_iterator.

    std::iterator is NOT a base of reverse_iterator.

    With VC9 20917:

    C:\Temp>iterator

    std::iterator is a base of back_insert_iterator.

    std::iterator is a base of front_insert_iterator.

    std::iterator is a base of insert_iterator.

    std::iterator is a base of istream_iterator.

    std::iterator is a base of ostream_iterator.

    std::iterator is a base of raw_storage_iterator.

    std::iterator is a base of istreambuf_iterator.

    std::iterator is a base of ostreambuf_iterator.

    std::iterator is a base of reverse_iterator.

    (If you have VC9 Beta 2, that will say "NOT a base" for all 9 iterators.)

    This is a slightly modified version (for human consumption) of the actual regression test we use. The actual test uses exit codes to report success or failure to our test harness, and upon failure, it prints the name of the offending iterator to be captured in our logs.

    [Anonymous]

    > Are you running larger C++ test suites besides your

    > own tests?

    As Andre said, Pat was describing only our developer-written library tests (which we call "regression tests", as we check them in alongside bugfixes) and our tester-written library tests (which we call "sniff tests").  We additionally have third-party library conformance tests from several companies.  And then the compiler (front-end, back-end) and IDE teams also have their own tests.

    [Phaeron], [Andrew McDonald], [jmm]

    That was my understanding of the Connect and hotfix processes.  I've been with VC for only 9 months, so my understanding is limited.  I'll talk to our PMs and see if I'm missing something.

    Please understand this, though: Fixing bugs in the next full release is easier than creating hotfixes. To fix something in the next full release, we just need to write the fix, verify that it works (do builds, run tests, etc.), and check it in. There's plenty of time (and CTPs, and Betas, etc.) to be sure that we aren't horking anything. Also, we aren't restricted by binary compatibility, since the CRT gets totally renamed between each full release, and we don't support things like passing STL objects between DLLs compiled with different full releases.

    The hotfix process is much more heavyweight (as it should be - you don't want us breaking stuff in hotfixes).  We must be careful to keep binary compatibility, and hotfixes get less testing than the months and years of testing that full releases endure.  Shipping a hotfix is also a big deal, since we have to package it and verify that the packaging works (from my perspective, this is an ordeal).

    Service packs get additional testing, but the fixes in them generally start life as hotfixes.  (There can be exceptions to this, but that's the usual way it works.)

    I've never personally observed hotfixes being produced in response to Connect bugs, nor have I seen Connect fixes backported to VC8. As I said, I'll check and see if we ever actually do that.

    [Bob]

    Your business can certainly request hotfixes (through the proper support channels, about which I know precisely nothing, sitting at the other end of them). My comments were for Microsoft Connect bugs submitted by individuals.

    We ship hotfixes both for the latest service pack, and the previous service pack.  (Currently, that would be VC8 SP1 and VC8 RTM.)

    Stephan T. Lavavej, Visual C++ Libraries Developer

  • I would like to encourage the VC++ team to include more bug fixes in the core libraries, CRT, etc in addition to customer reported hotfixes and security patches.  These bugs would be high priority, found inside Microsoft or outside (via connect) and also affect some core functionality (e.g., the CRT atoi() function).  

    This would work well as our shop tries to keep existing internally developed applications alive for as long as possible without doing a port to a new environment or major functionality change.  This is the 'consumer electronics' approach to IT in that we 'buy a new one' every 5 years and not upgrade it if possible while it is in production (to minimize costs).  The side effect is that we preserve the development environment for each system in a virtual machine and try to not upgrade the development tools (minimize risk).  New code, rewrite of an existing system, or major functionality enhancement to an existing system will include upgrading the development environment to the latest one available.

    Thank you for the hard work in upgrading the security of the C++ libraries and CRT in the last few years.  That was much needed.

  • Re including fixes in Service Packs, and Bob's request that we include fixes in SPs:

    Actually, for VS2005 SP1 (before Stephan's time), we put in significant effort to fix almost every single customer reported bug up to a certain date (excluding those bugs which would break binary compatibility or which were erroneously reported).  I think we hit over 90% fix rate on these bugs.  We had a cutoff date of course to stabilize and test the service pack and if a bug was fixed but didn't make it, it was probably because it was fixed after that date.  I think the libraries team alone fixed roughly 500 bugs (don't hold me to that, I'm going by memory here), many of which were very minor, but it was nice to see things cleaned up.

    Re: the anonymous posting about the small number of tests.  That 9000 tests was just the "sniff" tests and VC9 regression tests for the libraries.  To give you a rough comparison, the CRT sniff tests take about an hour to run while the entire set of CRT suites takes about two days to run on modern hardware.  (We have lots of tests).  As Marina said, our sniff tests are meant to be quick running while still hitting a lot of areas.  Doing a full test pass against VC takes a couple weeks and hundreds of machines.  

    Regarding whether we run something like Plumhall - we actually run a number of licensed suites like Plumhall (not sure whether I'm allowed to say which ones).  Somewhere between 5-10 C++ licensed or conformance suites.  This is actually a very small part of our testing though.  We have a lot more internal tests to cover things in more depth.  

    In fact, at this point, the sheer number of tests is one of our biggest challenges.  Because of the numbers of tests, machines and configurations we run under, both flakey tests and simply random hardware or network failures introduce a lot of noise into our results which take manual effort to isolate.  It's an especially troubling scenario since it can be very time consuming to determine the root cause of an isolated and non-reproducible failure when you need to determine whether it was due to an intermittent product issue, an intermittent test issue, or maybe just random EMI flipping a bit in memory (since we run probably millions of tests over the course of a product cycle on a lot of hardware, this is actually probably a more likely occurence than it would seem).

    In other words, don't worry that we don't have too many tests :).  I don't, although it is legitimate to wonder whether we have all the right tests ;).  Unfortunately, we will probably never be complete, and we will probably always have redundancies.  I think the challenge in the coming years will be to remove redundancies while still maintaining adequate testing levels to avoid missing things.

  • Hi all,

    Let me try to answer a couple of questions, or clarify my statements:

    [Sohail]

    You're a bit behind in using the "before you fix, write a test that exposes the bug" idea...

    Actually, we've been writing tests that expose bugs before fixing them for a while now.  We just did not have an automated way of running the set of regression tests as they expanded.  Now we do.

    [Norman Diamond]

    For your ANSI vs. Unicode tests, don't you need to run the sniff tests on around 10 machines overnight?

    We're just building the tests using both the ANSI and Unicode versions of the libraries (ATL and MFC).  We're not running the tests on machines with different code pages set.  That's not regression level testing--it gets taken care of in our full test passes (which Ben mentioned above).

    [Alex]

    I was a bit confused by one thing, though: What is sniff testing?

    As Marina mentioned above, sniff tests are a smaller set of tests that are used to make sure that nothing obvious is broken.  I believe the term comes from the days of fixing hardware bugs--if you fixed something, powered it up and didn't smell smoke, it had passed the "sniff" test.

    [Jo SIffert]

    How do you collect and analyze the results of the nightly runs?

    The nightly run both dumps output to the console (so I can look it over quickly when I arrive in the morning) and dumps summary and detailed run information to a couple of log files.  The results are summarized with a percentage passing, so I look for 100% when I come in.  If I see anything other than 100%, I dig into the logs and find out what failed.

  • > We're just building the tests using both the

    > ANSI and Unicode versions of the libraries

    > (ATL and MFC).  We're not running the tests

    > on machines with different code pages set.

    Up to a point, I understand this.  If you have no character strings and maybe no comments in your source code, then it doesn't matter what code page your development machine is running under when it's building.

    When Visual Studio 2005 gives a warning about one of Microsoft's own files in an SDK, it might be because of characters in comments.  When the DDK's compiler gave errors about one of Microsoft's own files in the DDK, it was because of characters in a string.  Notice that this paragraph is abusing the word "character", since the reason for diagnostics in the first place was that certain entities didn't constitute valid characters where characters were needed.

    When you have character strings and comments, I still think you need to repeat the builds under multiple code pages.

  • I do not know what URL is.  But I am running Windows XP and updated to PS2. But I have a problem with Microsoft Visual C++ Runtime Library.  I keep getting th message Runtime Error!

    Program C\Program Files/InternetExplorer\explore.exe

    R6025

    -pure virtual function call

    I have no idea what I should do but can follow directions if given in layman's terms.  I have looked in many places and have found suggestions about rewriting code and I have no idea what that means nor how to do it.  I am a real novice.

    Please give me some suggestions of how to cure the problem.  My email address is bgchast@earthlink.net.  The problem seemed to start after downloading updates put out by Microsoft.  Thank uou.

  • Just add a Visual Basic and Visual C++ Learning guide for free of cost. It'd help the weak programmers to expend their IT Knowledge.

  • > Just add a Visual Basic and Visual C++

    > Learning guide for free of cost.

    Actually there used to be Visual Basic and Visual C++ Learning Editions which came with printed books, in packages that were larger and heavier than the ordinary editions, but cheaper than the ordinary editions.

Page 2 of 2 (26 items) 12