Welcome to MSDN Blogs Sign in | Join | Help

We've Hit ZBB

Great I hear you say: but what's ZBB and what does it mean for the product?

Well ZBB means Zero-Bug-Bounce that means that the compiler team have managed to drive the total number of active compiler bugs to zero. This doesn't mean that we are done but it does mean that we have gotten rid of all the old bugs that have been hanging around for a while and now all we have to deal with is the incoming bugs. It also means that our bug triage bar has been raised again so we are now looking at each incoming bug with a more cirtical eye. For each bug we now ask is this a bug that we really need to fix in the product before we ship - for many of the bugs we see the answer is still yes (especially if it is a bug reported by a customer - so keep them coming in) but for other bugs which do not, in our opinion, have a serious impact on the product we will postpone them until the next release. So while ZBB isn't the end of the Whidbey development cycle it is definitely the beginning of the end.

 

 

Posted by joncaves | 0 Comments

What's New in Visual C++ 8.0

First I would like to apolgize for my silence over the last few months - I really have been heads down trying to finish off the compiler for the next release of Visual C++. Finally we are starting to see the light at the end of the tunnel and so I can relax a bit and find the time to share with you some of the up coming features in the next release of Visual C++.

I know that Brandon has been doing a great job talking about the new features of C++/CLI so I thought I would focus on some of the changes we have been making outside of the C++/CLI space. I'm going to focus on changes we have made to the native C++ compiler - both features we have added and bugs we have fixed.

Access to Friend Functions:

One of our many observant users discovered that the compiler was allowing code like the following:

class A {
   void mf();
};

class B {
   friend void A::mf();
};

According to the C++ Standard this is illegal because in order for B to make A::mf() a friend it needs to be have access to A::mf() and in this case it doesn't as A::mf() is a private member function. Now this particular example mightn't look that "dangerous" and you probably could argue that there is no harm in allow B to have access to A::mf() but the following example show a less clear cut case:

class A {
   class N {
   };

   void mf(N*);
};

class B {
   friend void A::mf(A::N*);
};

Now if the compiler allowed B access to A::mf(A::N*) it would also need to allow B access to the nested private type A::N this is probably more than most users would want as the author of A has probably had a good reason for making N private. So the best (and simplest) rule is that if want to make a member function of another class a friend of your class then you need to have access to the member function.

There are several ways to get around this problem: the easiest is to make B a friend of A instead of just A::mf. Another possibility is to change the access of A::mf to public - though this is a more invasive change and it may not be possible in all cases as the author of B may not have access to the definition of A.

Using '>>' To Terminate Template Arguments

How many of you have ever written code like the following in C++?

std::list<std::vector<string>> strings;

Only for the compiler to tell you that a template argument list must be terminated with a '>' and not a '>>' have you then shouted "Stupid compiler! Why can't you treat '>>' as '>' '>'?". Well the C++ Committee has listen to your complaints about this and they have decided to allow '>>' to terminate nested template argument lists. This change to the C++ language will be in the next version of the Standard (which is expected towards the end of this decade) but th Visual C++ team decided that this was such a small change to compiler and would help novice C++ programmers that we decided to include this change in the next version of the compiler.

The one downside of this change is that it does break some existing currently valid C++ code. The code in question is stuff like the following:

template<int value>
class X {
};

X<0xffff >> 2> x;

With the proposed change to the C++ language the compiler will interprest this as:

X < 0xffff > > 2 > x;

The fix is to wrap the right-shift expression in parenthesis:

X<(0xffff >> 2)> x;

I think most users will agree that breaking code like this is a small price to pay in order to be able to use '>>' to terminate template argument lists.

Posted by joncaves | 0 Comments

Answers

I've got some feedback to my recent blogs ... mostly in the form of questions: so here are some (hopefully useful) answers.


Andrew asked about my thoughts on general compiler design. I'd sum it up in one phrase - Keep It Simple - break the compilation process into a series of small discreet steps. Don't have multi-thousand line functions that try to do 50 different tasks. It is much better to have 50 small function each of which does one very specific task. My ideal compiler would be a parser that builds a very rough AST and then runs a multitude of visitors over the tree gradually checking for errors and converting the tree to the final fully resolved form.

Of course the one “problem” with this mult-pass approach is performance: you may end up with an extremely robust, correct, easily maintainable compiler but if it takes 30 minutes to compile Hello World then no one will use it. So as with lots of software problems there is a trade-off here: in this case it is the number of passes vs the complexity of each pass. But as the very least you should never have one pass that tries to do two tasks that have a strong chance of interfering with each other.


Matt and Phil asked what source code control system we use. We don't use Visual SourceSafe: though some teams at Microsoft do use it - I know that the MFC/ATL guys used it for quite a while and were very happy with it. A couple of years ago it was decided that Visual Studio should follow the NT model and create a unified source repository for all the source code that made up the product (up until this point each sub-team had looked after its own source code) - given that this includes 10'000s of files and millions of lines of code VSS was not really up to the task. So we use a source code control system called Source Depot.

Korby Parnell from the VSS team has a great blog http://weblogs.asp.net/korbyp in which he covers source code control issues.

 

Posted by joncaves | 2 Comments

Developing the Compiler

Working on a compiler can be difficult: you can make what you think is a minor change to fix a bug, check it in to the source tree. If you get the fix wrong and you are lucky, you’ll have your QA team telling you that you just broke a whole series of tests. If you are unlucky the bug will lurk for few weeks or months and the first you’ll hear of it is when some build-lab owner in NT is ‘phoning you up demanding to know why you broke their build and could you get it fixed already. If you are really unlucky their VP will call up your VP and next thing you know you are trying to explain why you broke the NT build to some really senior (and annoyed) people.

So to try and avoid problems like these each developer has to run a series of test suites before they can make a check-in. When I first joined the Microsoft C++ compiler team there where 3 test suites that had to be run:

·        Sniff – this took 20 minutes to run and just validated that the most obvious tests like “Hello World” and scribble.exe still built.

·        2Hr BVT & 6Hr BVT – these were larger Build Validation Test suites and as the names suggest they took 2 hours and 6 hours to run respectively

Today we still have these 3 suites and while the Sniff suite still takes around 20 minutes to run both the 2Hr and 6Hr suites now finish in less than 30 minutes – not because we have reduced the number of tests but because machines are now so much faster.

Over time we came to realize that these 3 suites were not enough and so over the years we have added more suites: we now have test suites that target conformance to the C++ Standard, suites that target code that uses attributes, suites that build and execute 3rd Party Libraries like Boost, suites that test the parser we use to provide Intellisense and, most recently, suites that target managed code. Currently before any check-in is made to the compiler source tree a developer will run approximately 14 different test suites.

But even with all these suites we still found we were running into issues: a lot of these issues were of the form of a change to the compiler parser would break the linker; or a change to optimizer would break the parser; or a change to the C runtime would break everyone (Sorry Martyn J). There were also issues were a change to the IA-32 compiler would break either or both of the IA-64 and the AMD-64 compilers (and vice-versa). On top of these desktop platforms the compilers are now used to target all the chips that are used by the Windows CE team. On all platforms there were issues were a retail build would work but a debug build would fail. So it was suggested that each time a change is ready to be checked in each developer should run every other team’s suites as well as their own team’s and that they should run these suites on all platforms and for all builds (retail/debug/test).

While this may sound like a great idea in theory it is not remotely practical: the combinatorics of suites, builds and platforms is huge and also not every developer has access to all the different machines necessary to run all the tests. There was always the problem of a developer “forgetting” to run a suite before they checked in – “I know this change cannot possibly break anything on IA-64” – wrong! So it was clear that we needed a process that was fast, required little or no developer intervention, and could handle running multiple test suites on multiple platforms: welcome to Gauntlet.

“Running the Gauntlet” is a term for a form of medieval punishment in which the miscreant would have to run between 2 lines of knights who would attempt to hit him (or her) with their gauntlets. There is an image of a rather tamer version in the Pieter Brueghel painting “Young Folks at Play” (Note: Pieter Brueghel is the eldest son of the famous Flemish painter Pieter Bruegel).

What is Gauntlet? It is program that runs on a server and which serializes all check-ins: it works as follows:

When a developer is ready to check-in they open up a web-page on the Gauntlet machine and fill out some information about what tree they are checking into (Parser, Optimizer, Linker, Runtime) and what files they are changing. They then submit the check-in to Gauntlet. The Gauntlet machine will then take the diffs from the developer’s machine apply them to its own copy of the source code and then run a whole series of builds and tests on different platforms. Currently for a check-in to the parser the Gauntlet machine will build about 12 different variations of the compiler and it will then run about 35 suites from all areas and on all platforms.

“Doesn’t this take for ever?” I hear you ask. No: Gauntlet is not just one machine: it is a cluster of about 30 machines (most IA-32 but also some IA-64 and AMD-64) – once Gauntlet has built a particular flavor of the compiler it farms out the suites for that flavor to other machines: as all the testing can be done in parallel. A check-in to the parser only takes Gauntlet just over 1 hour – but if we serialized all the building and testing it would take closer to 12 hours. This means we get a maximum amount of testing in a minimum amount of time.

Having Gauntlet has really helped us to improve the quality of the whole compiler toolset: it’s not perfect (it can take a while to get your turn) but it is much better than leaving all the testing up to individual developers.

I’ll probably come back to our development process again in the future but if you have any questions/comments please feel free leave me some comments and I’ll try to address them in a future block.

---------------------------------------------------------------------------------------------------------

One question I have gotten is why doesn’t your blog have RSS – it’s a long story. Basically www.gotdotnet.com decided to stop accepting any more new bloggers (at least temporarily) so I decided to use blogger.com (and blogspot.com) both of which are now owned by Google: unfortunately to get RSS I need to upgrade to the professional version: but at the moment they are not accepting any more upgrades L- so for now I am stuck without RSS. Sorry.

Posted by joncaves | 7 Comments

Welcome

Welcome: everyone else seems to be blogging so I thought I'd join them.

A little bit about myself. I've been working on compilers for longer than I care to remember (Pascal, Fortran, Modula-2, C and most recently C++). For the last 10 years I've been a developer on the Visual C++ compiler team: currently I am also the Microsoft representative on the ISO C++ committee WG21/X3J16.

The main focus of my current activities is C++ for the .NET platform.

I am hoping that I can use this blog to provide insights into the decision making and development process behind Visual C++: basically what we're doing and why we're doing it.

Posted by joncaves | 3 Comments

What Compiler Do You Use?

This is a common question I get asked at a lot of conferences: users are always interested in what tools we use internally.

The answer is complicated.

First: there is no one single version of the compiler which is used by every team at Microsoft. Each team is free to choose which ever version of the compiler best suits their needs. For a lot of teams this is a previously released version of the product: some teams use the compiler that shipped with Visual C++ .NET 2003 (AKA Visual C++ 7.1) other teams use the compiler that shipped with Visual C++ .NET (AKA Visual C++ 7.0) some team still use Visual C++ 6.0 and I have even heard reports of a team that still use Visual C++ 5.0 (Why? I have no idea).

But for some teams like the Visual Studio team and the Windows team a previously shipped product in not new enough - they needs access to features that are not yet ready to ship: so these teams need to engage in that great Microsoft tradition dogfooding.

Dogfooding is not only a great way for these teams to get early access to the features that they need but it is also a great way for the compiler team to get our latest compiler heavily tested on some real world code: running compiler test suites is one thing (and a very important part of our testing strategy) but there is no test like being able to compile and boot NT.

How does dogfooding work: every few months the Visual C++ team decides that they want to start an LKG push - LKG stands for Last Know Good: an LKG build is a complete compiler tool set that the Visual C++ team feels is good enough to be used outside of the immediate team. The LKG process involves first picking a daily build that we feel is of a high enough quality - basically a build that passes all our front-line testing. Once we have such a build we create a branch off our main development source code control depot and this branch becomes the LKG branch. This compiler is then subjected to a full test pass and any issues that are found are fixed in the LKG branch (and, of course, in the main branch) once our QA feels that the LKG has reached an appropriate level of quality we start dropping early releases to those teams that are interested (different teams pick up releases at different times: if a team is just about to ship the last thing they need is a new compiler) these teams will build their product and run their own internal tests. Any issues they find are reported back to the Visual C++ team and if necessary the LKG build is updated. Finally whenever all the teams agree that the quality of the compiler is high enough the tools set is released and the other teams can pick it up and use it to do their daily builds.

But the story does not end there: no matter how much testing we do issues may slip through: so it is possible for a team like NT to find an issue several months after we have released an LKG: in these cases we have to track down the issue, fix it, patch the LKG and release the updated toolset.

As you would expect the most extreme dogfooders of the compiler tools are the compiler team themselves: we use tools that are only days old: yes this can have it drawbacks as the tools can sometimes become unstable but these problems are minor when compared to the benefits we get from a lot of people using the latest version of the tools.

In my next blog I'll talk some more about our internal development processes and how we try to guarantee that every build of the compiler is a high quality build.

Posted by joncaves | 7 Comments

I've been Relocated!

It was decided that the “official” Microsoft Blog site would be on asp.net - so I've moved my blog here.

I'll repost my previous blog entries so that they are all located in one place.

Posted by joncaves | 4 Comments
 
Page view tracker