This is the second time I’ve experienced the “end game” of a product cycle in VC++. It is always a wonderful time: looking back at all the accomplishments, the satisfaction of seeing all the cool new features working smoothly together, the eagerness of finding and fixing every single bug! The pressure is still on while we stabilize our product, but we clearly have time now to sit down and analyze pretty much anything (from internal processes to customer interaction) as well as propose any little crazy thing that we can prove is valuable for the next release. This year we are calling this brainstorming effort the “Summer of Love”. As part of this, I’ve been involved lately in simplifying our internal builds and in improving our servicing experience. This post will focus on that effort.
The QFE (quick fix engineering) experience has always been painful. The process is complex, lengthy and many groups must be involved to ensure the complete success of a fix. It all starts with a DTS (days to solution) bug describing the problem faced by the customer. The bug is triaged and assigned to be reproduced and investigated. At this point, we must determine if the outcome is “by design”, a customer misuse or a real product bug. In the first 2 cases, the process stops after we explain all the details and provide suggestions. Otherwise, we commit to fix the problem in a service pack as well as in future product releases. If there is a workaround and the customer finds it acceptable, the process ends here. If there is no workaround, or the proposed workaround is not acceptable, and if the problem causes serious damage then the QFE process begins in earnest.
The first stage of a QFE is “repro”, which usually is very short since the problem was already reproduced during DTS time. VCQA sends the QFE to the assigned dev and the process moves into the “private coding” stage. This may take quite a long time if the problem to be addressed is difficult. Once the fix is implemented, the dev moves the QFE to the “QA private testing” phase. At this time, the dev prepares a “private drop” which is an archived collection of build outputs (binaries, MSMs, manifests, policies, eventually shippable source files) representing the complete impact of the fix. This set of binaries varies with every QFE scenario and it may be quite big: a complete CRT QFE contains no less than 82 files, as opposed to a back end QFE which has only 5 (c2.dll for x86, amd64 and cross, ia64 and cross). You can imagine how time consuming it is for the dev to create a private drop and write detailed install instructions when normally a dev’s job should end when the fix is finished!
During QA private testing, the test dev verifies the functionality of the private drop. If there are still problems, the bug is sent back to “private coding”. After tuning the fix, another private drop must be created by the dev and sent back to QA. After QA signs off on the fix, the test dev moves the QFE to the “checkin” phase. The dev checks in the fix, sends instructions to the build team about how the private build should be installed and moves the bug to the “customer private testing” stage. Checking in the QFE fix before the customer has the chance of taking a first look is very inconvenient. It may happen that the fix does not satisfy all customer scenarios and then the fix must be backed out for more tuning. This creates confusion in the branch. This step is necessary, however, as we need to provide signed binaries which only the build lab is able to do.
At this point, the customer installs the private drop according to the received instructions. These instructions will specify what files to replace and where they are located in product installation directories. You can imagine how painful this is compared to installing a simple patch. After the customer has signed off on the private drop, the bug is moved to “patch building” phase. Following this, we move to “smoke testing” where the built package is quickly tested by the setup team. The VCQA performs one more test which focuses on the customer scenario not just the fix functionality. Only when the package addresses the problem can it be delivered to the customer.
This process can be significantly simplified if private drop creation was automated and the final result was a signed patch instead of an archive. In this case, the dev could focus more squarely on the scope of the fix. Checking in before “customer private testing” would not be a necessity anymore and the customer would be able to quickly install the drop. This means that iterations back to private coding would not be as time consuming as they are now. In addition, the setup build team would not need instructions about the content of the patch and the customer would not need instructions for installing the private drop.
For Whidbey and Everett, servicing in this manner isn’t possible due to older build technology, but Orcas servicing will benefit from a lighter QFE process. A special certificate has been created to be used for QFE private patches and any Microsoft employee is able to use them. We no longer need to perform official builds in the lab to obtain signed binaries. The deployment team also recently provided new authoring scenarios applicable for patches. We can define now patch content for all our QFE categories. There is an engine able to interpret this authoring, automatically detect differences between a baseline layout (what the customer has) and an upgrade layout (obtained from the binaries built by the dev, containing the fix) and then invoke WIX APIs to create the MSP. Only the differences that are detected, and which are also authored, are going to be included in the package. What's really great is that all the steps can be easily automated in scripts making this process completely transparent to the dev.
Hopefully you've learned a little about how the Visual C++ team manages the QFE process. Moving forward, we should be able to create better, faster fixes for customers who experience serious problems.
Visual C++ Libraries Team