-
For some time now I’ve been promising to lay out all my thinking on Testing in Production (TiP). I first introduced the topic in a blog post in June 2009 titled “Ship your service into production and then start testing!” Over the summer I had some family commitments, late summer I finished up my internal Microsoft ThinkWeek (full article in Seattle Times here) white paper on the subject (5 five star reviews so far), and had to prepare a couple of talks on the subject for STARWest and the TwinSPIN & Benchmark QA SIG in Minneapolis. All those excuses are over and now I’m ready to get back to the blog.
What follows is an updated version of the executive summary of the ThinkWeek paper that I co-authored with Seth Eliot and Ravi Vedula on TiP-ing Service Testing. The paper did quite well as we recieved numerous 5 star reviews and the chair gave us a four start review with a must read comment. The paper has seen over five hundred views so far, so I'd call it a success. Ithant is is my best overview of the subject and makes the case for why TiP is an important framing concept for evolving our approach to services testing. For more on ThinkWeek check out the youtube video or news article links above.
“How could Test have
missed this bug?”
When it comes to services testing there are two diametrically opposed perspectives. The most common perspective is the one that is focused on making test environments in the lab as close to production as possible. This approach grows out of the oft expletive laden post RTW (Release to Web) admonishment, “how could test have missed this bug,” and the equally common retort, “our test environment isn’t enough like production so we couldn’t catch it.”
Many a tester has been burned by the bug that was missed and when this happens they often become defensive and more risk averse. They tend to drift into the first school of thought that attempts to continually increase the precision with which their test environment matches production. All sorts of techniques are used from purchasing the exact same servers for testing as production, purchasing load balancers and even taking sanitized dumps of production data into the test lab. After all, it is the job of testing to make certain bugs don’t get away and if even a single bug is missed in test because there was a missing network device or we only had one SQL Database instead of mirrored databases like production, then we must close that gap and stop the bugs from getting out into the wild.
“Test is the gate keeper, the champion of the customer,
and must stop all bad code from escaping!”
The funny thing is, whether it is a game, a desktop application or a web service, if you are a tester that has shipped a product, you have missed a bug. I know I have missed a lot of bugs in my career, and yet I remain employed. While I too work to find all the bugs in a product, when I hear a manager ask that same tired question, “How could test miss this bug,” I tend to take a deep breath and just let it pass.
Personally I am always confident that my team and I have done a good job and so my advice to others is to just let the comment pass. Think of it this way, it’s not your fault. If the bug hadn’t been designed in the first place or coded in the second place it wouldn’t exist. The fact that it exists is not your fault. You found tons of bugs in the product, so a few got past, go yell at the development team for daring to write the bug in the first place. When you become defensive you become obsessed with never missing a bug. I have a long talk and even a tutorial on what we can learn from bugs that get away, for now just let it go.

The second perspective is the one that accepts test can never find every bug in the test lab and with respect to services; test can never fully emulate production. Trying to find all possible production bugs in test is a daunting, expensive and eventually un-achievable aim. There are simply a vast number of bugs that will only ever be found in production. This line of thinking pushes one toward considering risk mitigation once a bug is out. After accepting that bugs will get out you can free your mind to start thinking about how to leverage production for testing and how to make production more test friendly.
TiP is really a paradigm shift requiring the acceptance that test environments have limitations, test environments cannot catch all the bugs, and testing in production is the most viable option.
I should probably pause a moment and define what I mean by Production and Testing in Production.
Definition of PRODUCTION: For this paper production will be the current version (v-current) of the service to include all the data centers and machines v-current runs across. It will also include v-Next instances that live in the data center and have real world traffic hitting them.
Broad Definition of TiP: TiP constitutes all testing activities occurring on hardware in a data center.
To most testers and operations engineers the notion that you would test in production seems verboten. TiP is an anathema. TiP is just not allowed. The reality is that it is allowed and it is much more common place than many think. In fact this paper will argue that we need to further invest in and enhance our abilities to test in production.

Figure 2-Test Account used frequently on Hotmail for production testing
If you were to look at TiP along a continuum at one end you would find the “Lab Centric” perspective and at the other you would find the “TiP Zealot.” The Lab argument would focus on making the lab as much like the production as possible. If we are looking at Windows or Office that would include all kinds of hardware and peripheral devices, all kinds of applications for compatibility testing, and thousands of machines running automated tests. To be fair, this has worked well for Microsoft with respect to desktop applications and server products. Even so we work with beta partners for them to deploy the next version of our products into production so we can get early feedback. In this case though, production is often treated as a backup to testing in the lab; the test focus is still within the test lab.
This approach to testing has been used by Microsoft and many other companies as the primary approach to testing web services.
The TiP zealot would argue that the lab will never be able to replicate the real world so all final sign off must happen in the real world, in production. Looking again at the software side of things we see the example where we run betas that sometimes include millions of customers and identifying selected Technical Adoption Program (TAP) partners to sign off on final Release to Manufacturing (RTM).

Figure 3: Services TiP Continuum show the shift to more production testing
The reality is whether we are talking software or a service we already conduct much of our testing in the real world and in production. What this series of posts attempts to do, is to organize a collection of best practices within the services space that will move us further along the TiP continuum safely toward a TiP centric perspective.
Future posts in this series will focus on three main sub-section from the original white paper. These upcoming sections are essentially the why, the what, and the how of TiP:
· Why – Factors pushing us toward TiP
· What – Best Practices for TiP
· How – Enabling technologies for TiP
Why – Factors pushing us toward TiP discusses how factors such as Scale and Integration are forcing many services to test in the data center and even forcing them to test the data center itself. Technology shifts such as Cloud Computing make it easier to test in production and other shifts such as an ever deepening stack of infrastructure layers make it impossible for us to replicate production in test labs.
What – Best Practices for TiP is likely the most fun for readers as I will simply lay out my top TiP Tips as a collection of best practices. These practices will cover everything from the simple idea of purchasing production hardware in the data center that is used both for testing and production interchangeably to the relatively bold concept of shipping test hooks into production.
How – Enabling technologies for TiP will wrap up this series by outlining some areas where we need new enabling technologies and improved architecture. Many of these solutions should be built into the coming wave of Cloud Computing offering being made by just about every major software and Telco company in the world.
As I’m building this series off of the white paper written by me (@rkjohnston), Seth Eliot and Ravi Vedula the next installment should come out fairly soon. In the meantime if you have questions about TiP or just want to add your thoughts, please post a comment.
-
Since I have been in software it has been readily accepted that in order to test software one needs hardware. At most software companies this means that there is a test lab filled with computers and devices to assist in the testing. My experience has shown that with each major paradigm shift in software such as desktop to client server (two tier) and from there to nTier and now Software as a Service (SaaS) moving toward Cloud Computing, the amount of equipment needed to test the software has increased. Right now within Microsoft the physical test labs we have are bursting at the seams.
In the book “How We Test Software at Microsoft,” that I co-wrote with Alan Page and Bj Rollison, I shared the statistics that we employ more than 8,000 testers (called SDETs or Software Development Engineers in Test) and we have more than 100,000 test machines in test labs around the world. Those statistics are now a few years old and the current numbers are much larger. In fact, the ratio of machines per tester in labs across Microsoft has been increasing steadily for years. Across the industry the ratios of machines to testers might be different than within Microsoft but the trend line of more machines (or virtual machines) per tester seems to be constantly going up.
While this is all quite impressive, the reality, however, is that we are on the cusp of the demise of the isolated test lab. I believe that within the next five years the notion that a tester can go into a test lab and run a test or check on something will virtually fade away. I don’t know about you, but I love having access to my test lab. For years it has been the heart of any test organization I have been fortunate enough to manage.
Unfortunately, close proximity and open access to one’s test lab will not continue much longer and that’s it. Further on in this blog I hit on a few of the factors driving us toward the demise of the test lab and I may post even more points in a later post. For now, I wanted to reminisce over some of my favorite Test Lab stories. Maybe you have some of your own.
When the test lab was the heart of the testing organization
Several years back I was the Test Manager for the MSN billing team. I would walk through the test lab every single morning. I did this out of a habit I’d developed based upon the technique known as Management By Walking Around (MBWA). If you want to know more about MBWA I recommend reading the works of Tom Peters (@tom_peters) and specifically his bestselling book titled, “A Passion for Excellence.” Fortunately his book was required reading when I was an undergrad and the concept of walking around as a management technique has stuck with me to this day.
Each morning I would arrive a little after 9am, just in time to get one of the last few parking spaces on the second floor of the parking garage. Developers, as a breed, at Microsoft tend to like to work late into the night and check their code in under the cloak of darkness when there are no testers around to notice them ignoring the results of their check-in tests. This of course means that most developers don’t start to arrive back at the office until late morning. Conversely Microsoft testers were actually early birds as compared to their developer counterparts.
The elevator ride up from the parking garage always seemed slow and I was always in desperate need of more coffee. When the doors opened to my floor, I was deposited directly across from the kitchenette and my one saving grace, the free-and-cheap-and-chunky morning mocha. Now this mocha was not the fancy Starbucks barista kind. No, my recipe was much simpler and much less expensive. Simply take three packages of pre-ground and measured coffee, pour them into the coffee machine, add half the normal amount of water and brew up a small pot of coffee syrup. Next, poor half a cup of coffee syrup into an orange poly-styrene cup with the “Microsoft” logo on it, add two packages of powdered hot cocoa mix, and stir until partially mixed. It is my recipe that lacked any sense of taste – much like my “bachelor spaghetti” recipe before I got married, but I digress…. For a better recipe see this one on eHow “How to Make a Cheap Mocha Latte.” One other note, we have replaced the poly-styrene cups with compostable coffee cups.
With “Free-and-Cheap-and-Chunky Mocha” in hand, I began to make my morning MBWA circuit. The first stop was to pass by my boss’s office to make certain I had arrived before him, next I would check a couple of emails and then it was on to my team’s test lab. By now it was just past 10am and the lab was buzzing with testers finding bugs and loudly discussing problems over the noise of several hundred test machines and overhead air-conditioning dumping down on us. As I passed through the lab I would take a detour down various aisles to chat with the testers who were busily working in front of the keyboards and monitors at their Easyup stations. The racks of servers behind them were in cages and despite belching out massive amounts of heat, all the testers had on coats or sweaters desperately trying to stay warm.
Let’s face it; a densely packed test lab with forced air ventilation is not a very hospitable environment for humans.
Within just a few minutes I would know whether setup was broken again or not, how the BVTs were going and what the unstable areas looked like that day. I would also be able to catch up on the standings in the FIFA World Cup series in which Brazil had just advanced, Brazil would eventually win it all that year, and whether how Lance Armstrong was doing in his quest to win his 3rd Tour de France.
It really wasn’t all that much fun living in the drafty test lab
A few years later I was a group test manager and had an even bigger lab with more than a thousand test machines and even more forced air ventilation. As I was now a true morning person arriving well before 9am each day, I would wait until just before lunch time to conduct my MBWA routine. The test lab was still part of my regular routine and so I would pass through it on my way to and from the cafeteria. It seemed that these days it was more hit or miss as to who might be in the lab. Often I would find only a lab engineer or two and no testers at all. I would still stop and chat, often about the latest Angelina Jolie movie and whether or not Lance Armstrong would try for another Tour de France.
It occurred to me that the increases in test automation and the use of Terminal Server meant testers just didn’t need to live in the lab anymore. It also occurred to me that while discussing movies with the lab team was fun, I’d forgotten a coat and was freezing my rear off in this cold drafty cave of a lab. Even without testers in the lab, massive amounts of critical test results were generated by the lab. It was still the engine of the test organization and when something did go wrong we needed quick and immediate access to it.
With testers out of the labs we progressed down a path of making labs even denser and less hospitable to human life. The easy ups with the monitors and keyboards were removed, lights were set low to save power and further reduce heat, and the test lab became much more like a small datacenter.
Mr. Bill Gates “explains” how expensive test labs are to building construction
Over the years I have had the great fortune to be in a few BillG reviews. In one review I was the Director of Test Excellence and we were presenting the initiatives we had going on within the test community. We were actively shutting down the “little labs” that were really converted offices with too much equipment for the power and AC, but there were quite a few of these labs and we needed to find room for the equipment. We were working with many test teams on the concepts of lights out labs and moving some of the equipment out of their immediate building.
Traditionally at Microsoft, when a team moved from one building to another, they would pack up all their lab equipment and take it with them. Our new goal was to simply get them to leave the equipment where it was. After all, they rarely needed physical access to the equipment, why incur the enormous cost of moving a lab along with the people. One could argue our penchant for moving people around to they can sit in intact teams is over the top but that’s a different discussion.
All of this sounded like good progress but it wasn’t enough.
In the review with BillG we talked about reducing the target for the percent of any new building dedicated to lab space. I don’t remember the exact number but it was going from something like 5% of square footage targeted for test labs down to just 3.5%. While this seemed like good progress to us, it wasn’t good enough for Bill.
I know there have been many posts on what it is like when Bill “explains” something to you in a review. Fortunately for us, Bill did take a few extra minutes to explain how expensive it was to build labs in buildings, how much they added to the total cost of the building to put in all this extra fancy power, ducting, cooling, and re-enforcing the floors for the heavy weight of thousands of servers. It was quite clear from Bill’s perspective reducing labs square footage closer to zero was the desired goal.
To tell you the truth BillG made a good, if somewhat firm, point. If you are going to set a goal, set a clear, definitive and aggressive goal. Personally I want all the test equipment out of the buildings and into low cost test data centers. We still haven’t stopped building lab space in new buildings but we have reduced the % allocated with each new project and we do have several data center facilities dedicated just to getting the lab equipment out of the buildings.
The Lights-out-Lab is a data center service.
For years the impetus for the lights-out-lab within Microsoft has been test automation. These mega labs contained thousands of commodity machines or bricks. We often lovingly called these machines bricks as they were the small form factor sub $1000, desktops sold by most OEMs and were quite literally stacked on Easy-Ups or in racks, one on top of the other. These machines were truly like a service that would run non-stop automated tests and spit out results day in and day out.
These labs were frequently running out of power and over-heating. We were constantly investing in massive infrastructure upgrades. The labs were often visited just once a month by technicians that would pull out the dead machines and replace them with new equipment. They truly functioned as a data center service and were the first target to move completely out of buildings and into low cost datacenter facilities.
We still had the challenge of what to do with all the machines used for manual testing or software configuration and compatibility testing. These scenarios often include more manual interaction between the tester and the test machine. We’ve been stymied for a few years on how to move this class of test equipment out of the local lab and out from under the tester’s desk.
The final demise of the test lab will be Virtualization and the shift to Cloud Computing
The real rationale for why the on-premise lab is near its end of life is the shift to web services and cloud computing. My friend and co-author on HWTSaM, Alan Page, recently delivered a talk at STAR West titled “Virtualization: The Path to Multiple Efficiencies.” He posted the slides from the talk here. In the presentation he does a good job outlining how Virtual Machines (VMs) are changing how we manage and allocate test hardware. It is well worth a quick read of his slides.
Virtualization certainly improves the dynamic allocation of test resources and experience is showing us that it works better on reasonably scaled up hardware. I’m a big believer in commodity hardware for services so when I say scaled up hardware I’m not personally supportive of these massive multi- quad core servers hooked up to large Storage Area Networks (SAN). Still, virtualization doesn’t add much value when run on the aforementioned brick machines.
Lately there has been a need for desktop applications to be certified to run within virtual machines. Microsoft recently announced the new MultiPoint Server for the classroom that allows multiple students to share one physical machine. This capability leverages virtualization within the classroom. Virtualization, as a way to check out a test machine and even save and share a bug with a developer, has become very common within Microsoft. The trend is likely to continue and that will help us move even more equipment out of test labs and into a more reliable off-site data center.
Cloud Computing is the last element to really tip us away from testing in labs.
In the upcoming Testing in Production (TiP) series my first TiP to service developers and testers is to move the test equipment from the test lab and into the data center (DC). The justification for this is that a DC will always be different than a test lab and the sooner we find the bugs that are specific to how a DC is managed, the better. The other argument is that within the next 5 years a large portion of web services will be developed and shipped into production within a cloud infrastructure such as the Microsoft Azure service or those offered by our competitors such as GOOG, Amazon and VM Ware.
With cloud computing a service typically ships as a VM into the cloud. Those VMs increase in number as more compute power is required to keep up with user demand. Test scenarios are very demand driven as well and it only makes sense that if we are going to ship into the cloud, we might want to conduct most of our testing in the cloud.
The Test Lab will not go completely away but it’s necessity has certainly peaked.
Hardware compatibility labs, usability test labs, and some amount of buddy testing are likely to stay in near proximity to the tester either in a lab or on a machine parked under a desk. The number of machines used for this type of testing has always been a small percent of the more than 100,000 test machines we have here at Microsoft. So while the test lab will likely continue, it will dramatically shrink as equipment moves out of labs and into the data center. Virtualization and Cloud Computing will help push most manual and compatibility tests into the DC.
It is clear to me at least that we are the cusp of the demise of the test lab. I have loved my test labs and clearly have some fun personal stories. I’d love to hear your thoughts on the future of the test lab and any stories you may want to share.
-
Who in their right mind agrees to fly all morning from Seattle to Minneapolis, drive to a local software testing special interest group (SIG), talk for 90 minutes, and fly back home that same night? For a number of reasons, the most important of which was being home in the morning to help get my kids ready for school, I did and it was a very good experience. This is the story of my experience along with a few reflections of traveling for the Benchmark QA forum and TwinSPIN SIG Nov. 5, 2009.
About 6:30 in the morning I went into my 8 year old son’s room to wake him up for school. He rubs his eyes a moment and then bolts upright in bed. “Dad, you’re back already,” he exclaimed.
“No, my trip is today, not last night,” I replied.
The kids quickly got ready for school and as my flight didn’t leave until after 10am, I was able to drop them off at their classroom door. Seattle had on her typical October gray cap of clouds. Even after sixteen years of living in the Northwest I wasn’t sure if it was just going to be a cloudy day or if we would see some actual rain. It wasn’t as if the weather would have any real significance on my travel plans but it is always fun to escape bad weather when traveling as opposed to flying into it. The weather stayed dry as we boarded the plane but just as we were taxiing for takeoff the clouds finally let loose a light shower, just enough to darken the tarmac and run rivulets down the little portal windows of the plane. Seattle was therefore rainy and the only question was whether or not Minneapolis would be better.
On the flight I went through the presentation one more time and shuffled a few slides to suit my mood. Once that was complete, I filed a bunch of emails into folders in Outlook, drafted some notes on feature ideas for Office 15 (yes we will ship Office 14 in early 2010 so it’s time to start thinking of what’s next), and then began writing this blog post and a few others. I finally found time to start writing my blog series on Testing in Production (TiP).
Cool Crisp day in Minneapolis
The flight out was a bit turbulent but we arrived early. So early, in fact, that there was no one at the gate to let us de-board the plane. There is nothing worse than being in the back of a plane, watching it pull into the terminal, see the fasten your seatbelt light go off, stand up and pull your bag from the overhead compartment and then just wait. In this case we waited an extra twenty minutes before someone finally arrived to open the door.
Larry Decklever, the founder and President of Benchmark QA, met me just outside the airport. On the drive to the event we talked a little about his company, golf, and oh yes some points about testing. As we drove, I took the chance to appreciate the clear blue skies and the wonderful crisp cool fall air. For this particular day at least Minneapolis weather beat Seattle.
At the Benchmark QA Offices I was introduced to Molly and Cindy who together helped me find some food, hook my laptop up to the projector, and settle in. We had some 90+ attendees registered for the event and they began showing up right on time for the pre-event networking hour.
Show time!
My presentation, “Testing in the Cloud,” is not about testing on any specific cloud infrastructure. It doesn’t focus on Microsoft’s Azure, Amazon’s AWS, or Google’s App Engine. It encompasses a series of concepts and shows how most of them apply equally to testing a web service whether that service is built out on bare metal hardware or runs as a virtual machine (VM) in a cloud, or in a hybrid mode like my favorite example from SmugMug.com. I also love to share a YouTube audio clip of Larry Ellison, CEO of Oracle, and what I consider to be the best rant on cloud computing ever, “What the hell is cloud computing?”
In the talk I cover many concepts best summarized in James Hamilton’s paper, “On Designing and Deploying Internet-Scale Services,” and the multiple presentations he has given on the topic. My presentation, though, introduces a little bit on Cloud Computing, a little on designing services correctly, and then spends the next 45 minutes discussing why Testing in Production (TiP) is vital as we move into the Cloud era.
Given that TwinSPINers seem to have much more experience testing software and embedded devices than I will likely ever amass, I expected several challenges on the topics. Instead I was pleased to see many nodding heads.
When I got to the last slide, the hands shot up and we had probably 30 minutes of Q&A. Interestingly the majority of the questions were about how Microsoft approaches this test problem or that test problem. I could have replied, just go buy “How We Test Software at Microsoft,” as all those questions are answered in the book, but I didn’t. I discussed how we use Static Code Analysis, that we have almost as many full time testers as we do developers at Microsoft, and how we heavily emphasize test automation.
For all those TwinSPINers that came up afterwards with questions and praise, thank you. Within ten minutes of finishing I was packed up and jetting back to the airport. Within an hour I was through security and quickly choking down a Philly Cheesesteak sandwich in food court. Yes, what was I thinking eating a Philly Cheesesteak made in a Minneapolis airport.
It was just past midnight when I got home. My wife left the lamp on my nightstand turned on so that I wouldn’t have to stumble in the dark. My six year old daughter had written me a little book welcoming me home and telling me how much she loved her daddy. My son, who is both a bit of an engineer and a bit of a writer, built me a little box filled with little two inch square pieces papers so that I could write my own book about my trip. I think however I will leave this blog as my trip report and use the little papers to write him a story of how much he is loved by his Daddy.
· The slides for this presentation can be found here.
· Also the slides from my presentation last month at STARWest can be found here.
· Lastly it looks like I’ll be at the Better Software Conference this coming June. I’ll post upcoming talks to my personal website here.
Thanks to Larry, Molly, Cindy, and the TwinSPINers for a fun whirl-wind trip.
-
All three authors of "How We Test Software at Microsoft", Bj Rollison, Alan Page, and Ken Johnston actually made it for this recording. Warning, I had a bit too much fun with my green screen. The discussion covers how different teams within Microsoft use Agile, Waterfall, Spiral and Feature Crew approaches for their engineering life cycles. New music, “Works on My Box,” by Art Leonard.
-
Recently I ran into a friend who was reading chapter 14 of How We Test Software at Microsoft. He commented on the picture in the book of the Rackable Ice Cube container full of servers. This was pretty cool to think about purchasing servers pre-racked in a cargo container. We joked that for our next purchase of machines for the lights out automation lab we could just order one of these, park it on the fourth floor of the parking garage near the big fan that pulls air up and out of the garage, and then run a few extension cords and Ethernet cables down to it. Done.
We'd have our new machines online in no time and we'd lose just a couple parking spaces. That was better than giving up office space for lab machines.

Of course it wouldn't really be that easy but it is fun to think outside of the box, or cargo container, at times.
Cargo Container Data Centers are Real and growing
The funny thing was my friend thought that this whole Cargo Container idea was new and cool. He got excited all over again when on June 29th Microsoft announced that it would be bringing two new data centers online. The one in Chicago is a huge facility designed to be a container based data center. Even though I wrote about this in HWTSaM, I only did enough research to present what it was. I did not take the time to research the origins of container based data center modules nor really research where we might be headed. For this post, I decided to spend a few hours digging through old blog posts and updating my knowledge. I found a few very interesting posts and links that are well worth sharing with everyone.
First off, Google was awarded a patent on container based modular data centers. The patent was originally filed back in 2003. Recently Google announced it has been using containers in its data centers since 2005. Below I have a link to a video on the subject.
So, clearly this idea has been around a few years and even been used in production for several years. The new Microsoft data center looks to be very advanced in terms of power management and cooling. We can see that the innovation isn't just about racking machines in a big metal box, there are a lot of innovations happening in this space to include the actual design of the data center. Check out this cool animation Microsoft produced a while back on a concept container based data center. I love the thought that the data center might not even need to have a roof on it.
No Roof and No Floor, let's take the data center off-shore
Back in 2004 I was working on the new SDET career stage profiles with several members of the Microsoft Test Leadership Team (MS TLT) when we wondered off topic. Electrical power costs were spiking again and our lab budgets were getting squeezed. One of us, I'm pretty sure it was Darrin, chimed in and said, "we should just take all the lab machines, put them on a boat and park it in what ever harbor had the cheapest power and be done with it. If they raised prices we'll just unplug the machines and drive the ship to a new location." Everyone thught it was a great idea if a bit impractical with current technology.
We probably should have explored the idea some more. I found another Google patent application and this one is for a water-based data center. They have some really good ideas on heat management and using the ocean to generate some of the electricity needed to run the data center. The more I think about a water-based data center the more I see a tie in for an action movie. I could see a serious James Bond or Mission Impossible sequence here where the agent has to sneak aboard a floating data center, find the right server, and copy the hard drive to a portable device without being detected.
Seriously though there are some very interesting financial and legal advantages to this idea as well as good old fashioned positive environmental impacts. On the financial side you can avoid realestate taxes and avoid dealing with local permitting processes when you don't use land. If the servers are offshore do local laws around gambling or sales tax still apply for transactions that happen on these servers. If the data center is in international waters, what laws apply and how do you assess the impact to GNP for a country? Also the ocean is a much better vehicle for releasing the heat generated by servers so a floating data center would certainly need much less air conditioning to cool the equipment. This would have a very positive impact on the data centers overall Power Usage Effectiveness.
Most of the links below can also be found by tracking James Hamilton's blog perspectives. I have him in my RSS feed and recommend his blog to everyone tracking innovations in high scale computing.
Cargo Container Data Centers
Googles cargo container DC links
Water Based data Center Links
-
This BLOG post is a thank you to all the individuals who have read “How We Test Software at Microsoft,” and posted something on the web about it (that Alan, BJ and I been able to find that is).
We are thrilled by the very positive response readers of HWTSaM have had. There are several five star, four star, and even a ten out of ten reviews of the book. If you’ve read the introduction to the book you know the lead author, Alan Page, didn’t originally feel the world needed yet another book on software testing but after a bit of reflection he saw how a book about Microsoft testing would be worth writing.
We didn’t write HWTSaM to be the ultimate book on software testing but rather to be a good companion to the other excellent books that have come before it. Regardless of the intended design, all three of us want the book to do well and for the first few months after the release we have been hovering over the Amazon rankings and scouring the Internet for comments and reviews. Through all this activity we’ve learned a few things about promoting a book that make us a bit more educated but not experts.
1. It pays to know your niche market and the popular bloggers
2. Claiming our book and making an Author Pages on Amazon.com is cool but doesn’t drive sales.
3. Climbing to #1 on the New York Times Best Seller list or Amazon.com is not going to happen (today’s book was “Common Sense”)
4. It is important to say thank you when someone praises your work
We have said thank you directly to most bloggers and reviewers but I wanted to collect a bunch of them together and say thanks in a more public forum such as this.
In the book I wrote or researched many of the sidebar stories so I thought it might be a good way to start this thank you blog with a personal story.
I call this one the “Perfect and Unexpected Plug.”
More than one hundred senior engineering managers and architects along with a few executives gathered this week to discuss some of the evolving ideas around Office 15 for a one day offsite focused on improving the Office engineering system. One of the key elements of the meeting was to bring in new thinking from outside of the Office organization. The first guest presenter was Mike Kelly who had been part of the Office organization before joining the Microsoft core engineering strategy team Engineering Excellence. In his presentation he shared some really amazing prototypes of new collaboration and build system tools as well as some industry analysis. The second presenter was Craig Fleischman, also a former Office manager who was now working on Windows. Craig presented some of the plans the Windows team has for Windows 8.
Right about now most readers are probably hoping I’ll spill some information about Windows 8 and Office 15 but sorry, I can’t do that. All I can say is that while neither Windows 7 nor Office 14 have shipped, we are hard at work developing plans for the next versions.
Craig finally got to a slide about Windows 8 and services. As this was an event focused on engineering systems rather than services and features he announced he would skip most of this slide. It was a long slide that slowly built with three different sections and more than fifteen bullets. Craig paused on the last bullet and simply said, “I don’t have time to go over this whole slide so I’ll just cover the last bullet. Read chapter 14.”
I hadn’t actually been paying much attention. That’s the problem with laptops and WiFi, it’s too easy to respond to the inbox. Speaking of focus I should re-read Alan Pages’ recent blog posts on productivity and distractions. Somehow I did hear skip the slide on services and “Read chapter 14.”
This caught my attention. I paused and looked up. All across the room multi-taskers were lifting their heads up from laptop screens and looking directly at Craig. The whole room was actually engaged, wondering what Craig meant by “Read chapter 14.”
A nervous energy shot up my spine and a wide smile broke out across my face. I suspected what this mysterious chapter 14 might be but I wasn’t certain enough to say anything.
Craig attempted to move on to his next slide but of course he was interrupted.
“Chapter 14 of what,” someone yelled out.
This seemed to catch Craig off guard. Of course everyone in Office knew about and must have read chapter 14 by now. “You know the services chapter,” he said. Blank eyes stared back at Craig. “Chapter 14 from Ken Johnston’s book,” he said.
A little “yes,” escaped my lips. I had been plugged!
Certainly I could not have asked for a better plug for one of my chapters in, “How We Test Software at Microsoft.” Here we were in a gathering of the most senior engineers from across Microsoft Office, one of the most successful businesses in the history of Software. Most of the attendees were not even testers, and now they were left wanting to know more about this mysterious chapter 14; the chapter 14 that Craig said must be read. I don’t think one could plan a better hook than that!
Craig was able to continue to his last slide but the room still wanted to know more about this book and Chapter 14. Fortunately I’d remembered one lesson from John Kremer’s book, “1001 ways to Market Your Books,” is that an author should always have a copy of their book on hand. I had three and made them available in short order.
During the break I came up to my friend and colleague Craig and jokingly asked who I should make the check out to. Craig commented that he thought we needed a core set of reading for everyone who moves from software into services and this should be one of those. Again this was high praise and I was thankful.
The end of the “Perfect and Unexpected Plug.”
This experience reminded me to be thankful of all the blogs and reviews HWTSaM has received so far. I know Alan, Bj and I read everyone and we always try to comment or contact the author to say thank you.
What follows are excerpts from several blog postings and reviews of How We Test Software at Microsoft. Most are linked to the official book site at http://www.hwtsam.com.
BLOGs and Book Reviews and Links:
· Microsoft Press has been a great partner to work with. Here is a portion of the Model Based Testing chapter with Graphics
· Six Reviews on Amazon.com. Here’s just a couple of quotes (the nice ones):
o Sally Foster “History buff” gave HWTSaM 5 out of 5 stars, “The writers are drawing from experience, they understand testing software, and more importantly, they understand how to position a tester, and a test team, for success. This book goes far beyond Kaner's "Testing Computer Software", and is a must for any software tester who is passionate about shipping quality pro ducts.”
o Manfred Dietz gave us 4 out of 5 Stars, “So, why not 5 stars? Because you guys did not mention anything about metrics and its influence on our work and the results.”
· Barnes & Noble has two five star reviews. Boulderdash wrote “A best practice book, it is loaded with real life experience of the authors…”
· Asaf – Boulderdash Blog “Alan, Ken and Bj have divided the chapters authoring among them. Each has his own way of writing, although different in style, the final result is excellent. I highly recommend the book for all those who are into software testing”
· Michael J. F. (SQA Blogs) “The excellent explanations of Equivalence Class Partitioning and Boundary Value Analysis are among the best I have ever read.”
· The Evil Tester posted a review on Compendium Development “The first 2 chapters present Microsoft as a great company to work for, one that really values the testing staff and reads as the best recruitment literature for any company I've ever read.”
· James Whittaker was one of the first to comment on HWTSaM back in January. James has a new book coming out soon that we look forward to reviewing. “…it will also be the year that I expose more insider details about testing culture and practice at Microsoft…Although, much of the thunder has already been aired by my colleagues in their new book How We Test Software at Microsoft. That book's a good read and a high bar for me to match when my own book comes out in a few months.”
· Linda Wilkinson, Practical QA “This book grabbed me right away; it was a glimpse into the culture of a vast, complex, and interesting company with some challenges that are unique in the field. And after reading this book, I’m STILL fascinated”.
o William Echlin commented, “..this is a book that on the face of it, I would not have attempted to read in a million years. Yet based on what you've said this in now somehwere near the top of my must read books.”
· iTWire Book review by David M Williams “All in all, this is an impressive work with a great deal of wisdom and principles – underpinned by sound theory – that would be of interest to any company that produces software of reasonable complexity.”
· Debra Martinez review on StickyMinds “This book has made its rounds in my testing department. There is not a day that goes by when I am not asked if I still have the book. I feel this book is great addition to any testing department”
· Michael Hunter the Braidy Tester “HWTSAM is chockablock full of details regarding fundamental testing techniques, strategies, and processes which I believe every tester should be familiar with (even if you disagree with the utility of some of them).”
· Kawal Banga on BCS Book Reviews scored 10 out of 10, “More than a million test cases were written for Microsoft Office 2007 and the automated tests for many Microsoft products have more lines of code than the products they test… All in all, this is an excellent book, and should be on every tester's bookshelf.”
· Phil Kirkham on Expected Results “I found it to be an excellent book, lots of tales from the trenches, explanations of the problems MS faces, how they try to overcome them - all intermingled with general testing theory.”
· Javier Andres Caceres Alvis and his Windows Mobile, Testing & Multi-core programming group used HWTSaM for several group discussions.
Thank you to everyone who has commented on HWTSaM whether positive or less. For those that were less positive I’m sorry I didn’t include direct links to your comments but I know Alan, Bj and I have read and reflected upon them. There are a few foreign language posting and a video podcast that I didn’t include in this article and I’m certain I missed some review somewhere. My apologies.
Thank you all,
Technorati Profile KJ
-
Alert: this post has nothing to do with S+S or operations or testing but it’s a small slice of life that I just had to share.
“P” words such as penultimate, pontificate, plethora, plebian, and polyglot have failed me more than once.
Have you ever found yourself reaching for that perfect word? Sometimes it’s when you are writing but other times it might be in the middle of a conversation or, heaven forbid, a debate around the conference table or marker board where you don’t have the luxury of looking up the definition to make sure you are picking the right word. There you are in real time reaching back for that perfect word and one comes to you and out it goes. Everyone pauses, turns and looks at you and you realize that is not the word you meant to use. So much for my credibility.
In my life that has happened mostly with a set of multi-syllabic “P” words. I’ve used penultimate instead of ultimate to mean the pinnacle or zenith, polyglot instead of plethora, and pontificate thinking I was just being smart. I’ve worked through my issues with all those words except for penultimate.
penultimate - second to last: second to last in a series or sequence, “the penultimate chapter”
Definition from Encarta.msn.com
It’s just not one of those words used much in American English. We are so focused on the winner or the next big thing or sometimes even the underdog that we rarely consider what or who came next to last. It doesn’t matter to us whether it was an elite group to be a part of in the first place. If you didn’t finish first or second, it doesn’t matter.
Some notable penultimates of the past few months of 2009:
· Kentucky Derby – Friesan Fire
· Indianapolis 500 – Driver Ryan Hunter Reay
· Car and Driver 10 Best Cars – 2009 Porsche Boxster and Cayman
· Overall American League Standings (as of this post) – Baltimore Oriels with a .431 winning %
· My son’s finish in the sack races at Field Day this year
I have decided, however it is time for me to find a way to work the word penultimate into my life and in order to do that I have emphasized it with my children. We unofficially created a new iHoliday last year at the end of school. We call it “The Penultimate Day of School!”
I’ve been ruminating on this idea for about a year now. Last year, my kids were all excited that the next day would be their last day of school for the year and they were upset that it wasn’t here yet. I then seized upon the opportunity to explain to them that this day was very important. It was the penultimate day of the school. When I explained what penultimate means, they got so excited that they started announcing it to everyone they saw. “I’m brilliant!” I thought to myself. But actually, by their enthusiasm, they created a day for which Hallmark should make cards.
Let’s face it, penultimate is a very funny sounding word. Most adults that hear it will give you quite the odd look. But upon learning the definition, the penultimate day of school was born. We all agreed to go to school that morning and share this new and titillating word by wishing everyone we met with “Happy penultimate day of school.”
The children in their classrooms all giggled at this greeting, but had no clue what it meant. The teachers were all taken aback to hear a word they didn’t know from such a small child using it with significant confidence.
At the end of the day there were a dozen or so individuals who now knew what the word penultimate meant and they now had a good place to use it, at least once a year.
Now there are many other good places to use the word penultimate, the penultimate lap of the Indy 500, or the penultimate game of the season, or the penultimate batch of salmonella tainted produce are all good examples. If I were to write an article on the rankings of search engines I would love to write, “Microsoft’s new Bing service is climbing but ask.com is still the penultimate search engine.”
Still it feels awkward to use the word penultimate and it is such a good word. The world needs a time and a place to use the word penultimate and I have decided to try and help this great word along. I am creating and promoting a new holiday, one born solely of the internet and social networking. This new holiday will be called the “Penultimate Day of School” and will be celebrated at every school in every part of the world and will be celebrated on the next to last day of school.
The rules of the Penultimate Day of School are quite simple. Students are encouraged to hail their fellow students and faculty with, “Happy penultimate day of school.” They are also encouraged to use as many big words as they can in conversation. They can even carry a thesaurus with them and whenever possible substitute a multi-syllabic word for a more common word. Penultimate Day of School is a day to revel in the use of really, really big words.
Since everybody is so excited for the start of summer, the best place to use the word penultimate is to describe the next to last day of a school year.
For 2009 I have a modest goal to simply double the involvement of parents and students in Penultimate Day of School. I have launched a Facebook page to promote the event and soon should have a SharePoint site where I hope students may post Penultimate Day of School thank you notes to teachers and faculty. Happy Penultimate Day of School, everyone!
I promise that the next post will get back to S+S testing. For more on that subject read chapter 1h4 of “How We Test Software at Microsoft.”
w76d98a3fs
-
Today I hosted four hours of interactive learning on S+S testing with table topics such as "Testing in Production, How far can we go?" and "Release Cadence in an S+S era." Every time I get together with smart engineers new better ideas are generated.
One interesting example that came up in the afternoon session was the impact background or maintenance tasks can have on a data center's infrastructure. In this particular example a rather large Microsoft service was getting ready for a big launch and needed to upgrade thousands of servers in a data center to the latest version of the service. Well the deployment of this service is fully automated (I'll write a post ranting about the importance of deployment in the near future) and so with the push of a single button the deployment was off. The "bug" if you will occurred because all the machines became very busy pulling down bits, conducting reads and writes to disk, and actually hit a higher average CPU utilization than they would during normal production use. This massive load actually caused power failures in the data center. So where is the bug?
Should we fix this in software or rely on a new policy of never upgrade thousands of machines at the exact same time ever again? This is a very interesting edge case and I don't have the answer. I offer it up as an example of how much there is to consider and learn as we move into S+S and Web 2.0 worlds with cloud computing and multiple devices.
A few more ideas that I gathered today include measuring not just time to deploy but time to rollback in case the deployment is flawed, Should we target the 75th percentile or 95th percentile when measuring and signing off on Page Load Times (PLT), and Release Criteria need to include post RTW measurements before you really sign off. All of these are great ideas that are at least a new twist on an important topic if not completely new. The great thing is they all came up during the training session today and I was lucky enough to hear them. Though most of the content is not public I will dive deeper into some of the hot topics next week.
The other experience I had today was delivering a webinar to a SIG in Bogota over Live Meeting. This session was in support of the book I helped write with my colleagues Alan Page and B.J. Rollison titled “How We Test Software at Microsoft.” For more information on the book visit www.hwtsam.com. For this session I was to deliver some content on Chapter 14 which focuses on S+S Testing and then answered some questions.
Doing a Webinar with Q&A can be challenging. Add to it the translation piece and no video of the audience to register their reaction and it becomes very challenging.
In the Webinar I introduced the topic of Testing in Production (TiP) for services. It is a growing field of thought within Microsoft and from what I can tell a process used very heavily by some of our competitors. The notion that one would ship something into production and then test it seems anathema to software testers. Needless to say this became the major topic of Q&A.
The real way to look at TiP is to ask what can safely and effectively be tested in production. The next question is to ask how to make testing in production a fast turnaround process that is cheaper than testing in a lab. When price and speed of production testing are lower than labs, and we are getting there with cloud computing, then you really should move all the testing that you can out of the lab and into production.
I have an article under way just on TiP and hope to publish it in the near future.
Thank you to the Javier Andres Caceres Alvis for the opportunity to discuss S+S and Services testing with your group.
-
Think about Services shifting the Testing into Production
The topic of Testing in Production (TiP) is a growing area of debate in SaaS and S+S testing groups. While I don’t personally really believe you should ship your service and then start testing it, I often introduce the topic this way. It is very challenging to get testers to consider shipping anything that hasn’t been thoroughly tested. The notion that a tester could build out an adequate test environment or set of environments and find every single major bug in their service is just as flawed as ship then test.
I use this “Ship first and then Test,” approach to jar audiences of testers and to move the conversation from what must be done in a lab to what can and would be best tested in production. I will try to make a much longer post on the subject in the future but for now simply think about these questions.
How often have you shipped a service and found that you missed a bug because the test environment wasn’t enough like production? Now ask yourself as scale and complexity of services increases is it reasonable to think you can really make test enough like production that you can catch every bug?
The process of thinking about what can be tested in production can be an exhilarating exercise. You quickly get to a discussion around what testability features need to ship with the service to make this possible. What impact can TiP have on revenue and customer experience, how do we isolate this impact, and what tests are best conducted in labs are other very important questions.
Think about it and let me know what opinions you have and questions you would like answered.
-
1+1 redundancy for production services is a flawed design approach.
+1 redundancy is like the kind of logic my wife uses with me when I go on an overnight business trip. She will insist that I take at least two pairs of socks in my bag even though I plan to be home the very next evening. The logic is that I might accidently step in a puddle and would need a clean pair of socks. Yes, she also wants me to take an extra pair of shoes but let’s just stick with the socks for now. Certainly there is a chance that I might step in a puddle and need an extra pair of dry socks but I find that when such and incident does occur and I need my extra pair of socks something else tends to happen like my flight getting canceled and me being forced to stay over an extra night.
To be clear I am very pleased that my wife insisted on the extra pair of socks and I wish I’d listened to her about the shoe thing too but the real problem was that when given a chance, more than one thing will eventually go wrong and then you find yourself in a long line in the airport waiting to go standby on a flight that only gets you half way to your destination and you wonder where that awful smell is coming from. Then you realize you are the one wearing the dirty socks from the day before and you are stinking up the place.
When I talk about 1+1 redundancy, it is usually in terms of two servers performing the same role like a pair of SQL Servers doing log shipping to keep each one up to date or a pair of routers with redundant routing tables. There is a great paper on flatter network architecture from Microsoft Research on the Monsoon project that you can find here. Whatever device or service you want to imagine is just fine. The key point is that they are a pair doing the same job and they are designed in such a way that if one should fail the other will pick up and charge forward. Unfortunately that just isn’t good enough.

Figure 1: The Monsoon project (see paper here) flattens the network architecture and moves networking to commodity hardware. This reduces cost and spreads risk.
In services 1+1 redundancy does not equal 2
The thing about 1+1 redundancy that people often forget is that when the 1 goes down you have lost your safety net and now you must react quickly in case the +1 should also fail. If this was the only exposure we had, then maybe we could live with it, but the reality is that due to ongoing maintenance and patching our 1+1 is down to just the +1 far more often than just failure scenarios. If you add up all the maintenance windows for a service, you will probably find something on the order of .005% of a year is spent in maintenance. In other words, we are at 1+1 redundancy just 99.995% of a year.
Even the best of our services struggle to maintain four 9’s of availability. It is therefore reasonable to expect that over the course of several years both units in a 1+1 configuration will experience coinciding down times. In my reviews of scores of critical outage summary reports, I have seen this pattern of cascading failures in 1+1 redundant topologies time and again.
1+1 is hard on operations and adds to the COGS (Cost of Goods Sold) for a production service.
We’ve established that due to maintenance windows and cascading failures 1+1 is at an above average risk of having both units fail at the same time. To date our automation for repair and failback in these situations is not very high so most services make up for this by increasing staffing levels in operations. Of course, this increases COGS.
Since I have managed operations teams, I can say confidently that this architecture bears a heavy burden for the on-call Operations and Product Engineers. The added cost is not when both fail, but when just one device fails. Every engineer immediately knows that the safety net is now gone and the risk of a second failure is looming. It becomes an urgent rush to get back to 1+1 redundancy as quickly as possible. This is true even in maintenance.
We hire smart engineers to be good at quickly performing repetitive manual tasks because risk of a customer impacting outage is high. That is not the most efficient use of our engineering talent.
The solution to the problems of 1+1 is 1+N.
For me, 1+N is the right solution as long as N is >= 3 and all 3 are fully active. I have had debates with individuals that had 1+1 topologies with a warm pair of warm spares. They would insist that the warm spares should count. In another posting I’ll go deeper into the problems with non-active backup solutions.
If everyone agrees that 1+N is the right way to go and most services launch with some portions of their service in a 1+N configuration, then why do we have so many places where we truncate down to 1+1? The answer is simple is as simple as dollars and cents. The answer is a bad assessment of COGS and risk. The answer is a poor application of the phrase “high availability”.
Everyone knows to avoid any single point of failure in a service so we design to eliminate them. The mistake in going with 1+1 usually occurs around the very expensive devices and the decisions folks make there to “save” money.
Big routers are very expensive. Big load balancers are very expensive. SQL Servers with multi-terabytes of unique user data are very expensive. We can’t have a single point of failure but we can’t afford more than one extra of any device in the system or the COGS model will be too high. That leads to the 1+1 mistake. Perceived cost and a need for some kind of redundancy cause teams to make this mistake time and again.

Figure 2: Portion of a physical topology diagram stacked by rack. Note SQL Servers are spread across 2 racks to reduce risk of power failure to both however they are being launched as pairs.
Operations and Test need to drive out 1+1 during design reviews
In my role I do a lot of service topology reviews. I love the big Visio diagrams printed on a large plotter with all the little pictures of servers, machine names and IP addresses listed all over the place, and lines for logical network connections. Don’t get me started. One of the things I do in these reviews is look for places in the diagram where there are two instances of something. In some cases where it is, say, an edge server for copying bits from corpnet and not really part of production two of that machine “role” makes economic sense but that is rare for these reviews. Here are some questions I like to ask when going through one of these topology reviews.
1. How can you say we are blocked by a technology limitation when we picked the technology? Can we pick something new or write it our self?
2. We could run this on lower end hardware couldn’t we? That would give us more instances of the same device wouldn’t it? Would this get us at least to 1+3?
3. Test to see if we can combine this machine role with another. If we can combine them then we can flatten the architecture.
4. Software can automate many processes. Can we automate the replication of data so we have more than two instances?
5. At least look at how we can we break the data store down to smaller stores with a hashing algorithm? Can we then get our 1+1 exposure down to less than 5% of our user base per pair?
6. Microsoft engineers, operations and product team are on the hook if we have a production outage. If one of these devices goes down in the middle, will you simply roll over and go back to sleep or will you immediately begin to troubleshoot? If not, why not? Are you willing to be on call for the next 365 days to respond to any outage? Fine. You wear the pager and you can keep your design.
When I do these reviews I don’t like answers such as “the technology won’t let us do it that way” or “team X does it this way so we should too” or the worst is “we just don’t have time to do it right.” Those answers ring hollow. 1+1 just isn’t redundant enough and everyone should stop defending it and get on to good design.
As a parting thought, consider the reliability of the US Space Shuttle program where they have five computers involved in making critical decisions for takeoff and landing.
Four identical machines, running identical software, pull information from thousands of sensors, make hundreds of milli-second decisions, vote on every decision, check with each other 250 times a second. A fifth computer, with different software, stands by to take control should the other four malfunction.
Charles Fishman, “They Write the Right Stuff,” Dec 18 2007
Despite being at 1+4 redundancy, billions of dollars in hardware, software, fuel and human lives at stake the Space Shuttle program has had 2 major disasters. One disaster occurred during takeoff in 1986 and the other upon re-entry in 2003. If you calculate the reliability of the program as 2/132 Flights this gives the program a 98.5% reliability rating. Please don’t think that I’m denigrating the space program as I have been a fan since my father worked with the shuttle astronauts at the Houston space center decades ago.
The takeaway here is that virtually all of our services aim for 99.75% or higher availability but NASA with 1+4 redundancy has not been able to achieve that level. Yes space flight is much more risky than services but my point is that critical systems need more than one level of redundancy and this is just one of many examples from multiple industries we can cite.
I fully and firmly believe 1+1 redundancy will not produce high availability and will be higher cost than a 1+N solution.
In the end, it really is like the dirty sock analogy. It’s nice to have an extra pair, but if you travel enough, you will probably run into a situation where you will have both wet feet and a canceled flight. In the case of socks, it’s just about a bit of discomfort and rude odors. In the case of services though we have customer impact and wear and tear on our staff struggling to keep a flawed design functioning and when our customers experience poor service due to bad design they are left smelling our foul stench.
This is the first of what I hope to be a regular set of blog postings. The focus of this bog will be on topics that intersect service design, testing and operations. I have many ideas for future posts but I’d like to hear from you. If you’ve read this post and found it useful and would like information on another topic please send your suggestion directly to me. Thank you in advance for any comments you may have on this post (even disagreements) and suggestions you send my way.
For more content on testing Software Plus Services see chapter 14 of “How We Test Software at Microsoft.”