You’re working on a feature and think there’s an obvious customer improvement to be made. The tester thinks you’re in obvious need of medical attention from a psychiatric professional. She believes the shipped design was fine from the start. The PM insists that your suggestion doesn’t fit the design language (?). He wants to make some overblown artistic statement. Your manager believes it’s the feature’s performance that’s the real problem. She wants you to stop arguing over design minutiae and start being a developer. Who is right? None of you. All of you. But most of all—the customer.
Customers can’t design the product for you. They don’t know what they want. However, they do know the good stuff when they see it. They vote with their money and their mouse (or their fingers and fortune; Windows 8, baby!). Your job is not to read customers’ minds. You can’t. Neither can your boss, your PM, or your tester. Each of you, particularly the PM, should be able to construct a good initial guess, but it’s only an educated guess. Your job is to ship feature improvements, measure customer reactions, and then iterate. I repeat: you can’t read customers’ minds. You aren’t correct—customers are. Ship, measure, repeat.
It used to be that the ship-measure-repeat cycle took a year or more per iteration. With cloud services, that cycle can be measured in weeks or even days. However, to take full advantage of shorter iterations and faster convergence to customer nirvana, you must change your coding and design habits, as well as your attitude. Or you can think you know better than customers and everyone else. You know, like a loser.
As with any iteration scheme, you need an initial guess. Usually, that’s the currently shipping design, but sometimes you’re starting fresh in a new area or changing the game against a competitor. You can’t afford to try every combination of every feature—the permutations are far too great. You must start with a well-educated guess and only test those aspects that are the most controversial or experimental.
Your initial guess is determined by traditional market research and product planning—business development, customer and technology research, competitive analysis, focus groups, brainstorming, and prototyping. (A great framework for this is called scenario-focused engineering.)
The outcome of this market research and product planning should be a set of prototypes. Old-school Microsoft thinking would advocate shipping a single product with the best attributes of the different prototypes. Actually, real-old-school Microsoft thinking would support just shipping the first prototype. (The horror! The horror!)
You can read more about proper prototyping in My experiment worked!
With today’s cloud services, you don’t ship a single product with the best attributes of the different prototypes. You ship all the best ideas from all the prototypes and let customers’ mice and money (or fingers and fortune) tell you which ideas drive the best results.
Trying different ideas on real customers is known as A/B testing or multivariate testing. The idea is simple. You send a known percentage of real customers to one version of a feature and a different known percentage to different version. You measure desired end results (more purchases, deeper engagement, and/or greater discovery) and then choose to send future customers to the version that got better results. Even packaged products can take advantage of A/B testing for their service-connected and/or frequently updated features.
For more on measuring desired end results, read How do you measure yourself?
As with any statistical method, you must be careful in your analysis. For example, those mathematically inclined will enjoy Simpson’s Paradox, which has a way of twisting A/B test results when you mix them with audience segmentation. There are many such gotchas, so read a good A/B testing primer and follow its links if you’re serious.
So how on earth do you ship two (or more) versions of the same feature at the same time? Easy—code it twice.
“Code it twice?! Heck, I don’t have enough time to code it once!” Calm yourself. Teams typically don’t have design arguments over whether the feature is an air freshener or a surface-to-air missile. Their design arguments are over very specific elements, like the layout and sequence for UI or the calling pattern for an API. And for those specific design decisions, the vast majority of the feature’s code is the same, and often one version is the one already written and deployed in production.
Coding the same feature two different ways is particularly easy with a model-view-controller (MVC) design pattern. With MVC, or similar design patterns, the model (the code that really does something) is separate from the UI (the view) and the actions (the controller). To ship the same feature two different ways, you share the same model, but invoke different views and/or controllers based on the customer. You can do this for web services as easily as for web UI.
Naturally, you must add instrumentation that records which version was invoked and what results followed. The infrastructure and instrumentation to support A/B testing and experimentation should be built into Azure and other cloud systems, but for now you’ll have to rely on shared code within your division. (Just like you do for the continuous deployment and exposure control technologies that I described in There's no place like production.)
If your team isn’t doing A/B testing already, check with long-term services teams within your division. Chances are they have an experimentation or instrumentation system you can share. If not, check with established services around the company (like Bing). Please don’t reinvent your own.
Once you realize that arguments over design are silly (just ship both and see), you quickly get addicted to data-driven decision making. It takes all the drama, politics, and chest-pounding out of arguments and puts the control firmly in the hands of our business results as determined by the people who drive those results—customers.
Pretty soon you’ll want to inform all your decisions with data. How many branches should we have? What features must we cut? What teams are in trouble? There are endless real business questions that data can help resolve.
Please note two pitfalls:
As I mentioned in Cycle time—the soothsayer of productivity, shorter cycle times for products and their supporting services have all kinds of benefits. Being able to make data-driven decisions is one of the biggest.
Start with a great initial design based on deep customer research. Have a bunch of solutions in mind for the key design problems and ship all the main alternatives. Select the desired business results you seek, instrument your alternatives accordingly, and let the results from real customers inform your decisions about which refinements to make.
It’s easy to do once you’ve tried it a few times, and it’s really addictive. Why guess and argue when you can know and decide? Knowledge is powerful. Go get powerful.