You hurt your shoulder playing volleyball, so you make an appointment to see your doctor. You enter the office and wait in line for five minutes just to let the receptionist know you've arrived. He has you verify your contact and insurance information, which haven't changed in ages, and then tells you to sit in the waiting room.
You sit in the waiting room for ten minutes, inhaling all kinds of ailments from the crowd, seething about how you're going to leave sicker than you were when you came in, till a nurse shows you to an exam room.
After five minutes in the exam room, another nurse comes in, takes your vital signs, and has you repeat the reason you gave for the appointment when you originally made it. Ten minutes later, your doctor arrives and actually addresses your shoulder injury.
Welcome to a user's experience running our software. You wait forever just for it to launch. You provide your credentials again, even though you gave them when you logged in. You wait again for your personalized environment to load. You click a few menu items or buttons to launch the specific functionality you want. Finally, you wait again while the feature prepares to do what you actually launched the software to do in the first place. That's assuming there aren't network delays.
Waiting is dull. Waiting is frustrating. Waiting is agonizing. Waiting is just an unbelievably bad time all around, on every level. Nobody likes to wait. Nobody asks to wait. So, why the heck do we make people wait?
Actually, why do our customers even put up with it? Why do I put up with it at my doctor's office? I guess because all doctors' offices are slow. But, if one of my friends told me about a quick doctor's office that provided comparable care, I'd switch in a heartbeat.
This means a competitor's quicker program could flatline our business. How do we make our programs quick, before our competitors do? I'm glad you asked.
If your doctor is good enough, you may put up with significant hardship instead of switching. However, giving people a reason to switch is inexcusable, especially when it's easy to do better.
So, you want better performance from your software. Where do you start? "I know, I know!" says our resident performance pariah, Mr. Speedy. "Profile your code, find out where it's spending all its time, and then optimize, perhaps even parallelize, those inner loops."
Well, Mr. Speedy, aren't you clever. Let's profile our doctor's office, shall we? Whoa! It turns out the doctor is always busy, and that's the bottleneck. Who would have guessed? According to Mr. Speedy, all we need to do is speed up the doctor, find a faster doctor, or get two doctors to do the job of one. Right? Wrong!
There's nothing wrong with my algorithm—I mean, my doctor. If you made her faster, she wouldn't be any better. In fact, she'd probably be worse. Doing the job right takes time, and making all kinds of optimizations might improve things a little, but might also cause mistakes. I don't want a different doctor who happens to be faster, either. I like my doctor. I know her well, and she knows me.
I also don't want two doctors. Even if they are twins, I'll never know which thread—I mean, which doctor—I'll get. They might both try to treat me. They'll have to communicate with each other all the time to avoid mistakes. They might even get stuck waiting on each other. It's way more complicated, and it really doesn't solve the problem even if two doctors are twice as fast. I've still got to deal with the receptionist, waiting room, exam room, and nurses.
"But what about multi-core processors?" you might ask. Look, there's using technology for the sake of technology, and there's using technology for a purpose. If the user experience demands threading across multiple processors, I'm all for it. If not, you're just giving yourself a cheap thrill at the expense of the customer.
"Hold on," says Mr. Speedy. "What you need is a cache–that always speeds things up." Hello! We've got cache fever in the doctor's office. That's part of what's slowing us down. We've got a reception line cache, a waiting room cache, and an exam room cache.
It seems like everyone at the doctor's office is concerned about speeding up their own work, so they all created their own caches. The receptionist created a cache, the nurses created a cache, and the doctor created a cache. The result is that patients spend all their time waiting and moving between caches instead of being processed by the doctor.
Think this doesn't happen in code? You've obviously never looked inside those database, shell, and system calls you use. All that data you're caching for "performance" is already being cached for the same reason by those functions. Sometimes there are as many caches as there are layers. Every cache has a fetch and memory cost to it. Well, I'm cached out.
Let's start over, shall we? Instead of speeding up the existing doctor's office, as marginally effective as that might be, let's think about things from the patient's perspective. What would you and I, as patients, like the experience to be?
Here's what I'd like. Check my contact info and insurance when I call in for an appointment. Write down my symptoms and include them in the appointment. Heck, let me do it all online (wait, that's crazy talk)!
When I show up at the doctor's office, I can walk right to my exam room (just one level of caching). It's the room with my name above it, just like at the good rental car agencies. A big sign in the room says, "Please take off your shirt and have positive identification ready for the nurse, then hit the big 'I'm ready' button."
The nurse, seeing that I've hit the "I'm ready" button, comes in, checks my ID, takes my vitals, and hits the 'Vitals taken' button, which adds me to my doctor's queue. As soon as my doctor's available, she comes in and addresses my needs. That would be great! Heck, there could even be a monitor in the exam room with queue stats and predicted wait times. The stats could be used to fine tune the number of appointments available per hour to minimize wait times, while still ensuring that each doctor is fully utilized.
If you haven't read about the Theory of Constraints or its drum-buffer-rope approach to optimizing results, you are in for such a treat. They should be required reading for anyone trying to rethink and revolutionize performance of everything from software to cafeterias.
There, that wasn't so hard. Setting up a doctor's office the way I described would be easy and not that expensive. It doesn't require more doctors or faster doctors, and it actually saves floor space. Sure, the online appointments and predicted wait time monitors would require special software, but those aren't essential to get a better and faster experience. What is essential is to think through the customer's experience with a view toward minimizing wait time.
Here are some questions for your consideration:
§ When was the last time your team thought through the end-to-end customer experience, including the wait time?
§ How would the customer want to deal with the inevitable constraints that every process has, besides giving them a CANCEL button? (Associating our software and services with "cancel" seems unwise.)
§ How could you minimize the impact of errors, network delays, and device I/O in a way that customers would find natural and unobtrusive?
§ What measures and statistics could you use to fine tune the experience, minimizing wait times while getting full utilization of key resources?
Right now we design experiences as if these performance constraints don't exist. Everything is modal and synchronous as if functions always return and people never select the wrong option. We design a feature at a time, instead of end-to-end; or if we do design end-to-end, we only think about the ideal scenarios, not the likely ones. We assume exceptions and delays are unusual, even exceptional. That's naïve, which is a kind way of saying "stupid."
Of course, there are great examples in Microsoft software for thinking through the end-to-end experience quite nicely. I mention some in the next section.
Performance tuning does have its place. There are functions and services that must scale up and out. There are issues with blocking and locking that require special care, which a real performance expert can help you resolve. It's just that those aren't the common case.
The common case is the ordinary case. A customer is trying to get something done. It involves network access and I/O. Those interactions could fail or cause delays. The customer generally has experienced these problems and knows they exist. The best way to handle them is to talk to customers and understand ideally what customers want to happen.
Perhaps the customer would be happy if the I/O completed asynchronously—the solution Outlook and OneNote use to vastly improve the customer experience. Perhaps the customer would be happy to work on a local copy and synchronize on demand—the solution ActiveSync and FrontPage use. Perhaps the customer is happy to queue their requests and get a status report later—the solution build systems and test harnesses use.
The key is to look at the world from the customer's perspective and to design an experience that anticipates failure modes and minimizes their impact on users. Performance should be specified in the experience with specific measures and guidelines, not left to chance or hope. It typically doesn't require complex algorithms or fancy caching, both of which can be overdone. It requires being thoughtful and deliberate.
When performance is specified through the experience, it's built-in and tested from the start. No one gets surprised or has to scramble at the end of the project to suddenly become a performance expert. The only surprise is on the customer's face when what they thought would be an agonizing doctor visit turns out to be a delightful one.