With the task list in place and participants recruited, it's time to run the study. My experience has been that running an API usability study is really no different from running any other type of study. Here's a description of what I do.

The day before the study I make sure that the machines I'll be using in the usability lab are set up. I install the necessary builds of Visual Studio and associated SDKs etc to be able to run the study. I always do this first thing in the morning just in case anything goes wrong as setting the machines up properly can be quite time consuming. Once that is done, I then go about setting up the screen capture software so that I have video recordings of each participant's session in the lab.

While our usability labs are well equipped with audio and video recording equipment, I prefer to use HyperCam for screen recording since the quality is decent and the format (AVI) makes it easy for me to create video clips to show highlights from each session. Running HyperCam and VisualStudio on the same machine can sometimes cause the machine to slow down pretty significantly though, so I run HyperCam on one machine and then use remote desktop from that machine to connect to the machine running Visual Studio. Setting up HyperCam, including setting up the PC microphone so that the recording level is just right usually takes no more than 10 or 15 minutes.

Our usability labs are split between the participant side and the observer side. After setting up the participant side of the room, I set the machine up on the observer side. We have in-house tools that we use to record our observations during each session. The tools allow us to create a simple tagging scheme that we use to tag our observations as we take them so that afterwards we can easily refer to all events that involved use of the ServiceContract attribute for example, or events that were particular highlights for whatever reason. I take some time to think about how I might want to analyze the observations after all the sessions and set the tools up to use the particular tagging 'schema' that I come up with. Sometimes the schema is as simple as three tags labeled 'Highlight', 'Success' and 'Failure' but other times it will be more complex. It all depends on the nature of the study.

Lastly, I print out the task list and a copy of the tutorial materials that were sent to participants, just in case participants forget to bring their own copy in. I always leave printing the task list to last as invariably I'll make small changes to the task list right up the last minute.

After all this setup, all that is left is to run each participant's session.

The first time each participant comes into the lab I spend some time getting them familiar with the lab. It can be quite disconcerting to be sitting in a lab knowing that people are watching you behind a one way mirror so I reassure participants that the purpose of the study is to learn how usable the API is, not to test each participant that comes into the lab. I also give some participants time to practice talking out loud while they work. We always ask participants to provide us with a running commentary of what they are doing as they are working as that is the best way to get at the intent behind the actions that they take. However it can feel a little odd to talk so much while working since it's not something that most people are used to doing. So for participants who have not been in the lab before, I will sometimes ask them to do some task (such as using Paint to draw a picture of a mouse) and ask them to talk out loud while they are working. Many participants find that after just a few minutes of practice that it's actually not that difficult to keep talking while working.

Then participants start working on the tasks. While they work on each task, I try to take as many notes about what they are doing as I can. I typically try to record a transcript of everything they say as they are working. When something interesting happens, I'll make sure to annotate my notes accordingly so that I can easily refer back to that event (our note taking software timestamps each note that I take so that I can associate a note with a point in time on the video recording of the session). Paying attention to what is going on and taking good notes can be very tiring but it's critical, especially if you don't want to be spending hours reviewing video tapes and other logs afterwards. Time and effort spent here pays off in the long run.

Probably the most important thing to be aware of when running a usability session is the impact that anything you say to the participant has on their behavior. The most obvious way in which you influence the behavior of the participant is in how you describe the task you ask them to perform. If you mention the specific classes, namespaces, methods etc that you want them to use, then you've led them down a path that they might not necessarily have taken had you not mentioned it. But during the study there are other less obvious ways that you can inadvertently influence participants. Very often, participants will get stuck with a task and might get a little frustrated. They'll ask for your help and the natural inclination is to offer some help, particularly if they are very frustrated. You can simply tell them how to get unstuck, but that doesn't really provide you with useful data about why they got stuck and how they expected to be able to fix the problem themselves. All you know is that when told to do the right thing, they were able to do it. Instead, you should see this as an opportunity to understand what it is about the design of the API that leads participants to get stuck.

These sort of situations are where the benefits of running a usability study really become apparent. As soon as a participant gets stuck on some task, they are in the best position to describe how and why they are stuck because they are literally going through the experience right there and then in front of you. They don't have to rely on their memory to tell you what happened, as they would if you simply interviewed somebody about their use of an API after they were done using it. They are motivated to explain things to you because that will help them recover the situation, unlike if you were asking them to describe the situation after the event.

Whenever participants ask me questions when they are stuck, I'll very often not tell them anything and will instead ask them to describe what they think is happening and how they expect the API to work. With that information I'm able to start forming hypotheses about the conceptual model that participants are forming while working with the API and how that leads to breakdowns in using the API. I can then start testing these hypotheses by asking participants to provide feedback about the way that these conceptual models are formed. I don't ask them to comment explicitly on their conceptual model but instead I'll ask them about salient aspects of the API that I think might lead to a particular model being formed. These are the points in a usability study during which I learn the most from participants.

Outside of these situations I try to say very little during the session. For example, to begin with, lots of participants ask me when they should move on to the next task. I try never to answer these questions, telling participants that they should move on to the next task when they think they are done with a task. I don't want participants to get used to my confirmation that they have done the right thing. If I keep telling participants that they are doing the right thing, participants tend to do something then wait to hear my confirmation before moving on. Instead of using their own intuition to figure out if they have done the right thing, they simply wait for an acknowledgement from me. At that point the data I'm getting is questionable.

The key criteria for success in running a usability study are: good observation and note taking skills, and the ability to understand observed behavior from the participant's perspective, instead of your own. You'll notice that running a successful study does not depend on having expensive usability labs. They're definitely nice to have, but not necessary.