Over the weekend, I read a great article by one of my favorite New Yorker writers, Atul Gawande.  The article (currently available here) introduced me to a recent development in intensive care units: the checklist.  A professor at Hopkins one day decided to see if traumatic complications could be reduced by simply mandating that nurses and doctors follow daily checklists.

The results were so dramatic that they weren’t sure whether to believe them: the ten-day line-infection rate went from eleven per cent to zero. So they followed patients for fifteen more months. Only two line infections occurred during the entire period. They calculated that, in this one hospital, the checklist had prevented forty-three infections and eight deaths, and saved two million dollars in costs.

Medicine has gotten so complex, that even trained experts will mess something up at some point.  The only solution was a simple list of tasks, and an organization willing to enforce that the list was followed.

Reading this article reminded me of one way my team keeps a high level of code quality - checkin checklists.  We have a wiki where we list out all of the things a developer is expected to do before checking in any code changes.  These tasks include building (duh) but also running all the unit tests, running various static code analysis tools, measure code coverage, etc.

While this list is helpful in and of itself, we've found over time that as these steps go from being manual and unenforced to automated and required, the level of code quality goes up.   Build breaks are non-existent.  Unit tests don't get stale and broken.  Many bugs are caught before the code changes are committed.

We use a home brewed tool that takes developer's changes and runs them through a battery of steps, including various kinds of builds, running tests, measuring code coverage, and really anything that can be put in a batch script.  We try to keep this process to between 30 and 60 minutes by parallelizing to multiple machines and tweaking the set of tests that have to run, but you get the idea.  We've even gone so far on one project as to have every checkin deploy a website to a server, install client code on two other machines, and run automated multi-machine tests against it.

Obviously, process has to be balanced against agility, but I think there is a lot of value to the notion of checklists as long as they can be enforced without unreasonable overhead.