Update: this blog is no longer active. For new posts and RSS subscriptions, please go to http://saintgimp.org.
For some reason, most developers (me included) have this idea that the number of classes in your code base strongly indicates the complexity of your design. Or to say it another way, we tend to freak out when we see the number of classes in a project climb upwards and we tend to intuitively prefer software designs that minimize the number of classes.
I often find myself ignoring clear evidence that I need to break out some responsibility into its own class because I have a fear that clicking on Add | New Item in Visual Studio is going to make my code much more complicated, so I search for ways to jam the responsibility into some existing class instead.
This impulse is usually dead wrong.
I had an experience a few weeks back where my unit tests were getting to be pretty messy and painful to deal with. I knew this was probably evidence that my design needed some serious improvement but I ignored it. It got worse. Finally I said, “Ok, fine, I’m going to refactor the code until my unit tests quit sucking.” I started breaking up classes that had too many responsibilities, creating additional interfaces and concrete classes, and generally created an explosion of new files. It was terrifying. It felt like I was turning the code into a nightmare of complexity.
The funny thing was, though, when I was all done, everything just fell into place and all of a sudden I had an elegant, easy-to-understand, maintainable design sitting there. Sure there were a lot more code files than before, but each class did one small thing and did it well. It was remarkable how much easier it was to reason about the new design than the old.
Michael Hill wrote an excellent blog post on this awhile back and he points out that in addition to the usual low-coupling, high-cohesiveness arguments in favor of small single-responsibility classes, there’s also an argument from cognitive science. We can only hold a few “chunks” of information in our short-term memory at once. But we can get around this limitation by collecting chunks that are closely related, giving them a name, and that becomes just one chunk to store and track.
When we have a class that contains 500 lines of source code and does five different things, we have to think about all of that code more or less simultaneously. It’s really difficult to handle all of that detail at once. If we break up that class into five classes that each do one thing, we only have to track five class names in order to reason about the system. Much easier.
Of course this can be overdone. My recent zombie-fighting involved (among other things) chopping out a bunch of pointless classes that were apparently built to satisfy someone’s concept of proper architectural layers but didn’t really handle any responsibilities of their own. They didn’t create useful names for chunks of code; they were just pointless abstractions.
It’s interesting that two apparently contradictory principles can be true at the same time: on one hand source code is a liability and should be minimized, but on the other hand more, smaller classes are better than fewer, bigger classes, even if that raises the file count and line count.