Coding Against the Grain
The other day, I was reviewing a coworker's design for an area she was writing. As she was describing the responsibilities of each class, my attention was drawn to one in particular:
"Why is this method in this class Foo? I mean, it's true that the operation does have something to do with Foo objects in general. But isn't a really specific method, one you'd expect the user of this object to be doing? It's a bit if you were to have this really app-specific string manipulation function, and then throwing it into the global string class even though no one else would ever need it".
Her response:
"Well, yes, I can see that point. But you're always saying that we should write small classes, and well, the calling class is already quite big. So I thought I would spread the complexity around a bit".
That got me to thinking. Because it is true -- I am always going on about "one class, one responsibility" and the dangers of monolithic classes. But in this situation, the "fix" struck me as a bit naive. It's a bit like how everyone knows that long functions are bad, but if you have a bit of code that genuinely does call for a long function (for example, some function that does step A, then step B, then step C ... all the way to step Z, in a very procedural fashion), the common "solution" is to chop it up into a bunch of single-purpose functions.
But it's hard to say that that's any improvement. What was once straightforward to read now requires jumping around in the editor. And now it's not at all clear that these dozens of (frequently poorly named) funclets are only called by that once-ginormous "DoStuff" function. At best you've gotten control over all of the local variables, but anonymous scopes could have solved that problem.
There are a million ways a program can be written, but only a subset of those are agile, maintainable designs. How do you go about finding them?
When I was younger I learned that most sheets of paper have a sort of "grain" to them. You can't notice it by looking, but there's a simple test: rip the same piece of paper vertically, then horizontally. One of the two will be a nice, clean, straight tear, but the other will be an irregular, raggedy mess. In the same way, any given program has a "grain", an underlying structure that dictates an optimal factorization into classes and functions. That underlying structure may not be immediately apparent to the naked eye -- but you'll know soon enough the moment you start "coding against the grain". To avoid ending up with a big ball of mud, a programmer has to listen carefully to find out how the program "wants" to be written.
Here are some common rookie mistakes I see in many newbie designs that indicate to me that someone's been coding against the grain:
No Functions, Just Objects
It's a pity that the plain-ole' function has been all but forgotten due to the object orientation craze of the 90s. Some mainstream languages like Java and C# pretend flat functions don't even exist (although .Net 2.0's static class concept has rehabilitated their image somewhat).
The contracts of functions are simpler and easier to express than the contracts of objects. I've never seen this exact code, but I've seen the spirit of it many times:
class Multipier_t
{
public:
void setX(int x);
void setY(int y);
void doCalculation();
int getResult();
private:
int m_x;
int m_y;
int m_result;
};
Factory functions seem to suffer from this anti-pattern the most. For some odd reason factory method I've ever seen came swaddled in a do-nothing class -- even in C++, where it's unnecessary. It's almost as if it were called the Factory Class pattern and not the Factory Method pattern.
Every application is different and the exact ratio varies, but it shouldn't be considered unusual when half the code is plain old functions and the other half is regular OO code with classes.
Paramaphobia
A similar pathology is the fear of function parameters. You've been there, I'm sure. It starts innocently with one or two arguments, and before you know it, you're hurtling down that slippery slope. Is pszSecurityParams the eighth or ninth parameter to FrobnicateFoo? Ah, to hell with it, let's just make them all takes-void-returns-void and throw them into the same class, where they can share information with impunity.
And then we're back to that happy sunny world of global variables (hello, world!). A rabid outreak of function parameters is no good thing, but there are far better cures for this disease (such as Introduce Parameter Object). A function that take many parameters is probably doing too much, anyways.
Or perhaps you're simply in the mindset of, "hey, these two functions are in the same class. Why do I need to pass parameters that could be fished out of the this parameter?" The answer: referential transparency. Without parameters, you can't readily see what the called function uses as input, what it modifies and what it returns as output.
Objects should only carry around information that must persist between invocations and which can't be easily recalculated.
In my own code, the vast majority of functions neither rely on nor generate side effects -- they are strictly dependent on their input parameters only. This style makes writing tests and refactoring code hella easier, and I cannot overestimate the amount of simplicity it buys me.
No Mapping Between Responsibilities And Objects
Violating the Single Responsibility Rule is another way to get into the same "global variable" mess. A class with two responsibilities is a blatant invitation for one responsibility to interfere with the other. And a class that implements half a responsibility isn't much better: it's constantly communicating with some other object in the system to get its own work done.
Sometimes it's hard to nail down exactly what constitutes a responsibility. After all, a class is constructed of many methods, each of which should also have a single responsibility (albeit a more limited one). Neither can it be seen easily on a UML diagram: does that single line between two boxes represent a simple interaction or an incestuous coupling? It's hard to say.
At such times, it's important to listen to the code and keep an open mind. All programmers come up with mental models of how problems should be decomposed. Some of those models are accurate. Most are not. The difference between a good designer and a bad one (other than coming up with the right model in the first place) is that a good designer won't rigidly impose their beliefs on reality when they don't fit. Rather they adjust their model based on what the code is telling them, letting the shape of the code dictate who the real actors are and how they co-ordinate together.
No Objects, Just Subjects
On the Portland Pattern Repository wiki there is a page on a proposed "best practice" entitled: "Dont Name Classes Object, Manager, Handler, or Data". Sure, it's somewhat tongue-in-cheek, but I think they may be on to something.
In English, most sentences have a subject, an object, and a verb. The "verb" is that venerable workhorse, the function, and the object is usually ... some object. But what's the subject map to? Actually, the subject is usually not important. Often it's some fictional construct, such as "Handler" or "Manager" or -- dear Lord, "Wrapper" (shudder). Empty words such as these can always be omitted.
Stay away from the Kingdom of Nouns, and challenge any class that ends in -er. Functions do things. Objects have things done to them.
No Distinction Between the Necessarily True and the Contingently True
Some time back I was tracking down a memory corruption. The code looked like this:
void OnNotification(int type, void* pNotification)
{
if (type == DB_NOTIFICATION)
{
((DbNotification_t*)pNotification)->fHandled = true;
}
else // it's a UI_NOTIFICATION
{
((UiNotification_t*)pNotification)->fHandled = true;
}
}
At the time the code was written, the comment was entirely correct. There were only two types of notifications, so if it wasn't one, it had to be the other. And the code worked for a good long while. But when the third type of notification was introduced into the system, all hell broke loose. This is a good example of the difference between something that is "contingently true" -- something that just so happens to be true for the moment, but not rigidly true in the same sense that "A or not A" necessarily has to be true according to the laws of Boolean algebra.
Newbies muddle this up all the time. You can see it, for example, with hard-coded numbers that assume that the display size will always be 800 x 600, or the same mathematical tax-calculation formula duplicated in various locations throught the code.
A key software principle is "alternate soft and hard layers". What this means is that a good design makes an attempt to separate things that are highly unlikely to change from the things most susceptible to change. If successful, you can end up having a large amount of code that, once written, needs to be looked at rarely if not never again. Your energy can be better spent on the small bits of code that do need changing in light of rapidly changing requirements.