Larry Osterman's WebLog

Confessions of an Old Fogey
Blog - Title

Coding with Style

Coding with Style

  • Comments 30

AKA, Software archaeology, a working example.

I've touched on this this a couple of times before, but I'd like to talk a bit about style.

This isn't an issue for people working on a single person project, but once you start working in a team, the issue of coding style comes up often.

The problem is that everyone has their own style, and they're usually pretty different.  K&R lays out several different styles, and Kernighan and Plaugher lay out still more.  Style encompasses everything from where the braces go, to how far they're indented (and where they're indented), to the names of identifiers (both variables and function names), to where the variables are declared.  Even if source files contain spaces or tabs (and what the tab setting is) is an aspect of programming style.

Most programmers learn a particular style when they start to learn how to program, and then they tend to stick with that style for their career (with minor modifications).  One thing to keep in mind about style is that it's personal.  Nobody's style is any better than anyone else's style, they're all ultimately a matter of personal choice (having said that, some personal style choices can be rather distracting, for a simple example, see this Daily WTF post).

When you're laying out a project at the beginning, you essentially have two ways to go when determining the style for your code.  The first is to get all the developers together in a meeting, hash out the details, and write a specification for the coding style for your project.  And then you've got to be diligent in enforcing that style, especially if the style you're enforcing is different from that of an individual developer in your group.

The other way is to be somewhat more ad-hoc in coding style - typically each developer owns one or more components in the system, you let them develop those components with their own style, and trust that the final product will be harmonious.  If you're using paired programming, or if there is more than one owner of a piece of software, then clearly the individual developers need to come to consensus on their style.

There are times that the ad-hoc coding style is effectively forced on an organization.  This happens a lot when dealing with legacy code - often times the code was written by developers who have long left the organization, and thus they are no longer maintaining the code.

There's one critical thing to deal with when you're dealing with disparate styles - you should never, ever change existing code to match your personal style.  I've seen this happen in real life, two developers on a team each have their own idea of what a variable should be named, and as the two developers work on the module, each of them goes through and makes sweeping changes to rename the variables to meet their personal naming convention.  The biggest problem with this is that sweeping changes like this cause huge issues with code reviews - each gratuitous change to the code shows up as a change when diff'ing source files, and they increase the signal-to-noise ratio, which reduces the effectiveness of the code review - the code reviewer can easily miss a critical error if it's buried deep in a sea of variable name changes.

For those of you who have worked on long running projects, how many times have you run across code like this?

typedef int CB;
typedef wchar * SZ;
#define length 20
void MyFunctionName(SZ szInput, CB cbInput)
{
    CB stringlength = strlen(szInput);
    char myBuffer[length];
    if (stringlength < length)
    {
        if (szInput[0] == 'A')
            {
                <Handle the case where the input string starts with 'A'>
            }
        else {
        if ('B' == szInput[0]) {
            <Handle the case where the input string starts with 'B'>
        }
        }
    }
}

What happened here?  Well, the code was worked on by four different developers, each of whom had their own style, and who applied it to the source.  The one that authored the routine used strict Hungarian - they defined types for SZ and CB, and strictly typed the parameters to the routine.  The next developer came along, and renamed the local variables to match their own personal style, and added the check for case "A".  The second developer used the following style for their if/then statements:

if (if condition, constants in conditionals appear on the right)
    {
        <if statement>
    }
else
    {
        <else statement>
    }

Now, a 3rd developer came along, this developer added the check for case "B".  The third developer used the following style for their if/then statements:

if (if condition, constants in conditionals appear on the left) {
    <if statement>
}
else {
    <else statement>
}

And then, finally the 4th developer came along and added the buffer overflow case.  Their if/then style was:

if (if condition, constants in conditionals appear on the left)
{
    <if statement>
}
else
{
    <else statement>
}

None of these styles is wrong, they're just different.  But when put together, they turn into a totally unmaintainable mess.

So it's utterly critical, if you're working on a project that uses ad-hoc coding standards that you adopt your coding style to match the code.  The future maintainers of the code will thank you for it.  Even if your project has strict coding standards, the instant you have to deal with legacy code, you MUST adopt the coding standard of the original author of the code, even if it doesn't match your groups standards.

  • You didn't close the brace on the else, although it's hard to see because of the poor coding style.

    :)
  • Most of these issues are solved by using a good code formatting tool. Once that is done all that is left to argue over is capitalization and naming convention.

    At my place of work we just bought everyone a copy of ReSharper. Now if one programmer decides to be a formatting rebel it doesn't affect everyone. We also run a formatter prior to builds as well.
  • Heh, just wanted to say you have more {'s than }'s in your code, but no surprise that i'm not the only smartass here :-)

    I suspect you did this on purpose, didn't you?
  • Ilya, actually it wasn't on purpose, I just messed up.

    Kevin,
    The problem with a good code formatting tool is the unnecessary diff problem. You can do this IF you're going to enforce a particular style, but if you're going to use an ad-hoc style, then...

    Also, it's not always clear that a "good" formatting tool exists. The only "good" formatting tool would be one that parsed the source file, and re-generated it - otherwise you've got the potential of the formatting tool introducing bugs.

    Remember, there are a myriad of styles - I didn't even include the ones that leave out the {} on if statements, the ones that hide conditionals inside macros, etc. A good code formatter can handle some of it, at the cost of inflating diffs.
  • Problems of coding style are a legacy of the choice of storing code in simple text files, with limited interpretation by the coder's IDE package. That is part of what Charles Simonyi's "Intentional Software" company is all about - taking our coding environments to the next level, where coding style, or even choice of language becomes a non-issue.

    My point being, that as long as we stick with plain text files as a code-storage medium, we will be stuck with these issues. The problem can be solved to some extent by clever IDE's, but ultimately we need to start storing code in a more structured way.

    Its the classic problem of the data being too closely coupled to the presentation. We really should know better.
  • A good point Steven.

    But for the forseeable future, I suspect we're stuck with the text files.
  • In Eclipse you can right click and do Source>Format and Source>Correct Indentation. The default coding style is the Sun style. You can customize the style to your liking though and then pass that out to your devs. It is quite fancy.
  • css, that's cool, but it misses the point.

    Restructuring existing code to match an individuals coding style is almost ALWAYS a bad idea. You just about never want to do it, especially if you're dealing with production code.

    As I mentioned above, the problems are two-fold. First, they increase the change log size of the file - this can be a big deal with certain source control systems (Microsoft's old source code system, SLM used to store the diffs of the files, and to resurrect old versions of the files, it replayed the diffs - large scale changes made this process painful).

    Second, they're not guaranteed to be correct. So by beautifying the code, you're potentially introducing new defects (you might also find existing defects). You don't know, and can't know, unless you diff the code. And the number of changes is typically so great that the diffs are meaningless.
  • We have tried a couple of things at work to alleviate this problem.

    1. We TRIED to enforce a coding standard as much as possible. I think this is just too difficult to pull off effectively. People process input in different ways. Coding in your own style might actually help you figure something out quicker.

    2. We all agree on some basic rules and try to stick to them.

    3. We constantly berate (in a very non threatening, fun way) each other on mistakes or congratulate and adopt clever techniques.

    4. If you find yourself in someone else's code, try to adhere to the standard or format they have already written it in. We feel this has a couple of advantages.

    You learn to write code the way someone else does (hopefully this broadens your stylistic horizons or at least gives you insight into another thought process) and the code stays consistent at least in one file.

  • Jeff, that's actually a GREAT set of guidelines, I really like them.
  • Right on Larry. Luckily I have always tried to use the same style as the code I was changing, regardless of how esoterically it was written (even though it was an unconcious decision until fairly recently :) .

    Simple reformatting can introduce bugs, and when clients pay a premium to bring me in to fix them, well, I best not create any new ones! My goals was to modify as *few* lines as possible to get things to work (and keep those lines grouped together too!).

    Besides, code reformatters don't usually handle naming of variables, capitalization, etc. The best I've seen is indentation and placement of curly braces.
  • Coding in style is a major thing lacking in the skill set of developers today. It drives me insane as there don't seem to be any good programmers anymore and you can really tell alot abuot a person by how much pride they take into the style of their code(...or of their kitchen, or of their car, etc) Soon I'll have a short FAQ or article on my website on this exact topic. But I call it Feng Shui Programming.
  • What I meant from my post was that if the company didn't like Sun's Java style, that someone on the product team, or the team itself, would come up with their preferred coding style. Then that style could be distributed to all the developers on the team so then whenever someone works on code, before they submit it, it would be formatted to the team's style spec. It wouldn't be be formated to an individual team member's style when submitted to version control.
    Of cource this is only half a solution because it only takes care of whitespace issues (brackets, tabs, if, elses, while, catch, etc, etc) and doesn't deal with issues like naming conventions.
    Concerning working on an existing code base, yes there would be a one time cost where diffs might be messed up when files were committed with the new coding standard. I guess it would be up to the team if it would be worth the hassle.
    As far as introducing defects, I don't think the feature would be in Eclipse if it caused defects. By no means is it an entire solution but it might be a step in the right direction.
  • css,

    My basic point is that if you're working in a group, once the code's been written it doesn't get reformatted, because reformatting is essentially a gratuitous change - it needlessly adds load to source control servers, and is very difficult to review.

    As far as Eclipse introducing defects, you're underestimating the stupidity of developers.

    Here's a simple example. My coding style says that I need to have a {} character around each if statement. In other words, I want to have:

    if (<condition>)
    {
    <if statement>
    }

    How would Eclipse correct the following:

    if (<condition>)
    <if statement>
    <another statement, indented at the same depth as the if statement>

    Would it be conservative and only insert braces around the <if statement>? Or would it attempt to discern my intent and move the <another statement> within the braces?

    How about the user who did:

    #define RpcBeginTry __try {
    #define RpcExcept(flag) } __except(flag) {
    #define RpcEndTry }

    Would it correctly change:
    RpcBeginTry
    <8 spaces>Code
    RpcExcept(1)
    <8 spaces>Code
    RpcEndTry

    to:
    RpcBeginTry
    <4 spaces>Code
    RpcExcept(1)
    <4 spaces>Code
    RpcEndTry

    Or would it treat RpcBeginTry as a statement and complain because it wasn't syntactially valid?

    Code reformatting is a VERY tricky business.
  • Grr. .Text's formatting strikes again. It looked better when I wrote it in the editor :)
Page 1 of 2 (30 items) 12