Larry Osterman's WebLog

Confessions of an Old Fogey
Blog - Title

What does style look like, part 7

What does style look like, part 7

Rate This
  • Comments 37
Over the course of this series on style, I've touched on a lot of different aspects, today I want to discuss aspects C and C++ style specifically.

One of the things about computer languages in general is that there are often a huge number of options available to programmers to perform a particular task.

And whenever there's a choice to be made while writing programs, style enters into the picture.  As does religion - whenever the language allows for ambiguity, people tend to get pretty religious about their personal preferences.

For a trivial example, consider the act of incrementing a variable.  C provides three different forms that can be used to increment a variable. 

There's:

  • i++,
  • ++i,
  • i+=1, and
  • i=i+1.

These are all semantically identical, the code generated by the compiler should be the same, regardless of which you chose as a developer (this wasn't always the case, btw - the reason that i++ exists as a language construct in the first place is that the original C compiler wasn't smart enough to take advantage of the PDP-8's built-in increment instruction, and i++ allowed a programmer to write code that used it).

The very first time I posted a code sample, I used my personal style, of i+=1 and got howls of agony from my readers.  They wanted to know why on EARTH I would use such a stupid construct when i++ would suffice.  Well, it's a style issue :)

There are literally hundreds of these language specific style issues.  For instance, the syntax of an if statement (or a for statement) is:

if (conditional) statement

where statement could be either a single line statement or a compound statement.  This means that it's totally legal to write:

if (i < 10)
    i = 0;

And it's totally legal to write

if (i < 10)
{
    i = 0;
}

The statements are utterly identical from a semantic point of view.  Which of the two forms you choose is a style issue.  Now, in this case, there IS a fairly strong technical reason to choose the second form over the first - by putting the braces in always, you reduce the likelihood that a future maintainer of the code will screw up and add a second line to the statement.  It also spaces out the code (which is a good thing IMHO :) (there's that personal style coming back in again)).

Other aspects of coding that ultimately devolve to style choices are:

if (i == 10)

vs

if (10 == i)

In this case, the second form is often used to prevent the assignment within an if statement problem - it's very easy to write:

if (i = 10)

which is unlikely to be what the developer intended.  Again, this is a style issue - by putting the constant on the left of the expression, you cause the compiler to generate an error when you make this programming error.  Of course, the compiler has a warning, C4706, to catch exactly this situation, so...

Another common stylistic convention that's often found is:

do {
    < some stuff >
} while (false);

This one exists to allow the programmer to avoid using the dreaded "goto" statement.  By putting "some stuff" inside the while loop, it enables the use of the break statement to exit the "loop". Personally, I find this rather unpleasant, a loop should be a control construct, not syntactic sugar to avoid language constructs.

Speaking of goto...

This is another language construct that people either love or hate.  In many ways, Edsger was totally right about goto - it is entirely possible to utterly misuse goto. On the other hand, goto can be a boon for improving code clarity.  

Consider the following code:

HRESULT MyFunction()
{
    HRESULT hr;

    hr = myCOMObject->Method1();
    if (hr == S_OK)
    {
        hr = myCOMObject->Method2();
        if (hr == S_OK)
        {
            hr = myCOMObject->Method3();
            if (hr == S_OK)
            {
                hr = myCOMObject->Method4();
            }
            else
            {
                hr = myCOMObject->Method5();
            }
        }
    }
    return hr;
}

In this really trivial example, it's vaguely clear what's going on, but it suffices.  One common change is to move the check for hr outside and repeatedly check it for each of the statements, something like:

    hr = myCOMObject->Method1();
    if (hr == S_OK)
    {
        hr = myCOMObject->Method2();
    }
    if (hr == S_OK)
 

What happens when you try that alternative implementation?

HRESULT MyFunction()
{
    HRESULT hr;

    hr = myCOMObject->Method1();
    if (hr == S_OK)
    {
        hr = myCOMObject->Method2();
    }
    if (hr == S_OK)
    {
        hr = myCOMObject->Method3();
        if (hr == S_OK)
        {
            hr = myCOMObject->Method4();
        }
        else
        {
            hr = myCOMObject->Method5();
        }
    }
    return hr;
}

Hmm.  That's not as nice - some of it's been cleaned up, but the Method4/Method5 check still requires that you indent an extra level.

Now consider what happens if you can use gotos:

HRESULT MyFunction()
{
    HRESULT hr;

    hr = myCOMObject->Method1();
    if (hr != S_OK)
    {
        goto Error;
    }
    hr = myCOMObject->Method2();
    if (hr != S_OK)
    {
        goto Error;
    }
    hr = myCOMObject->Method3();
    if (hr == S_OK)
    {
        hr = myCOMObject->Method4();
    }
    else
    {
        hr = myCOMObject->Method5();
    }
    if (hr != S_OK)
    {
        goto Error;
    }
Cleanup:
    return hr;
Error:
    goto Cleanup;
}

If you don't like seeing all those gotos, you can use a macro to hide them:

#define IF_FAILED_JUMP(hr, tag) if ((hr) != S_OK) goto tag
HRESULT MyFunction()
{
    HRESULT hr;

    hr = myCOMObject->Method1();
    IF_FAILED_JUMP(hr, Error);

    hr = myCOMObject->Method2();
    IF_FAILED_JUMP(hr, Error);

    hr = myCOMObject->Method3();
    if (hr == S_OK)
    {
        hr = myCOMObject->Method4();
        IF_FAILED_JUMP(hr, Error);
    }
    else
    {
        hr = myCOMObject->Method5();
        IF_FAILED_JUMP(hr, Error);
    }

Cleanup:
    return hr;
Error:
    goto Cleanup;
}

Again, there are no right answers or wrong answers, just choices.

Tomorrow, wrapping it all up.

  • You forgot "++i;"
  • And now I've read the lot... A stange habit I noticed in reading PERL code, and even Delphi, would be to use 'lazy logical operators', something like this:
    if(Method1()==S_OK && Method2()==S_OK && (Method3() && !Method4()))Method5();
  • Man, how could I have forgotten ++i;. Doh!

    Fixed.
  • Siggma,
    Interestingly enough, I've seen the lazy && thing done before. And I've seen it screw up hideously.

    Every time I've seen someone try this, it was to try to be clever, and they've almost always gotten it wrong. At a minimum, using constructs like this insert stumbling blocks when reading code - you have to stop and puzzle out what the author was trying to do, and if they got it right.
  • The 'lazy evaluation' style might look big and clever, but whenever I've done it before I've had horrible problems when debugging - it's just too difficult to find out which operation failed without stepping into each individual one. With Larry's method, you can just step over each function call and check the return value. I tend to write kinda verbose code for the same reason - it's so much easier to debug if you're not calling 6 functions on the same line.
  • IMHO, the goto statement is very useful when writing COM-style code that acquires/releases interface pointers, when you don't have stuff like CComPtr available. Pyramid-style code (like Larry's first MyFunction version) is just horrible and I really can't stand it. :)
    Ofcourse, I'd rather just use C++ construction/destruction semantics (i.e. smart pointers) to 'flatten' the code for stuff like that, but sometimes it's not possible.

    I've seen some code example on MSDN that did something similar but instead of using goto they used __try, __finally and __leave.
    Kinda weird if you ask me.

    I'll try to find the URL.
  • Note that choice of prefix or postfix increment in C++ can have performance implications if your variable happens to be an object that overloads these two operators (iterators are a good example). While prefix increment can simply change the internal state of the object, postfix increment must first create a copy of the object, change internal state, then return the copy.
  • __try/__finally/__leave are fascinating constructs. In particular, it's unbelievably expensive to return from inside a __try block - it takes literally tens of thousands of instructions on non x86 platforms. So most developers use a "goto-the-finally-clause" construct (usually hidden by a macro).
  • As promised: http://support.microsoft.com/default.aspx?scid=kb;en-us;118626
  • Of course there's an easier way to do what the KB recommends:

    Simply call CheckTokenMembership(mytoken, CreateWellKnownSid(Administrators), &isadmin).

  • That do/while trick caught me out for about two hours not so long ago. I really had to stare hard to work out what it was up to. Use a goto - it's clearer.

    I've got into the habit of using ++i from some STL iterator code. When you try to implement operator++(int) (the equivalent of i++) you realise that in order to provide the correct semantics, you have to copy the object, increment the original, then return the copy. There's a degree of inefficiency here. The compiler is allowed to elide (remove) temporaries, but not anything you actually declared, IIRC.

    Having said that, I rarely have more than one assignment operator in a single statement, and I very rarely use expressions containing ++. I normally use it in a line on its own, or in a for statement's third controlling expression. I also rarely use the comma operator, especially in for statements. I tend to limit myself to simple for statements in C-based languages, where a single variable is initialised in the first part, tested in the second, and modified in the third (this does include linked-list walking, not just integer variables). Anything more complicated belongs in a while, do/while, or for(;;)/break structure, depending on whether the test is needed at the top, bottom, or somewhere in the middle.
  • Mike, I agree with your comments pretty much 100% - I also rarely use comma, etc.
  • I personally like the do { } while(false) construct, just so I can avoid being yelled at for using goto.

    One thing I hate is macros that hide jumps. There is some sample code in the WMP Format SDK that has macros with whacky names (such as CPRg and COrg) which test function return values, set specially-named local variables, and then jump to specially-named labels. *shiver*
  • >> As does religion - whenever the language allows for ambiguity, people tend to get pretty religious about their personal preferences.

    Not sure what you are implying here. But, I'll use this opportunity to share the truth.

    "For God so loved the world, that he gave his only begotten Son, that whosoever believeth in him should not perish, but have everlasting life." (Jhn 3:16)
  • John,
    Religion in this case means an unswerving faith in the correctness of your belief.

    For instance, if I say that ++i is better than i++ (or do/while(false), or if () <statement>, without facts to back it up, it's a religious argument.
Page 1 of 3 (37 items) 123