Aaargh, Part Six: One More Thing About Comments

Aaargh, Part Six: One More Thing About Comments

Rate This
  • Comments 16

Gripe #7: Use The Right Struct For The Job

I meant to include this one in yesterday's gripe about comments, as this illustrates a time when I found a comment that should never have been there.  The person who wrote the comment should have realized that the very fact that they needed to put a comment in indicated that the code needed to be rewritten.  It slipped my mind at the time though, so here it is today.

Here's a fragment of code I found in an unnamed Microsoft product once. The code is part of a fairly complicated routine for clipping a rectangle -- the details are unimportant for the purposes of this discussion. The code is correct -- it performs the correct calculation and spits out the right answer -- but boy, is it hard to read:

if (client.x > rect.right)
  default.x += ((client.x - rect.right) / 2 - rect.left);
// right == width
if (client.y > rect.bottom)
  default.y += ((client.y - rect.bottom) / 2 - rect.top);
// bottom == height

Holy cow, was that ever a bug waiting to happen. The data member that holds the width was named right and the data member that holds the height was named bottom ! I needed to change this code and spent a good thirty solid minutes looking at the routine (which was much longer than shown here), running it in my head before I concluded that the comment and the code were in fact correct.

Someone got lazy -- they already had a struct defined for a {right, left, bottom, top} rectangle but they had the information {left, width, top, height}, so they just stuffed the information they were manipulating into the struct they had handy. Aaargh! It's drivin' me nuts!

Worse, the comment is cryptic in the extreme.  It took me quite a while to figure out whether the equality operator in the comment was documenting a logical invariant (that is, the comment means “we know that the right coordinate is equal to the width because the left coordinate is always zero in this case“) or whether they were just using the equality operator to mean “has the semantics of“. 

Please don't put cryptic comments in the code explaining why code is misleading -- fix the code so that it isn't misleading anymore! In this particular case, the rectangle in question was entirely local so there was no need to have it in a struct at all. I needed to modify this code anyway, so I got rid of the struct altogether and just declared four local variables for the left, width, top and height, appropriately named.

  • I think there's a general trend that people feel that they should have very few local variables, so they reuse them to mean something different at a later point in the same routine. They fail to realise that if the compiler can deduce that, if the scope of two variables never overlaps, it can reuse the memory location for variable A for variable B (i.e. A and B actually refer to the same location). Admittedly, it only does this if you've turned optimisations on.

    IMO, the first thing to do is to split the different tasks into their own routines. I'm a bear of little brain (at times, anyway) and can't remember more than a few locals - up to about ten, perhaps. A number of my colleagues happily write four- or five-screen functions, but I can't cope with that. I start getting lost if I can't see the beginning and end braces of each control structure on the same screenful. Any bigger and I have to print it out to understand it (and don't get me started on control structures that span multiple pages, and code that won't print properly because it's too wide).
  • I would love to see more examples of comments from code by Microsoft. I'm a bit of a stickler regarding my coding and, even though I am, I know there are time when I neglect commenting. It bothers me. I guess it's one of those things that let's me know that it can happen at the best of places.

    I know it probably would never happen, but I would be fascinated with a book full of these examples with bits of code snippets and commentary on them. Kind of like a bloopers type of book for nerds. Actually, it would be better if the book had both the good and bad!!

    Start cooking one of those up!!
  • I'm terribly lazy when it comes to comments *and* variable names. I don't bother with the former unless I'm feeling particularly conscientious (i.e. almost never), and the latter follows a strange naming scheme I've been using for years:

    Any kind of number tends to get a single-character name, normally in the order q, s, v, a, b, c... Everything else either gets called foo, bar, etc., or some kind of higly abbreviated mnemonic ("flnm" for "filename" and the like).

    I blame both of these on learning to code on ancient micro BASICs which tended to only allow single-character variable names, and with the limited memory available wasting it on comments was never a good idea. The 7 or so years I spent on those before migrating to a "real" language did me no favours!

    This bit is completely off-topic, I'm afraid, but this has been puzzling me for a couple of days now and you're the expert on this sort of thing. :)

    In VB (well, VBA in Word), I keep getting ByRef errors when passing parameters to functions, and have to put an additional set of parens in the calls. In the following example I have to use "bar = dosomething((foo))" and I can't see why. The function declaration and type of the variable I'm passing are the same, so what gives?

    Here's a VB code fragment:

    Option Explicit
    Function dosomething(param As String) As String
    dosomething = param
    End Function

    Sub otherstuff
    Dim foo, bar As String
    foo = "Lorem ipsum"
    bar = dosomething(foo) ' ByRef error!
    End Sub
  • VB uses reference parameters by default (meaning that changes to the param will change the actual param variable).

    Function dosomething(param As String) As String

    should be written as

    Function dosomething(ByVal param As String) As String

    Then you avoid the ByRef error.

  • > Dim foo, bar As String

    This declares foo as a variant, and bar as a string, therefore passing foo byref through a string parameter is a no-no.

    The correct way to dim foo and bar on the same line is:

    Dim foo as string, bar as String

    You have to specifiy the type for each variable in VB.
  • Mike -- I think this explains where I've been going wrong all this time. That's what you get for making assumptions, I guess...
  • Right, the byref error is likely because the byref object cannot be coerced to a byref string.

    I've written about these issues several times in my blog.

    See:

    http://blogs.msdn.com/ericlippert/archive/2003/09/15/52996.aspx

    http://weblogs.asp.net/ericlippert/archive/2003/09/15/53005.aspx

    http://blogs.msdn.com/ericlippert/archive/2003/09/15/53006.aspx

    http://blogs.msdn.com/ericlippert/archive/2003/09/29/53117.aspx
  • It's a bit odd that nothing more than putting an extra set of parens fixes the problems. On the surface of it, (foo) and ((foo)) should be the same... This is one of those situations where I'd prefer it would Do What I Mean, Not What I Say!
  • The extra parens force an expression evaluation, which returns an RValue which can't be passed ByRef, so the compiler dummies up a temp variable of the correct type in order to store the RValue result and pass it ByRef.
  • Some discussions and links about code commenting can be found at:

    http://lambda.weblogs.com/discuss/msgReader">http://lambda.weblogs.com/discuss/msgReader$9893

    and

    http://lambda.weblogs.com/discuss/msgReader">http://lambda.weblogs.com/discuss/msgReader$3025

    Although LtU appears to be down at the moment :-(
  • Whoops, posting the links broke them. You can try to locate the posts using their id number.
  • Has for good reasons been the origin. And all right, bottom follows.
  • [ Waste Book ] 04-06-05
  • one small point on the topic - actually, I remember that that was quite common on the side of winapi's code. There, most of the locations/areas were described by tagPOINT or tagRECT, tagRECT was {top left bottom right}. Now check this:

    - point is long, long

    - rect is long,long,long,long

    very very very often they may be used as a 'vectors' and the coordinates may be simply added accordingly, ie. point+point = (x+x,y+y), rect+rect etc

    so, actually:

    - the RECT may be treated as a union of two points: topleft, bottomright

    ---- what gives you a way to use it as a ... vector {point start, point end}, and use classic rect operations for simple vector translation

    - the RECT of origin (0,0) may be treated as: 0,0,height,width [ -> client.bottom, -> client.right !! ] and further be used as undrawn window size! no real location, just the size. Usage like that is/was VERY common..

    Can't say why exactly in such way, and why not simply add few datatypes for different semantics.. probably due to the laziness and "we already have this similar thing.. hmmm", but still, things like that are very common inside some .Net control libraries :/

  • I know nobody will ever read my comment, since the article is from 2004, and it's 2010 (we're getting old, uh?). I heard somewhere once (probably one of my teachers back in college) that great code is written for humans, poor code is written for the compiler. And that we should always do the previous.

    In this case, the compiler doesn't care whether the variable is called top, bottom, right, left or sky, floor, orient, ocident. It only cares that it's a integer. But for humans, it makes a very large difference.

Page 1 of 2 (16 items) 12