Loops are gotos

Loops are gotos

Rate This
  • Comments 58

Here's an interesting question I got the other day:

We are writing code to translate old mainframe business report generation code written in a BASIC-like language to C#. The original language allows "goto" branching from outside of a loop to the interior of a loop, but C# only allows branching the other way, from the interior to the exterior. How can we branch to the inside of a loop in C#?

I can think of a number of ways to do that.

First, don't do it. Write your translator so that it detects such situations and brings them to the attention of a human who can analyze the meaning of the code and figure out what was meant by the author of this bad programming practice. Branching into the interior of a loop is illegal in C# because doing so would skip all of the code that is necessary to ensure the correct behaviour of the loop. Odds are good that this pattern indicates a bug or at least some bad smelling code that should be looked at by an expert.

Second, this pattern is not legal in C# or VB.NET, but perhaps it is legal in another useful modern language. You could write your translator to translate to some other language instead of C#.

Third, if you want to do this automatically and you need to use C#, the trick is to change how you generate your loops. Remember, loops are nothing more than a more pleasant way to write a "goto". Suppose your original code looks something like this pseudo-basic fragment:

10   J = 2
20   IF FOO THEN GOTO 50
30   FOR J = 1 TO 10
40     PRINT J
50     PRINT "---"
60   NEXT
70   PRINT "DONE"

The "obvious" way to translate this program fragment into C# doesn't work:

     int J = 2;
     if (FOO) goto L50;
     for(J = 1 ; J <= 10; J = J + 1)
     {
       PRINT (J);
L50:   PRINT ("---");
     }
     PRINT ("Done");

because there's a branch into a block. But you can eliminate the block by recognizing that a "for" loop is just a sugar for "goto". If you translate the loop as:

     int J = 2;
     if (FOO) goto L50;
     J = 1;
L30: if (!(J <= 10)) goto L70;
       PRINT (j);
L50:   PRINT ("---");
     J = J + 1;
     goto L30;
L70: PRINT ("done");

and now, no problem. There's no block to branch into, so there's no problem with the goto. This code is hard to read so you probably want to detect the situation of branching into a loop and only do the "unsugared" loop when you need to.

It is difficult to do a similar process that allows arbitrary branching in a world with try-finally and other advanced control flow structures, but hopefully you do not need to. Old mainframe languages seldom have complex control flow structures like "try-finally".

  • Back in highschool, in my first programing class (Qbasic) our instructor forbade us from using goto's... if your assignment had one goto in it.... he would trash it... THANK GOD! 8 years later MCAD certified, working for an awesome company... and i would have never gotten this far if I would have been allowed to think that goto was a good thing...

  • Eliminating goto was supposed to be the panacea for spaghetti code, and yet it continues to plague what could otherwise be fabulous code... Curious...

  • Guys, goto is not by itself a bad thing in any way. Its evilness is that it is very easy to misuse it, but by itself it is a useful tool in any imperative programming languages. Did you never have to break out of a nested loop? Or a switch nested inside a loop? And if you're coding in C (not C++), goto is essentially the only robust way to do proper error handling, when you have to check for error (and potentially clean up) after every single function call.

    The extreme argument against goto ("Goto is always evil. Always!") is also the argument against break and return, and, to some extent, throw. If you are truly willing to go to these lengths, then you should go all the way, and claim that nothing less than the purity of Haskell is desirable. Which would at least be a consistent position, even though not a very pragmatic one.

    But, so long as you stick to an imperative language, some form of goto is necessary and desirable.

  • By the way you can do a Duff's device in c#

    static void DuffsDevice<T>(T[] from, T[] to)

    {

    int count = from.Length;

    int position = 0;

    switch (count % 8)

    {

    case 0:

    to[position] = from[position++];

    goto case 7;

    case 7:

    to[position] = from[position++];

    goto case 6;

    case 6:

    to[position] = from[position++];

    goto case 5;

    case 5:

    to[position] = from[position++];

    goto case 4;

    case 4:

    to[position] = from[position++];

    goto case 3;

    case 3:

    to[position] = from[position++];

    goto case 2;

    case 2:

    to[position] = from[position++];

    goto case 1;

    case 1:

    to[position] = from[position++];

    if ((count -= 8) > 0) goto case 0; else break;

    }

  • @pminaev:

    An example of combined error-handling in C without using goto:

    bool got_error = FALSE;

    got_error = do_func_1();

    got_error = got_error && do_func_2();

    got_error = got_error && do_func_3();

    got_error = got_error && do_func_4();

    if (got_error) {

     // combined error handling

    }

  • Oops, wrong logic (it's early) but you get the idea.

  • That's assuming your functions return bool to indicate error - what about an int error code (note that in that case, && cannot be used, as it will coerce any non-0 "truth" value to 1)?

    In practice, most C code I've seen uses "if ... goto error" and so on. Probably also because it's far clearer than this trick, too. As I've said earlier, sometimes goto is the right way to solve the problem, and working around it only makes things more complicated, all for the sake of "not saying the dreaded word".

  • We constantly use gotos in all forms except for jumping into a loop or back to the top of a function.

    This is our way of drastically reducing the nesting level of if then else/for loops/etc.  This point is lost when must younger developers trust the framework to clean up their own objects/allocation and ignore the longer term support cost for a particular block of code.  

    We've been burned with offshore written code that has 15+ levels of nesting in 500+ line functions because the developers refused to use gotos, returns and apparently got paid more for each line of code.

    Common code we've seen

    if (file open is ok)

    {

     500+ lines of code

    }

    else

    {

    do some error handling

    }

    This would be nested inside of a loop or multiple levels down inside of an if/then else block.

    Another example would be multiple nested try/catch statements each to handle a resource allocation/connection failure instead the commonly used block:

    if allocate 1 fails goto cleanup_one

    if allocate 2 fails goto cleanup_two

    if allocate 3 fails goto cleanup_three

    body of function (or better a function call to the body to make it easier to see that the allocate and cleanup operations are done in the correct order)

    cleanup_three: do cleanup of object 3

    cleanup_two: do cleanup of object 2

    cleanup_one: do cleanup of object 1

    return

  • @pminaev: int error code can easily be handled in the same way by ANDing got_error with the result of comparing the function to the known good value e.g. if a negativereturn vale indicates error then:

    got_error = !got_error && (func1() < 0);

    If you dislike the "trick" with the logic short-circuiting then you can use a more explicit version:

    bool got_error = FALSE;

    got_error = do_func_1();

    if (!got_error)

       got_error = do_func_2();

    if (!got_error)

       got_error = (do_func_3() < 0);

    if (got_error) {

       // combined error handling

    }

    Either way, the only advantage to the goto approach is that it saves a couple of clock cycles. On most platforms, including most embedded system I've worked on, a few clock cycles in neglible.

  • > Either way, the only advantage to the goto approach is that it saves a couple of clock cycles.

    You still miss my point. The advantage of the goto error handling approach is that it is usually more readable than any other workaround. Workarounds for the sake of them are not a good idea.

  • I always used to think that the GOTO statement was a bad thing, and programming should only use loops and other "prettier" flow techniques.

    Over the last few weeks I was playing around writing an MSIL disassembler to look deeper into the code I write... and the first thing I noticed was that compiler translate loops into a seried of GOTO blocks. Typically a for loop would look something equivalent to:

                 (Initialise looping variable)

                 goto LABEL1;

    LABEL2: (Whatever instructions are inside the block)

                 (Increment or decrement looping variable as required)

    LABEL1: if (Check for loop to be run) goto LABEL 2;

    So a trivial loop:

    for (int ix = 0; ix < 10; ix++)

    {

     do_something();

    }

    do_loop_finished();

    gets translated to:

                 int ix = 0;

                 goto LABEL1;

    LABEL2: do_something();

                 ix++;

    LABEL1: if (ix < 10) goto LABEL 2;

                 do_loop_finished();

    So I am now wondering why we are always told to you loops rather than gotos, when they are just converted back to gotos, thus adding an additional level of translation (and hence sub-optimal code)!

  • I agree that gotos can be VERY evil...

    BUT, they are very useful. I had to write a parser for fiels that could reach up to 20 gigs. Their content was never known until they were read. I came up with a 400 line optimised function, each of the other programmers came up with 700+ line functions.

    They had similar flow patterns, except I used goto statements.

    <quote>

    Common code we've seen

    if (file open is ok)

    {

    500+ lines of code

    }

    else

    {

    do some error handling

    }

    </quote>

    solution...

    if (!FileOpenIsOk)

    {

    error handling

    }

    function continues

  • @pminaev:

    I agree that avoiding "goto" purely out of superstition is a bad idea. But I disagree that a goto version of that code is actually any more readable.

    @Russell Anscomb:

    Of course. And this is true of most languages. Loops are all just syntactic sugar for various Branch and Jump assembly instructions. Languages themselves are just abstractions of machine code. If you want absolute speed at all cost then why are you using a high-level language at all? You need to write directly in assembler (and you also need to be a genius to outperform most modern compilers).

    However the rest of us see the benefit of readable and maintainable high-level code over absolute performance.

  • It's probably also worth pointing out that in a modern language, like C#, most error handling will be done by exceptions - which in many ways offer all the same benefits as gotos in this scenario, with the added advantage of carrying information about why they were raised.

  • Of course ALL flow control becomes "goto" (possibly conditional) at the machine level. Anyone who does not understand that, needs to just hang up their had.

    Much more dangerous than "goto" is the infamous "come from" instruction. When

    using this, you can cause an arbitrary transfer from any point in the code:

    10 x = 1

    20 y = 1

    30 x = 2

    40 Print x,y,z

    5000 Come From 20

    5001 y = 99

    5002 GoBack

    (Sorry for posting this 14 days before it's 35th original publication)

Page 2 of 4 (58 items) 1234