Loops are gotos

Loops are gotos

Rate This
  • Comments 58

Here's an interesting question I got the other day:

We are writing code to translate old mainframe business report generation code written in a BASIC-like language to C#. The original language allows "goto" branching from outside of a loop to the interior of a loop, but C# only allows branching the other way, from the interior to the exterior. How can we branch to the inside of a loop in C#?

I can think of a number of ways to do that.

First, don't do it. Write your translator so that it detects such situations and brings them to the attention of a human who can analyze the meaning of the code and figure out what was meant by the author of this bad programming practice. Branching into the interior of a loop is illegal in C# because doing so would skip all of the code that is necessary to ensure the correct behaviour of the loop. Odds are good that this pattern indicates a bug or at least some bad smelling code that should be looked at by an expert.

Second, this pattern is not legal in C# or VB.NET, but perhaps it is legal in another useful modern language. You could write your translator to translate to some other language instead of C#.

Third, if you want to do this automatically and you need to use C#, the trick is to change how you generate your loops. Remember, loops are nothing more than a more pleasant way to write a "goto". Suppose your original code looks something like this pseudo-basic fragment:

10   J = 2
20   IF FOO THEN GOTO 50
30   FOR J = 1 TO 10
40     PRINT J
50     PRINT "---"
60   NEXT
70   PRINT "DONE"

The "obvious" way to translate this program fragment into C# doesn't work:

     int J = 2;
     if (FOO) goto L50;
     for(J = 1 ; J <= 10; J = J + 1)
     {
       PRINT (J);
L50:   PRINT ("---");
     }
     PRINT ("Done");

because there's a branch into a block. But you can eliminate the block by recognizing that a "for" loop is just a sugar for "goto". If you translate the loop as:

     int J = 2;
     if (FOO) goto L50;
     J = 1;
L30: if (!(J <= 10)) goto L70;
       PRINT (j);
L50:   PRINT ("---");
     J = J + 1;
     goto L30;
L70: PRINT ("done");

and now, no problem. There's no block to branch into, so there's no problem with the goto. This code is hard to read so you probably want to detect the situation of branching into a loop and only do the "unsugared" loop when you need to.

It is difficult to do a similar process that allows arbitrary branching in a world with try-finally and other advanced control flow structures, but hopefully you do not need to. Old mainframe languages seldom have complex control flow structures like "try-finally".

  • So you cant jump to a goto label within a loop from outside the loop? I didnt know this. But whats that piece of code that Reflector happens to show when decompiling a yield return statemachine?

       private bool MoveNext()

           {

               try

               {

                   switch (this.<>1__state)

                   {

                       case 0:

                           this.<>1__state = -1;

                           this.<>7__wrap4 = this.<>4__this.GetNodes().GetEnumerator();

                           this.<>1__state = 1;

                           while (this.<>7__wrap4.MoveNext())

                           {

                               this.<node>5__3 = this.<>7__wrap4.Current;

                               this.<>4__this.m_alreadyReturnedNodes.Add(this.<node>5__3);

                               this.<>2__current = this.<node>5__3;

                               this.<>1__state = 2;

                               return true;

                           Label_008C:

                               this.<>1__state = 1;

                           }

                           this.<>m__Finally5();

                           break;

                       case 2:

                           goto Label_008C;

                   }

                   return false;

               }

               fault

               {

                   this.System.IDisposable.Dispose();

               }

           }

    Is it just Reflector making up a while statement while the original il statements just are gotos?

    Regards Florian

  • The discussion is moving a litle away from the first remark about automated migrated code.

    Yes excessive and wrong use of Goto is very bad. But looking manual at 10.000.000 lines to remove them is a huge task. Analyzing and refactoring code solves a number of them but still not all.

    The IL itself does allow it so why not the compiler. Let the programmer be the judge to use it or not.

    In vb.net indead the goto into a for is not allowed but a goto into an if is. And possible other code levels?

           Dim j As Integer

           j = 1

           GoTo l50    ' <---- allowed

           If (1 = 1) Then

               j = 2

    l50:

               j = 3

           End If

           GoTo l60  '<----------- not allowed

           For j = 1 To 5

    l60:

           Next

  • I'm suprised that people can be so 'religious' in their renouncing of gotos. I thought kind of singlemindedness was a Java guy's thing.

    Yes, most of the time the higher level control flows are better, but there are exeptions.

    Take the example of a controlflow schema from Visio. When you want to implement that with functions, you have a problem because after the call ends, you end up where you called the function. That's not what you want, you wanted to go to another Process or Decision. (note the use of the words 'go' and  'to' ;-) In those cases program with Gotos is much more readable than trying to do that with functions.

    Also, every function call is a push to the stack, they are not cheap. If you need really fast execution of your code, gotos can actually help you squeeze out that last bit of speed, although this can be at the expense of readabiliy, i'll agree there.

    Regards Gert-Jan

  • > The IL itself does allow it so why not the compiler. Let the programmer be the judge to use it or not.

    Because the compiler must enforce a semantic level of meaning on the code so that it can effectively generate IL code that does what you wrote in the higher level language.

    Consider the difference between for loops and if-else statements. There is quite a bit of difference in the amount of setup required to make each work.

    if-else: No setup required, beyond performing the test condition and branching to the else part. Since there's no setup, it's no problem to goto into either the if-body or else-body.

    for loop: At minimum, initializing the loop index and check that it meets the test condition. (foreach is even worse... find the appropriate enumerator, get that all setup, etc., etc...) With all this setup, how do you reasonably do a goto into the body of a for loop AND get provably correct code for EVERY case? Not too likely. (On the other hand, you can goto out of a for loop, because that amounts to little more than a break.)

  • I meant flowchart from Visio to be precise.

  • Nothing like stepping on a setjmp / longjump landmine. I've seen some pretty clever exception-like handling built around this construct using C/C++. It drove me nuts till I figured it out. Another developer had called setjmp from main, then whenever his deep library code encountered an error he'd call longjump. The longjump would set the instruction pointer back to the beginning of the program.

  • What would people say at goto's funeral?

    goto's wife: "Oh my sweet goto, I can't loop without you"

    goto's kids: "we want our goto back"

    goto's 3rd cousins' brother in-laws' 5th pets' owner: "He was pure evil, evil I tell you!"

  • I would suggest converting the code to MSIL, then decompiling the MSIL into C#. That way you get C# code, but it will still have loops where ever possible.

  • This code

    10   J = 2

    20   IF FOO THEN GOTO 50

    30   FOR J = 1 TO 10

    40     PRINT J

    50     PRINT "---"

    60   NEXT

    70   PRINT "DONE"

    can write on C# using new bool variable:

    j = 2;

    bool foo1 = foo;

    if (!foo)

       j = 1;

    for (; j <= 10; j ++)

    {

       if (!foo1)

           print (j);

       foo1 = false;

       print ("---");

    }

    print ("DONE");

    Any code do not repeates twice!

    You can use foo instead foo1, when this variable do not required in other code part.

  • I don't think anyone here fails to realize that all flow-of-control structures end up as GOTO's at a machine level. The point is that the machine can manage the GOTO's, but for humans its like reading a story, telling someone to jump two pages ahead, then jump three pages back. Yeah, you can do it, but geez, does that make it a good idea? Heavens - the computer can count to a zillion in binary, but that doesn't mean its a good idea for human consumption.

    That aside, my personal beef isn't with "goto's" per se, but when they are being used in obviously lazy situations when just a few minutes' effort would have created a more streamlined control flow structure. In *most* cases, a GOTO is a hack, an amputation of logic where a few stitches of design and discretion would have avoided an ugly and usually unnecessary scar. I've rarely seen a structure made *more* elegant with a blunt GOTO. But that's just me...

  • GOTO is all about sequence in the context of state. Code in this form is inimical to parallelisation.

    This is easy to automate with functional programming languages, where dependencies on evaluation sequence are necessarily explicit. It is damn near impossible to automate in procedural languages where many dependencies are not only implicit but often indirectly implicit (due to side effects).

    When a language requires explicit declaration of dependencies, laziness inhibits their spurious creation. Better quality code results, and it is incidentally more amenable to optimisation for multiple CPUs.

    As in life, time is the great enemy.

  • I think people are confusing GOTO with branching.

    GOTO is a programming statement which allows unstructured branching. Other programming statements (if-then, for, while, throw) only allow branching within a particular context or programming structure. Unstructured branching allows you to write incomprehensible code *very* easily. For example (using pseudo-basic)

    10 x = 110

    20 if ( x is even ) goto 40

    30 x = 3 * x + 1

    40 x = x / 2

    50 if ( x > 10 ) goto 20

    60 print x

    I mean, never mind what x might be at the end, can we even be sure this will finish for different start values of x?

    (Please don't get caught up on this being a mathematical example, BTW. Statements 20 to 50 could be totally non maths related - as long as they interfere with each other you are going to get a problem).

    It's very easy with unstructured branching to start getting this sort of effect. Of course, you could use GOTO to duplicate structured branching (like duplicating a while loop), but why do it? It is far nicer when you come to look at your code to know there are *no* GOTOs in it rather than have to analyse whether each individual GOTO is sensible or not.

    As to the oft quoted example of error handling in a sequence of method calls, I believe this indicates bad design. First of all I think you need to carefully consider what *is* an error, then isolate your error handling in one group of methods or classes or whatever, so that the rest of your code isn't burdened with constantly having to check whether things are happening ok. i.e., it is better to write:

    method_a()

    {

    method_a_1();

    method_a_2();

    etc

    }

    - than:

    method_a()

    {

    if ( method_a_1() == ok )

    {

    if ( method_a_2() == ok )

    {

    etc

    }}}}}}}}}}}}}}}.

    - or some equivalent using gotos.

    Where you simply are unable to pre-check for an error (like that nuclear refinery you were monitoring has just gone off-line), you should probably use an excepion and exit (gracefully) until somebody fixes the problem.

    Richard

  •     for(J = FOO ? 3 : 1; J <= 10; J++)

          PRINT (string.Format("{0}\r\n---",J));

        PRINT ("Done");

  • What I was trying to say is that the code should be refactored properly. Using Goto is ugly and difficult to read. Simply check what the code is trying to achieve and find the best way to replicate that functionality.

    Which you will note was my first suggestion, yes. But what if you have five hundred thousand lines of goto-ridden mainframe code to translate? "Ugly and difficult to read" is better than "not working", and cheaper and more accurate than "translate five hundred thousand lines of code by hand". -- Eric

    Just because the original code uses a Goto, that doesn't mean our new code has to use any at all. We could similarly do this:

    int j = FOO ? 3 : 1;
    while (j<=10)
      PRINT (string.Format("{0}\r\n---",J++));
    PRINT ("Done");

    Using your human intelligence to create a goto-free program that matches the semantics of one particular code fragment is easy. Writing a program that does that to any arbitrary fragment is rather more difficult, I assure you. -- Eric

  • (with a lower case j in line 3, obviously)

Page 3 of 4 (58 items) 1234