Loops are gotos

Loops are gotos

Rate This
  • Comments 58

Here's an interesting question I got the other day:

We are writing code to translate old mainframe business report generation code written in a BASIC-like language to C#. The original language allows "goto" branching from outside of a loop to the interior of a loop, but C# only allows branching the other way, from the interior to the exterior. How can we branch to the inside of a loop in C#?

I can think of a number of ways to do that.

First, don't do it. Write your translator so that it detects such situations and brings them to the attention of a human who can analyze the meaning of the code and figure out what was meant by the author of this bad programming practice. Branching into the interior of a loop is illegal in C# because doing so would skip all of the code that is necessary to ensure the correct behaviour of the loop. Odds are good that this pattern indicates a bug or at least some bad smelling code that should be looked at by an expert.

Second, this pattern is not legal in C# or VB.NET, but perhaps it is legal in another useful modern language. You could write your translator to translate to some other language instead of C#.

Third, if you want to do this automatically and you need to use C#, the trick is to change how you generate your loops. Remember, loops are nothing more than a more pleasant way to write a "goto". Suppose your original code looks something like this pseudo-basic fragment:

10   J = 2
20   IF FOO THEN GOTO 50
30   FOR J = 1 TO 10
40     PRINT J
50     PRINT "---"
60   NEXT
70   PRINT "DONE"

The "obvious" way to translate this program fragment into C# doesn't work:

     int J = 2;
     if (FOO) goto L50;
     for(J = 1 ; J <= 10; J = J + 1)
     {
       PRINT (J);
L50:   PRINT ("---");
     }
     PRINT ("Done");

because there's a branch into a block. But you can eliminate the block by recognizing that a "for" loop is just a sugar for "goto". If you translate the loop as:

     int J = 2;
     if (FOO) goto L50;
     J = 1;
L30: if (!(J <= 10)) goto L70;
       PRINT (j);
L50:   PRINT ("---");
     J = J + 1;
     goto L30;
L70: PRINT ("done");

and now, no problem. There's no block to branch into, so there's no problem with the goto. This code is hard to read so you probably want to detect the situation of branching into a loop and only do the "unsugared" loop when you need to.

It is difficult to do a similar process that allows arbitrary branching in a world with try-finally and other advanced control flow structures, but hopefully you do not need to. Old mainframe languages seldom have complex control flow structures like "try-finally".

  • Your point about the difficulty in understanding convoluted code is well taken.  In fact some of the coded solutions produced wrong results.  In the case where FOO is true they fail to start and end the list with "---".

    Actually, your example and solution point out the fundamental translation problem and solution.  Antique BASIC treated for a FOR loop index as a global variable and the loop condition was checked in the NEXT statement (Always on iteration was executed).  The best way to handle the translation problem is to break the FOR loops into initialization, Value adjustment, and conditional jumps, much like you did ( I would check end conditions at the NEXT statement ).

    There are also wonderful dialect specific factors like what is the value of j after the loop?  Is it 10 or 11?  That would depend on the specific implementation and is unfortunately something that may matter for the code that followed.  After all it was often possible to jump out of FOR loops before termination.

  • if (StatusOK) {

      if (DataAvailable) {

         ImportantVariable = x;

         goto MID_LOOP;

      }

    } else {

      ImportantVariable = GetSomeValue();

      MID_LOOP:

      //Lots of code

    }

    I'm disappointed that no-one has mentioned Steve McConnell yet.

  • Eric:

    If you have 500,000 lines of working code which is riddled with GOTO statements, and you then replicate it on a line-by-line (or fragment-by-fragment if you prefer) basis, there's no guarantee that it's going to work in the target environment. There are other factors such as that pointed out by bmann: The value of J upon exit may be different, which means the code could fail catastrophically at its first run, or even at some later point when FOO eventually does equal True, and then you'd be left with the serious headache of having to debug it. If you untangled it properly, it would make that debugging much more simple and the code would be readable to anyone else who came to look at it.

    Aside from that, if it's working, why would you move it to another language?

    bmann: Good point... but with our new refactored code we should be able to see exactly where the problem is and what the problem is at aa glance :)

  • Seriously, we have methods now...... If you need to jump to something inside a loop, then it should probably be in a separate method.

  • With regard to Duff's device (see comments from A.C.Hynes and Matthew Whited above) here is another C# version which is closer to the original.   C# is surprisingly expressive at times:

    namespace DuffsDevice

    {

       using System;

       class Program

       {

           static void Main(string[] args)

           {

               unsafe

               {

                   short[] data = new short[] { 1, 2, 3, 4, 5, 6, 7, 8, 9, 10};

                   short register = -1;

                   fixed (short* dataPtr = &data[0])

                   {

                       DuffsDevice(&register, dataPtr, data.Length);

                   }

                   Console.WriteLine(register);

               }

           }

           /// <summary>

           /// C# approximation to Duff's device.

           ///

           /// void send(register short *to, register short *from, register int count)

           /// {

           ///     register int n=(count+7)/8;

           ///     switch(count%8)

           ///     {

           ///         case 0: do{ *to = *from++;

           ///         case 7:     *to = *from++;

           ///         case 6:     *to = *from++;

           ///         case 5:     *to = *from++;

           ///         case 4:     *to = *from++;

           ///         case 3:     *to = *from++;

           ///         case 2:     *to = *from++;

           ///         case 1:     *to = *from++;

           ///         }while(--n>0);

           ///     }

           /// }

           /// </summary>

           /// <param name="to"></param>

           /// <param name="from"></param>

           unsafe static void DuffsDevice(short* to, short* from, int count)

           {

               int n = (count + 7) / 8;

               switch (count % 8)

               {

                   case 0:

                       *to = *from++;

                       goto case 7;

                   case 7:

                       *to = *from++;

                       goto case 6;

                   case 6:

                       *to = *from++;

                       goto case 5;

                   case 5:

                       *to = *from++;

                       goto case 4;

                   case 4:

                       *to = *from++;

                       goto case 3;

                   case 3:

                       *to = *from++;

                       goto case 2;

                   case 2:

                       *to = *from++;

                       goto case 1;

                   case 1:

                       *to = *from++;

                       if (--n > 0) goto case 0; else break;

               }

           }

       }

    }

  • re: Duff's Device... Hand-rolled optimization in C#? You guys are funny.

  • Turns out that a simple loop to iterate through a safe array can be a bit faster than my version of Duff's Device in .NET.   A loop to iterate through allocated stack space (using stackalloc) using an unsafe pointer is much faster.   So, 'Duff's device' in C# is sub-optimal.   A case of hand-rolled de-optimization!   Full marks to C# for being so expressive, but don't use the technique!

  • this is funny stuff.

    We are being guided to the old, frustrating, spider webs of looping with GOTOs.

    this shouldnt practically be an approach in professional development using C#.

  • One time long ago I inherited a program written in C-tran.  The author knew and thought in FORTRAN but had to code the app in C, and a veritable mess of code resulted that contained over 400 GO TO stetements.   We stumbled along maintiaing it, never given the time to rewrite what was "working" but always allowed the time to analyze the mess of spaghetti for unintended consequences of the changes we were planning to make (and deal with the ones we did not foresee).

    We recruited a bright guy from the customer support team to come into the product development team and his soel condition for accepting the position was they he be allowed to rewrite that app.  It was accepted and the final product after3 months was easier to understand, easier to maintain and easier to extend.

    GOTO is just a tool - but it is VERY dangerous in the hands of those who do not fear it.

  • Bad code is bad code. One needs to understand what branching into a FOR statement in their compiler means. I have used languages where there is only the goto.  I have also see bad FOR statements.

    10 FOR i = 1 TO 10

    20     PRINT i

    30 NEXT i

    40 LET i = i * 2

    50 PRINT i

    What is the value of i? 20 or 22.  This depends on the BASIC version.

    Of course we should not use i outside to loop.

  • Reading most of this thread, I was amazed to discover how interesting it is to look at GOTO from 20 or 30 different points of view.

    Getting back to the question that started this thread: the reason she/he wants to convert the old BASIC code to C# is (presumably) so that it will use the .NET CLR and interoperate with other .NET code. The problem is a kind of "impedance mismatch" meaning that BASIC is so dumb, and C# is so smart, that conversion is impractical. What we need is a cruder version of C# (call it "C-dumb") that is closer to the BASIC code. Has anyone considered FORTRAN?

    Seriously, I once had to write a state machine once to handle Asterisk telephone events and it was a heck of a lot easier to write, read and modify using a few GOTOs versus keeping it "pure" with strange WHILEs.

    The argument that MSIL uses GOTO so we should use them too is suspect. Consider writing a specification for a steel screw. Yes, it is made of molecules but when you order a certain screw do you really want to describe how all the molecules relate to one another? It is the wrong level of abstraction.

    Maybe God Almighty will write the answer with fiery letters in the sky: "GOTO IS/IS NOT HARMFUL!"

  • How much knowlege is available about how the original program works, or (more importantly) what it is supposed to be doing?

    A lot of old code can be reduced (in lines of codes) by 20 to 80%, purely on the basis that much of it was cut/pasted from other code that did the same thing on a different instance of data.  This is why classes/objects/interfaces/etc came about to begin with.

    I agree it is painful to take in hand thousands of lines of code and re-design and re-write it, but it many cases it is well worth it, and would take a lot less time than one might think (there is a certain moment of inertia that begins to build once you get familiar with a coding style no matter how ugly it might seem).

  • For the love of god!!

    Why would you do a direct translation of the code? Just use a loop and some if conditions in it to skip sections.

    DO NOT USE GOTO!

    Why? Well, suppose you have a million lines of code to translate. Your choices are (1) spend 10 hours writing a literal translator that generates ugly code, (2) spend 10 hours writing a literal translator and then 90 hours rewriting all the code that involves gotos, or (3) spend 10000 hours rewriting the whole system from scratch. The costs of those three choices are $1000, $10000 and $1000000, respectively. Suppose you have to pay for it. Which would you choose? How much are you willing to spend on beautifying code that already works just fine? -- Eric

     

Page 4 of 4 (58 items) 1234