What's tail recursion

If you know its nothing to do with any of your pet's tail then get onto the next section.

Tail recursion is a special case of recursion where the last call in a method is a recursive call.

The major issue with recursion is that with each recursive call the stack grows by the stack frame allocated to the function call and soon it hits a stack overflow in case the recursion is too deep. Consider the following code which recursively counts down from a given number. In case you call this method with 100000 or some such large number it'll crash on your face with a StackOverFlow exception.

static void CountDown(int num)
{
    Console.WriteLine("{0} ", num);
    if (num > 0)
        CountDown(--num);
}

Optimizing tail recursion

Interestingly the above code is of the special tail-recursion form. Since the last call is the recursive call there is no need to preserve stack frame of the calling function and the compiler can easily use this information to generate machine instruction such that the stack doesn't grow at all.

In C# terms the above code can be optimized by the compiler into something like

static void CountDown(int num)
{
START:
    Console.WriteLine("{0} ", num);
    if (num > 0)
    {
        --num;
        goto START;
    }
}

In this code the parameter is replaced on the same stack and the execution is simply short-circuited to re-start on that.

All of this is fine but does .NET support this

Instead of a straight answer lets try to see ourselves. I'll first comment out the Console.WriteLine code in the first method to make it simply, build it and disassemble it using the trick mentioned in my previous post. The generated IL for this method looks something as follows

.method private hidebysig static void  CountDown(int32 num) cil managed
  {
    // Code size       25 (0x19)
    .maxstack  2
    .locals init (bool V_0)
    IL_0000:  nop
    IL_0001:  ldarg.0
    IL_0002:  ldc.i4.0
    IL_0003:  cgt
    IL_0005:  ldc.i4.0
    IL_0006:  ceq
    IL_0008:  stloc.0
    IL_0009:  ldloc.0
    IL_000a:  brtrue.s   IL_0018

    IL_000c:  ldarg.0
    IL_000d:  ldc.i4.1
    IL_000e:  sub
    IL_000f:  dup
    IL_0010:  starg.s    num
    IL_0012:  call       void TailRecursion.Program::CountDown(int32)
    IL_0017:  nop
    IL_0018:  ret
  } // end of method Program::CountDown

From the IL marked with red its evident that the C# compiler at-least doesn't do this optimization and emits a normal recursive call.

However, .NET platform needs to support all sorts of languages including those like scheme where the only way to do iteration is to convert the code into tail-recursion and the compiler is supposed to do the above mentioned optimization.

It's exactly for this reason there is a supported IL instruction named tail. Using this I can modify the above code to as follows.

.method private hidebysig static void  CountDown(int32 num) cil managed
  {
    // Code size       25 (0x19)
    .maxstack  2
    .locals init (bool V_0)
    IL_0000:  nop
    IL_0001:  ldarg.0
    IL_0002:  ldc.i4.0
    IL_0003:  cgt
    IL_0005:  ldc.i4.0
    IL_0006:  ceq
    IL_0008:  stloc.0
    IL_0009:  ldloc.0
    IL_000a:  brtrue.s   IL_0029

    IL_000c:  ldarg.0
    IL_000d:  ldc.i4.1
    IL_000e:  sub
              tail.
    IL_0023:  call       void TailRecursion.Program::CountDown(int32)
    IL_0029:  ret
  } // end of method Program::CountDown

From CIL ECMA 335 spec "use a tail. prefix to specify that the current stack frame should be released before transferring control. If the call would transfer control to a method of higher trust than the original method the stack frame will not be released."

This single instruction is very significant in this context. So now if I assemble this using ilasm and run it with as big a number I fancy it'll not crash (obviously it may take hours to compute though).

Update: Through our internal C# DL I got to the gory details. Check out the links here and here