How Not To Teach Recursion

How Not To Teach Recursion

  • Comments 65

A Joel On Software reader asked the other day for examples of recursive functions other than old chestnuts like Fibonacci or factorial.  Excellent question. I suggested topological sort, of course, but there are plenty of other examples that are way better than the Fibonacci numbers.  Why do I think Fibonacci is bad?  Read on.

In case it's slipped your mind, the Fibonacci numbers are defined as follows:

fib(1) = 1
fib(2) = 1
fib(n) = fib(n-1) + fib(n-2)

thus, they go 1, 1, 2, 3, 5, 8, 13, 21, …

The Fibonacci numbers have many interesting properties.  For example, suppose you have a pair of rabbits that every year gives birth to a pair of rabbits, rabbits take one year to reach maturity, and rabbits never die.  In that (admittedly unrealistic) situation, the number of pairs alive at any given time is a Fibonacci number.  Clearly the number alive this year is the number alive the year before (who all live) plus the number alive two years before (who are now mature enough to breed.)  Fibonacci numbers also have a relationship to the Golden Mean, logarithmic spirals, and many other interesting nooks of mathematics.  But that's not what I want to talk about today.

Because they have a straightforward recursive definition, it's natural to introduce recursive programming to students by simply translating the definition into code:

function fib_1(n)
{
    if (n == 1 || n == 2)
        return 1;
   
else
      return fib_1(n-1) + fib_1(n-2);
}

I am in principle very much for this pedagogic approach.  Recursive programming can be difficult to get your head around.  Starting with recursive definitions, getting your head around those, and then using those definitions in code is a pretty good way to approach the material.  If you think of lists as being "zero or more items in a row", it is conceptually hard to think of anything other than an iterative approach to processing the list.  If you think of a list as "a list may be empty, or may be an item followed by a list", then the definition of the data leads itself to a recursive programming approach very naturally.

Practically however, this is probably the worst possible simple way to solve the Fibonacci problem.  When you introduce recursion by using it to solve a problem which iteration solves much, much better, students come away thinking that recursion is dumb -- that it is both harder to understand and produces worse solutions.  Better to pick an example where the iterative solution is not better!

In fact, this is a good whiteboarding question for interviews:  implement me a recursive Fibonacci algorithm, tell me why it is terrible, and write me a better algorithm. A good candidate should be able to come up with the above and something like this:

function fib_2(n)
{
    if (n == 1 || n == 2)
        return 1;
    var fibprev = 1;
    var fib = 1;
    for (var cur = 2 ; cur < n ; ++cur)
    {
        var temp = fib;
        fib += fibprev;
        fibprev = temp;
    }
    return fib;
}

Typical questions that I might ask at this point in an interview are

  • What are the asymptotic running time and memory usage of the two algorithms you've written?
  • Can you write me an algorithm with even better asymptotic running time and memory usage? Alternatively, can you convince me that this is as good as it gets?
  • Suppose I asked you to make this algorithm absolutely as fast as possible.  What would you do?
  • Now suppose that I do not care much about raw speed but I do want the method to be robust in the face of bad input.  Now what do you do?

and so on.  (Anyone who cares to propose answers to these can leave comments; consider this a spoiler warning if you want to work them out for yourself.  Working out the asymptotic running time of recursive fib is quite easy and produces a somewhat surprising result.)

Another old chestnut that's often used as an introduction to recursion is the famous Tower Of Hanoi problem.  Briefly stated, you have a number of disks with a hole in the middle, and three spikes upon which you can place disks.  The disks are all of different sizes and are arranged from biggest to smallest in a pyramid on one spike.  You must move one disk at a time from spike to spike, you must never place a larger disk on top of a smaller disk, and you must end up with the entire pyramid on a different spike than you started with. 

The recursive solution is straightforward.  Moving a "sub-pyramid" of size one is obviously trivial.  To move a sub-pyramid of size n, first move a sub-pyramid of size n-1 to an  extra spike.  Then move the bottommost disk of the size n pyramid to the other extra spike.  Then move the sub-pyramid of size n-1 onto the disk on the second extra spike.  A nice recursive solution!

Many recursive implementations for this old chestnut exist.  But I'm not a big fan of this problem as a pedagogic approach to recursion for two reasons.  First, it's totally contrived (as is “fib“).  Second, there is a very simple iterative solution for this problem, which is almost never mentioned when it is used as an example of the power of recursive programming!  Again, we should pick examples where recursion is clearly preferable.

In case you're curious, the iterative solution can be produced by writing a program that implements these rules in a loop:

  • Number the disks 1 to n starting from the top. 
  • Never move an odd disk onto an odd disk.
  • Never move an even disk onto an even disk.
  • Never move a larger disk onto a smaller disk.
  • Never move the same disk twice in a row.
  • Never move a disk onto an empty spike unless you have to. 
  • Move disks until there are no legal moves.

If you don't believe me, try it yourself with ace through ten of a deck of cards sometime.  Or, perhaps start with ace through four -- ace through ten will take you a while.  Both the recursive and iterative solutions are optimal in the number of moves.  The recursive solution is O(n) in terms of memory usage, the iterative solution is O(1).

  • I still remember a homework assignment to solve the tower of Hanoi problem without recursion.  I got a 60% on a well written, well commented, and perfectly working program - I didn't google the answer, but probably came up with something similar to what was listed here (it may not have been optimal number of moves, but that wasn't part of the assignment).  

    I lost 40% because my professor was convinced the best (or maybe the only?) way to solve this problem was using a stack to simulate recursion (which is probably the dumbest solution that actually works).  

    Somehow we need to actually teach students how to think - otherwise they grow up to be Professors that teach this way, and poison more minds.  Teaching someone to actually think about and understand what they are doing is hard, if not impossible - but I do appreciate reading posts like this.  I feel better knowing someone is trying.

  • My favorite example to teach recursion is that of printing a linked list in reverse. Heaven help you it's a slow process to handle iteratively.

    To print the contents of said list in forward order (assume C++ classes)

     printr(list *node){

       if(NULL==node)

         return;

       cout << node->data;

       printr(node->next);

     }

    To print in reverse order merely requires moving the cout to after the call to printr(node->next).

  • Lets have a look at a development trend that seems to be doing the rounds these days in both software and hardware. How a simple idea can gather momentum initially, and then some mould and fungii, and then a ton of other crap that no one wants as well.

  • PingBack from http://mike.hostetlerhome.com/2007/01/19/about-recursion/

  • Binet's formula with the rounding is sort of cheating. The best exact algorithm I know of for fibonacci uses matrix exponentiation, and it runs in log(n) time.

    [1 1] ^ n

    [1 0]

    puts fib(n) in the top right element (easy to prove). With binary exponentiation, it can be extremely fast.

  • My last comment isn't entirely clear - what i mean with that equation is to raise a 2x2 matrix to the nth power.

  • Eric,

    I observe that it is usually easier to demonstrate that recursive algorithms do what they are designed for.  This is particularly well illustrated in your Tower of Hanoi examples, where you give an iterative algorithm, which to me, at least, is somewhat cryptic: I can understand its description, but it is not straightforward to see why it works.

    I think code's being clearly correct is an important benefit.  Whether this benefit is more or less important than efficiency depends on the circumstances.  As a (oversimplified) rule of thumb, I make my code as clearly correct as possible, and then if it is not fast enough, I sacrifice some of this clarity for speed.

    To summarize: I think there are strong arguments for the recursive forms of your functions, and I don't agree with your unqualified characterisations of them as 'the worst' or'terrible'.

    If the naive fibonacci example isn't fast enough, I'd try a recursive definition based on a helper function returning adjacent pairs before I resorted to your iterative definition.  I think your iterative definition would be faster, but that you have paid a cost in code clarity to make it so.

  • Eric,

    "What if you pass in too many arguments? Too few? Strings? Dates? Arbitrary COM objects? null? undefined? "

    This makes a number of presumptions about the working environment that weren't specified. A strongly, statically typed language deals with these errors already (at compile time).

    The same goes, for that matter, for Steven Bone's comments about the recursive method and stack frames. Most introduction to programming students are (hopefully) not being bombarded with such implementation details; the goal of such courses should be to teach the concepts (and the relevance of such comments varies a lot by language and platform).

    Fundamentally, the students are being taught that a) there are recursive algorithms that are easy to write and easy to verify (as astrolabe noted), a valuable property to say the least and b) that recursion is even possible, something that most are not familiar with. If a student is objecting on grounds of the recursive method requiring too many function calls, I wouldn't think he's advanced--it shows, in fact, that he's entirely missing the point.

    Perhaps there are better examples, but the student is (hopefully) being taught how to express concepts in code. That's what programming languages are all about, after all; expressing abstract concepts in an unambiguous, machine-comprehensible manner. If the student is worried, at such an early stage (freshman year!) about real-world applicability, perhaps he should consider enrollment in such an institution as DeVry University, where the cut-and-paste approach to software development is not frowned upon! That is, by far, the fastest way to real-world relevance.

    Bogdan Sintoma wrote, "These days its seems that the algorithms optimizations isn't the top priority for most of the real applications (unfortunately), instead the top priority is to reduce the developing time and of course the cost," and I can't help but reply. In a nutshell, Bogdan, programmer time is, in nearly every domain, far more expensive than CPU time. Even more to the point, incorrect execution is nearly always more expensive than slow execution. This is why garbage collection (which can, in fact, be far faster than manual memory management, as an aside), strong typing, and a complete *non* focus on preemptive optimization is so prevalent.

    That, or because most programmers are stupid. Correctness hasn't necessarily improved by leaps and bounds.

  • But consider for a minute a tail recursive optimized language such as scheme... and/or an implementation of the fib function that is linearlly recursive. This is O(n), and uses recursion.

    (define (fib n)

     (letrec ((fh (lambda (n nxt rs)

                    (cond

                      ((= n 0) rs)

                      (else

                       (fh (- n 1) (+ nxt rs) nxt))))))

       (fh n 1 0)))

    (fib 3)

  • Andrew,

    tail recursion an iteration really are the same thing. Your fib implemantation is a good example for this, since it uses the same algorithm fib_2 does.

  • This wasn't an issue for me. I learned recursion before iteration. My first programming language was scheme -- iteration was introduced after recursion as "tail-recursion".

  • Here is the sample provided by Andrew Gwozdziewycz reworded in OCaml (another programming system with tail call optimization):

    <pre>

    open Printf;;

    let fib = function n ->

     let rec fh fh_num fh_km1 fh_km2 =

       if fh_num = 0 then fh_km2

       else fh (fh_num - 1) (fh_km1 + fh_km2) fh_km1

     in

       fh n 1 0;;

    printf "fib: %d" (fib 9);;

    </pre>

  • I first saw recursion in programming demonstrated as a way of computing factorials. Later (in college) I was taught Towers of Hanoi, Fibonacci, and the usual. I don't believe this hurt me, because the instructor was careful to warn us of the inefficiency. This example coincidentally stresses the principle that recursion often gives very easy-to-program and understand solutions to problems, but is rarely the most efficient way to solve something.

    This also reminded me of the closed formula for the Fibonacci sequence, which you can derive from generating functions:

    http://mathworld.wolfram.com/BinetsFibonacciNumberFormula.html

  • For the towers of Hanoi, a significant advantage of the recursive solution is that it goes hand-in-hand with an inductive proof of its correctness. That is, you can show the recursive code is correct using a pretty straightforward inductive argument.

    But showing that the iterative code is correct is harder ... of course correctness proofs are rarely of value in practice, but they are certainly good things when you can get them so easily.

    Indeed, how do you convince a reader of the non-iterative program who doesn't believe it works in every case?

    And the concern for the O(1) versus O(n) memory usage is a red herring: it takes 2**n moves, and so it is dominated by time.  If you are solving a 64-disk instance, then there's hardly any difference between using 1 unit of memory as opposed to using 64 units when you have to wait thousands of years for a solution.

  • Eric, your very premise is fundamentally flawed.  Others such as Andrew Gwozdziewycz hinted at it, but even more directly,.. if one uses a real compiler your assumptions are rendered valueless.

    Using the Franz Allegro Lisp compiler (first compiler I had at had - I'm sure others can do it too), and this definition:

    (defun fib (x)

     (cond

      ((or (= x 1)

           (= x 2)) 1)

      (t

       (+ (fib (- x 1)) (fib (- x 2))))))

    compiled and tracing Fib I see exactly one call.  If I force it to interpret, say for pedagogical reasons, I can see the recursive call structure when tracing.

    Pedagogically if you want to teach students to be good programmers, you first teach them the tools of the trade.  I would much rather see anyone working for me write that then your fib_2.  Why?  it's easier to maintain, it's easier to see what it does, I guarantee it took less time to write and debug.  It's good clean functional code with no side effects.

    But more importantly it works and it works right.  And with my modern compiler it works efficiently (because most modern compliers can get rid of tail recursion quite easily).  Why do I want to teach my students to second guess and try to out-think the compiler?  They should write what's simplest, then they should profile the thing and it turns out you forced them to use a dumb complier for some reason, THEN and ONLY THEN should you and they worry about the next steps to optimize.  Trying to optimize before understanding the problem needing optimizing is a huge part of the problem in the way most students are currently taught and try to program.  Please don't perpetuate this.

    I agree that assignments are more interesting if they have a purpose.  And computing Fibonacci numbers for no apparently good reason is a little pointless.  More grounded tasks always help.  But for just looking under the hood, sometimes it helps when they can see the call graph, and not have to worry about all the underlying processing, because it's simple.  As such I would be more inclined to use an example like that in class to show the idea without getting mired in the details, and then have them practice with something more meaningful, maybe like sort as others have mentioned.

Page 3 of 5 (65 items) 12345