How Not To Teach Recursion

How Not To Teach Recursion

  • Comments 65

A Joel On Software reader asked the other day for examples of recursive functions other than old chestnuts like Fibonacci or factorial.  Excellent question. I suggested topological sort, of course, but there are plenty of other examples that are way better than the Fibonacci numbers.  Why do I think Fibonacci is bad?  Read on.

In case it's slipped your mind, the Fibonacci numbers are defined as follows:

fib(1) = 1
fib(2) = 1
fib(n) = fib(n-1) + fib(n-2)

thus, they go 1, 1, 2, 3, 5, 8, 13, 21, …

The Fibonacci numbers have many interesting properties.  For example, suppose you have a pair of rabbits that every year gives birth to a pair of rabbits, rabbits take one year to reach maturity, and rabbits never die.  In that (admittedly unrealistic) situation, the number of pairs alive at any given time is a Fibonacci number.  Clearly the number alive this year is the number alive the year before (who all live) plus the number alive two years before (who are now mature enough to breed.)  Fibonacci numbers also have a relationship to the Golden Mean, logarithmic spirals, and many other interesting nooks of mathematics.  But that's not what I want to talk about today.

Because they have a straightforward recursive definition, it's natural to introduce recursive programming to students by simply translating the definition into code:

function fib_1(n)
{
    if (n == 1 || n == 2)
        return 1;
   
else
      return fib_1(n-1) + fib_1(n-2);
}

I am in principle very much for this pedagogic approach.  Recursive programming can be difficult to get your head around.  Starting with recursive definitions, getting your head around those, and then using those definitions in code is a pretty good way to approach the material.  If you think of lists as being "zero or more items in a row", it is conceptually hard to think of anything other than an iterative approach to processing the list.  If you think of a list as "a list may be empty, or may be an item followed by a list", then the definition of the data leads itself to a recursive programming approach very naturally.

Practically however, this is probably the worst possible simple way to solve the Fibonacci problem.  When you introduce recursion by using it to solve a problem which iteration solves much, much better, students come away thinking that recursion is dumb -- that it is both harder to understand and produces worse solutions.  Better to pick an example where the iterative solution is not better!

In fact, this is a good whiteboarding question for interviews:  implement me a recursive Fibonacci algorithm, tell me why it is terrible, and write me a better algorithm. A good candidate should be able to come up with the above and something like this:

function fib_2(n)
{
    if (n == 1 || n == 2)
        return 1;
    var fibprev = 1;
    var fib = 1;
    for (var cur = 2 ; cur < n ; ++cur)
    {
        var temp = fib;
        fib += fibprev;
        fibprev = temp;
    }
    return fib;
}

Typical questions that I might ask at this point in an interview are

  • What are the asymptotic running time and memory usage of the two algorithms you've written?
  • Can you write me an algorithm with even better asymptotic running time and memory usage? Alternatively, can you convince me that this is as good as it gets?
  • Suppose I asked you to make this algorithm absolutely as fast as possible.  What would you do?
  • Now suppose that I do not care much about raw speed but I do want the method to be robust in the face of bad input.  Now what do you do?

and so on.  (Anyone who cares to propose answers to these can leave comments; consider this a spoiler warning if you want to work them out for yourself.  Working out the asymptotic running time of recursive fib is quite easy and produces a somewhat surprising result.)

Another old chestnut that's often used as an introduction to recursion is the famous Tower Of Hanoi problem.  Briefly stated, you have a number of disks with a hole in the middle, and three spikes upon which you can place disks.  The disks are all of different sizes and are arranged from biggest to smallest in a pyramid on one spike.  You must move one disk at a time from spike to spike, you must never place a larger disk on top of a smaller disk, and you must end up with the entire pyramid on a different spike than you started with. 

The recursive solution is straightforward.  Moving a "sub-pyramid" of size one is obviously trivial.  To move a sub-pyramid of size n, first move a sub-pyramid of size n-1 to an  extra spike.  Then move the bottommost disk of the size n pyramid to the other extra spike.  Then move the sub-pyramid of size n-1 onto the disk on the second extra spike.  A nice recursive solution!

Many recursive implementations for this old chestnut exist.  But I'm not a big fan of this problem as a pedagogic approach to recursion for two reasons.  First, it's totally contrived (as is “fib“).  Second, there is a very simple iterative solution for this problem, which is almost never mentioned when it is used as an example of the power of recursive programming!  Again, we should pick examples where recursion is clearly preferable.

In case you're curious, the iterative solution can be produced by writing a program that implements these rules in a loop:

  • Number the disks 1 to n starting from the top. 
  • Never move an odd disk onto an odd disk.
  • Never move an even disk onto an even disk.
  • Never move a larger disk onto a smaller disk.
  • Never move the same disk twice in a row.
  • Never move a disk onto an empty spike unless you have to. 
  • Move disks until there are no legal moves.

If you don't believe me, try it yourself with ace through ten of a deck of cards sometime.  Or, perhaps start with ace through four -- ace through ten will take you a while.  Both the recursive and iterative solutions are optimal in the number of moves.  The recursive solution is O(n) in terms of memory usage, the iterative solution is O(1).

  • >> bounds checking to prevent overflows and negative numbers in fib_2() would be sufficient to make it robust.

    > Oh really?

    A hint to what Eric may be referring to - JavaScript is not statically typed.

    With regard to recursion in general: not all programming languages were created equal. Supporting tail recursion can make all the difference between a valid recursive implementation and an invalid one.

    Another thing is that some languages make it very easy to define lists recursively without specifying the functions explicitly. For example Haskell:

    fib = 1 : 1 : zipWith (+) fib (tail fib)

    For more Haskell examples see http://cubbi.org/serious/fibonacci/haskell.html

    BTW BeyondJS makes it possible to define the Fibonacci in JavaScript in the same way:

    var fibs = (0).lazy(1);
    fibs.extend(fibs.zipWith("+", fibs.tail()));

    May not be as pretty as Haskell, but it works! Now to display the first 10 values in the sequence:

    fibs.head(10).foreach(alert);

    The interesting thing is that thanks to lazy evaluation the performance of Haskell and BeyondJS is actually quit good. This is because once a value in the sequence has been calculated it's saved and not calculated again.

    You can find BeyondJS at http://w3future.com/html/beyondJS/
  • I don't think you can have an algorithm with O(1). Using the Binet Formula leads to an algorithm with O(log N) time complexity (even though apparently you can use a Field so that you only use powers of integers, not real numbers. I woudn't know how to do that, but it's mentioned here: www.cs.wisc.edu/~mhock/SSL/fibcalc.pdf

    Space complexity should always be at least proportianl to the length of the resulting fibonacci number which (I believe) is proportianl to phi^n.
  • I meant proportional to phi^n / 2^n. Of course if you implement it using say int32s this is not really relevant, you'd just overflow.
  • > Binet’s Formula: fib(n) = (Phi^n-(-Phi)^(-n))/sqrt(5), where Phi = (sqrt(5)+1)/2

    You can simplify this by noticing that Phi is greater than 1 so Phi^-1 is less than 1 and Phi^-n is much less than 1 for large n. So you'll get the same answer by rounding Phi^n/sqrt(5) to the nearest integer. Note that this formula doesn't work for negative integers.

    The easiest way to derive the formula (that I know of) is using a generating function.

    > I don't think you can have an algorithm with O(1)

    It's O(1) as long as you have a fixed precision for numbers. In general, F_n has O(n) bits in the result so every algorithm must take at least O(n) time.

    > apparently you can use a Field so that you only use powers of integers

    Well, rationals, which can be implemented using integers and some fixup work. The trick requires creating a field extension, a field extension requires a field (naturally), and the integers is not a field. If you had a suitable implementation of integers mod p you could do it with that finite field. Your results may look a little strange at times though!
  • So if neither "Fibonacci numbers" nor the "Tower of Hanoi" problems are good pedagogic examples ... what would you use to teach recursion to students?
  • It depends on the purpose of the class of course. But in general, I'd try to find an example where a recursive approach is clearly better, and where the data being manipulated is defined recursively.

    I agree with the commenters above who suggested that file systems are a good place to start. Many students understand them at a practical level already, and you can quite easily define them recursively: A file system is a DAG of folders, a folder may contain zero or more files and zero or more folders. Once you have the concept of a recursively defined DAG down, lots of topics become teachable.
  • Directed Acyclic Graph, is it? A traditional DOS or Windows 9x or NT <=4.0 file system is just a directed tree for one drive, and a directed forest for the whole system. Expanding the class to DAGs allows for directories that are accessible via several distinct paths. Symlinks (aka junction points)? But those same symlinks can introduce cycles into the graph. Although not recommended, they are possible.

    That leads to a practical problem. Suppose you have a file system supporting symlinks, namely, NTFS 5. How do you traverse it, starting at a given root, visiting each directory only once (even if it is accessible from the root through more than one path), and not getting caught in loops?
  • Bah, no one liked my example. :(

    But yes, most forms of graph searching are more easily understood when phrased in recursive terms (and frequently easier to write).
  • Correct recursive function for Fibonacci can be based on Lucas numbers:

    F(n+m) = 1/2 * ( F(n)*L(m)+F(m)*L(n))

    where L is Lucas number (like Fibonacci but start from (1,3) not a (1,1))

    Taking in account L(x) = F(x-1) + F(x+1)

    And a little bit optimized

    F(n+m) = F(n-1)*F(m) + F(n)*F(m+1)

    This way you can create pretty fast recursive algo to calc big Fibonacci numbers.

    Fibonacci calculation was once at All-Ukrainian Informatics Contest (4 hours total to solve 4 CS problems). Test cases was up to F(65000) - about 14Kb in output.
  • IMHO, teaching recursive functions starting with these very simple examples isn't such a bad practice. Recursive factorial, fibonacci and tower of hanoi are easy to understand and useful to explain the concept. The recursive solution for the tower of hanoi is far from optimal, but elegant, and a very simple solution for a problem that a novice could believe it's very complex. After the concept is understood, starting from those solutions, other concept could be introduced: lazy evaluation, backtracking cut, algorithm complexity, optimizations and so on. Graphs are more complex concepts, and should be introduced later.

    If we talk about the basic concepts maybe we should start with: a=a+1 :) Most of the people have a basic math background before they start to learn programming, so the above basic concept could raise some understanding problem. After that, many will use a++ but maybe not so many will know why ++a is recommended ;).

    These days its seems that the algorithms optimizations isn't the top priority for most of the real applications (unfortunately), instead the top priority is to reduce the developing time and of course the cost. So, RAD tools, scripted language, GC and so on are used, and all of them have optimizations tradeoffs. Also, how many of you encountered in a real application code the famous bubble sort algorithm? :) But, perhaps, the application work well and the client was happy. Until the 'profiler' say's otherwise almost any solution is acceptable if the application is delivered on time, isn't it? ;).

    So, if the optimizations aren't such a problem in the real applications, why should we bother about them when the purpose is the understanding of a concept? But, I'm totally agree that teachers should explain also the alternatives.

    Best regards,
    Bogdan
  • One particularly good teaching application for recursion is the Merge Sort.

    Among the benefits: Merge Sort is one of the simplest O(N•LogN) sorting algorithms to write.
  • We have internal email lists for questions about programming languages. Here's one that came across...
  • Hi Eric,

    I implemented your iterative rules for solving Towers Of Hanoi in vbscript and had lots of fun thanks!

    The problem has always been presented to me in a slightly more stringent way, in that the tower starts on the left spike and must be moved to the right hand spike to win.
    Solving this slightly more stringent scenario can be achieved by adding one more rule concerning the first move:

    The first move should always be from the first
    column to
    a) The second column for even numbers of disks.
    b) The third column for odd numbers of disks.

    Also, by playing around with my code, I found by trial and error that:

    The number of iterations needed to solve for n disks is equal to double the number of iterations required for n-1 disks plus one.

    Thanks for such a great blog and good luck with the marriage!

    Regards,
    Tim



  • Well, it would be surprising if the number of moves was any different! After all, the algorithm for moving a tree of size n is essentially

    * move a tree of size n-1 onto an empty spike in k moves.
    * move the bottom disk, one move
    * move a tree of size n-1 onto the bottom disk, again, in k moves.

    So the total number of moves has to be 2k+1.

  • Um... Brett's suggestion of Directory/File search was exactly what I was thinking when I read this post. The student would be less distracted by understanding problem domain and more focused on the concept of recursion. Add to that, a Directory/File search is something that a person is more likely to actually use.

Page 2 of 5 (65 items) 12345