Every Tree There Is

Every Tree There Is

Rate This
  • Comments 19

[This is part of a series on generating every string in a language. The previous part is here. The next part is here.]

BinaryTrees1Last time we talked about how the number of binary trees with n nodes is C(n), where C(n) is the nth Catalan number. I asked if there were more or fewer trees – not restricted to binary trees – of size n than there are binary trees of size n. If you worked it out, the answer might have surprised you; it is certainly not immediately obvious.BinaryTrees2

First off, a common response I get to this question is immediately "well, since binary trees are a special case of arbitrary trees, there must be more arbitrary trees." Can you see why this argument is wrong? Binary trees have more structure than arbitrary trees! There are two binary trees with two nodes: one with the child on the right of the root, one with the child on the left of the root. There is only one arbitrary tree with two nodes; there is no difference between the "left" and "right" child.

As it turns out, the answer to my question is that except for the trivial cases of zero- and one-node trees, there are always more binary trees of size n than trees of size n.

If you actually work it out though you find an interesting pattern; the number of arbitrary trees of size n is C(n-1) – that is, there are 14 binary trees of size 4, and 14 arbitrary trees of size 5.  There are 1430 binary trees of size 8 and 1430 arbitrary trees of size 9. Weird! (Or, another way of looking at it is that the number of binary trees with n "null leaf nodes" is equal to the number of arbitrary trees with n nodes.)

Clearly this cannot be a coincidence; there must be a one-to-one relationship between the set of binary trees of size n and that of arbitrary trees of size n+1. And in fact the mapping is easy to see if you look at it in just the right way. On the left I’ve drawn a mapping from the first five binary trees of size 4 enumerated by my code of last time and the first five corresponding trees of size 5. (Click on the picture for a larger version.) 

To make it more clear exactly what the relationship is between the two trees, check out the diagram on the right. To turn a binary tree into its corresponding arbitrary tree, you rotate it 45 degrees anticlockwise, stick a root node above the whole thing, and then fix up all the horizontal lines to point to the appropriate parent.

Or, from a data structures point of view, you can represent every node in an arbitrary tree as a reference to the first child and a reference to the next sibling. This is exactly the same as a binary tree, it's just that the labels on the nodes are "first child" and "next sibling", not "left" and "right". The only difference between binary trees and arbitrary trees in this system is that the "right" child of the root is always null, since the root has no siblings. Once you see that the difference between a binary tree and an arbitrary tree is just the names of the fields in the data structure, it becomes apparent that there is a strong relationship between them.

So, interestingly enough, the solution to the second part of my challenge is: we’re already done. Since we have a device which enumerates all binary trees, and binary trees are 1-1 with arbitrary trees, then all we need is a device which can change binary trees to arbitrary trees. Doing so is left as an exercise, but for interest’s sake, here’s a device which takes a binary tree and gives a string representation of the corresponding arbitrary tree, using the compact syntax for an arbitrary tree I described last time.

public static string TreeString(Node node)
{
     // Get the arbitrary tree string corresponding to
     // a given binary tree.
     var sb = new StringBuilder();
     Action<Node> f = null;
     f = n =>
     {
         sb.Append("{");
         for (Node child = n.Left; child != null; child = child.Right)
             f(child);
         sb.Append("}");
     };
     f(new Node(node, null));
     return sb.ToString();
}

To get all the trees of size 5, we just ask for all the binary trees of size 4 and print them as trees of size 5:

foreach (var n in AllBinaryTrees(4))
    Console.WriteLine(TreeString(n));

and we get

{{}{}{}{}}
{{}{}{{}}}
{{}{{}}{}}
{{}{{}{}}}
{{}{{{}}}}
{{{}}{}{}}
{{{}}{{}}}
{{{}{}}{}}
{{{{}}}{}}
{{{}{}{}}}
{{{}{{}}}}
{{{{}}{}}}
{{{{}{}}}}
{{{{{}}}}}

Notice that if you remove the outer set of braces on each then we have also solved the problem of “print all the legal combinations of four sets of matched braces”! If you can enumerate all the binary trees then it turns out you can enumerate all the solutions to dozens of different equivalent problems.

Next time: what else can we generate all of?

[This is part of a series on generating every string in a language. The previous part is here. The next part is here.]

  • It's interesting that for binary trees the following are considered distinct configurations

      A             A

        \             \

         B            B

        /               \

      C                 C

    but if they are viewed as general trees they are not distinct, it's the same configuration.

    There's a distinction in the definitions given for the binary and arbitrary trees, the binary tree definition contains node position information (which also implies ordering), but the definition of general trees only contains ordering information. (without position information)

    If we were to define the the binary trees without any position information, something like : a binary tree is a tree where every node has a maximum of two child nodes (maybe as a linked list) then the number of possible configurations  is reduced by a large factor. (Intuitively I think 2n!, but I need to verify)

  • Generating all possible pictures:

    http://aboutintelligence.blogspot.com/2008/05/generating-all-possible-pictures.html

  • Eric wrote: "Binary trees have more structure than arbitrary trees! There are two binary trees with two nodes: one with the child on the right of the root, one with the child on the left of the root. There is only one arbitrary tree with two nodes; there is no difference between the "left" and "right" child."

    I think this argument depends on what function you are using to generate new nodes in the tree.

    For example, if you have a binary tree, and you are adding a new value, you could create a new node on the left if the incoming value is less than the current node value. That rule is arbitrary, however, because you could reverse that and instead create a new node on the right. Now does that constitute one binary tree, or two?

    Also, with arbitrary trees, it depends on how you define them. To say there is no difference between the left and the right child really depends on what type of arbitrary tree you are describing. If you have some type of function which determines how new nodes are created, they may in fact have more structure than a binary tree.

    For instance, if an incoming value is compared to an existing node, you could have it create a left node if the value is less, a right node if the value is greater, and a middle node if the value is equal. (would that make it a tertiary tree?)

    Anyway, you could scale this "gateway" function so that four, five, or six (or more) nodes are created. I would argue that a six-node tree would have more structure than a binary tree in this case. I haven't done the math on it, but would it possibly have 6! different configurations?

  • Oops...I meant to write "ternary tree" not "tertiary tree". And...a four-node tree would be a quadtree, etc.

Page 2 of 2 (19 items) 12