Every year my group of friends undertakes a Secret Santa gift exchange. When we started we each drew names from a hat and bought a gift for the names we drew. Being budding programmers, we soon dispensed with the hat and wrote a program to do the work for us. In 1999 a friend and I wrote a C++ app to do the work. Though we've been running it every year, the code hasn't much changed in the intervening seven years. It is getting dated and needs reworking.
Toward that end, I've been contemplating the match routine lately. The current one does something naive like:
Pick a person Choose a match at random If there is a conflict with that person, slide to the next one on the list. Once a match is found, remove those people from the relative lists and pick a new person. If a match cannot be made, start the process over.
Pick a person
Choose a match at random
If there is a conflict with that person, slide to the next one on the list.
Once a match is found, remove those people from the relative lists and pick a new person.
If a match cannot be made, start the process over.
With a small number of people and not a lot of blacklisting (husband shouldn't draw wife's name, etc.), this algorithm will work. However, for a complicated list of people, this algorithm is nondeterministic and could theoretically run forever. I was thus searching for a better algorithm. One which was faster than a complete enumeration of all options and deterministic in nature.
This weekend I took the final for my Formal Models of Computation course. (Yes, this ties in with the above--be patient) The last thing we covered was complexity classes and the concepts of P and NP. What follows is a brief description of this concept. For a more formal handling of the subject, check out the Wikipedia entry.
In theoretical computer science, they don't use real computers. Rather, they use formal models called Turing Machines. These have the same power to solve problems that modern computers do, just a bit less efficiently. They are a good proxy for real computers. The speed of these machines is measured roughly in terms of the input. So given an input of length n, a linear algorithm would take O(cn) time, where c is a constant. We usually ignore these constants and just call it O(n) time.
There is a class of problems called P or Polynomial-time which represent those problems that can be solved by a Turing machine in a time which is a polynomial of the input length. That is, O(n^2), O(n^3), ... , O(n^k). These are generally thought of as those problems that computers can efficiently solve.
There is another class of problems called NP or Nondeterministic-Polynomial-time. These represent those problems that can be solved by a nondeterministic Turing machine in polynomial time. A nondeterministic Turing machine is bascially one that can do more than one thing at once. When it comes to a fork in the algorithm, it can take both forks simultaneously.
It is assumed that NP describes a bigger universe of problems than P. That is, P !=NP. What takes nondeterministic Turing machines polynomial time takes regular Turing machines exponential time. That is, they take something like O(2^n) time.
Back to my Secret Santa problem. It was just after my final that I turned my thoughts back to solving this problem. It then hit me that what I was trying to do was impossible. There is a well-known NP-class problem which involves finding a Hamiltonian circuit. A Hamiltonian circuit is a path that traverses an entire graph by visting each node exactly one time. It turns out that this is exactly the problem I was trying to solve. Imagine my Secret Santa problem as a graph where each person is a node and there are edges between all nodes that are not blacklisted. In this view of the problem, I'm trying to find a path around the graph, visting each node once. In theoretical computer science this is known as a reduction.
This analysis pretty much dashes my chances of finding an elegant solution to the problem. There is no true solution other than brute force trying each combination. With the small number of nodes in my usual matching, this works but I still want something better. All is not lost, however. There are some techniques I can use to get close to the solution without necessarily trying all of the combinations which I intend to investigate. I'll write about them after I understand more.
Wow. I guess I did learn something practical in that class after all.