Do beginners even need to learn sorting algorithms? Let’s face it most modern programming environments have a sort function built in. In the so-called real world just about the only people who write sorting algorithms have PhDs in math or computer science and have likely published papers in peer reviewed journals on the sorting subject. So why take up time in a first programming course to talk about sorting? Because its good for them! Well sort of.
I was looking through the first textbook I wrote some 11 years ago and finding that it include both the bubble and selections sorts. Now these are terribly performing sorts. No one uses these. Quick sort perhaps. Maybe Insertion sort. Bubble sort? What was I thinking! As best I can recall I was thinking about helping explain how arrays worked with loops. It was more about developing an algorithm and seeing how it worked with real (or realish) data. Oh and if I recall correctly, the publisher asked me to make sure I covered sorting and searching. But maybe that is me trying to blame someone else. In a more advanced course one would use different sorting algorithms to learn about algorithm complexity, performance and probably also discuss Big-O notation. That works as a good excuse to talk about different sorting algorithms for many.
Students understand what sorting is all about. Well they don’t understand all of what it is about but at least if you ask students to sort data by hand they can usually do it. In fact asking students to sort data, perhaps on index cards, and asking them to think through the process they use can be very helpful. This helps them understand the complexity. The value though is really about understanding algorithm creation and translating that into code. Having to involved a number of important concepts like arrays, loops, and decision structures is also a good thing.
I think today I would teach the Quick Sort. It has all the goodness of other sort PLUS it involves recursion. Recursion is still a bit of magic to me. What it has taken me a while to really appreciate recursion I do now. The typical recursion example is really looping – things like calculating the Fibonacci Sequence. That’s not a bad thing but an application that requires recursion is better, my opinion, for getting students to really appreciate its power.
But I wonder if there are better algorithms to use as teaching tools? We, the teaching community, should probably be looking for some. Perhaps some that students are actually likely to have to implement some day. What do you think? Are sort algorithms essential? Why? What sort or sorts do you think students should learn? Or is there a better problem that solves the same issues in learning programming and algorithm development?
Many years ago I made this observation - want to reduce the size of your Programming II class? Teach sorting algorithms in Programming I.
Agree with your comment that the sorting problem helps students think about algorithms. Everyone understands what the problem is, so you spend zero time explaining that. Plus, they all THINK they know how to solve it. But ask them to write their detailed algorithm and turn that into a program -- they'll appreciate how difficult it can be to be very specific in recording the steps for a process they THOUGHT they fully understood! It's also a good exercise for beginners in how to model and represent the problem as a simple collection.
So we're not just teaching "sorting". Rather, we're really teaching algorithms, and sorting is just the easy-to-understand problem domain. Given this purpose, it makes perfect sense to start with the more intuitive bubble and insertion sorts first. Those are the algorithms that most beginning students will come up with in their first try. Then, once you challenge them by saying it costs $10 each time you swap two numbers, now you're motivating them to look for a better solution!
Also remember we have no reason to call it "Quick Sort" until we understand why bubble / insertion / etc. are the "Slow Sorts". Quick Sort is "quick" compared to what?
I like teaching searching and sorting because these are problems everyone knows, everyone understands how it's relevent to computers (i.e. sorting names in Excel, showing the high scores on a video game, web searches in Google -- sorry, in Bing :-) ). And it's quite easy to demonstrate (VISUALLY!) the big-Oh relationships -- for example, linear search O(n) vs. binary search O(log n) vs. hash table O(1)). And I think the basic algorithm-efficiency concepts are really important to CS, whether or not you present it using the formal big-oh notation.
Many stuff mentioned here are good stuff.
Only one I have reservations on is on benefits of recursion.
It may just be a personal thing, but I'd prefer to stay away from recursion as it eats up program stack space.
90% of cases, it's not even a problem.
It's only a problem when handling really huge amounts of data...
If you want to teach recursion in the abstract, go ahead. If you want to teach top-down or object-oriented programming in the abstract, go ahead. Use whatever examples you find apposite. The question is really whether the examination of specific algorithms provides a benefit to the student.
Computers (apart from the peripherals they can control) are useful for two and only two reasons:
1. Their ability to do arithmetic;
2. Their ability to store and retrieve large amounts of data.
Everything else is a consequence of those two powers. As a rule, students best come to appreciate the second of those two powers by studying access methods -- which invariably involves the study of searching and sorting methods. Despite the many advances in "prefab" kits of data manipulation and access methods these past two decades, a true understanding of what's possible in a computer still requires an understanding of searching, sorting, and how they synergize.
Why yes, I DO believe introductory CS students should be required to learn a machine language! However did you guess?
The only sorting algorithm I can remember is bubble sort. I've learned all sorts of sorting algorithms while also learning Big-O notation.
For instance, I can't reproduce the quick sort algorithm without looking somewhere else. I believe that something is worth teaching if it's going to be remembered.
Sorting algorithms are too specific the be remembered and given attention to. It can help if someone is actually gonna do some work with it, but that's rare. They are however good tools to help teaching something else, like data structures or as you said, the Big-O notation. They will be forgotten, but the concepts they require to be implemented will not. But then, it could be any other type of algorithm and it would have the same effect: "Imprinting fundamental concepts to the student's brain".
We teach 7 sorts of sorts to illustrate that there is more than one way to skin a category. But recursion is worse than useless: an approach that exists solely to show how clever CS professors are. There is never any reason to solve a problem recursively, and the non-recursive algorithms tend to be much cooler anyway.
Tree sorts are a good one to teach: an efficient algorithm for these is minimalist to the point of Zen, and it's also highly recursive. Of course, once the students get the basics down, you then teach them how to rebalance the tree... then you go for B-trees... and so on... there's a lifetime of fiddling and optimization into which tree sorts can be extended, and in the meanwhile, you get to learn a whole lot of programming and algorithmic techniques.
Teaching Quick Sort as the only way to sort is potentially dangerous. It is a well known fact that the Quick Sort algorithm sometimes will run in N squared time if given already nearly sorted data. This could have disastrous consequences if it happened unexpectedly in a mission critical application.
Consider the case of an already sorted array, which contains a list of drivers and their current position in an auto race, such as the Indy 500. Every now and then a driver will overtake another driver, requiring that the rank table be resorted. As long as only a few elements are out of place, and even then only by a few slots, it is it quite possible that in this case that either Bubble Sort or straight insertion will compete favorably as the fastest possible sort - whereas Quick Sort will quite likely choke and die if it isn’t running on a fast system - much to the annoyance of the television sports reporters and fans.
Now let's up the stakes. In pretty much any online stock trading environment, there is usually a nice big list of biggest gainers and losers of the day, most shares traded, etc, at any given time. It's kind of nice to see this kind of list updated in real time. Of course, ideally, given that a lot of people are possibly looking at the data, it might be worth while to re sort the list after every tick of the market, just in case something has moved. Again, Quick Sort fails miserably, and Bubble sort works admirably, because just as long as we resort previously sorted list after every trade, or few trades, we stay in good shape.
Now it would be unthinkable of course, to imagine a single unified system that attempts to track every single aircraft in the sky at any one time, its position, its destination, its anticipated arrival time, and of course, predicted airspace conflicts based on flight plan, current airspeed, etc - and where that that system relies heavily on either Quick Sort, or Insertion Sort or any other basic sorting algorithm as it is taught; for just those reasons discussed. Just imagine if a single change in flight plan, or whatever caused the master air traffic database to have to be re sorted over and over again by a O(N log N) which turns out to be O(N^2). Yeah, better hope that the list that we are trying to update isn’t "aircraft which are getting dangerously close to each other.”
IMHO - Quick sort, insertion sort, bubble sort and all of the rest have their places, usually in code that is under development, but not quite ready to ship. Otherwise I would prefer to look at the types of data on a case by case basis, and then to consider based on the nature of the data, where it comes from, how it might evolve over time, etc, I would decide whether or not to use one of the standard sorts, or whether or to use something more specialized, like a radix sort, or a heap sort, etc. Customizing a sort algorithm to an application, takes some work however, especially when you start contemplating using a variety of techniques like using radixes to hash the keys into multiple heaps, etc; and thus the standard libraries have this thing called Quick Sort, which can be quickly be put to the task during application prototyping.
After that - beware and be aware.
Sorting is not just about sorting. It is a convenient place to explain some elementary algorithmic techniques such as array traversal, nested looping, recursion, in-place processing... And also to experience the basics of creating bugs by array overflow, wrong loop termination criteria, data overwriting...
So easy to state the problem, so easy to exercise creativity. An excellent opportunity to show students that it is allowed (and beneficial) to use one's brain!
I can see my idea of "beginners" is very differents from most commenters here. My idea of "beginners" are kids that would get something out of the Computer Science Unplugged sorting lessons. Now those are fun and won't scare a beginner.
Found exactly what i needed for my sorting class...
Excuse me, Dr. Fred?! Recursion is "worse than useless"? Recursion is often an elegant solution to a problem such as traversing hierarchical structures. How would you calculate N!, hmm? Recursion works well. You'd better take another look a recursion. It sounds like you got your nose out of joint about this technique.
Yikes! The insertion/selection sorts are only trivially more complex than a bubble sort; and I'd think either was almost as easy to recreate from scratch. For a higher performing sort, without access to any external references I'd probably go with a merge sort; I find it much simpler conceptually and it's only major liability vs quicksort is that it needs 2n memory to sort an array of size n.