Skip
# Why bubble when you can insert?

And how do you think you got that, eh? Magical sorting fairies, eh? +2 Helm of Sorting, eh? A visit from Baron Sortram von Sortington, eh?

You buy 'em books, you buy 'em books, and they chew on the covers.

posted by middleclasstool at 9:11 PM on April 13, 2009 [20 favorites]

If you're sorting on a column with an index, it will never be unsorted.

For selecting something that doesn't have an index a regular sorting algorithm will probably be used, you can read about what MySQL (for example) does here. It's a bit more complex then what you see there, because they have to be able to sort more records then fit into memory.

posted by delmoi at 9:38 PM on April 13, 2009

Want to play a game of "who can name the slowest sort"?

posted by Chuckles at 11:44 PM on April 13, 2009

I don't think that's true for typical databases; you're simply doing a tree-based selection sort. If it is true, then you're doing an insertion sort. In either case, the cost of the sort is split between the INSERT and the SELECT operations.

abc, sleslie: I think mergesort is missing. What makes mergseort so useful, esp. for large data sets, is its extreme locality / predictability of reference (consider the classic four-way tape sort). The Wikipedia entry for it has a nice visualization similar to the one at the end of the

posted by hattifattener at 11:56 PM on April 13, 2009

Java Bug Database: Bug ID 6804124

posted by effbot at 2:03 AM on April 14, 2009

Holy crap. QS is optimal?

My professors told me time and time again that QS was

I wish I could read the linked pdf... but, it refuses to render.

posted by Netzapper at 2:18 AM on April 14, 2009

Here's my algorithm for Unicorn Sort, which is O(1):

-- have the unicorn sort the list

newlist = unicorn_sort(list)

But I puzzled a while over your algorithm for a while anyway and still didn't get it.

posted by DU at 6:46 AM on April 14, 2009 [3 favorites]

What if you had as many unicorns as you have items to sort?

posted by Combustible Edison Lighthouse at 6:49 AM on April 14, 2009

Its just a Bubble Sort but with N processors. So instead of O(N^2) you get O(N) because you have N processors...

posted by vacapinta at 6:57 AM on April 14, 2009

That's what it looks like, but I couldn't figure out the put() stuff. Maybe because I know naught of chip design.

You *always* have as many unicorns as you have items to sort.

Not even "analyze" I would say. More like "estimate". Even in the scientific(ish) computing I do, we never sit down and Analyze An Algorithm. We just kind of informally mention that one method is n

posted by DU at 7:24 AM on April 14, 2009

"The space complexity is also minimal: O(n), measured in rods of spaghetti."

Ha.

posted by grouse at 7:31 AM on April 14, 2009

The nodes are pipelined. get() receives a value, put() pushes it to the next node. In the main loop, each node keeps the max value it has seen this far, and sends the rest down the pipeline. When you've fed N values into N nodes, they're sorted.

posted by effbot at 7:36 AM on April 14, 2009

Aha! I've been using Tcl too much, put() looks like a print statement to me....

posted by DU at 7:39 AM on April 14, 2009

Obviously untrue for bubblesort.

posted by DU at 4:23 AM on April 15, 2009

Post

# Why bubble when you can insert?

April 13, 2009 8:35 PM Subscribe

This is very pretty, and a great teaching tool, but there'll always be a place in my geeky heart for

posted by thisjax at 8:47 PM on April 13, 2009 [5 favorites]

*Sorting Out Sorting*(Part 1, Part 2, or the entire thing super duper fast).posted by thisjax at 8:47 PM on April 13, 2009 [5 favorites]

No monkeysort? The cool thing about a monkeysort is that even someone who claims to be 100% ignorant of algorithms will tell you that a monkeysort is a stupid way to sort things.

posted by idiopath at 8:48 PM on April 13, 2009 [1 favorite]

posted by idiopath at 8:48 PM on April 13, 2009 [1 favorite]

This is my Big O face.

posted by cortex at 8:56 PM on April 13, 2009 [15 favorites]

posted by cortex at 8:56 PM on April 13, 2009 [15 favorites]

*I like:*

SELECT

foo

FROM

bar

ORDER BY foo DESC

SELECT

foo

FROM

bar

ORDER BY foo DESC

And how do you think you got that, eh? Magical sorting fairies, eh? +2 Helm of Sorting, eh? A visit from Baron Sortram von Sortington, eh?

You buy 'em books, you buy 'em books, and they chew on the covers.

posted by middleclasstool at 9:11 PM on April 13, 2009 [20 favorites]

A pseudocode implementation of Magical Faerie Sort was worth 10% extra credit on my CS2232 final. I got half-credit for drawing a cartoon of a pixie horking up some Befunge.

posted by cortex at 9:17 PM on April 13, 2009 [2 favorites]

posted by cortex at 9:17 PM on April 13, 2009 [2 favorites]

Sorting the Slow Way: An Analysis of Perversely Awful Randomized Sorting Algorithms [pdf]

From the wikipedia bogosort page. Wonderful.

posted by cortex at 9:21 PM on April 13, 2009

From the wikipedia bogosort page. Wonderful.

posted by cortex at 9:21 PM on April 13, 2009

Evidently Quicksort is Optimal. It's a pretty neat result, and it's surprisingly understandable for a set of presentation slides. Only took 40 years from the discovery of the algorithm, too.

posted by jedicus at 9:30 PM on April 13, 2009

posted by jedicus at 9:30 PM on April 13, 2009

ahem.

metafilter: i have very little idea of what is going one here. but i'm pretty enthralled. thanks!

posted by es_de_bah at 9:34 PM on April 13, 2009

metafilter: i have very little idea of what is going one here. but i'm pretty enthralled. thanks!

posted by es_de_bah at 9:34 PM on April 13, 2009

*And how do you think you got that, eh? Magical sorting fairies, eh? +2 Helm of Sorting, eh? A visit from Baron Sortram von Sortington, eh?*

If you're sorting on a column with an index, it will never be unsorted.

For selecting something that doesn't have an index a regular sorting algorithm will probably be used, you can read about what MySQL (for example) does here. It's a bit more complex then what you see there, because they have to be able to sort more records then fit into memory.

posted by delmoi at 9:38 PM on April 13, 2009

Does merge sort appear in "Sorting Out Sorting" under a different name? I thought it was one of the major sorting algorithms, but SOS doesn't mention it by name.

posted by abc123xyzinfinity at 9:54 PM on April 13, 2009

posted by abc123xyzinfinity at 9:54 PM on April 13, 2009

Quicksort if fine, if you're working on a single thread. But there are faster sorts for vectors, multithreaded and parallel applications.

And at the lowest level of Quicksort, Bubblesort can be faster than Insertion, if your vector register is the right size.

posted by Araucaria at 9:58 PM on April 13, 2009

And at the lowest level of Quicksort, Bubblesort can be faster than Insertion, if your vector register is the right size.

posted by Araucaria at 9:58 PM on April 13, 2009

My mind just went for a trip watching SOS... too cool.

posted by JoeXIII007 at 10:40 PM on April 13, 2009

posted by JoeXIII007 at 10:40 PM on April 13, 2009

Here's a "Sorting Out Sorting" video link from Google Video that is the complete video in one shot, at normal speed, with a spiffy MP4 download link.

posted by Mikey-San at 11:01 PM on April 13, 2009 [1 favorite]

posted by Mikey-San at 11:01 PM on April 13, 2009 [1 favorite]

Visualizing sorting algorithms another way.

posted by swift at 11:10 PM on April 13, 2009 [8 favorites]

posted by swift at 11:10 PM on April 13, 2009 [8 favorites]

*Sorting the Slow Way: An Analysis of Perversely Awful Randomized Sorting Algorithms [pdf]*

Want to play a game of "who can name the slowest sort"?

posted by Chuckles at 11:44 PM on April 13, 2009

*If you're sorting on a column with an index, it will never be unsorted*

I don't think that's true for typical databases; you're simply doing a tree-based selection sort. If it is true, then you're doing an insertion sort. In either case, the cost of the sort is split between the INSERT and the SELECT operations.

abc, sleslie: I think mergesort is missing. What makes mergseort so useful, esp. for large data sets, is its extreme locality / predictability of reference (consider the classic four-way tape sort). The Wikipedia entry for it has a nice visualization similar to the one at the end of the

*Sorting Out Sorting*video.

posted by hattifattener at 11:56 PM on April 13, 2009

I've always been impressed by timsort, which is optimized to take advantage of real-world data.

posted by reventlov at 12:11 AM on April 14, 2009 [3 favorites]

posted by reventlov at 12:11 AM on April 14, 2009 [3 favorites]

Oh so awesome! I can remember the feeling I had in class when I realized I had wasted so much time in my life when sorting.

Seriously, does anyone implement quicksort by hand? I generally use merge sort nowadays.

posted by FuManchu at 12:42 AM on April 14, 2009

Seriously, does anyone implement quicksort by hand? I generally use merge sort nowadays.

posted by FuManchu at 12:42 AM on April 14, 2009

My favorite has long been a variant of Quicksort that simply

Since most of the recursion-related time overhead in any recursive algorithm occurs in the deepest levels of recursion, this method gets rid of quite a lot of it. Insertion sort can be programmed with simple

posted by flabdablet at 12:48 AM on April 14, 2009 [2 favorites]

*stops recursing*when the sublist size becomes small (say, 5 or 6 elements), followed by a single pass of insertion sort over the entire list.Since most of the recursion-related time overhead in any recursive algorithm occurs in the deepest levels of recursion, this method gets rid of quite a lot of it. Insertion sort can be programmed with simple

*copies*instead of swaps, which makes its already very low overhead for nearly-sorted lists even lower; also, since it does no more data movement when called*once*for the entire list than it would do if called as a base-case sublist sort when the recursion bottoms out, there's no need for the overhead involved in setting up its loops to happen more than once.posted by flabdablet at 12:48 AM on April 14, 2009 [2 favorites]

"If you're getting bored, let this be a lesson to you about order n-squared sorts!"

Boy, if I had a nickel for every time I head that growing up.

Better yet would be if I got a nickel for each time I heard that growing up, each time I heard that growing up!

Oh wait that wouldn't be so much after all

posted by Bokononist at 2:02 AM on April 14, 2009

Boy, if I had a nickel for every time I head that growing up.

Better yet would be if I got a nickel for each time I heard that growing up, each time I heard that growing up!

Oh wait that wouldn't be so much after all

posted by Bokononist at 2:02 AM on April 14, 2009

*I've always been impressed by timsort, which is optimized to take advantage of real-world data.*

Java Bug Database: Bug ID 6804124

posted by effbot at 2:03 AM on April 14, 2009

*Evidently Quicksort is Optimal. It's a pretty neat result, and it's surprisingly understandable for a set of presentation slides. Only took 40 years from the discovery of the algorithm, too.*

Holy crap. QS is optimal?

My professors told me time and time again that QS was

*considered*optimal since nobody'd figured out anything empirically faster, but that no proof had been made.

I wish I could read the linked pdf... but, it refuses to render.

posted by Netzapper at 2:18 AM on April 14, 2009

This: Was cool.

The links to Sorting Out Sorting: Also cool. (How has this 90s-era CS degree never seen this video?)

The link to the alternative visualization: Also also cool.

The link from their to the Cairo graphics lib: Also also also cool.

To sum up: Cool.

posted by DU at 4:33 AM on April 14, 2009

The links to Sorting Out Sorting: Also cool. (How has this 90s-era CS degree never seen this video?)

The link to the alternative visualization: Also also cool.

The link from their to the Cairo graphics lib: Also also also cool.

To sum up: Cool.

posted by DU at 4:33 AM on April 14, 2009

I've always loved how the Spaghetti Sort demonstrates and makes use of the inherent parallel processing power of Nature.

posted by vacapinta at 4:35 AM on April 14, 2009 [1 favorite]

posted by vacapinta at 4:35 AM on April 14, 2009 [1 favorite]

No wait, I have seen SOS. Who could forget that awesome music?

posted by DU at 4:43 AM on April 14, 2009

posted by DU at 4:43 AM on April 14, 2009

Back in college, when we were studying Occam as a hardware design tool (think: silicon compilers - Occam source code goes in *here*, chip layout diagram comes out *there*), we briefly considered a sorting algorithm they called "pumpsort" or the "sort pump". In order for it to work, you need at least as many processing elements as you have items to sort. Each processing element executes the following:

-- initialize

x = -NaN

-- main loop

while(get(item))

if (item >= x) then

put(x)

x := item

else

put(item)

end

-- flush

put(x)

As you feed the unsorted items in one end, they are sorted as the pipeline fills. Then, when the input is all done, you yank the chain and flush the pipeline and the items come out all sorted in O(n) time. Ta-da!

posted by kcds at 6:26 AM on April 14, 2009 [1 favorite]

-- initialize

x = -NaN

-- main loop

while(get(item))

if (item >= x) then

put(x)

x := item

else

put(item)

end

-- flush

put(x)

As you feed the unsorted items in one end, they are sorted as the pipeline fills. Then, when the input is all done, you yank the chain and flush the pipeline and the items come out all sorted in O(n) time. Ta-da!

posted by kcds at 6:26 AM on April 14, 2009 [1 favorite]

*In order for it to work, you need at least as many processing elements as you have items to sort.*

Here's my algorithm for Unicorn Sort, which is O(1):

-- have the unicorn sort the list

newlist = unicorn_sort(list)

But I puzzled a while over your algorithm for a while anyway and still didn't get it.

posted by DU at 6:46 AM on April 14, 2009 [3 favorites]

*Here's my algorithm for Unicorn Sort, which is O(1):*

What if you had as many unicorns as you have items to sort?

posted by Combustible Edison Lighthouse at 6:49 AM on April 14, 2009

*But I puzzled a while over your algorithm for a while anyway and still didn't get it.*

Its just a Bubble Sort but with N processors. So instead of O(N^2) you get O(N) because you have N processors...

posted by vacapinta at 6:57 AM on April 14, 2009

unicorn sort, also known as quantum computing.

posted by ShadowCrash at 6:59 AM on April 14, 2009

posted by ShadowCrash at 6:59 AM on April 14, 2009

My focus of high-performance computing is more on numerical than this sort of work, but I always assumed Quicksort was a bad idea. It's not cache-friendly, it's not pipeline-friendly, it's not thread-friendly, why would you use it?

Also, here's a hint for those of you still in undergrad (or hobbyist programmers). You are taught the basic sorting algorithms so you have an understanding of how to analyze how much more complicated algorithms will perform in the real world, with real data, real software, and real bugs (yay!). You are NOT taught Quicksort, Merge Sort, and Heap Sort so that you will be able to implement your own the very first day you show up at your new job at FinanceCorp.

posted by onalark at 7:18 AM on April 14, 2009

Also, here's a hint for those of you still in undergrad (or hobbyist programmers). You are taught the basic sorting algorithms so you have an understanding of how to analyze how much more complicated algorithms will perform in the real world, with real data, real software, and real bugs (yay!). You are NOT taught Quicksort, Merge Sort, and Heap Sort so that you will be able to implement your own the very first day you show up at your new job at FinanceCorp.

posted by onalark at 7:18 AM on April 14, 2009

*Its just a Bubble Sort but with N processors*

That's what it looks like, but I couldn't figure out the put() stuff. Maybe because I know naught of chip design.

*What if you had as many unicorns as you have items to sort?*

You *always* have as many unicorns as you have items to sort.

*You are taught the basic sorting algorithms so you have an understanding of how to analyze how much more complicated algorithms will perform...*

Not even "analyze" I would say. More like "estimate". Even in the scientific(ish) computing I do, we never sit down and Analyze An Algorithm. We just kind of informally mention that one method is n

^{2}vs another that's closer to n and leave the details up the implementor (i.e. me).

posted by DU at 7:24 AM on April 14, 2009

*I've always loved how the Spaghetti Sort demonstrates and makes use of the inherent parallel processing power of Nature.*

"The space complexity is also minimal: O(n), measured in rods of spaghetti."

Ha.

posted by grouse at 7:31 AM on April 14, 2009

*That's what it looks like, but I couldn't figure out the put() stuff.*

The nodes are pipelined. get() receives a value, put() pushes it to the next node. In the main loop, each node keeps the max value it has seen this far, and sends the rest down the pipeline. When you've fed N values into N nodes, they're sorted.

posted by effbot at 7:36 AM on April 14, 2009

*...put() pushes it to the next node.*

Aha! I've been using Tcl too much, put() looks like a print statement to me....

posted by DU at 7:39 AM on April 14, 2009

Yet another sorting demo page (with a slightly different take on it).

posted by joe vrrr at 8:42 AM on April 14, 2009 [4 favorites]

posted by joe vrrr at 8:42 AM on April 14, 2009 [4 favorites]

Dude, this just rocks!

I've been bubble-sorting all my life, it's cool to see so many other methods.

posted by Westgate at 9:41 AM on April 14, 2009

I've been bubble-sorting all my life, it's cool to see so many other methods.

posted by Westgate at 9:41 AM on April 14, 2009

occam on transputers was a revelation, but my followers scattered and my cult collapsed when the coming of the T800 did not happen according to the promises of scripture. Apparently the dust devils couldn't be kept off the holy wafers.

posted by StickyCarpet at 11:43 AM on April 14, 2009

posted by StickyCarpet at 11:43 AM on April 14, 2009

Joe's link to the static visualizations was posted up-thread by swift yesterday.

I kind of agree with the take that animations are pretty but unillustrative and that the static versions have a certain stark beauty.

But still it seems to me that the static line diagrams also fail to tell the story: they're very good at showing the swaps, but most of the cost in inefficient sorts is the compares, and they're not visualized at all.

(And as time on these diagrams is measured only by swaps, his "trivial" time-based interpretations of the diagrams is meaningful only if the ratio of compares to swaps is static throughout the sort. Is this true for the algorithm's he's visualizing?)

posted by We had a deal, Kyle at 4:27 PM on April 14, 2009 [1 favorite]

I kind of agree with the take that animations are pretty but unillustrative and that the static versions have a certain stark beauty.

But still it seems to me that the static line diagrams also fail to tell the story: they're very good at showing the swaps, but most of the cost in inefficient sorts is the compares, and they're not visualized at all.

(And as time on these diagrams is measured only by swaps, his "trivial" time-based interpretations of the diagrams is meaningful only if the ratio of compares to swaps is static throughout the sort. Is this true for the algorithm's he's visualizing?)

posted by We had a deal, Kyle at 4:27 PM on April 14, 2009 [1 favorite]

*...if the ratio of compares to swaps is static throughout the sort*

Obviously untrue for bubblesort.

posted by DU at 4:23 AM on April 15, 2009

It's just occurred to my why my immediate reaction to the spaghetti sort was that it's some kind of cheat.

The O(n log n) speed limit applies only to comparison sorts. As soon as you start allowing some kind of mapping from list items to a simple linear quantity e.g. spaghetti length, you're not doing that kind of sort any more. Spaghetti sort corresponds rougly to a pigeonhole sort.

There are lots of ways to code essentially O(n) sorts without requiring parallel processing or quantum trickery or physical cuteness, if you're allowed to map sort keys to numbers.

posted by flabdablet at 6:04 PM on April 19, 2009

The O(n log n) speed limit applies only to comparison sorts. As soon as you start allowing some kind of mapping from list items to a simple linear quantity e.g. spaghetti length, you're not doing that kind of sort any more. Spaghetti sort corresponds rougly to a pigeonhole sort.

There are lots of ways to code essentially O(n) sorts without requiring parallel processing or quantum trickery or physical cuteness, if you're allowed to map sort keys to numbers.

posted by flabdablet at 6:04 PM on April 19, 2009

The other "cheat" in spaghetti sort is that, while the actual sort step is O(1), input/output is O(n).

posted by Netzapper at 11:37 PM on April 19, 2009

posted by Netzapper at 11:37 PM on April 19, 2009

« Older "Architecture is my delight, and putting up and... | The Trombone Newer »

This thread has been archived and is closed to new comments

Sorting Out Sorting. Classic computer science video with that unforgettable music.posted by Godbert at 8:47 PM on April 13, 2009 [7 favorites]