[Haskell-cafe] Re: (flawed?) benchmark : sort

Tue Mar 11 02:34:46 EDT 2008

ok:
> 
> On 11 Mar 2008, at 12:27 pm, Krzysztof Skrzętnicki wrote:
> 
> >I've written little framework to work on. See sortbench.hs and  
> >sortbench.py attachments.
> >Furthermore, I checked Yhc's implementation of sort: it is indeed  
> >very fast:
> 
> I took his earlier code and plugged yhc's sort into it.
> Compiling with ghc -O2 using GHC 6.8.2, I found the yhc code (basically
> variant of natural merge) to be considerably slower than some of the
> alternatives.
> 
> There is a pretty obvious way to speed up the YHC code which you would
> expect to provide nearly a factor of two speedup, and with the random
> integer data, it does.
> 
> However, there is one simple but extremely important point which must be
> considered in evaluating a sorting routine:  the library 'sort' function
> is, or should be, a *general-purpose* sort.  It should be useful with  
> any
> data type which is an instance of Ord or for which you can write a `cmp`
> function, and it should work at least as well with already-sorted input
> as with randomised input.  quicksort (whose original reason for  
> existence
> was to sort on a machine whose memory would disgrace today's  
> wristwatches)
> is well known for doing deceptively well on randomised integer  
> sequences.
> 
> When I run Krzystztof's test harness (which I have currently brought  
> up to
> 25 different sorting functions) with a list of the form [1..N] instead  
> of
> a random list, suddenly all the variants of merge sort come out ahead of
> all the variants of quick sort.  In fact his best version of quicksort,
> qsort_iv, comes out fully 1155 times slower than the YHC algorithm on a
> list of 10,000 ordered integers.  That can be improved by spending a bit
> of effort on choosing a good pivot, but then the quicksort algorithms  
> are
> no longer so competitive for randomised inputs.
> 
> The classic "Engineering a Quicksort" paper by Bentley and McIlroy from
> Software : Practice & Experience recommends a whole bunch of  
> distribution
> shapes (one run, two runs, sawtooth, organ pipes, random, ...) that  
> should
> be benchmarked before drawing too many conclusions.
> 
> It is also wise to try more than one data type.  How do the different
> algorithms compare on random samples from a Scrabble dictionary?  (Why
> that particular question?  Because I mean to try it.)
> 
> Right now, I remain happy with merge sort, because it is never  
> mysteriously
> several thousand times slower than expected.

Do you have these tests bundled up in a repository? It would be very
useful to drop these into the base library testsuite, so we can point to
this when needed.

-- Don