[Haskell-cafe] (flawed?) benchmark : sort

Tue Mar 4 12:47:27 EST 2008

2008/3/4, Krzysztof Skrzętnicki <gtener at gmail.com>:
> Hi
>
> I was playing with various versions of sorting algorithms. I know it's very
> easy to create flawed benchmark and I don't claim those are good ones.
> However, it really seems strange to me, that sort - library function - is
> actually the worse measured function. I can hardly belive it, and I'd rather
> say I have made a mistake preparing it.
>
> The overall winner seems to be qsort_iv - which is nothing less but old sort
> replaced by mergesort now.
>
> Any clues?

Part of what you may be missing :
-- cut here --
module Main where

import Control.Parallel.Strategies
import Control.Arrow
import System.CPUTime
import System.IO
import System.Environment
import System.Random
import Data.List( partition, sort )

data Tree a =
    Node (Tree a) a (Tree a)
        | Leaf

qsort_i []  = []
qsort_i (x:xs) = qsort_i (filter (< x) xs) ++ [x] ++ qsort_i (filter (>= x) xs)

qsort_ii [] = []
qsort_ii (x:xs) = let (ls,gt) = partition (< x) xs in qsort_ii ls ++
[x] ++ qsort_ii gt

qsort_iii xs = qsort_iii' [] xs
qsort_iii' acc [] = acc
qsort_iii' acc (x:xs) =
    let (ls,gt) = partition (< x) xs in
    let acc' = (x:(qsort_iii' acc gt)) in qsort_iii' acc' ls

qsort_v [] = []
qsort_v (x:xs) = let (xlt, xgt ) = foldl (\ (lt,gt) el -> case compare x el of
                                                            GT -> (el:lt, gt)
                                                            _  -> (lt,
el:gt) ) ([],[]) xs
                 in qsort_v xlt ++ [x] ++ qsort_v xgt

-- zmodyfikowany i
qsort_vi [] = []
qsort_vi (x:xs) = qsort_vi (filter (\y-> compare x y == GT) xs) ++ [x]
++ qsort_vi (filter (\y-> compare x y /= GT) xs)

-- zmodyfikowany iii
qsort_vii xs = qsort_vii' [] xs
qsort_vii' acc [] = acc
qsort_vii' acc (x:xs) =
    let (ls,gt) = partition (\y-> compare x y == GT) xs in
    let acc' = (x:(qsort_vii' acc gt)) in qsort_vii' acc' ls

-- qsort is stable and does not concatenate.
qsort_iv xs = qsort_iv' (compare) xs []

qsort_iv' _   []     r = r
qsort_iv' _   [x]    r = x:r
qsort_iv' cmp (x:xs) r = qpart cmp x xs [] [] r

-- qpart partitions and sorts the sublists
qpart cmp x [] rlt rge r =
    -- rlt and rge are in reverse order and must be sorted with an
    -- anti-stable sorting
    rqsort_iv' cmp rlt (x:rqsort_iv' cmp rge r)
qpart cmp x (y:ys) rlt rge r =
    case cmp x y of
    GT -> qpart cmp x ys (y:rlt) rge r
    _  -> qpart cmp x ys rlt (y:rge) r

-- rqsort is as qsort but anti-stable, i.e. reverses equal elements
rqsort_iv' _   []     r = r
rqsort_iv' _   [x]    r = x:r
rqsort_iv' cmp (x:xs) r = rqpart cmp x xs [] [] r

rqpart cmp x [] rle rgt r =
    qsort_iv' cmp rle (x:qsort_iv' cmp rgt r)
rqpart cmp x (y:ys) rle rgt r =
    case cmp y x of
    GT -> rqpart cmp x ys rle (y:rgt) r
    _  -> rqpart cmp x ys (y:rle) rgt r

-- code by Orcus

-- Zadanie 9 - merge sort
mergeSort :: Ord a => [a] -> [a]
mergeSort []    = []
mergeSort [x]   = [x]
mergeSort xs    = let(l, r) = splitAt (length xs `quot` 2) xs
                  in mergeSortP (mergeSort l) (mergeSort r)

-- funkcja pomocnicza scalajÄ…ca dwie listy uporzÄ…dkowane w jednÄ…
mergeSortP :: Ord a => [a] -> [a] -> [a]
mergeSortP xs []            = xs
mergeSortP [] ys            = ys
mergeSortP (x:xs) (y:ys)
    | x <= y = x:(mergeSortP xs (y:ys))
    | otherwise = y:(mergeSortP (x:xs) ys)

-- Zadanie 10 - tree sort
treeSort :: Ord a => [a] -> [a]
-- pointless po raz drugi
treeSort = (treeSortInorder . treeSortToTree)

treeSortToTree :: Ord a => [a] -> Tree a
treeSortToTree []       = Leaf
treeSortToTree (x:xs)   = let (xlt, xgt) = foldl (\ (lt,gt) el -> case
compare x el of
                                                            GT -> (el:lt, gt)
                                                            _  -> (lt,
el:gt) ) ([],[]) xs
                          in Node (treeSortToTree xlt) x (treeSortToTree xgt)

treeSortInorder :: Ord a => Tree a -> [a]
treeSortInorder Leaf            = []
treeSortInorder (Node l x r)    = (treeSortInorder l) ++ [x] ++
(treeSortInorder r)

-- end code by Orcus

-- begin benchmark making code

makeBenchs benchs xs = do
  let (funcNames, funcs) = unzip benchs
  tBegin <- getCPUTime
  timers <- mapM (\f-> print (f xs) >> getCPUTime) funcs
  let times = zipWith (-) timers (tBegin:timers)
      sortedResults = sort $ zip times funcNames
      minT = fromIntegral $ fst $ head sortedResults
      scaled = map (((/minT) . fromIntegral) *** id) sortedResults
  hPutStr stderr $ unlines $ map show scaled

onRandom eltCnt = do
  gen <- getStdGen
  let xs = take eltCnt (randomRs (1::Int, bigNum) gen) `using` rnf
  xs `seq` return xs

onSorted eltCnt = do
  gen <- getStdGen
  let xs = take eltCnt (randomRs (1::Int, bigNum) gen) `using` rnf
      sxs = sort xs `using` rnf
  xs `seq` sxs `seq` return sxs

bigNum = 1000000 :: Int

-- end of benchmark making code

main = makeBenchs [("i",qsort_i),
                   ("ii",qsort_ii),
                   ("iii",qsort_iii),
                   ("iv",qsort_iv),
                   ("v",qsort_v),
                   ("vi",qsort_vi),
                   ("vii",qsort_vii),
                   ("sort",sort),
                   ("mergeSort",mergeSort),
                   ("treeSort",treeSort)]
            =<< onSorted . read . head =<< getArgs

-- cut here --

It could probably be improved (with classics solution (better
selection of the pivot...)), but the mergesort is only 3 times slower
in worse case, and much more regular, if someone needs a faster sort
in a specific case, it isn't hard to code.

-- 
Jedaï