[Haskell-cafe] will the real quicksort please stand up? (or:
sorting a > million element list)
Thomas Hartman
thomas.hartman at db.com
Mon Oct 22 18:09:36 EDT 2007
It has been noted in a few places that the 2 line quicksort demo in the
Introduction section of the haskell wiki
http://www.haskell.org/haskellwiki/Introduction
isn't a "real" quicksort, and wouldn't scale well with longer lists.
Interested, and wanting to put my recently learned test data generating
skills to the test, I put this to the test with a little suite for various
versions of quickcheck I found lying around.
My patience extends to <3 minutes for waiting for a result, which for my
tests was about the amount needed for a million integer long list. This
was definitely faster for the "treesort" below than the naive quicksort
from the wiki.
(Prelude sort, which I think is mergesort, just blew the stack.)
I wonder if there are "tricks" or lore that could be applied to get better
results or insight.
t.
{-import Control.Monad (when)
import Control.Monad.ST
import Data.Array.ST
import Data.Array.IArray
import Data.Array.MArray
import System.IO.Unsafe
import Control.Monad
-}
import Test.QuickCheck hiding (test)
import System.Random
import Data.List
{-
$ time ghci -e 'test treesort 6' quicksort.hs # a "real" quicksort,
according to roconnor
1000000
real 2m18.702s
user 1m3.296s
sys 0m0.453s
$ time ghci -e 'test qs1 6' quicksort.hs # "naive" (from the wiki intro to
haskell, not a "real" quicksort
1000000
real 4m18.174s
user 1m57.615s
sys 0m0.665s
$ time ghci -e 'test sort 6' quicksort.hs # mergesort from the prelude
*** Exception: stack overflow
real 0m13.634s
user 0m3.805s
sys 0m0.446s
hartthoma at linuxpt:~/ProjectRepos/learning/quicksort>
-}
-- For interactive development, I wound up working in hugs +S +I,
-- because ghci in emacs tended to lock up
-- hugs just segfaults, but at least the computer doesn't lock.
-- by the way, is there a way to get ghci to behave similarly to hugs?
(restricted memory mode?)
-- t1 and t2 are about the same speed for n = 4, got bored of waiting for
n great.
-- naive, "not really" quicksort according to recent reddit discussion.
test sortf n = genlist n >>= return . ( length . sortf )
-- naive
qs1 [] = []
qs1 (x:xs) = qs1 (filter (< x) xs) ++ [x] ++ qs1 (filter (>= x) xs)
-- roconnor claims that "real" quicksort in haskell is really treesort.
-- from http://programming.reddit.com/info/2h0j2/comments
--I'm talking about a deforested version of tree sort. In tree sort you
build a binary search tree by adding the head of the list to the root of
the tree, and then building binary search trees for those items less than
the root, and those items greater than the root.
qs2 l = treeSort l
treeSort l = foldTree (\x lft rht -> lft++[x]++rht) [] (mkTree l)
-- After building the search tree, it is turned into a list by folding \x
lft rht -> lft++[x]++rht.
foldTree f g Leaf = g
foldTree f g (Node x l r) = f x (foldTree f g l) (foldTree f g r)
mkTree [] = Leaf
mkTree (x:xs) = Node x (mkTree (filter (<= x) xs)) (mkTree (filter (x <)
xs))
-- If you deforest this algorithm (removing the intermediate tree
structure) you are left with
-- treeSort' [] = []
-- treeSort' (x:xs) = treeSort' (filter (<= x) xs) ++ [x] ++ treeSort'
(filter (x <) xs)
-- for testing
genlist n = rgen ( vector $ 10^n ) :: IO [Int]
rgenIntList = rgen (arbitrary :: Gen [Int]) :: IO [Int]
rgen gen = do sg <- newStdGen
return $ generate 10000 sg gen
data Tree a = Leaf | Node a (Tree a) (Tree a)
---
This e-mail may contain confidential and/or privileged information. If you
are not the intended recipient (or have received this e-mail in error)
please notify the sender immediately and destroy this e-mail. Any
unauthorized copying, disclosure or distribution of the material in this
e-mail is strictly forbidden.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://www.haskell.org/pipermail/haskell-cafe/attachments/20071022/d457f729/attachment.htm
More information about the Haskell-Cafe
mailing list