[Haskell-cafe] Re: Code and Perf. Data for Prime Finders (was:
Genuine Eratosthenes sieve)
Dave Bayer
bayer at cpw.math.columbia.edu
Mon Jul 23 16:54:08 EDT 2007
As an exercise, trying to understand the beautiful paper
Stream Fusion
From Lists to Streams to Nothing at All
Duncan Coutts, Roman Leshchinskiy and Don Stewart
http://www.cse.unsw.edu.au/~dons/papers/CLS07.html
http://www.cse.unsw.edu.au/~dons/streams.html
I recoded my prime sieve using a pared down version of their Stream
datatype; this is the simplest version I could write that achieves a
significant speedup.
My reaction to their paper was, if streams are better internally than
lists, why not code directly in streams? Lists enjoy a serious
notational advantage in Haskell, but one could imagine a language
where the list notation was reserved for stream semantics.
My sieve was spending half its time in "merge", so I made only the
changes necessary to convert "merge" to use streams. My streams are
infinite, and "merge" can be written to not use Skip, so Step goes away.
Even though "nextx" and "nexty" only have one case now, using case
statements is significantly faster than using let or where clauses.
I'm imagining that I read about this somewhere, but if I did, it
didn't sink in until I was tuning this code. I don't know if this is
related to fusion optimization, or a general effect.
The timings are
> [Integer] -O2 1*10^6 2*10^6 3*10^6 4*10^6 5*10^6
> ---------------------------------------------------------
> ONeillPrimes | 3.338 | 7.320 | 11.911 | 18.225 | 21.785
> StreamPrimes | 3.867 | 8.405 | 13.656 | 21.542 | 37.640
> BayerPrimes | 3.960 | 8.940 | 18.528 | 33.221 | 38.568
Here is the code:
> {-# OPTIONS_GHC -fglasgow-exts #-}
>
> module StreamPrimes (primes) where
>
> -- stream code
>
> data Stream a = forall s. Stream (s -> (a,s)) s
> data AStream a = A a (AStream a) | B (Stream a)
>
> stream :: [a] -> Stream a
> stream xs = Stream next xs
> where
> next [] = undefined
> next (x:xt) = (x,xt)
>
> astream :: [a] -> AStream a
> astream [] = undefined
> astream (x:xt) = A x $ B $ stream xt
>
> merge :: Ord a => Stream a -> Stream a -> Stream a
> merge (Stream nextx vs) (Stream nexty ws) =
> Stream next (vt,ws,Left v)
> where
> (v,vt) = nextx vs
> next (xs,ys,Left x) =
> case nexty ys of
> (y,yt) ->
> if x < y
> then (x,(xs,yt,Right y))
> else (y,(xs,yt,Left x))
> next (xs,ys,Right y) =
> case nextx xs of
> (x,xt) ->
> if x < y
> then (x,(xt,ys,Right y))
> else (y,(xt,ys,Left x))
>
> mergeA :: Ord a => AStream a -> AStream a -> AStream a
> mergeA (A x xt) ys = A x (mergeA xt ys)
> mergeA (B xs) ys = mergeB xs ys
>
> mergeB :: Ord a => Stream a -> AStream a -> AStream a
> mergeB s@(Stream next xs) ys@(A y yt) =
> case next xs of
> (x,xt) ->
> if x < y
> then A x (mergeB (Stream next xt) ys)
> else A y (mergeB s yt)
> mergeB xs (B ys) = B $ merge xs ys
>
> -- Code for venturi :: Ord a => [[a]] -> [a]
>
> root :: Ord a => AStream a -> [AStream a] -> [a]
> root (A x xt) yss = x : (root xt yss)
> root (B xs) (ys:yst) = root (mergeB xs ys) yst
> root _ _ = undefined
>
> pair :: Ord a => [AStream a] -> [AStream a]
> pair (x:y:xt) = mergeA x y : (pair xt)
> pair _ = undefined
>
> group :: Ord a => [AStream a] -> [AStream a]
> group (x:xt) = x : (group $ pair xt)
> group _ = undefined
>
> venturi :: Ord a => [[a]] -> [a]
> venturi (x:xt) = root (astream x) $ group $ map astream xt
> venturi _ = undefined
>
> -- Code for primes :: Integral a => [a]
>
> diff :: Ord a => [a] -> [a] -> [a]
> diff xs@(x:xt) ys@(y:yt) = case compare x y of
> LT -> x : (diff xt ys)
> EQ -> (diff xt yt)
> GT -> (diff xs yt)
> diff _ _ = undefined
>
> trim :: Integral a => a -> [a] -> [a]
> trim p = let f m x = mod x m /= 0 in filter (f p)
>
> seed :: Integral a => [a]
> seed = [2,3,5,7,11,13,17]
>
> wheel :: Integral a => [a]
> wheel = drop 1 [ m*j + k | j <- [0..], k <- ws ]
> where m = foldr1 (*) seed
> ws = foldr trim [1..m] seed
>
> multiples :: Integral a => [a] -> [[a]]
> multiples ws = map fst $ tail $ iterate g ([], ws)
> where g (_,ps@(p:pt)) = ([ m*p | m <- ps ], trim p pt)
> g _ = undefined
>
> primes :: Integral a => [a]
> primes = seed ++ (diff wheel $ venturi $ multiples wheel)
More information about the Haskell-Cafe
mailing list