# [Haskell-cafe] Simple FAST lazy functional primes

Sjoerd Visscher sjoerd at w3future.com
Mon Nov 2 08:07:45 EST 2009

```You can remove the "take k" step by passing along the list of primes
smaller than p instead of k:

primes = 2 : 3 : 5 : 7 : sieve [3, 2] (drop 2 primes)
sieve qs@(q:_) (p:ps)
= [x | x<-[q*q+2,q*q+4..p*p-2], and [(x`rem`p)/=0 | p<-qs]]
++ sieve (p:qs) ps

I also removed the "x" parameter by generating it from the previous
prime. But this also means you have to start at p=5 and q=3, because
you want q*q+2 to be odd.

Sjoerd

On Nov 2, 2009, at 8:41 AM, Will Ness wrote:

>
> First, here it is:
>
> primes   = 2: 3: sieve 0 primes' 5
> primes'  = tail primes
> sieve k (p:ps) x
>         = [x | x<-[x,x+2..p*p-2], and [(x`rem`p)/=0 | p<-take k
> primes']]
>           ++ sieve (k+1) ps (p*p+2)
>
> (thanks to Leon P.Smith for his brilliant idea of directly generating
> the spans of odds instead of calling 'span' on the infinite odds
> list).
>
> This code is faster than PriorityQueue-based sieve (granted, using
> my own
> ad-hoc implementation, so YMMV) in producing the first million primes
> (testing  was done running the ghc -O3 compiled code inside GHCi).
> The relative performance vs the PQ-version is:
>
>  100,000   300,000   1,000,000  primes produced
>   0.6       0.75      0.97
>
> I've recently came to the Melissa O'Neill's article (I know, I know)
> and
> she makes the incredible claim there that the square-of-prime
> optimization
> doesn't count if we want to improve the old and "lazy"
>
> uprimes = 2: sieve [3,5..] where
>  sieve (p:xs) = p : sieve [x | x <- xs, x `mod` p > 0]
>
> Her article gave me the strong impression that she claims that the
> only way
> to fix this code is by using Priority Queue based optimization, and
> then
> goes on to present astronomical gains in speed by implementing it.
>
> Well, I find this claim incredible. First of all, the "naive" code
> fires up
> its nested filters much too early, when in fact each must be started
> only
> when the prime's square is reached (not only have its work started
> there -
> be started there itself!), so that filters are delayed:
>
> dprimes  = 2: 3: sieve (tail dprimes) [5,7..]  where
>  sieve (p:ps) xs
>         = h ++ sieve ps (filter ((/=0).(`rem`p)) (tail t))
>           where (h,t)=span (< p*p) xs
>
> This code right there is on par with the PQ-base code, only x3-4
> slower at
> generating the first million primes.
>
> Second, the implicit control structure of linearly nested filters,
> piling up
> in front of the supply stream, each with its prime-number-it-filters-
> by
> hidden inside it, needs to be explicated into a data structure of
> primes-to-filter-by (which is in fact the prefix of the primes list
> we're building itself), so the filtering could be done by one
> function call
> instead of many nested calls:
>
> xprimes  = 2: 3: sieve 0 primes' [5,7..]  where
>  primes' = tail xprimes
>  sieve k (p:ps) xs
>           = noDivs k h ++ sieve (k+1) ps (tail t)
>             where (h,t)=span (< p*p) xs
>  noDivs k = filter (\x-> all ((/=0).(x`rem`)) (take k primes'))
>
>> From here, using the brilliant idea of Leon P. Smith's of fusing
>> the span
> and the infinite odds supply, we arrive at the final version,
>
> kprimes  = 2: 3: sieve 0 primes' 5  where
>  primes' = tail kprimes
>  sieve k (p:ps) x
>           = noDivs k h ++ sieve (k+1) ps (t+2)
>             where (h,t)=([x,x+2..t-2], p*p)
>  noDivs k = filter (\x-> all ((/=0).(x`rem`)) (take k primes'))
>
> Using the list comprehension syntax, it turns out, when compiled with
> ghc -O3, gives it another 5-10% speedup itself.
>
> So I take it to disprove the central premise of the article, and to
> show
> that simple lazy functional FAST primes code does in fact exist, and
> that the PQ optimization - which value of course no-one can dispute
> - is
> a far-end optimization.
>
>
> _______________________________________________