# [Haskell-cafe] Simple FAST lazy functional primes

Sjoerd Visscher sjoerd at w3future.com
Mon Nov 2 08:56:33 EST 2009

```Excuse me, 2 doesn't have to be in the list of smaller primes, as
we're only generating odd numbers:

primes = 2 : 3 : 5 : 7 : sieve [3] (drop 2 primes)
sieve qs@(q:_) (p:ps)
= [x | x<-[q*q+2,q*q+4..p*p-2], and [(x`rem`p)/=0 | p<-qs]]
++ sieve (p:qs) ps

Sjoerd

On Nov 2, 2009, at 2:07 PM, Sjoerd Visscher wrote:

> You can remove the "take k" step by passing along the list of primes
> smaller than p instead of k:
>
> primes = 2 : 3 : 5 : 7 : sieve [3, 2] (drop 2 primes)
> sieve qs@(q:_) (p:ps)
>        = [x | x<-[q*q+2,q*q+4..p*p-2], and [(x`rem`p)/=0 | p<-qs]]
>          ++ sieve (p:qs) ps
>
> I also removed the "x" parameter by generating it from the previous
> prime. But this also means you have to start at p=5 and q=3, because
> you want q*q+2 to be odd.
>
> Sjoerd
>
> On Nov 2, 2009, at 8:41 AM, Will Ness wrote:
>
>>
>> First, here it is:
>>
>> primes   = 2: 3: sieve 0 primes' 5
>> primes'  = tail primes
>> sieve k (p:ps) x
>>        = [x | x<-[x,x+2..p*p-2], and [(x`rem`p)/=0 | p<-take k
>> primes']]
>>          ++ sieve (k+1) ps (p*p+2)
>>
>> (thanks to Leon P.Smith for his brilliant idea of directly generating
>> the spans of odds instead of calling 'span' on the infinite odds
>> list).
>>
>> This code is faster than PriorityQueue-based sieve (granted, using
>> my own
>> ad-hoc implementation, so YMMV) in producing the first million primes
>> (testing  was done running the ghc -O3 compiled code inside GHCi).
>> The relative performance vs the PQ-version is:
>>
>> 100,000   300,000   1,000,000  primes produced
>>  0.6       0.75      0.97
>>
>> I've recently came to the Melissa O'Neill's article (I know, I
>> know) and
>> she makes the incredible claim there that the square-of-prime
>> optimization
>> doesn't count if we want to improve the old and "lazy"
>>
>> uprimes = 2: sieve [3,5..] where
>> sieve (p:xs) = p : sieve [x | x <- xs, x `mod` p > 0]
>>
>> Her article gave me the strong impression that she claims that the
>> only way
>> to fix this code is by using Priority Queue based optimization, and
>> then
>> goes on to present astronomical gains in speed by implementing it.
>>
>> Well, I find this claim incredible. First of all, the "naive" code
>> fires up
>> its nested filters much too early, when in fact each must be
>> started only
>> when the prime's square is reached (not only have its work started
>> there -
>> be started there itself!), so that filters are delayed:
>>
>> dprimes  = 2: 3: sieve (tail dprimes) [5,7..]  where
>> sieve (p:ps) xs
>>        = h ++ sieve ps (filter ((/=0).(`rem`p)) (tail t))
>>          where (h,t)=span (< p*p) xs
>>
>> This code right there is on par with the PQ-base code, only x3-4
>> slower at
>> generating the first million primes.
>>
>> Second, the implicit control structure of linearly nested filters,
>> piling up
>> in front of the supply stream, each with its prime-number-it-
>> filters-by
>> hidden inside it, needs to be explicated into a data structure of
>> primes-to-filter-by (which is in fact the prefix of the primes list
>> we're building itself), so the filtering could be done by one
>> function call
>> instead of many nested calls:
>>
>> xprimes  = 2: 3: sieve 0 primes' [5,7..]  where
>> primes' = tail xprimes
>> sieve k (p:ps) xs
>>          = noDivs k h ++ sieve (k+1) ps (tail t)
>>            where (h,t)=span (< p*p) xs
>> noDivs k = filter (\x-> all ((/=0).(x`rem`)) (take k primes'))
>>
>>> From here, using the brilliant idea of Leon P. Smith's of fusing
>>> the span
>> and the infinite odds supply, we arrive at the final version,
>>
>> kprimes  = 2: 3: sieve 0 primes' 5  where
>> primes' = tail kprimes
>> sieve k (p:ps) x
>>          = noDivs k h ++ sieve (k+1) ps (t+2)
>>            where (h,t)=([x,x+2..t-2], p*p)
>> noDivs k = filter (\x-> all ((/=0).(x`rem`)) (take k primes'))
>>
>> Using the list comprehension syntax, it turns out, when compiled with
>> ghc -O3, gives it another 5-10% speedup itself.
>>
>> So I take it to disprove the central premise of the article, and to
>> show
>> that simple lazy functional FAST primes code does in fact exist, and
>> that the PQ optimization - which value of course no-one can dispute
>> - is
>> a far-end optimization.
>>
>>
>> _______________________________________________
>
> --
> Sjoerd Visscher
> sjoerd at w3future.com
>
>
>
> _______________________________________________