[Haskell-cafe] A tale of Project Euler

Daniel Fischer daniel.is.fischer at web.de
Wed Nov 28 17:56:06 EST 2007


Am Mittwoch, 28. November 2007 22:31 schrieb Andrew Coppin:
> There are problems for which it's important to know how many times a
> given prime factor occurs. And there are other problems where it is
> merely necessary to know which primes are factors. I would say it's
> useful to have *both* functions available...

Yup
>
>
>
> By the way, I'm now using a new and much uglier prime seive:
>
> size = 1000000 :: Word64
>
> calc_primes :: IO [Word64]
> calc_primes = do
>     grid <- newArray (2,size) True
>     seive 2 grid
>   where
>     seive :: Word64 -> IOUArray Word64 Bool -> IO [Word64]
>     seive p g = do
>       mapM_ (\n -> writeArray g n False) [p, 2*p .. size]
>       mp' <- next (p+1) g
>       case mp' of
>         Nothing -> return [p]
>         Just p' -> do
>           ps <- seive p' g
>           return (p:ps)
>
>     next :: Word64 -> IOUArray Word64 Bool -> IO (Maybe Word64)
>     next p g = do
>       if p == size
>         then return Nothing
>         else do
>           t <- readArray g p
>           if t
>             then return (Just p)
>             else next (p+1) g
>
> Seems moderately fast. At least, I can find the 10,001st prime in under
> 1 second... 

One thing: since You check the array bounds, the system needn't check them 
again, use unsafeWrite and unsafeRead. And use Int for the index, that would 
be MUCH faster.
Another thing: you can stop sieving when p*p > size, another speedup
Third thing: In my experience mapM_ is relatively slow, hand coded loops are 
faster (though much uglier), but YMMV
Fourth thing: sieve only odd numbers
Fifth thing: better use an STUArray, don't drag IO in if it's not necessary.

> The obvious downside of course is that you must know how big
> to make the array at the start of your program, and you must compute the
> entire array before you can use any of it. Both limitations not precent
> in the lazy recursive list implementation...
>
The size of the array can easily be estimated by the prime number theorem,
pi(x) ~ x/log x, where pi(x) is the number of primes not exceeding x, so for 
10,001 primes, find x with x/log x about 10,000, add a bit for safety, a 
bound of 120,000 will do.
If you want the n-th prime, you don't want to use the smaller primes anyway, 
if you need all primes, chances are that prime generation will take 
negligible time in comparison to your main algo anyway.

Keep enjoying Project Euler,
Daniel


More information about the Haskell-Cafe mailing list