[Haskell-cafe] A tale of Project Euler

Tue Nov 27 15:53:42 EST 2007

On Nov 27, 2007 2:34 PM, Andrew Coppin <andrewcoppin at btinternet.com> wrote:

> Hi guys.
>
> Somebody just introduced me to a thing called "Project Euler". I gather
> it's well known around here...
>
> Anyway, I was a little bit concerned about problem #7. The problem is
> basically "figure out what the 10,001st prime number is". Consider the
> following two programs for solving this:
>
> First, somebody else wrote this in C:
>
> int n = 2 , m , primesFound = 0;
>
> for( n=0;n < MAX_NUMBERS;n++ )
> if( prime[n] )
> {
>  primesFound++;
>
>  if( primesFound == 10001 )
>   cout << n << " is the 10001st prime." << endl;
>
>  for(m=2*n;m<MAX_NUMBERS;m+=n)
>   prime[m]=false;
> }
>
> Second, my implementation in Haskell:
>
> primes :: [Integer]
> primes = seive [2..]
>    where
>      seive (p:xs) = p : seive (filter (\x -> x `mod` p > 0) xs)
>
> main = print (primes !! 10000)
>
> Well, we know who's winning the beauty contest. But here's the worrying
> thing:
>
>  C program: 0.016 seconds
>  Haskell program: 41 seconds
>
> Eeeps! o_O That's Not Good(tm).
>
> Replacing "primes :: [Integer]" with "primes :: [Word32]" speeds the
> Haskell version up to 18 seconds, which is much faster but still a joke
> compared to C.
>
> Running the Haskell code on a 2.2 GHz AMD Athlon64 X2 instead of a 1.5
> GHz AMD Athlon drops the execution time down to 3 seconds. (So clearly
> the box I was using in my lunch break at work is *seriously* slow...
> presumably the PC running the C code isn't.)
>
> So, now I have a Haskell version that's "only" several hundred times
> slower. Neither program is especially optimised, yet the C version is
> drastically faster. This makes me sad. :-(
>
> By the way... I tried using Data.List.Stream instead. This made the
> program about 40% slower. I also tried both -fasm and -fvia-c. The
> difference was statistically insignificant. (Hey guys, nice work on the
> native codegen!) The difference in compilation speed was fairly drastic
> though, needless to say. (The machine is kinda low on RAM, so trying to
> call GCC makes it thrash like hell. So does linking, actually...)
>
>
> Also, I'm stuck with problem #10. (Find the sum of all primes less than
> 1 million.) I've let the program run for well over 15 minutes, and still
> no answer is forthcomming. It's implemented using the same primes
> function as above, with a simple filter and sum. (The type has to be
> changed to [Word64] to avoid overflowing the result though.) The other
> guy claims to have a C solution that takes 12 ms. There's a hell of a
> difference between 12 ms and over half an hour...(!) Clearly something
> is horribly wrong here. Uh... help?
>

The algorithm you use to compute primes is actually quite inefficient, and
in particular, it is NOT the same algorithm that the C program is using,
despite first appearances!  The call to 'filter' in the sieve function works
by checking *every* number for divisibility by p, and only keeping those
that aren't divisible by p; whereas the C program simply marks multiples of
p as non-prime, skipping over those which are not multiples.  So the Haskell
version basically searches for primes by doing trial division on every
single integer, whereas the C version is actually using a sieve.

For more information you might want to read this paper, which includes a
much better primes implementation:

www.cs.hmc.edu/~*oneill*/papers/*Sieve*-JFP.pdf

In fact, this very same topic was discussed on the list a while back, you
can read it here:

http://thread.gmane.org/gmane.comp.lang.haskell.cafe/19699/focus=19798

-Brent
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://www.haskell.org/pipermail/haskell-cafe/attachments/20071127/417b0612/attachment-0001.htm