[Haskell-cafe] Re: Code and Perf. Data for Prime Finders (was: Genuine Eratosthenes sieve)

Mon Jul 23 19:44:32 EDT 2007

Dave Bayer wrote:
> Here is another prime sieve.

It's great to know people are still having fun with this stuff...   
I've added your implementation to the zipfile available at

    http://www.cs.hmc.edu/~oneill/code/haskell-primes.zip

(FWIW, I added specializations for Int and Integer and also  
primesToNth and primesToLimit).

> It is about half the length of the fastest contributed code  
> (ONeillPrimes)

I think you must be comparing it against an older version of my code  
(the zip file above now contains the most up to date version).  That  
older version actually contained two slightly different copies of the  
algorithm.  The more recent version doesn't.

FWIW, here are the statistics I see for lines of code (ignoring  
comments, blank lines, and compiler directives):

ONeillPrimes: 91 lines, 750 words, 4111 chars, 75628 bytes in .o file
BayerPrimes:  72 lines, 604 words, 2649 chars, 74420 bytes in .o file

So, I'd say the difference is at best 25% in source size and 2% in  
final binary size.

But in reality, a big chunk of my code is a general purpose heap/ 
priority-queue implementation.  If priority queue operations were  
provided as a standard part of Haskell in the same way that lists and  
BSTs are, the statistics would be:

ONeillPrimes: 47 lines, 331 words, 2039 chars

> and nearly as fast

Your results are great!  It actually beats early versions of my  
method, before I made a couple of tweaks to improve performance.

But for the current version of my code, there is still a bit of a  
performance gap between our two methods.  Here are the stats I get  
(ghc -O3, 2.4GHz x86):

                 1*10^6 | 2*10^6 | 3*10^6 | 4*10^6 | 5*10^6 | 6*10^6
                 -------+--------+--------+--------+--------+---------
   ONeillPrimes    1.36 |   3.08 |   4.98 |   6.98 |   9.05 |  11.21
   ONeillPrimes*   1.35 |   3.07 |   4.94 |   6.93 |   8.99 |  11.14
   BayerPrimes     2.18 |   4.49 |   8.99 |  11.18 |  16.60 |  25.77

The "*" version is one that uses ``calcPrimes()'' rather than  
``primes'' to provide its list of primes, and thereby avoids needless  
remembering of the whole list of primes and all the memory that entails.

>  until it blows up on garbage collection:

I think that is the biggest issue with many of the other prime  
generators.  At a glance (just looking at RSS with Unix's top  
command), your space usage seems like its about double mine.  And  
when I use ``calcPrimes()'' rather than ``primes'' I barely need any  
space at all (O(sqrt(N)) at the point where I've calculated N  
primes.  The difference there is striking -- a couple of MB vs hundreds.

Anyway, fun stuff...

     Melissa.