[Haskell-cafe] Re: FASTER primes

Mon Jan 4 13:16:32 EST 2010

Daniel Fischer <daniel.is.fischer <at> web.de> writes:

> 
> 
> Am Montag 04 Januar 2010 13:25:47 schrieb Will Ness:
> 
> > Euler's sieve is
> >
> >  sieve (p:ps) xs = h ++ sieve ps (t `minus` map (p*) [p,p+2..])
> >                       where (h,t) = span (< p*p) xs
> 
> Not quite. That's a variant of the postponed filters, it crosses off e.g.
> 45 twice, once as 3*15 and once as 5*9 (okay, it has already been removed by
> the first, so let's say 45 appears in two lists of numbers to be removed if
> present).

there won't be any such. whatever is removed first, is just skipped second (and 
third, etc). 45 does appear twice on the two multiples ists (3- and 5-). But it 
is "removed" by first, and skipped by second. And there's no removal here in 
the first place. There is no pre-filled storage. All there is, is some lists 
comparison, and lists are just generators (I so keep hoping).

I don't see any problem here. As Melissa (and yourself, I think) have shown, 
double hits on multiples are few and far between.

Also, it uses no filters, i.e. no individual number divisibility testing. 
The "filters" refers first of all to testing an individual number to decide 
whether to keep it in or not. Euler's sieve removes multiples in advance, so 
there's no testing and no filtering, only comparison. It follows the 
_Postponed_ Filter's framework in postponing the action until the right moment; 
the action itself is two lists comparison and skipping of the equals (i.e. 
the "minus" action).

> Euler's sieve is never attempting to remove a number more than once, that's

How's that possible? On wikipedia is says, it removes multiples of 3; then 
multiples of 5; etc. So it just looks for each number it is removing, if it is 
there, and if so, removes it. I would argue that looking whether to remove or 
not, is part of attempting to remove. It can't have foresight, right?

> the outstanding feature of it. Unfortunately, that also makes it hard to
> implement efficiently. The problem is that you can't forget the primes
> between p and p^2, you must remember them for the removal of multiples of p
> later on.

you're right, every formulation of these algorithms is always done in the 
imperative, mutable-storage setting. They always speak of "removing" numbers 
etc. 

The more pressing problem with that code is its linear structure of course 
which gets addressed by the tree-folding merge etc.

>  An (inefficient but faithful) implementation would be
> 
> euler ks@(p:rs) = p : euler (rs `minus` map (*p) ks)

I think it does exactly the same thing, computationally, except it does so, 
again,  _prematurely_  exactly in Turner's sieve's fashion - for each prime 
found, _as_ _soon_ as it is found. If it is faithful, so is my code. :)

> Its performance is really horrible though. 

makes sense, as for Turner's. See, the _when_ thing really helps to see what's 
going on here. 

> A much better implementation is
> 
> primes = 2:3:euler hprs [] (scanl (+) 5 (cycle [2,4]))
>   where
>     hprs = 5:drop 3 primes
>     euler (p:ps) acc cs = h ++ euler ps (tail (acc ++ h)) (t `minus` comps)
>       where
>         (h,t) = span (< p*p) cs
>         comps = map (*p) (acc ++ cs)

this look like {2,3} wheel reimplemented and inlined. No point in improving 
anything until the linear structure isn't turned into a tree. 

> > >
> > > To be fair, she writes:
> > >
> > > ... (This optimization does not aﬀect the time complexity of the sieve,
> > > however, so its absence from _the_ _code_ in Section 1
> > > is _not_ our cause for worry.)"
> >
> > A-HA!
> >
> > But its absense from _*that*_ _*code*_ WAS the *major* cause for worry, as
> > dealing with it worked wonders on its complexity and constant factors.
> 
> I think you're overinterpreting it. 

I might have. I don't mean to neatpick, honest. It's just how it looked to 
me: "Turner's? - bad. Squares? - won't help, it's a _minor_ thing; don't even 
go there!" etc. 

As I've said, geniuses leap, and fly. I just wanted to _walk_ down that road, 
not skipping any steps. And it turned out, the very first step _really_ pays up 
_big_. So big in fact, it might be argued that any subsequent improvement is 
just an advanced optimization, in terms of presenting an introductory code 
which isn't horribly inefficient and yet is short and clear.

> _______________________________________________
> Haskell-Cafe mailing list
> Haskell-Cafe <at> haskell.org
> http://www.haskell.org/mailman/listinfo/haskell-cafe
>