[Haskell-cafe] Re: FASTER primes

Will Ness will_n48 at yahoo.com
Wed Dec 30 04:57:40 EST 2009


Daniel Fischer <daniel.is.fischer <at> web.de> writes:

> 
> 
> Am Mittwoch 30 Dezember 2009 01:04:34 schrieb Will Ness:
> >
> > > While I haven't detected that with the primes code, I find that in my
> > > ghci your code is approximately 2.5 times faster than ONeill or Bayer
> > > when interpreted (no difference in scaling observed), while when 
> > > compiled with -O2, ONeill is approximately three times as fast as 
> > > your code
> > 
> > that was what I was getting at first too, before I've put into my code the
> > _type_signatures_ and the "specialize" _pragmas_ as per her file. Then it
> > was only 1.3x slower, when compiled (with about same asymptotics and 
> > memory usage).
> >
> 
> Specialising your code to Int makes it half as fast as ONeill here 
> (as an executable).
> That is largely due to the fact that your code uses much more memory here 
> (54MB vs. 2MB for the millionth prime), though, the MUT times have a ratio 
> of about 1.5.

I'm an unsophisticated tester. I just use

 GHC -O2 -c filename.hs
 GHCi filename

and then it says ( for primes()!!1000000 )

(8.24 secs, 1906813836 bytes) for my code, and   
(6.09 secs, 1800873864 bytes) for O'Neill's      

But now when I've looked at system resources I see this too. Well, it means
we've found where the PQ code is better. OK.


> Now an interesting question is, why does it use so much memory here?
> Can you send me your exact code so I can see how that behaves here?

will do. It's probably doing a lot more bookkeeping. Or it might be some impl
issue with scanl or span etc., and it'll go away if we'd recode it directly, 
who knows. We can only guess if we don't know the compiler in and out. That's
exactly what kept me off using the compiler. Guessing. Sheesh.

(and I did see a 2.0x speedup once when replacing one simple code snippet for 
its operationally equivalent twin).


> 
> > >and twice as fast as Bayer as an executable, about twice as fast as your
> > > code and slightly slower than Bayer in ghci.
> >
> > see, this kind of inconsistencies is exactly why I was concentrating only
> > on one platform in measuring the speed - the interp'/GHCi combination.
> 
> The problem with that is that one is primarily interested in speed for
> library functions, which are mostly used as compiled code.

right and I used it as a measure of code's "fitness" to the problem so it was
only comparative to me.

> 
> > Especially when developing and trying out several approaches, to test with
> > compiler just takes too long. :) And why should it give (sometimes) wildly
> > different readings when running inside GHCi or standalone ??
> 
> Good question.
> 
> >
> > > And I have huge memory problems in ghci with your code.
> > > That may be due to my implementation of merge and minus, though. You
> > > wrote 'standard' and I coded the straightforward methods.
> >
> > Here's what I'm using ...
> 
> More or less the same that I wrote.

The rest is exactly as I've posted it, I've only added type signatures and
specialize pragmas, as per Melissa's code,


{-# SPECIALIZE primes :: () -> [Int] #-}
{-# SPECIALIZE primes :: () -> [Integer] #-}
primes :: Integral a => () -> [a]
primes () = 2:3:5:7:primes'
   where
    primes' = [11,13] ++ drop 2 (rollFrom 11) `minus` comps
    (comps,_)  = tfold mergeSP (pairwise mergeSP mults)
    mults   = map (\p-> fromList $ map (p*) $ rollFrom p) $ primes'


maybe it's about memoization of primes'. She writes something about it in 
her code. 


> >
> > It's just that the mutating code tends to be convoluted, like in the
> > example I mentioned of quicksort. One has to read the C code with good
> > attention to understand it.
> 
> Convoluted is (often) an exaggeration. But I agree that the specification 
> of 'what' is usually easier to understand than that of 'how'.

well put.

> 
> > "Normal" Haskell is much more visually apparent, like
> >
> >   primes = 2: 3: sieve (tail primes) [5,7..]
> >    where
> >     sieve (p:ps) xs = h ++ sieve ps (t `minus` tail [q,q+2*p..])
> >                       where (h,~(_:t)) = span (< q) xs
> >                             q          = p*p
> >
> 
> Yes.
> 
> > or
> >
> >   primes = 2: 3: sieve [] (tail primes) 5
> >    where
> >     sieve fs (p:ps) x = [i | i<- [x,x+2..q-2], a!i]
> >                           ++ sieve ((2*p,q):fs') ps (q+2)
> >      where
> >       q           = p*p
> >       mults       = [ [y+s,y+2*s..q] | (s,y)<- fs]
> >       fs'         = [ (s,last ms)    | ((s,_),ms)<- zip fs mults]
> >       a           = accumArray (\a b->False) True (x,q-2)
> >                              [(i,()) | ms<- mults, i<- ms]
> >
> 
> Umm, really?
> I'd think if you see what that does, you won't have difficulties with a
> mutable array sieve.

You're right, bad example. :)


> _______________________________________________
> Haskell-Cafe mailing list
> Haskell-Cafe <at> haskell.org
> http://www.haskell.org/mailman/listinfo/haskell-cafe
> 






More information about the Haskell-Cafe mailing list