[Haskell-cafe] Re: Optimization fun

apfelmus at quantentunnel.de apfelmus at quantentunnel.de
Sat Feb 10 17:00:27 EST 2007


Creighton Hogg wrote:
> Hello Haskell-ers,
> So a friend and I were thinking about making code faster in Haskell, and I
> was wondering if there was a way to improve the following method of
> generating the list of all prime numbers.  It takes about 13 seconds to
> run, meanwhile my friend's C version took 0.1.
> I'd love to learn a bit more about how to optimize Haskell code.

Of course, the best optimization is a better algorithm. In case this is
what you're after, have a look at

   Colin Runciman, Lazy Wheel Sieves and Spirals of Primes
   http://citeseer.ist.psu.edu/runciman97lazy.html

While Haskell makes it possible to express very complicated algorithms
in simple and elegant ways, you have to expect to pay a constant factor
(roughly 2x-10x) when competing against the same algorithm in low-level C.

> -- Naive way to calculate prime numbers, testing each new n to see if it
> has
> prime factors less than sqrt(n).
> import Data.List
> primes = 2:(foldr (\x y -> if isPrime x then x:y else y) [] [3..])
>    where isPrime x = foldl' (\z y -> z && (if x `mod` y == 0 then False
> else True)) True (take (floor $ sqrt $ fromIntegral x) primes)

Your code has two glitches and a serious flaw: foldl' is strict but not
fast, use foldr instead. Premature strictness is the root of all evil :)

To see what went wrong, I take the freedom to rewrite the code as

  primes    = 2 : filter isPrime [3..]
  isPrime x = all' (\p -> x `mod` p /= 0) . take sqrtn $ primes
    where sqrtn = floor $ sqrt $ fromIntegral n
  all' prop = foldl' (\z y -> z && prop y) True


The first and most minor glitch is the missing type signature. Every
Haskell compiler will default your integers to arbitrary precision Integers:

  > :t primes
  [Integer]

I doubt that your C friend uses arbitrary precision arithmetic, so you'd
better write down

  primes  :: [Int]
  isPrime :: Int -> Bool


The second glitch is that you 'take sqrtn primes'. This is not the same
as 'takeWhile (<= sqrtn) primes', i.e. taking primes as long as they are
smaller than the square root of n. I guess you know that this results in
far fewer primes taken.


The glitches may have been unintentional, but the flaw intentionally
degrades performance: you should not use a strict all' but the lazy

  all prop = foldr (\y z -> prop y && z) True

from the Prelude. The point is that the lazy version stops inspecting
the elements of the remaining list whenever (prop y) turns False for the
first time. This is because && is lazy:

  False && x = False

for whatever x we supply. For example, take the list

  [True, False, True, True] ++ replicate 100 True

Here, all returns False after inspecting the first two elements while
all' inspects every one of the 104 list elements just to return False
afterwards. As every second number is even, your all' is busy wasting
time like crazy.


Regards,
apfelmus



More information about the Haskell-Cafe mailing list