speedup help

Mark P Jones mpj@cse.ogi.edu
Mon, 3 Mar 2003 20:46:22 -0800

Hi Damien,

| So, I'm having to calculate 'n choose k' an awful lot.  At 
| the moment I've got this:
| comb :: Integer -> Integer -> Integer
| comb m 0 = 1
| comb m n = (numerator(toRational (fact m) / toRational (fact 
| n * fact (m-n))))
| where fact is a memoized factorial function.  It's not 
| perfectly memoized, though; I use lists, since that's easier 
| by default.  They should be arrays, and possibly just 
| changing that would speed comb up a lot.  (Comb is currently 
| 40% of runtime, fact is 23%.)  But I think it should be 
| possible to speed up comb itself, too.

If you're looking to calculate memoized n choose k, then I'd suggest
looking at the following.  First, where do the n choose k numbers come
from?  Forget factorial!  Think Pascal's triangle!

 pascal :: [[Integer]]
 pascal  = iterate (\row -> zipWith (+) ([0] ++ row) (row ++ [0])) [1]

Now we can define a variant of your comb function like this:

 comb    :: Int -> Int -> Integer
 comb n m = pascal !! n !! m

(Add an extra line if you want comb to work for values outside the
range:  0 <= m <= n.  You could also replace the rows of the triangle
with arrays, if you want.  No factorials, multiplies, or divides here,
and natural memoization ...)

| comb is only called from here:
| sumbn n = sum [ bernoulli i * fromIntegral(comb (n+1) i) | i 
| <- [0 .. n-1] ]

In that case, you can take further advantage of using Pascal's triangle
by recognizing that numbers of the form (comb (n+1) i) are just the
entries in the (n+1)th row.  (All but the last two, for reasons I
don't understand ... did you possibly want [0..n+1]?)  So we get the
following definition:

  sumbn n = sum [ bernoulli i * fromIntegral c
                | (i,c) <- zip [0..n-1] (pascal!!(n+1)) ]

Actually, I prefer the following version that introduces an explicit
dot product operator:

  sumbn n = take n (map bernoulli [0..]) `dot` (pascal!!(n+1))

  dot xs ys = sum (zipWith (*) xs ys)

I don't have your setup to test the performance of these definitions,
but I'd be curious to know how they fare.  Even if they turn out to be
slower, I thought these definitions were interesting and different
enough to justify sharing ...

All the best,