[Haskell-cafe] speeding up fibonacci with memoizing

ajb at spamcop.net ajb at spamcop.net
Sun Feb 18 18:15:25 EST 2007

```G'day all.

On 2/18/07, Yitzchak Gale <gale at sefer.org> wrote:

> Besides memoizing, you might want to use the fact
> that:
>
> fib (2*k) == (fib (k+1))^2 - (fib (k-1))^2
> fib (2*k-1) == (fib k)^2 + (fib (k-1))^2

Quoting Felipe Almeida Lessa <felipe.lessa at gmail.com>:

> Implementation details:
> -----------------------------------------
> another_fibs = 0 : 1 : 1 : map f [3..]
>     where
>       square x = x * x
>       sqfib = square . another_fib
>       f n | even n = sqfib (k+1) - sqfib (k-1) where k = n `div` 2
>       f n          = sqfib k + sqfib (k-1) where k = (n + 1) `div` 2
> another_fib = (!!) another_fibs
> -----------------------------------------

First off, your memo structure is a linked list, which takes O(n) time
to access the nth element.  The first call takes O(n) time, the second
takes O(n/2) time, the third takes O(n/4) time etc etc, so in total, it's
O(n).

That's the same complexity as the naive iterative algorithm:

fib n = fib' 0 1 !! n
where fib' a b = a : fib' b (a+b)

Secondly, the memo data structure here leaks memory.  If you need
fib 2000000 once and only low Fibonacci numbers after that, you
keep the data structure up to size 2000000, even though you don't
need it.  Now that's fine for a benchmark, but you should never do
this in a library.

Taking one and two together, it seems that it would be better to
use an array.  Let's try that:

fib :: Integer -> Integer
fib n
| n < 0 = error "fib"  -- A slight lie, fixed below.
| n < fromIntegral memoSize = memoTable ! fromIntegral n
| even n = let n2 = n `div` 2
a = fib (n2+1)
b = fib (n2-1)
in a*a - b*b
| otherwise = let n2 = (n+1) `div` 2
a = fib n2
b = fib (n2-1)
in a*a + b*b
where
memoSize :: Int
memoSize = 10000

memoTable
= array (0,memoSize-1) (take memoSize (fibs 0 1))
where
fibs a b = a : fibs b (a+b)

That keepe the memory leak under control, but there's another problem.
If n >= memoSize, then there will be two recursive calls spawned,
which will spawn two recursive calls...  The complexity is O(2^n),
which is  exactly the same problem as the naive recursive Fibonacci
implementation.  We've effectively just made the constant factor
extremely low by optimising a huge number of base cases.

Now consider the recursive calls.  The even case needs fib (n2-1) and
fib (n2+1), and the odd case needs fib n2 and fib (n2-1).  If we have
a = fib n2 and b = fib (n2-1), we can trivially compute fib (n2+1);
it's just a+b.  So let's just modify the recursive call to return two
adjacent Fibonacci numbers.  That way, we only have one recursive call,
and the overall complexity should be O(log n).

We want something like this:

fib :: Integer -> Integer
fib n
| n < 0  -- Because fib (n+1) = fib n + fib (n-1), we can extend
-- Fibonacci numbers below 0.
= let n' = -n
in if even n then -fib n' else fib n'
| otherwise
= fst (fib' n)

fib' :: Integer -> (Integer,Integer)

The base cases:

fib' n
| n < fromIntegral memoSize = memoTable ! fromIntegral n
where
memoSize :: Int
memoSize = 10000

memoTable
= listArray (0,memoSize-1) (take memoSize (fibs 0 1))
where
fibs a b = (a,b) : fibs b (a+b)

The array contains the pairs (0,1), (1,1), (1,2), (2,3), (3,5) etc.
If you think about it, this is redundant; we're holding twice the
number of Integers that we need to.  So let's optimise that a bit:

fib' n
| q < fromIntegral memoSize
= case memoTable ! fromIntegral q of
p@(a,b) | r == 0    -> p
| otherwise -> (b, a+b)
where
(q,r) = n `divMod` 2

memoSize :: Int
memoSize = 10000

memoTable
= listArray (0,memoSize-1) (take memoSize (fibs 0 1))
where
fibs a b = (a,b) : let ab = a+b in fibs ab (ab+b)

Finally, the recursive case.  A little arithmetic gives:

fib' n
| q < fromIntegral memoSize
= {- as before -}
| r == 0
= let (a,b) = fib' (q-1)
c = a+b
c2 = c*c
in (c2 - a*a, c2 + b*b)
| otherwise
= let (a,b) = fib' q
c = a+b
a2 = a*a
in (b*b + a2, c*c - a2)
where
{- as before -}

Now that's an industrial-strength Fibonacci.  It's O(log n) not
including the cost of adding and multiplying large Integers, and
uses a bounded amount of memory between calls, making it suitable
for a library.

The slowest part of the test program is actually the bit that prints
the number.  So I used this driver program:

main :: IO ()
main = do
(n:_) <- getArgs
putStrLn "another_fib"
putStrLn (another_fib (read n) `seq` "done")

I also used a slow machine so we can see the difference.  Your version:

% time ./fibtest 200000
another_fib
done

real    0m1.173s
user    0m1.041s
sys     0m0.121s

And mine:

% time ./fibtest 200000
fib
done

real    0m0.312s
user    0m0.289s
sys     0m0.021s

> Conclusion: it's often better to improve the algorithm than the
> implementation =).

And it's even better to do both.

Cheers,
Andrew Bromage
```