[Haskell-beginners] Haskell Bin Tree Cellular Automata.

Stefan Höck efasckenoth at gmail.com
Sat Jun 6 04:01:59 UTC 2015


On Thu, Jun 04, 2015 at 04:31:07PM -0700, Sourabh wrote:
> Wow, it runs in 0.44 seconds now. This is incredible! Can you explain what
> just happened here?

Here is a try at an explanation. To reduce the complexity involved with
creating automata trees, lets define a very simple function:

  module Memo () where

  calc :: Integer -> Integer
  calc = (1+)
 
And now we build a list of integers in a similar fashion to your
tree-generating function:

  ints :: Integer -> [Integer]
  ints i = i : [calc (ints i !! (n-1)) | n <- [1..]]

Store this in a file, fire up ghci, load the file and enter

  take 500 $ ints 1

You can literally watch things slowing down as new values are being
printed. As you said correctly, at every call to `ints` the
whole list is recalculated up to the value you need. Since
`ints` is called with a new parameter at every iteration, Haskell
has no way of memoizing anything.

Let's add some memoization:

  ints2 :: Integer -> [Integer]
  ints2 i = let is = i : [calc (is !! (n-1)) | n <- [1..]]
            in  is

Again, load this in ghci and run

  take 500 $ ints2 1

Marvel at the difference. `is` has type [Integer] and as such is
a constant. Already calculated values of `is` will therefore be
memoized, so the whole list is only created once. Still, this is
far from ideal. Enter

  ints2 1 !! 50000

and wait some more. The problem is that the algorithm still
has O(n^2) complexity, n being the length of the list you need to
create. At every iteration, `is !! (n-1)` accesses the (n-1)th
element of `is` which takes itself n-1 operations. So, to
generate a list of ten elements, the algorithm runs

  is !! 1
  is !! 2
  .
  .
  .
  is !! 9
 
to a total of 45 operations. For small list sizes, this makes hardly
a difference, but for a list size of 50'000, this means already
1'249'975'000 operations doing nothing but traversing a list.

Let's improve this further:

  ints3 :: Integer -> [Integer]
  ints3 i = i : ints3 (calc i)

Here we use Haskell's lazyness to great effect. This function
says: our list of ints is the initial value, prepended to the list
of ints starting at the next value.

  take 500 $ ints3 1

and

  ints3 1 !! 50000

are both very fast. Every item in the list is calculated exactly once
and the algorithm runs in linear time.
Now you will only run into trouble at very
large indices (above 10'000'000 on my computer). `ints3` is not
tail recursive and will produce a stack overflow for very
large lists. 

When you look at the source code of `iterate`
you will see that it is just a generalized version of `ints3`
taking an additional higher order function as input.

I hope this sheds some light

Stefan
  


More information about the Beginners mailing list