[Haskell-cafe] How to think about this? (profiling)

Tue Dec 16 07:07:20 EST 2008

On Mon, Dec 15, 2008 at 11:33 PM, Lemmih <lemmih at gmail.com> wrote:
> 2008/12/16 Magnus Therning <magnus at therning.org>:
>> This behaviour by Haskell seems to go against my intuition, I'm sure I
>> just need an update of my intuition ;-)
>>
>> I wanted to improve on the following little example code:
>>
>>  foo :: Int -> Int
>>  foo 0 = 0
>>  foo 1 = 1
>>  foo 2 = 2
>>  foo n = foo (n - 1) + foo (n - 2) + foo (n - 3)
>>
>> This is obviously going to run into problems for large values of `n` so
>> I introduced a state to keep intermediate results in:
>>
>>  foo :: Int -> State (UArray Int Int) Int
>>  foo 0 = return 0
>>  foo 1 = return 1
>>  foo 2 = return 2
>>  foo n = do
>>      c <- get
>>      if (c ! n) /= -1
>>          then return $ c ! n
>>          else do
>>              r <- liftM3 (\ a b c -> a + b + c)
>>                  (foo $ n - 1) (foo $ n - 2) (foo $ n - 3)
>>              modify (\ s -> s // [(n, r)])
>>              return r
>>
>> Then I added a convenience function and called it like this:
>>
>>  createArray :: Int -> UArray Int Int
>>  createArray n = array (0, n) (zip [0..n] (repeat (-1)))
>>
>>  main = do
>>      (n:_)  <- liftM (map read) getArgs
>>      print $ evalState (foo n) (createArray n)
>>
>> Then I thought that this still looks pretty deeply recursive, but if I
>> call the function for increasing values of `n` then I'll simply build up
>> the state, sort of like doing a for-loop in an imperative language.  I
>> could then end it with a call to `foo n` and be done.  I replaced `main`
>> by:
>>
>>  main = do
>>      (n:_)  <- liftM (map read) getArgs
>>      print $ evalState (mapM_ foo [0..n] >> foo n) (createArray n)
>>
>> Then I started profiling and found out that the latter version both uses
>> more memory and makes far more calls to `foo`.  That's not what I
>> expected!  (I suspect there's something about laziness I'm missing.)
>>
>> Anyway, I ran it with `n=35` and got
>>
>>  foo n : 202,048 bytes , foo entries 100
>>  mapM_ foo [0..n] >> foo n : 236,312 , foo entries 135 + 1
>>
>> How should I think about this in order to predict this behaviour in the
>> future?
>
> Immutable arrays are duplicated every time you write to them. Making
> lots of small updates is going to be /very/ expensive.
> You have the right idea, though. Saving intermediate results is the
> right thing to do but arrays aren't the right way to do it. In this
> case, a lazy list will perform much better.
>
>> ack n = ackList !! n
>>    where ackList = 0:1:2:zipWith3 (\a b c -> a+b+c) ackList (drop 1 ackList) (drop 2 ackList)

Ah, OK that does explain it.

I understand your solution, but AFAICS it's geared towards limited
recursion in a sense.  What if I want to use memoization to speed up
something like this

  foo :: Int -> Int
  foo 0 = 0
  foo 1 = 1
  foo 2 = 2
  foo n = sum [foo i | i <- [0..n - 1]]

That is, where each value depends on _all_ preceding values.  AFAIK
list access is linear, is there a type that is a more suitable state
for this changed problem?

I realise this particular function can be written using scanl:

  foo :: Int -> Int
  foo n = ackList !! n
      where
          ackList = 0:1:2:(drop 2 $ scanl1 (+) ackList)

but I guess it's not always that easy to construct a solution based on scanl.

Cheers,
M

-- 
Magnus Therning                        (OpenPGP: 0xAB4DFBA4)
magnus＠therning．org          Jabber: magnus＠therning．org
http://therning.org/magnus         identi.ca|twitter: magthe