Deforestration harmful?

Simon Marlow
Wed, 8 May 2002 16:45:24 +0100

> I will be a bit more precise. Let f and g have type:
>    g :: Seed -> [Tree]
>    f :: Tree -> [Expr]
> where Tree1 is a tip-valued binary tree, and Tree2 is also a
> binary tree but the forks are arithmetic operators:
>    data Tree = Tip Int | Bin Tree1 Tree1
>    data Expr = Val Int | Add Tree2 Tree2 | Sub Tree2 Tree2 | Mul ....

These don't make sense.  Perhaps for Tree1 you mean Tree, and Tree2 you mean Expr?

> The function g is defined like:
>      g seed = [ z | (s1, s2) <- split seed,
>                      x <- g s1, y <- g s2, z <- Bin x y ]

There's a type error in that definition: 'Bin x y' is not a list.

> The function f converts a Tree to many Exprs by replacing
> Bin with one of Add, Sub, Mul or Div:
>      f (Tip i) = [Val i]
>      f (Bin x y) = [op x' y' | x' <- f x, y' <- f y, op <- join x y]
> The program before fusion looks like:
>      concat . map f . g
> while the fused program simply:
>      fg seed = [ z | (s1, s2) <- split seed,
>                       x <- fg s1, y <- fg s2, z <- join x y ]
>      join x y = [op x y | op <- [Add, Sub, ... ]]
> And the surprise was the fused program ran slower than the
> naive one before fusion.
> The time profiling showed that, in one run, the naive
> program itself took 2.00 sec to finish, with 9% extra GC time. The
> fused program itself, on the other hand, took only 1.84 sec but
> spent an extra 67% of time on GC (the heap size is the default
> 64M).
> Considering their heap profiles, both of them look like several
> steep, slanted triangles standing next to each other. The highest peak
> in the heap profile of the naive program reaches only 50K of memory,
> while that for the fused program reaches as much as 1500K. In
> the former case, the heap is mostly occupied by chunks created
> by g. In the latter case, the heap is mostly occupied by join.
> > Furthermore, deforestation itself does not reduce runtime very much.
> Indeed. Looks like it helps to speed up the mutator a bit but
> end up spending more time on GC, due to increased heap demand.

I think this is a misleading generalisation - in most cases deforestation doesn't affect the heap residency behaviour of the program.  As Olaf said, deforestation tends to result in small constant factor speedups, but can be particularly beneficial if it enables further optimisations to take place.

In your case it looks like the transformation you made by hand had the side-effect of introducing a space leak, but it's quite hard to tell from the code you've shown us. (space leaks are particularly hard to identify by eye, using the profiling tools is the only way to really understand what's happening).

> Monday, May 6, 2002, 12:02  PM, Simon Marlow wrote:
>  > It is entirely plausible that as you increase the heap 
> size the mutator
>  > time increases: this is because you get fewer cache hits 
> with a larger
>  > heap.  GHC starts off with a small heap size so as to try 
> to make the
>  > most of the cache.  However, on most examples we've seen 
> the benefits of
>  > reducing garbage collections by increasing the heap size tend to
>  > outweigh the benefits from staying in the cache.
>  > Did you try heap profiling to discover what structure was 
> filling up
>  > the heap?
> Thanks for the explanation. :)
> More info. about the heap was given above. By the way, could there be
> situations where increasing the heap size also increases the GC time?

Yes, again because of increased cache misses.