[Haskell-cafe] Is fusion overrated?

Wed May 18 18:13:35 CEST 2011

On Wed, May 18, 2011 at 2:04 AM, Ryan Ingram <ryani.spam at gmail.com> wrote:
> Yes, the goal isn't so much to improve complexity (both are O(1)) but to
> reduce the constant factor on that O(1).
>
> In an inner loop like that, allocator/gc calls by far dominate the cost of
> the program.  If you can remove them, you've improved the performance of the
> program by 10-100x.

Yes, this is important. Fusion is an obvious win on strict structures,
because it can make the space usage asymptotically better. However,
even if this doesn't happen for lazy structures, fusion can save a lot
of _time_. It won't make the time complexity asymptotically better,
but it can save an amount of work proportional to the complexity of
the algorithm.

For instance, I was playing around with uvector a while back, and
needed foldr or something similar that wasn't available. So I wrote
something like:

    foldr f z s = if null s then z else f (head s) (tail s)

This all ran in constant space, but tail re-unfolds the entire stream
every time, so this function has time complexity O(n^2). The nth
element chugs through n allocation-followed-by-deallocations before it
becomes usable, which can all be done in constant space, but takes
linear time.

Fusion won't save you in this example (to my knowledge). But if you
compose k functions from lists to lists together, you'll have k
allocations-followed-by-deallocations on every element that makes it
all the way through the pipeline. You'll see O(1) space usage, but
your time usage has a c*k*n term simply from being expressed by a
composition pipeline, where c is the cost of the unnecessary boxing.
Fusion eliminates this term.

-- Dan