Alexis King lexi.lambda at gmail.com
Tue Sep 17 05:30:43 UTC 2019

```Disclaimer: I am not an expert on fusion.

works, since that makes it easier to develop an intuition for what
will/won’t get fused. Here’s a crash course: fusion is implemented via *rewrite
rules
which are user-defined code transformation rules applied by the GHC
optimizer. The essence of list fusion is a single rewrite rule defined in
the standard library, which rewrites all expressions of the shape
foldr k z (build g) to g k z, where build is a function exported by GHC.Exts
with the following simple definition:

build g = g (:) []

How do you get fusion from that? Here’s the trick: fusable functions that
consume lists are implemented using foldr, while ones that build lists are
implemented using build. For example, sum could be implemented using foldr
like this:

sum = foldr (+) 0

…and map (which both consumes and builds) could be expressed like this:

map f xs = build (\cons nil -> foldr (\x ys -> f x `cons` ys) nil xs)

Now if you write sum (map f xs), and sum and map are both inlined, you’ll
get

foldr (+) 0 (build (\cons nil -> foldr (\x ys -> f x `cons` ys) nil xs))

which matches the rewrite rule from above and gets rewritten into this:

foldr (\x ys -> f x + ys) 0 xs

Like magic, the intermediate list has disappeared!

The details are a little more complicated than that in practice, but that’s
the core idea. Keeping that context in mind, let me answer some of your
questions directly.

On Sep 16, 2019, at 21:44, Hilco Wijbenga <hilco.wijbenga at gmail.com> wrote:
> [To start off, I want to state my, possibly incorrect, understanding
> of fusion and how it does not (in my expectation) apply to Vector,
> Set, et cetera. A List can essentially disappear as it's replaced by a
> loop but a Vector would not: the executing code would create and
> garbage collect multiple intermediate Vectors before finally returning
> the end result.]

Since fusion is implemented via rewrite rules, which can be defined by
anyone, operations on library-defined datatypes *can* be fused if the
library author provides the necessary rewrite rules. Other datatypes use
different fusion strategies than the foldr/build fusion mentioned above,
but the core ideas are similar.

For the specific two types you mentioned, Vector operations *do* happen to
be fused, while Set operations are not. Set operations are hard to fuse
because duplicates need to be removed from the stream, and determining if
an element is already in the set fundamentally requires a data structure.
You could, however, always convert a set to a list, transform the list with
a sequence of fusable operations, and turn it back into a set.

Assume that I need my algorithm to go from initial input to
> IntermediateResult to subsequent IntermediateResult (a few times) to
> EndResult. In my case, each subsequent IntermediateResult is a bit
> smaller than the previous one but that's probably irrelevant.
>

Should I prefer IntermediateResult to be lazy? Should I use [] instead
> of Vector in the IntermediateResult? What about the functions that
> actually operate on IntermediateResult, should I prefer to use [] or
> Vector there? I'm currently able to use Data.Vector.concatMap in some
> places, is that just as optimized?

Remember that fusion is based on rewrite rules, which are fundamentally
transformations on the *code*. Therefore, what matters most is the
structure of the code you have written, not what values are flowing around
your program at runtime. If a list produced by build ends up being passed
to foldr, but GHC couldn’t inline definitions enough to see the *syntactic*
pattern foldr k z (build g), then fusion can’t happen.

Therefore, if you want fusion to happen, make sure that you use standard
list operations to construct or consume lists (e.g. anything in Prelude or
Data.List), not manually-written recursive functions that pattern-match on
lists (GHC doesn’t know how to fuse those). Make sure GHC is able to inline
things enough it will be able to discover chains of fusable operations. If
you’re really worried about it, learn to read the GHC Core that can be
dumped by the optimizer; this blog post