[Haskell-cafe] Fusion & Laziness

Tue Sep 17 02:44:27 UTC 2019

Hi all,

I'd like to have a better understanding of fusion and (maybe?) laziness.

Let's say I have an (exported) type "data EndResult = ... !(Vector
Thing) ..." and an intermediate, unexported type "data
IntermediateResult = ... !(Vector Thing) ...". (I suppose it doesn't
need to be Vector, it could be Set, or some other data structure that
(I imagine) is relatively expensive to map over, unlike, say [].)

[To start off, I want to state my, possibly incorrect, understanding
of fusion and how it does not (in my expectation) apply to Vector,
Set, et cetera. A List can essentially disappear as it's replaced by a
loop but a Vector would not: the executing code would create and
garbage collect multiple intermediate Vectors before finally returning
the end result.]

Assume that I need my algorithm to go from initial input to
IntermediateResult to subsequent IntermediateResult (a few times) to
EndResult. In my case, each subsequent IntermediateResult is a bit
smaller than the previous one but that's probably irrelevant.

Should I prefer IntermediateResult to be lazy? Should I use [] instead
of Vector in the IntermediateResult? What about the functions that
actually operate on IntermediateResult, should I prefer to use [] or
Vector there? I'm currently able to use Data.Vector.concatMap in some
places, is that just as optimized?

I realize that the correct answer more than likely is "don't worry
about it". And that I'm being very vague. :-) I'm not looking for
anything definitive, I'm just hoping to improve my understanding and
intuition.

What should I consider when thinking about these types of things? If I
don't want to create two separate implementations and profile them,
are there "obvious" signs one way or another?

Cheers,
Hilco