[Haskell-cafe] Fusion & Laziness

Tue Sep 17 08:02:23 UTC 2019

Hi Hilco,

To guarantee no intermediate Vectors are ever allocated, you can use the
"Bundle" type in
https://hackage.haskell.org/package/vector-0.12.0.3/docs/Data-Vector-Fusion-Bundle.html
for
intermediate values, and only convert the Bundle to a list or vector in the
final step. In fact, most interfaces in the vector package are wrapped with
"stream" and "unstream" calls under the hood, using Bundle to do the actual
transformation, but still exposing the Vector type as input/output, and the
composed stream/unstream calls are eliminated by the ghc rewrite rules.

Cheers,
Cheng

On Tue, Sep 17, 2019 at 10:45 AM Hilco Wijbenga <hilco.wijbenga at gmail.com>
wrote:

> Hi all,
>
> I'd like to have a better understanding of fusion and (maybe?) laziness.
>
> Let's say I have an (exported) type "data EndResult = ... !(Vector
> Thing) ..." and an intermediate, unexported type "data
> IntermediateResult = ... !(Vector Thing) ...". (I suppose it doesn't
> need to be Vector, it could be Set, or some other data structure that
> (I imagine) is relatively expensive to map over, unlike, say [].)
>
> [To start off, I want to state my, possibly incorrect, understanding
> of fusion and how it does not (in my expectation) apply to Vector,
> Set, et cetera. A List can essentially disappear as it's replaced by a
> loop but a Vector would not: the executing code would create and
> garbage collect multiple intermediate Vectors before finally returning
> the end result.]
>
> Assume that I need my algorithm to go from initial input to
> IntermediateResult to subsequent IntermediateResult (a few times) to
> EndResult. In my case, each subsequent IntermediateResult is a bit
> smaller than the previous one but that's probably irrelevant.
>
> Should I prefer IntermediateResult to be lazy? Should I use [] instead
> of Vector in the IntermediateResult? What about the functions that
> actually operate on IntermediateResult, should I prefer to use [] or
> Vector there? I'm currently able to use Data.Vector.concatMap in some
> places, is that just as optimized?
>
> I realize that the correct answer more than likely is "don't worry
> about it". And that I'm being very vague. :-) I'm not looking for
> anything definitive, I'm just hoping to improve my understanding and
> intuition.
>
> What should I consider when thinking about these types of things? If I
> don't want to create two separate implementations and profile them,
> are there "obvious" signs one way or another?
>
> Cheers,
> Hilco
> _______________________________________________
> Haskell-Cafe mailing list
> To (un)subscribe, modify options or view archives go to:
> http://mail.haskell.org/cgi-bin/mailman/listinfo/haskell-cafe
> Only members subscribed via the mailman list are allowed to post.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.haskell.org/pipermail/haskell-cafe/attachments/20190917/f45db5aa/attachment.html>