good fusion

Tue Apr 11 10:57:27 EDT 2006

On 4/11/06, Malcolm Wallace <Malcolm.Wallace at cs.york.ac.uk> wrote:
> Thanks for pointing to your ICFP'02 paper on this.  Very interesting.

I'm glad you liked it. :)

> The remaining difficulty is how to allow the co-existence of foldr/build
> with your destroy/unfoldr, so we get the benefits of both techniques.
> I have an idea about that.
>
[Details cut out]

First of all, I'm glad to see you've also discovered how fun it can be
to do transformations of functional programs. This is real fun! :)

Secondly, and this is a readers digest of what I write below, I've
thought long and hard about how to marriage foldr/build with
destroy/unfoldr and there simply doesn't seem to be a way which
maintains the initial simplicity of foldr/build and which enables
significantly more fusion.

So, what you demonstrate is that given that we have two functions,
both represented as unfoldr, then they can be fused with zipWith even
though it is represented as a foldr2. But if we try to imagine a more
concrete scenario things become more problematic. Which two functions
could those be? Suppose they were both map functions. That's fine, map
can be represented as an unfoldr. But wait a second! If we want to use
foldr/build then we want to represent map in terms of foldr and build
- not unfoldr! Dang!
And this is really the core of the problem with unifying foldr/build
and destroy/unfoldr - which representation certain functions should
have.

One could imagine having both representations and have GHC's rule
mechanism choose the representation that works at a particular
instance. In case both representations should work one can let GHC
choose non-deterministically. The first problem with this is that it
starts to get messy. But I have no doubt that it is doable. Secondly,
I'm fairly sure (I have a faint memory of this but I can't quite
recall it) that to achieve maximum fusion in certain situations then
there are functions that must change representation from foldr/build
to destroy/unfoldr or vice versa. One might prespond to this by
saying: Fine, we're only dealing with compiler optimisations here
anyway and it is difficult to guarantee that optimisations trigger in
general and why should this be different. But one of the beauties with
foldr/build is the producer/consumer abstraction which makes it very
easy for programmers to predict when fusion will happen. I certainly
don't want to be without that.

I hate to bring so pessimistic opinions, you seem to be on a roll
here. Perhaps there is a special case here lurking which can achieve a
bit of fusion in the case for zipWith. But personally I doubt it.

>   genUnfoldr (g`zipG`f) (s,r) (\ (x,y) z-> consumer x y z) z
>
> Perhaps this is not much better though, because where we used to have
> intermediate list structure, now we have intermediate pair structures.
> I would like to hope that other optimisations might still be able to
> remove these as well, but have not yet investigated.

There are optimisations that tackle this situation and GHC implements
some but not all of them. Here are some relevant papers:
http://research.microsoft.com/%7Esimonpj/Papers/cpr/index.htm
ftp://ftp.cs.kun.nl/pub/Clean/papers/2000/groj2000-OptRecFunYTuples.pdf

Cheers,

/Josef