good fusion

Jan-Willem Maessen jmaessen at alum.mit.edu
Wed Apr 12 09:16:11 EDT 2006


On Apr 11, 2006, at 2:37 PM, Udo Stenzel wrote:

> Malcolm Wallace wrote:
>> Well, the core of my idea is that instead of two stages, foldr/ 
>> build or
>> destroy/unfoldr, there are really three: foldr/build/genUnfoldr.
>
> As Josef already realized, foldr can be expressed in terms of destroy:
>
> *> foldr c n xs = destroy foldrDU xs
> *>   where foldrDU g y = case g y of
> *>           Nothing -> n
> *>           Just (x,y') -> c x (foldrDU g y')
>
> The converse does not seem to be true, destroy is strictly more
> expressive than foldr.

This is an intriguing question, and a way to get hackers like me  
interested...  Proving equivalence would simply be a matter of  
finding a definition (which would have to use a whole lot of higher- 
order magic).  Now I'm going to have to go back and convince myself  
it won't work all over again.

> Similarly any unfoldr can be expressed as a
> build:
>
> *> unfoldr g y = build (unfoldrFB y)
> *>   where unfoldrFB Nothing       c n = n
> *>         unfoldrFB (Just (x,y')) c n = c x (unfoldrFB y' c n)
>
> Using these definitions we get a foldr/unfoldr rule for free.

In fact, the pH compiler (back in 1994) had a deforestation pass  
based on a version of foldr/unfoldr deforestation, with none of this  
build or destroy nonsense.  This is enough to express quite a few  
"good producers" and all the "good consumers" in the GHC sense.  And  
we can indeed obtain a zipWith / unfoldr fusion rule, or rather  
several with an obvious order of application (I'm reconstructing this  
on the fly from recollections of a subtly different form; it'll be  
broadly correct but I'll probably mess up a detail along the way).

   1) zip (unfoldr f z) (unfoldr g y) = unfoldr (hoist f g) (z,y)
        where hoist f g (z,y) = f z >>= \u -> g y >>= \v -> (u,v)  --  
in the Maybe monad

   2) foldr g y (zip (unfoldr f z) xs) = foldr h (const y) xs z
        where h x k z = maybe y (\(v,w)-> g (x,v) (k w)) (f z)

   3) as above, with arguments commuted.

   4) zipWith xs ys = unfoldr u (xs,ys)
        where u (x:xs,y:ys) = Just ((x,y),(xs,ys))
	     u _ = Nothing

We can squeeze reasonable-looking loops out of higher-order folds if  
we're willing to do a little bit of work.  We do end up relying on  
transformations like worker/wrapper quite a bit to get rid of all  
those tuples...  But that's what worker/wrapper is for.

> So to get maximum fusion, the general rule seems to be "prefer to  
> write
> producers in terms of unfoldr and consumers in terms of foldr".

Absolutely.  The frustrating part: all those list transducers like  
reverse, take, drop, inits, tails...  Some of them are expressible as  
folds or unfolds, but it's never terribly natural; we should expect  
the resulting code to be pretty awful.  For these, either build or  
destroy seem to be indispensible.

So the real quest here is for something which unites build and  
destroy.  Then foldr and unfoldr are just a side show.

>> I'm still working out all the details - it may yet turn out that you
>> quickly reach dead-ends where further fusion does not occur.  But I'm
>> hoping that it is possible to express whole trees of computation
>> (composition pipelines + branching at zipWith) rather than just
>> pipelines, in the one framework.

As I say, list transducers are a killer.  Luckily map, concatMap, and  
filter are friendly---these are necessary to make comprehensions work  
out nicely.  But other list functions are quite hard.  I had a hack  
for reverse (we can turn foldr into foldl and vice versa, and foldl  
is just a higher-order foldr), but the general case is quite tricky.

This is all very timely, as I've been taking a trip down memory lane  
lately; in Fortress we are representing catamorphisms in Boom-style  
fold-form directly as objects, and expressing aggregates as build- 
functions.

-Jan-Willem Maessen

> That's my feeling, too.  But that should already be possible using
> destroy/unfoldr alone.  Actually I can think of lots of interesting
> consumers that (seem to) require destroy (foldl, zip, ReadP), but  
> of no
> interesting producers that would require build.  I'd expect  
> problems in
> situations with a single producer and multiple consumers, but those
> aren't currently deforested anyway.
>
>
> Regards,
>
> Udo.
> _______________________________________________
> Libraries mailing list
> Libraries at haskell.org
> http://www.haskell.org/mailman/listinfo/libraries



More information about the Libraries mailing list