Extending fold/build fusion

Simon Peyton Jones simonpj at microsoft.com
Wed Jan 15 19:20:51 UTC 2014


Akio

Aha!  So you are really talking about replacing the *entire* foldr/build story with a new one, namely a foldW/buildW story.  Presumably all producers and consumers (map, filter, take, drop etc) must be redefined using foldW and buildW instead of fold and build.  Is that right?

That is much more significant than the wiki page describes.  If you are serious about this, could you perhaps update the wiki page to describe what you propose?   Do you believe that the new story will catch every case that the old one does?  (Plus some new ones.)  Does your data support that?

I'm really not sure about your Tree example.   I agree that the foldl' style code gives the result that you show.  But I tried the more straightforward version:

sumT :: Tree -> Int

sumT t = foldr (+) 0 (build (toListFB t))

This yielded pretty decent code:
FB.$wgo =
  \ (w_sio :: FB.Tree) (ww_sir :: GHC.Prim.Int#) ->
    case w_sio of _ {
      FB.Tip rb_dgM -> GHC.Prim.+# rb_dgM ww_sir;
      FB.Bin x_af0 y_af1 ->
        case FB.$wgo y_af1 ww_sir of ww1_siv { __DEFAULT ->
        FB.$wgo x_af0 ww1_siv
        }
    }

This builds no thunks.  It does build stack equal to the depth of the tree.  But your desired go1 code will also do exactly the same; go1 is strict in its second argument and hence will use call-by-value, and hence will build stack equal to the depth of the tree.

In short, I'm not yet seeing a benefit.
I am probably missing something important.
Suggestion: rather than just reply to this email (soon lost in the email stream), it would be easier for others to join in if you updated your wiki page to say (a) what you propose, and (b) how it can yield benefits that the current setup cannot.  Then an email reply can say "go look at section 3" or whatever.

best wishes

Simon

From: Akio Takano [mailto:tkn.akio at gmail.com]
Sent: 14 January 2014 09:22
To: Simon Peyton Jones
Cc: ghc-devs
Subject: Re: Extending fold/build fusion

Thank you for looking at this!

On Tue, Jan 14, 2014 at 1:27 AM, Simon Peyton Jones <simonpj at microsoft.com<mailto:simonpj at microsoft.com>> wrote:
I've hesitated to reply, because I have lots of questions but no time to investigate in.  I'm looking at your wiki page https://github.com/takano-akio/ww-fusion


*         Does your proposed new fold' run faster than the old one?  You give no data.

No, it runs just equally fast as the old one. At the Core level they are the same. I ran some criterion benchmarks:

source: https://github.com/takano-akio/ww-fusion/blob/master/benchmarks.hs
results: http://htmlpreview.github.io/?https://github.com/takano-akio/ww-fusion/blob/master/foldl.html

The point was not to make foldl' faster, but to make it fuse well with good producers.


*         The new foldl' is not a "good consumer" in the foldr/build sense, which a big loss.  What if you say fold' k z [1..n]; you want the intermediate list to vanish.
For my idea to work, enumFromTo and all other good producers need to be redefined in terms of buildW, which fuses with foldrW. The definition of buildW and the relevant rules are here:

https://github.com/takano-akio/ww-fusion/blob/master/WWFusion.hs



*         My brain is too small to truly understand your idea.  But since foldrW is non-recursive, what happens if you inline foldrW into fold', and then simplify?  I'm betting you get something pretty similar to the old foldl'.  Try in by hand, and with GHC and let's see the final optimised code.
I checked this and I see the same code as the old foldl', modulo order of arguments. This is what I expected.


*         Under "motivation" you say "GHC generates something essentially like..." and then give some code.  Now, if GHC would only eta-expand 'go' with a second argument, you'd get brilliant code. And maybe that would help lots of programs, not just this one.  It's a slight delicate transformation but I've often thought we should try it; c.f #7994, #5809
I agree that it would be generally useful if GHC did this transformation. However I don't think it's good enough for this particular goal of making foldl' fuse well.

Consider a function that flattens a binary tree into a list:

data Tree = Tip {-# UNPACK #-} !Int | Bin Tree Tree

toList :: Tree -> [Int]
toList tree = build (toListFB tree)
{-# INLINE toList #-}

toListFB :: Tree -> (Int -> r -> r) -> r -> r
toListFB root cons nil = go root nil
  where
    go (Tip x) rest = cons x rest
    go (Bin x y) rest = go x (go y rest)

Let's say we want to eliminate the intermediate list in the expression (sum (toList t)). Currently sum is not a good consumer, but if it were, after fusion we'd get something like:

sumList :: Tree -> Int
sumList root = go0 root id 0

go0 :: Tree -> (Int -> Int) -> Int -> Int
go0 (Tip x) k = \m -> k $!  (x+m)
go0 (Bin x y) k = go0 x (go0 y k)

Now, merely eta-expanding go0 is not enough to get efficient code, because the function will still build a partial application every time it sees a Bin constructor. For this recursion to work in an allocation-free way, it must be rather like:

go1 :: Tree -> Int -> Int
go1 (Tip x) n = x + n
go1 (Bin x y) n = go1 y (go1 x n)

And this is what we get if we define foldl' and toList in terms of foldrW and buildW.

I think a similar problem arises whenever you define a good consumer that traverses a tree-like structure, and you want to use a strict fold to consume a list produced by that producer.

Thank you,
Takano Akio

Simon

From: ghc-devs [mailto:ghc-devs-bounces at haskell.org<mailto:ghc-devs-bounces at haskell.org>] On Behalf Of Akio Takano
Sent: 09 January 2014 13:25
To: ghc-devs
Subject: Re: Extending fold/build fusion

Any input on this is appreciated. In particular, I'd like to know: if I implement the idea as a patch to the base package, is there a chance it is considered for merge?

-- Takano Akio
On Fri, Jan 3, 2014 at 11:20 PM, Akio Takano <tkn.akio at gmail.com<mailto:tkn.akio at gmail.com>> wrote:
Hi,

I have been thinking about how foldl' can be turned into a good consumer, and I came up with something that I thought would work. So I'd like to ask for opinions from the ghc devs: if this idea looks good, if it is a known bad idea, if there is a better way to do it, etc.

The main idea is to have an extended version of foldr:

-- | A mapping between @a@ and @b at .
data Wrap a b = Wrap (a -> b) (b -> a)

foldrW
  :: (forall e. Wrap (f e) (e -> b -> b))
  -> (a -> b -> b) -> b -> [a] -> b
foldrW (Wrap wrap unwrap) f z0 list0 = wrap go list0 z0
  where
    go = unwrap $ \list z' -> case list of
      [] -> z'
      x:xs -> f x $ wrap go xs z'

This allows the user to apply an arbitrary "worker-wrapper" transformation to the loop.

Using this, foldl' can be defined as

newtype Simple b e = Simple { runSimple :: e -> b -> b }

foldl' :: (b -> a -> b) -> b -> [a] -> b
foldl' f initial xs = foldrW (Wrap wrap unwrap) g id xs initial
  where
    wrap (Simple s) e k a = k $ s e a
    unwrap u = Simple $ \e -> u e id
    g x next acc = next $! f acc x

The wrap and unwrap functions here ensure that foldl' gets compiled into a loop that returns a value of 'b', rather than a function  'b -> b', effectively un-CPS-transforming the loop.

I put preliminary code and some more explanation on Github:

https://github.com/takano-akio/ww-fusion

Thank you,
Takano Akio


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.haskell.org/pipermail/ghc-devs/attachments/20140115/ce120237/attachment-0001.html>


More information about the ghc-devs mailing list