[GHC] #8763: forM_ [1..N] does not get fused (allocates 50% more)
GHC
ghc-devs at haskell.org
Tue Sep 4 11:56:39 UTC 2018
#8763: forM_ [1..N] does not get fused (allocates 50% more)
-------------------------------------+-------------------------------------
Reporter: nh2 | Owner: (none)
Type: bug | Status: new
Priority: normal | Milestone: 8.8.1
Component: Compiler | Version: 7.6.3
Resolution: | Keywords:
Operating System: Unknown/Multiple | Architecture:
Type of failure: Runtime | Unknown/Multiple
performance bug | Test Case:
Blocked By: | Blocking:
Related Tickets: #7206 | Differential Rev(s):
Wiki Page: |
-------------------------------------+-------------------------------------
Comment (by sgraf):
Sure will do. Let's begin with the original problem. There's two functions
in [[attachment:reprof.hs]]. One is `f`, having a loop of the `forM_
[2..n] body` form, the other is `g` which has the `forM_ [2,3..n] body`
form. The former doesn't allocate (56kB in total), whereas the latter
allocates quite a lot (960MB).
Why is that? The arithmetic sequences desugar, rewrite and specialise to
`build (\c t -> eftIntFB c z 2 n)` and `build (\c z -> efdtIntFB c z 2 3
n)`, respectively. That cancels away with `forM_`'s implementation in
terms of `foldr`:
{{{
forM_= flip mapM_
mapM_ body
= {- definition -}
foldr ((>>) . body) (return ())
= {- eta expand the section -}
foldr (\x k -> body x >> k) (return ())
= {- (>>) of ST, written as `s -> (a, s)` for lighter syntax -}
foldr (\x k s -> case (body x s) of (_, s') -> k s') (return ())
}}}
Note how `k` occurs in tail position within the lambda. Now, this cancels
with the definition of `ef(d)tIntFB`:
{{{
foldr (\x k s -> case (body x s) of (_, s') -> k s') (return ()) . build
(\c z -> efdtIntFB c z 2 3 n)
= {- foldr/build -}
efdtIntFB (\x k s -> case (body x s) of (_, s') -> k s') (return ()) 2 3
n
}}}
So, that lambda is what is bound to `c` when `ef(d)tIntFB` gets inlined.
Now, the implementation of `eftIntFB` only has a single occurrence of `c`,
so it will inline properly and bind its `k` parameter (the continuation in
tail position) to the recursive `go` call here
[[https://github.com/ghc/ghc/blob/fa3143c76ac77ee96fd89559cacc089205abaa20/libraries/base/GHC/Enum.hs#L522]].
The call to `k` turns into a tail call within `go`s body:
{{{
go x = I# x `c` if isTrue# (x ==# y)
then n
else go (x +# 1#)
==> c = \x k s -> case (body x s) of (_, s') -> k s'
go x s =
case (body (I# x) s) of
(_, s') ->
if isTrue# (x ==# y)
then n s'
else go (x +# 1#) s'
}}}
And `go` can be made a join point.
The same isn't possible in the current `efdtIntFB`, because it duplicates
`c` by branching on whether to count up or down (and also within the loop
itself, anyway):
{{{
efdtIntFB :: (Int -> r -> r) -> r -> Int# -> Int# -> Int# -> r
efdtIntFB c n x1 x2 y
| isTrue# (x2 >=# x1) = efdtIntUpFB c n x1 x2 y
| otherwise = efdtIntDnFB c n x1 x2 y
}}}
Now, when `efdtInt{Up,Dn}FB` gets inlined, we end up with a let-bound `c`
that still tail-calls its argument `k`, and is even tail-called itself
within `go_up`/`go_dn` (here
https://github.com/ghc/ghc/blob/fa3143c76ac77ee96fd89559cacc089205abaa20/libraries/base/GHC/Enum.hs#L588):
{{{
let c x k s = case (body x s) of (_, s') -> k s'
in ...
let go_up x
| isTrue# (x ># y') = I# x `c` n
| otherwise = I# x `c` go_up (x +# delta)
}}}
Note that `go_up` tail calls `c` and passes itself as the `k` parameter.
If `c` was inlined, all would be fine and `go_up` would turn into a join
point. That's not the case because `c` is duplicated in `efdtIntFB` and
then one more time in `efdtInt{Up,Dn}FB`. My first implementation in
comment:52 (for which you provided a simplification in comment:54) dealt
with the latter, while the idea in comment:60 is supposed to deal with the
former. Sadly, case-of-case seems to undo the painful de-duplication of
the `c` parameter (that's what comment:71 is about).
Why doesn't LLF help here? Well, lifting out `c` to top-level gets rid of
allocations for `c`, but there's still at least the allocation for the
thunk for `go_up (x+1)` (the `Int` box goes away because of strictness).
Also, the call to `go_up` is still an unknown call, as opposed to the
simple join call we would get by inlining `c`.
--
Ticket URL: <http://ghc.haskell.org/trac/ghc/ticket/8763#comment:74>
GHC <http://www.haskell.org/ghc/>
The Glasgow Haskell Compiler
More information about the ghc-tickets
mailing list