[GHC] #8763: forM_ [1..N] does not get fused (10 times slower than go function)
GHC
ghc-devs at haskell.org
Thu Apr 12 09:28:27 UTC 2018
#8763: forM_ [1..N] does not get fused (10 times slower than go function)
-------------------------------------+-------------------------------------
Reporter: nh2 | Owner: (none)
Type: bug | Status: new
Priority: normal | Milestone: 8.6.1
Component: Compiler | Version: 7.6.3
Resolution: | Keywords:
Operating System: Unknown/Multiple | Architecture:
Type of failure: Runtime | Unknown/Multiple
performance bug | Test Case:
Blocked By: | Blocking:
Related Tickets: #7206 | Differential Rev(s):
Wiki Page: |
-------------------------------------+-------------------------------------
Comment (by sgraf):
Here's an implementation of `efdtIntFB` that fits our requirements:
{{{
data Direction = Up | Dn deriving Eq
direction :: Int# -> Int# -> Direction
direction from to
| isTrue# (to >=# from) = Up
| otherwise = Dn
efdtIntFB :: (Int -> r -> r) -> r -> Int# -> Int# -> Int# -> r
efdtIntFB c n x1 x2 y = emit first x1
where
-- We can safely emit the first element if an iteration
-- 'moves closer' to @y at . That's exactly the case when
-- @dir_x2@ coincides with @dir_y at .
!first = dir_x2 == dir_y
!dir_x2 = direction x1 x2
!dir_y = direction x1 y
-- We need the overflow flag in 'emit'.
(# delta, delta_ovf #) = x2 `subIntC#` x1
-- | Think of @emit :: Maybe Int -> [Int]@, only unboxed.
-- If the argument is 'Nothing', we reached the end of the list.
-- If the argument is 'Just', we emit an element, compute
-- the next candidate, validate it and recurse.
emit False _ = n
emit True x = I# x `c` emit next_ok next
where
-- Check that @next@ didn't move past @y at .
-- Also, overflow is only allowed iff the computation for
-- @delta@ overflowed.
(# next, next_ovf #) = addIntC# x delta
!next_ok = isTrue# (next_ovf ==# delta_ovf)
&& direction next y == dir_y
-- TODO: evaluate strict && for branchless code
{-# INLINE[0] efdtIntFB #-}
}}}
Some pros:
- I find this much easier to understand. No complicated invariants, etc.
- No Up/Dn variants to maintain. Still, if the direction is statically
known, constant folding and inlining will simplify stuff to the equivalent
code.
- As a result, no more duplication of `c` occurrences
- Also no more duplication of `n` occurrences
Cons:
- `emit`s closure is 4 words big (2 words bigger than the closure of the
original `go_up` helper) in the most general form. It's unfortunate that
we can't pack together `dir_y` and `delta_ovf` into a single word without
confusing constant folding. This would need either some kind of constant
propagation through bit fields (out of scope for GHC, I think) or a
smarter closure allocation theme that packs together non-pointer payloads.
- We pay for the generalisation of Up/Dn variants by having to compare
with `dir_y` all the time.
- `base` lacks `addWordC#` primitives, which I'll probably add now
--
Ticket URL: <http://ghc.haskell.org/trac/ghc/ticket/8763#comment:60>
GHC <http://www.haskell.org/ghc/>
The Glasgow Haskell Compiler
More information about the ghc-tickets
mailing list