[GHC] #8763: forM_ [1..N] does not get fused (10 times slower than go function)

Thu Mar 29 20:23:08 UTC 2018

#8763: forM_ [1..N] does not get fused (10 times slower than go function)
-------------------------------------+-------------------------------------
        Reporter:  nh2               |                Owner:  (none)
            Type:  bug               |               Status:  new
        Priority:  normal            |            Milestone:  8.6.1
       Component:  Compiler          |              Version:  7.6.3
      Resolution:                    |             Keywords:
Operating System:  Unknown/Multiple  |         Architecture:
 Type of failure:  Runtime           |  Unknown/Multiple
  performance bug                    |            Test Case:
      Blocked By:                    |             Blocking:
 Related Tickets:  #7206             |  Differential Rev(s):
       Wiki Page:                    |
-------------------------------------+-------------------------------------

Comment (by sgraf):

 It seems I uploaded the variant where I used `IO` instead of `ST`, where
 things still inline. When you substitute `ST s` for `IO` and use `print $
 runST $ ...` instead of `... >>= print`, it should reproduce with 8.4.1.

 Now here's the funny part: I managed to make this reproduce even for `IO`
 by duplicating the call to `nop`. So it seems like `c` really just hits
 the threshold where the inliner gives up. The only solution I can think of
 is what I described in my second point above: Implement `efdtIntUpFB` in a
 way that doesn't duplicate `c`.

 In general we should avoid to call `c` in `build`s more than once because
 of scenarios like this. Huge `c`s aren't uncommon at all (do blocks in
 `forM_` bodies, the functions passed as first argument to `foldr`, etc.)
 and otherwise we can't guarantee that everything inlines.

-- 
Ticket URL: <http://ghc.haskell.org/trac/ghc/ticket/8763#comment:49>
GHC <http://www.haskell.org/ghc/>
The Glasgow Haskell Compiler