[GHC] #10069: CPR related performance issue

GHC ghc-devs at haskell.org
Tue Feb 19 09:36:52 UTC 2019


#10069: CPR related performance issue
-------------------------------------+-------------------------------------
        Reporter:  pacak             |                Owner:  (none)
            Type:  bug               |               Status:  new
        Priority:  normal            |            Milestone:
       Component:  Compiler          |              Version:  8.6.2
      Resolution:                    |             Keywords:
                                     |  DemandAnalysis
Operating System:  Unknown/Multiple  |         Architecture:
 Type of failure:  Runtime           |  Unknown/Multiple
  performance bug                    |            Test Case:
      Blocked By:                    |             Blocking:
 Related Tickets:                    |  Differential Rev(s):
       Wiki Page:                    |
-------------------------------------+-------------------------------------
Changes (by sgraf):

 * keywords:  CPRAnalysis, DemandAnalysis => DemandAnalysis


Comment:

 Looking at https://ghc.haskell.org/trac/ghc/attachment/ticket/10069/Blah
 .dump-simpl#L1668, I don't think this is related to CPR analysis but to
 the worker/wrapper transformation having issues with NOINLINE functions.

 What happens here is that `f1` to `f4` can't be inlined (so we don't see
 the case on the `A`), but `fa` still gets a strictness signature saying
 that all but arguments 2 to 5 are dead. WW will now split `fa` into a
 wrapper function that scrutinises the `A` to just project out the 4
 arguments that aren't dead and pass it on to the worker `$wfa` unboxed.

 So far so good. Now, WW arranges it so that the worker `$wfa` builds up a
 new `A` with dummy values for absent fields (0# for Int#). Normally, this
 new `A` binding would cancel out with case matches in `$wfa`, because the
 strictness signature must ultimately come from some case expression. These
 however are hidden in `NOINLINE` functions, so no cancelling is happening.
 As a result, we allocate the dummy `A` for nothing, we could have just
 passed along the old `A`.

 Here's an example demonstrating this in the small:

 {{{#!hs
 data C = C !Int !Int

 {-# NOINLINE c1 #-}
 c1 :: C -> Int
 c1 (C _ c) = c

 {-# NOINLINE fc #-}
 fc :: C -> Int
 fc c = c1 c +  c1 c
 }}}

 Relevant Core:

 {{{#!hs
 c1_rP
   = \ (ds_d3af :: C) ->
       case ds_d3af of { C dt_d3PA dt1_d3PB -> GHC.Types.I# dt1_d3PB }

 Main.$wfc
   = \ (ww_s7DJ :: GHC.Prim.Int#) ->
       case c1_rP (Main.C 0# ww_s7DJ) of { GHC.Types.I# x_a4kS ->
       GHC.Prim.*# 2# x_a4kS
       }

 fc
   = \ (w_s7DF :: C) ->
       case w_s7DF of { C ww1_s7DI ww2_s7DJ ->
       case Main.$wfc ww2_s7DJ of ww3_s7DN { __DEFAULT ->
       GHC.Types.I# ww3_s7DN
       }
       }
 }}}

 The problem I see here is that we don't WW `c1`, or that we don't inline
 the resulting wrapper into `fc` before the hypothetical worker `$wc1` of
 `c1` gets inlined back into `c1` because it's so small. If we inlined
 `$wc1` into `$wfc`, the case on `C` would cancel out with the dummy `C`
 and everything would be well.

 So: If we WW `fc`, we should also WW `c1`, otherwise we end up with bad
 code.

-- 
Ticket URL: <http://ghc.haskell.org/trac/ghc/ticket/10069#comment:32>
GHC <http://www.haskell.org/ghc/>
The Glasgow Haskell Compiler


More information about the ghc-tickets mailing list