[GHC] #10069: CPR related performance issue
GHC
ghc-devs at haskell.org
Tue Feb 19 09:36:52 UTC 2019
#10069: CPR related performance issue
-------------------------------------+-------------------------------------
Reporter: pacak | Owner: (none)
Type: bug | Status: new
Priority: normal | Milestone:
Component: Compiler | Version: 8.6.2
Resolution: | Keywords:
| DemandAnalysis
Operating System: Unknown/Multiple | Architecture:
Type of failure: Runtime | Unknown/Multiple
performance bug | Test Case:
Blocked By: | Blocking:
Related Tickets: | Differential Rev(s):
Wiki Page: |
-------------------------------------+-------------------------------------
Changes (by sgraf):
* keywords: CPRAnalysis, DemandAnalysis => DemandAnalysis
Comment:
Looking at https://ghc.haskell.org/trac/ghc/attachment/ticket/10069/Blah
.dump-simpl#L1668, I don't think this is related to CPR analysis but to
the worker/wrapper transformation having issues with NOINLINE functions.
What happens here is that `f1` to `f4` can't be inlined (so we don't see
the case on the `A`), but `fa` still gets a strictness signature saying
that all but arguments 2 to 5 are dead. WW will now split `fa` into a
wrapper function that scrutinises the `A` to just project out the 4
arguments that aren't dead and pass it on to the worker `$wfa` unboxed.
So far so good. Now, WW arranges it so that the worker `$wfa` builds up a
new `A` with dummy values for absent fields (0# for Int#). Normally, this
new `A` binding would cancel out with case matches in `$wfa`, because the
strictness signature must ultimately come from some case expression. These
however are hidden in `NOINLINE` functions, so no cancelling is happening.
As a result, we allocate the dummy `A` for nothing, we could have just
passed along the old `A`.
Here's an example demonstrating this in the small:
{{{#!hs
data C = C !Int !Int
{-# NOINLINE c1 #-}
c1 :: C -> Int
c1 (C _ c) = c
{-# NOINLINE fc #-}
fc :: C -> Int
fc c = c1 c + c1 c
}}}
Relevant Core:
{{{#!hs
c1_rP
= \ (ds_d3af :: C) ->
case ds_d3af of { C dt_d3PA dt1_d3PB -> GHC.Types.I# dt1_d3PB }
Main.$wfc
= \ (ww_s7DJ :: GHC.Prim.Int#) ->
case c1_rP (Main.C 0# ww_s7DJ) of { GHC.Types.I# x_a4kS ->
GHC.Prim.*# 2# x_a4kS
}
fc
= \ (w_s7DF :: C) ->
case w_s7DF of { C ww1_s7DI ww2_s7DJ ->
case Main.$wfc ww2_s7DJ of ww3_s7DN { __DEFAULT ->
GHC.Types.I# ww3_s7DN
}
}
}}}
The problem I see here is that we don't WW `c1`, or that we don't inline
the resulting wrapper into `fc` before the hypothetical worker `$wc1` of
`c1` gets inlined back into `c1` because it's so small. If we inlined
`$wc1` into `$wfc`, the case on `C` would cancel out with the dummy `C`
and everything would be well.
So: If we WW `fc`, we should also WW `c1`, otherwise we end up with bad
code.
--
Ticket URL: <http://ghc.haskell.org/trac/ghc/ticket/10069#comment:32>
GHC <http://www.haskell.org/ghc/>
The Glasgow Haskell Compiler
More information about the ghc-tickets
mailing list