[GHC] #6070: Fun with the demand analyser
GHC
ghc-devs at haskell.org
Thu Jan 16 15:32:13 UTC 2014
#6070: Fun with the demand analyser
-------------------------------------+------------------------------------
Reporter: simonpj | Owner: simonpj
Type: bug | Status: new
Priority: normal | Milestone: 7.8.1
Component: Compiler | Version: 7.4.1
Resolution: | Keywords:
Operating System: Unknown/Multiple | Architecture: Unknown/Multiple
Type of failure: None/Unknown | Difficulty: Unknown
Test Case: | Blocked By:
Blocking: | Related Tickets:
-------------------------------------+------------------------------------
Description changed by simonpj:
Old description:
> Max writes: I've been trying to speed up the supercompiler, and in the
> process
> I've noticed some infelicities in the demand analyser that might be
> worth looking at.
>
> == Infelicity 1: analysis does not take into account extra unboxing done
> by the CPR transformation ==
> An example is:
> {{{
> e :: (Int, Int) -> Int -> (Int, Int)
> e x y = x `seq` if y > 10
> then x
> else e x (y + 1)
> }}}
> Because x is seqed, the occurrence in the "then" branch gets the CPR
> property so e has the CPR property overall. However, the overall
> demand on x is S(AA), i.e. the demand analyser believes the x box is
> used -- if the box were unused it would get the demand U(LL). So the
> overall demand type is S(AA)U(L)m, and the worker looks like:
> {{{
> Rec {
> CPR2.$we [Occ=LoopBreaker]
> :: (GHC.Types.Int, GHC.Types.Int)
> -> GHC.Prim.Int# -> (# GHC.Types.Int, GHC.Types.Int #)
> [GblId,
> Arity=2,
> Caf=NoCafRefs,
> Str=DmdType S(AA)L,
> Unf=OtherCon []]
> CPR2.$we =
> \ (w_srv :: (GHC.Types.Int, GHC.Types.Int))
> (ww_srt :: GHC.Prim.Int#) ->
> case GHC.Prim.># ww_srt 10 of _ {
> GHC.Types.False ->
> case GHC.Prim.+# ww_srt 1 of sat_ssS { __DEFAULT ->
> CPR2.$we w_srv sat_ssS
> };
> GHC.Types.True ->
> case w_srv of _ { (ww2_srA, ww3_srB) -> (# ww2_srA, ww3_srB #) }
> }
> end Rec }
> }}}
> But if we demand-analysed $we then GHC would say that it had the
> demand type U(LL)L and unbox the pair argument! It seems silly that
> the demand analyser outputs code that can be improved further by just
> running demand analysis again.
>
> Somewhere where this really bit me in practice is in:
> {{{
> d :: M.Map Int Int -> (Int, Int)
> d m = M.foldrWithKey' (\k v (a, b) -> if k + v > 2 then (a, b) else
> (b, a)) (0, 1) m
> }}}
> Which generates this inner loop:
> {{{
> Rec {
> CPR2.$wgo2 [Occ=LoopBreaker]
> :: (GHC.Types.Int, GHC.Types.Int)
> -> Data.Map.Base.Map GHC.Types.Int GHC.Types.Int
> -> (# GHC.Types.Int, GHC.Types.Int #)
> [GblId,
> Arity=2,
> Caf=NoCafRefs,
> Str=DmdType S(AA)S,
> Unf=OtherCon []]
> CPR2.$wgo2 =
> \ (w_srS :: (GHC.Types.Int, GHC.Types.Int))
> (w1_srQ :: Data.Map.Base.Map GHC.Types.Int GHC.Types.Int) ->
> case w1_srQ of _ {
> Data.Map.Base.Tip ->
> case w_srS of _ { (ww1_srW, ww2_srX) -> (# ww1_srW, ww2_srX #) };
> Data.Map.Base.Bin rb_st1 kx_ss3 x_ssa l_ssk r_ss6 ->
> case kx_ss3 of _ { GHC.Types.I# x1_ssd ->
> case CPR2.$wgo2 w_srS r_ss6 of _ { (# ww1_ssi, ww2_ssh #) ->
> case x_ssa of _ { GHC.Types.I# y_sse ->
> case GHC.Prim.+# x1_ssd y_sse of sat_st0 { __DEFAULT ->
> case GHC.Prim.># sat_st0 2 of _ {
> GHC.Types.False ->
> let {
> sat_ssZ :: (GHC.Types.Int, GHC.Types.Int)
> [LclId]
> sat_ssZ = (ww2_ssh, ww1_ssi) } in
> CPR2.$wgo2 sat_ssZ l_ssk;
> GHC.Types.True ->
> let {
> sat_st6 :: (GHC.Types.Int, GHC.Types.Int)
> [LclId]
> sat_st6 = (ww1_ssi, ww2_ssh) } in
> CPR2.$wgo2 sat_st6 l_ssk
> }
> }
> }
> }
> }
> }
> end Rec }
> }}}
> We can save a number of allocations proportional to the size of the
> Map if the demand analyser didn't have this problem.
>
> I hacked up the analyser to "fix" this by changing the lines at line
> 473 onward to read:
> {{{
> if isTopLevel top_lvl
> then fn_ty -- Don't record top level things
> else case dmd of
> Box (Eval (Poly Abs)) | DmdType _ _ res <- fn_ty,
> returnsCPR res -> addVarDmd fn_ty var (Eval (Poly lazyDmd))
> _
> -> addVarDmd fn_ty var dmd
> }}}
> So if we demand a CPRish variable (such as bound by a case scrutinee
> or a U(LL)-demanded lambda-binder) in the evalDmd then I throw away
> the Box part of the evalDmd on the basis that after CPRing we won't
> demand the box at all.
>
> I doubt this hack is the right solution (and I haven't tried it to see
> how it affects the libraries) but perhaps it gives you some ideas.
>
> == Infelicity 2: a case where demand summarisation hurts us ==
>
> I found practical examples where summarising the demand transfomer of
> a function as a single strictness signature hurt us. The examples were
> code like:
> {{{
> h :: (Int, Int) -> Int -> (Int, Int)
> h x y = if y > 10
> then x
> else h (case h x 0 of (y1, y2) -> (y2, y1)) (y + 1)
> }}}
> If, at the innermost use of h, we re-analyse h in the demand
> C(C(U(LL))) rather than just unleashing the vanilla DmdSig computed
> from the demand C(C(S)) then we can unbox the pair argument. Currently
> GHC just gives h the DmdType SU(L) which doesn't eliminate the
> allocation of the pair at all.
>
> This situation occurs in practice with code like:
> {{{
> c :: M.Map Int Int -> (Int, Int)
> c m = M.foldrWithKey (\k v (a, b) -> if k + v > 2 then (a, b) else (b,
> a)) (0, 1) m
> }}}
> The difference from my earlier example d is that I'm using the version
> of `foldrWithKey` that doesn't call `seq`. Current GHC generates this
> inner loop:
> {{{
> Rec {
> CPR2.c_go2 [Occ=LoopBreaker]
> :: (GHC.Types.Int, GHC.Types.Int)
> -> Data.Map.Base.Map GHC.Types.Int GHC.Types.Int
> -> (GHC.Types.Int, GHC.Types.Int)
> [GblId,
> Arity=2,
> Caf=NoCafRefs,
> Str=DmdType U(L*)S,
> Unf=OtherCon []]
> CPR2.c_go2 =
> \ (z_spW :: (GHC.Types.Int, GHC.Types.Int))
> (ds_spU :: Data.Map.Base.Map GHC.Types.Int GHC.Types.Int) ->
> case ds_spU of _ {
> Data.Map.Base.Tip -> z_spW;
> Data.Map.Base.Bin rb_sqq kx_sq2 x_sq9 l_sqj r_sq5 ->
> case kx_sq2 of _ { GHC.Types.I# x1_sqc ->
> case CPR2.c_go2 z_spW r_sq5 of wild2_sqk { (a_sqh, b_sqg) ->
> case x_sq9 of _ { GHC.Types.I# y_sqd ->
> case GHC.Prim.+# x1_sqc y_sqd of sat_sqp { __DEFAULT ->
> case GHC.Prim.># sat_sqp 2 of _ {
> GHC.Types.False ->
> let {
> sat_sqo :: (GHC.Types.Int, GHC.Types.Int)
> [LclId]
> sat_sqo = (b_sqg, a_sqh) } in
> CPR2.c_go2 sat_sqo l_sqj;
> GHC.Types.True -> CPR2.c_go2 wild2_sqk l_sqj
> }
> }
> }
> }
> }
> }
> end Rec }
> }}}
> I implemented this but the overhead of reanalysing a function at each
> occurrence site is prohibitive (with my changes paraffins took 2.5s to
> compile instead of 0.3s). It does fix the problem though.
>
> A scheme like in "stricterness more relevant" but with let-binding
> that is polymorphic in the use-site demand might be able to detect the
> extra strictness here. I think Stefan Holdermanns has a paper
> describing such a system. But this is anyway much harder to fix than
> my first infelicity.
New description:
Max writes: I've been trying to speed up the supercompiler, and in the
process
I've noticed some infelicities in the demand analyser that might be
worth looking at. One is #5949. The other is:
== Infelicity 2: a case where demand summarisation hurts us ==
I found practical examples where summarising the demand transfomer of
a function as a single strictness signature hurt us. The examples were
code like:
{{{
h :: (Int, Int) -> Int -> (Int, Int)
h x y = if y > 10
then x
else h (case h x 0 of (y1, y2) -> (y2, y1)) (y + 1)
}}}
If, at the innermost use of h, we re-analyse h in the demand
C(C(U(LL))) rather than just unleashing the vanilla DmdSig computed
from the demand C(C(S)) then we can unbox the pair argument. Currently
GHC just gives h the DmdType SU(L) which doesn't eliminate the
allocation of the pair at all.
This situation occurs in practice with code like:
{{{
c :: M.Map Int Int -> (Int, Int)
c m = M.foldrWithKey (\k v (a, b) -> if k + v > 2 then (a, b) else (b,
a)) (0, 1) m
}}}
The difference from my earlier example d is that I'm using the version
of `foldrWithKey` that doesn't call `seq`. Current GHC generates this
inner loop:
{{{
Rec {
CPR2.c_go2 [Occ=LoopBreaker]
:: (GHC.Types.Int, GHC.Types.Int)
-> Data.Map.Base.Map GHC.Types.Int GHC.Types.Int
-> (GHC.Types.Int, GHC.Types.Int)
[GblId,
Arity=2,
Caf=NoCafRefs,
Str=DmdType U(L*)S,
Unf=OtherCon []]
CPR2.c_go2 =
\ (z_spW :: (GHC.Types.Int, GHC.Types.Int))
(ds_spU :: Data.Map.Base.Map GHC.Types.Int GHC.Types.Int) ->
case ds_spU of _ {
Data.Map.Base.Tip -> z_spW;
Data.Map.Base.Bin rb_sqq kx_sq2 x_sq9 l_sqj r_sq5 ->
case kx_sq2 of _ { GHC.Types.I# x1_sqc ->
case CPR2.c_go2 z_spW r_sq5 of wild2_sqk { (a_sqh, b_sqg) ->
case x_sq9 of _ { GHC.Types.I# y_sqd ->
case GHC.Prim.+# x1_sqc y_sqd of sat_sqp { __DEFAULT ->
case GHC.Prim.># sat_sqp 2 of _ {
GHC.Types.False ->
let {
sat_sqo :: (GHC.Types.Int, GHC.Types.Int)
[LclId]
sat_sqo = (b_sqg, a_sqh) } in
CPR2.c_go2 sat_sqo l_sqj;
GHC.Types.True -> CPR2.c_go2 wild2_sqk l_sqj
}
}
}
}
}
}
end Rec }
}}}
I implemented this but the overhead of reanalysing a function at each
occurrence site is prohibitive (with my changes paraffins took 2.5s to
compile instead of 0.3s). It does fix the problem though.
A scheme like in "stricterness more relevant" but with let-binding
that is polymorphic in the use-site demand might be able to detect the
extra strictness here. I think Stefan Holdermanns has a paper
describing such a system. But this is anyway much harder to fix than
my first infelicity.
--
--
Ticket URL: <http://ghc.haskell.org/trac/ghc/ticket/6070#comment:5>
GHC <http://www.haskell.org/ghc/>
The Glasgow Haskell Compiler
More information about the ghc-tickets
mailing list