[GHC] #8032: Worker-wrapper transform and NOINLINE trigger bad reboxing behavior
GHC
ghc-devs at haskell.org
Tue Jul 9 00:23:02 CEST 2013
#8032: Worker-wrapper transform and NOINLINE trigger bad reboxing behavior
--------------------------------------------+------------------------------
Reporter: ezyang | Owner:
Type: bug | Status: new
Priority: normal | Milestone:
Component: Compiler | Version: 7.7
Resolution: | Keywords:
Operating System: Unknown/Multiple | Architecture:
Type of failure: Runtime performance bug | Unknown/Multiple
Test Case: | Difficulty: Unknown
Blocking: | Blocked By:
| Related Tickets:
--------------------------------------------+------------------------------
Description changed by ezyang:
Old description:
> (Note: I've updated the ticket with a simpler test-case).
>
> NOINLINE and the worker-wrapper transform sometimes interact poorly to
> cause unnecessary extra reboxing.
>
> {{{
> module Gnam where
>
> data D = D Int
>
> foo k d@(D e) =
> let loop i
> | i < e = loop (i + 1)
> | otherwise = baz k d i
> in loop 0
> where {-# NOINLINE baz #-}
> baz k d i = k (d, i)
> }}}
>
> This results in the following STG:
>
> {{{
> Gnam.$wfoo
> :: forall t_alo.
> ((Gnam.D, GHC.Types.Int) -> t_alo) -> GHC.Prim.Int# -> t_alo
> [GblId,
> Arity=2,
> Caf=NoCafRefs,
> Str=DmdType C(S)L,
> Unf=OtherCon []] =
> \r [w_sob ww_so3]
> let {
> e_so4 :: GHC.Types.Int
> [LclId, Unf=OtherCon []] =
> NO_CCS GHC.Types.I#! [ww_so3]; } in
> let {
> wild_so6 :: Gnam.D
> [LclId, Unf=OtherCon []] =
> NO_CCS Gnam.D! [e_so4];
> } in
> }}}
>
> This worker function needs to box its arguments so that they can be
> passed to baz. However, the only invocation of wfoo already had these
> arguments available:
>
> {{{
> Gnam.foo [InlPrag=INLINE[0]]
> :: forall t_alo.
> ((Gnam.D, GHC.Types.Int) -> t_alo) -> Gnam.D -> t_alo
> [GblId,
> Arity=2,
> Caf=NoCafRefs,
> Str=DmdType C(S)U(U(L)),
> Unf=OtherCon []] =
> \r [w_son w1_soh]
> case w1_soh of _ {
> Gnam.D ww_sok ->
> case ww_sok of _ {
> GHC.Types.I# ww2_soo -> Gnam.$wfoo w_son ww2_soo;
> };
> };
> }}}
>
> The problem seems to lie in how the worker wrapper transformation
> operates. Before, the STG is:
>
> {{{
> Gnam.foo =
> \ (@ t_alr)
> (k_aeM [Dmd=Just C(S)] :: (Gnam.D, GHC.Types.Int) -> t_alr)
> (d_aeN [Dmd=Just U(U(L))] :: Gnam.D) ->
> case d_aeN of wild_X5 { Gnam.D e_aeO [Dmd=Just U(L)] ->
> letrec {
> loop_smj [Occ=LoopBreaker] :: GHC.Types.Int -> t_alr
> [LclId,
> Arity=1,
> Str=DmdType U(L) {aeM->C(S) aeO->U(L)},
> Unf=Unf{Src=<vanilla>, TopLvl=False, Arity=1, Value=True,
> ConLike=True, WorkFree=True, Expandable=True,
> Guidance=IF_ARGS [20] 112 0}]
> loop_smj =
> \ (i_aeU [Dmd=Just U(L)] :: GHC.Types.Int) ->
> case i_aeU of wild_alU { GHC.Types.I# x_alW [Dmd=Just L] ->
> case e_aeO of _ { GHC.Types.I# y_am0 [Dmd=Just L] ->
> case GHC.Prim.<# x_alW y_am0 of _ {
> GHC.Types.False ->
> baz_smn @ t_alr @ Gnam.D @ GHC.Types.Int k_aeM wild_X5
> wild_alU;
> GHC.Types.True -> loop_smj (GHC.Types.I# (GHC.Prim.+# x_alW
> 1))
> }
> }
> }; } in
> loop_smj lvl_smr
> }
> }}}
>
> Notice that wild_alU is being properly used in the result. After the
> worker wrapper transformation, foo is now:
>
> {{{
> Gnam.foo =
> \ (@ t_alp)
> (w_sn7 [Dmd=Just C(S)] :: (Gnam.D, GHC.Types.Int) -> t_alp)
> (w_sn8 [Dmd=Just U(U(L))] :: Gnam.D) ->
> case w_sn8 of w_sn8 { Gnam.D ww_sna ->
> case ww_sna of ww_sna { GHC.Types.I# ww_snc ->
> $wfoo_sng @ t_alp w_sn7 ww_snc
> }
> }
> }}}
>
> So it seems that we should also pass along the evaluated variables, in
> case they are used. There is a tradeoff here, in that we will require
> more arguments to the function than if we just reconstructed it. However,
> if we smarten up worker-wrapper so that it drops unused arguments, this
> could be a win when not all of the fields are used, e.g. if we add
> another field to D:
>
> {{{
> Gnam.foo [InlPrag=INLINE[0]]
> :: forall t_alp.
> ((Gnam.D, GHC.Types.Int) -> t_alp) -> Gnam.D -> t_alp
> [GblId,
> Arity=2,
> Caf=NoCafRefs,
> Str=DmdType C(S)U(U(L)L),
> Unf=OtherCon []] =
> \r [w_som w1_sof]
> case w1_sof of _ {
> Gnam.D ww_soj ww1_soo ->
> case ww_soj of _ {
> GHC.Types.I# ww3_son -> Gnam.$wfoo w_som ww3_son ww1_soo;
> };
> };
> }}}
>
> Now ww1_soo is passed, even though it is dead. I think there is a comment
> to this effect in the simplifier already.
New description:
(Note: I've updated the ticket with a simpler test-case).
NOINLINE and the worker-wrapper transform sometimes interact poorly to
cause unnecessary extra reboxing.
{{{
module Gnam where
data D = D Int
foo k d@(D e) =
let loop i
| i < e = loop (i + 1)
| otherwise = baz k d i
in loop 0
where {-# NOINLINE baz #-}
baz k d i = k (d, i)
}}}
This results in the following STG:
{{{
Gnam.$wfoo
:: forall t_alo.
((Gnam.D, GHC.Types.Int) -> t_alo) -> GHC.Prim.Int# -> t_alo
[GblId,
Arity=2,
Caf=NoCafRefs,
Str=DmdType C(S)L,
Unf=OtherCon []] =
\r [w_sob ww_so3]
let {
e_so4 :: GHC.Types.Int
[LclId, Unf=OtherCon []] =
NO_CCS GHC.Types.I#! [ww_so3]; } in
let {
wild_so6 :: Gnam.D
[LclId, Unf=OtherCon []] =
NO_CCS Gnam.D! [e_so4];
} in
}}}
This worker function needs to box its arguments so that they can be passed
to baz. However, the only invocation of wfoo already had these arguments
available:
{{{
Gnam.foo [InlPrag=INLINE[0]]
:: forall t_alo.
((Gnam.D, GHC.Types.Int) -> t_alo) -> Gnam.D -> t_alo
[GblId,
Arity=2,
Caf=NoCafRefs,
Str=DmdType C(S)U(U(L)),
Unf=OtherCon []] =
\r [w_son w1_soh]
case w1_soh of _ {
Gnam.D ww_sok ->
case ww_sok of _ {
GHC.Types.I# ww2_soo -> Gnam.$wfoo w_son ww2_soo;
};
};
}}}
The problem seems to lie in how the worker wrapper transformation
operates. Before, the STG is:
{{{
Gnam.foo =
\ (@ t_alr)
(k_aeM [Dmd=Just C(S)] :: (Gnam.D, GHC.Types.Int) -> t_alr)
(d_aeN [Dmd=Just U(U(L))] :: Gnam.D) ->
case d_aeN of wild_X5 { Gnam.D e_aeO [Dmd=Just U(L)] ->
letrec {
loop_smj [Occ=LoopBreaker] :: GHC.Types.Int -> t_alr
[LclId,
Arity=1,
Str=DmdType U(L) {aeM->C(S) aeO->U(L)},
Unf=Unf{Src=<vanilla>, TopLvl=False, Arity=1, Value=True,
ConLike=True, WorkFree=True, Expandable=True,
Guidance=IF_ARGS [20] 112 0}]
loop_smj =
\ (i_aeU [Dmd=Just U(L)] :: GHC.Types.Int) ->
case i_aeU of wild_alU { GHC.Types.I# x_alW [Dmd=Just L] ->
case e_aeO of _ { GHC.Types.I# y_am0 [Dmd=Just L] ->
case GHC.Prim.<# x_alW y_am0 of _ {
GHC.Types.False ->
baz_smn @ t_alr @ Gnam.D @ GHC.Types.Int k_aeM wild_X5
wild_alU;
GHC.Types.True -> loop_smj (GHC.Types.I# (GHC.Prim.+# x_alW
1))
}
}
}; } in
loop_smj lvl_smr
}
}}}
Notice that wild_alU is being properly used in the result. After the
worker wrapper transformation, foo is now:
{{{
Gnam.foo =
\ (@ t_alp)
(w_sn7 [Dmd=Just C(S)] :: (Gnam.D, GHC.Types.Int) -> t_alp)
(w_sn8 [Dmd=Just U(U(L))] :: Gnam.D) ->
case w_sn8 of w_sn8 { Gnam.D ww_sna ->
case ww_sna of ww_sna { GHC.Types.I# ww_snc ->
$wfoo_sng @ t_alp w_sn7 ww_snc
}
}
}}}
So it seems that we should also pass along the evaluated variables, in
case they are used. There is a tradeoff here, in that we will require more
arguments to the function than if we just reconstructed it. However, if we
smarten up worker-wrapper so that it drops unused arguments, this could be
a win when not all of the fields are used, e.g. if we add another field to
D:
{{{
Gnam.foo [InlPrag=INLINE[0]]
:: forall t_alp.
((Gnam.D, GHC.Types.Int) -> t_alp) -> Gnam.D -> t_alp
[GblId,
Arity=2,
Caf=NoCafRefs,
Str=DmdType C(S)U(U(L)L),
Unf=OtherCon []] =
\r [w_som w1_sof]
case w1_sof of _ {
Gnam.D ww_soj ww1_soo ->
case ww_soj of _ {
GHC.Types.I# ww3_son -> Gnam.$wfoo w_som ww3_son ww1_soo;
};
};
}}}
Now ww1_soo is passed, even though it is only ever used to rebox the
value. I think there is a comment to this effect in the simplifier
already. Passing boxed value would require exactly the same number of
function arguments, but save on a heap allocation!
--
--
Ticket URL: <http://ghc.haskell.org/trac/ghc/ticket/8032#comment:2>
GHC <http://www.haskell.org/ghc/>
The Glasgow Haskell Compiler
More information about the ghc-tickets
mailing list