[GHC] #12808: For primitive (Addr#) operations, Loop Invariant Code Flow not lifted outside the loop...
GHC
ghc-devs at haskell.org
Tue Nov 8 20:27:13 UTC 2016
#12808: For primitive (Addr#) operations, Loop Invariant Code Flow not lifted
outside the loop...
-------------------------------------+-------------------------------------
Reporter: GordonBGood | Owner:
Type: bug | Status: new
Priority: normal | Milestone: 8.2.1
Component: Compiler | Version: 8.0.1
Resolution: | Keywords:
Operating System: Unknown/Multiple | Architecture:
Type of failure: Runtime | Unknown/Multiple
performance bug | Test Case:
Blocked By: | Blocking:
Related Tickets: | Differential Rev(s):
Wiki Page: |
-------------------------------------+-------------------------------------
Comment (by GordonBGood):
Replying to [comment:6 simonpj]:
> I'm failing to see how the code at the top lines up with the Cmm you are
showing.
@simonpj, the cmm code shown is the first of the "cull" case loops from
the Haskell GHC code. The bottom "optimized" version has had the register
initialization dropped down into inside the loop.
> Maybe show STG code too, and say how they match up? If we can do the
floating in Core, that would be better!
Here is the STG code for the same cull loop/recursive function, massively
back tabbed for display purposes here (output of -ddump-stg from GHC
version 8.0.1 on 64-bit Windows, lines 898 through 1132):
{{{
let {
cull_seCS [Occ=LoopBreaker]
:: GHC.Prim.Addr#
-> GHC.Prim.State#
GHC.Prim.RealWorld
-> (# GHC.Prim.Addr#,
GHC.Prim.State#
GHC.Prim.RealWorld #)
[LclId,
Arity=2,
Str=DmdType <S,U><S,U>,
Unf=OtherCon []] =
sat-only \r srt:SRT:[] [c#_seCT
sp#_seCU]
case
ltAddr# [c#_seCT
lmt1#_seCQ]
of
_ [Occ=Dead]
{ __DEFAULT ->
case
readWord8OffAddr# [c#_seCT
0#
sp#_seCU]
of
_ [Occ=Dead]
{ (#,#) ipv8_seCX [Occ=Once]
ipv9_seCY [Occ=Once] ->
case
or# [ipv9_seCY
1##]
of
sat_seCZ
{ __DEFAULT ->
case
writeWord8OffAddr# [c#_seCT
0#
sat_seCZ
ipv8_seCX]
of
sp1#_seD0 [OS=OneShot]
{ __DEFAULT ->
case
readWord8OffAddr# [c#_seCT
r1#_seCD
sp1#_seD0]
of
_ [Occ=Dead]
{ (#,#) ipv10_seD2 [Occ=Once]
ipv11_seD3 [Occ=Once] ->
case
or# [ipv11_seD3
2##]
of
sat_seD4
{ __DEFAULT ->
case
writeWord8OffAddr#
[c#_seCT
r1#_seCD
sat_seD4
ipv10_seD2]
of
sp3#_seD5 [OS=OneShot]
{ __DEFAULT ->
case
readWord8OffAddr#
[c#_seCT
r2#_seCF
sp3#_seD5]
of
_ [Occ=Dead]
{ (#,#) ipv12_seD7
[Occ=Once]
ipv13_seD8
[Occ=Once] ->
case
or#
[ipv13_seD8
4##]
of
sat_seD9
{ __DEFAULT ->
case
writeWord8OffAddr# [c#_seCT
r2#_seCF
sat_seD9
ipv12_seD7]
of
sp5#_seDa
[OS=OneShot]
{
__DEFAULT ->
case
readWord8OffAddr# [c#_seCT
r3#_seCH
sp5#_seDa]
of
_
[Occ=Dead]
{
(#,#) ipv14_seDc [Occ=Once]
ipv15_seDd [Occ=Once] ->
case
or# [ipv15_seDd
8##]
of
sat_seDe
{ __DEFAULT ->
case
writeWord8OffAddr# [c#_seCT
r3#_seCH
sat_seDe
ipv14_seDc]
of
sp7#_seDf [OS=OneShot]
{ __DEFAULT ->
case
readWord8OffAddr# [c#_seCT
r4#_seCJ
sp7#_seDf]
of
_ [Occ=Dead]
{ (#,#) ipv16_seDh [Occ=Once]
ipv17_seDi [Occ=Once] ->
case
or# [ipv17_seDi
16##]
of
sat_seDj
{ __DEFAULT ->
case
writeWord8OffAddr# [c#_seCT
r4#_seCJ
sat_seDj
ipv16_seDh]
of
sp9#_seDk [OS=OneShot]
{ __DEFAULT ->
case
readWord8OffAddr# [c#_seCT
r5#_seCL
sp9#_seDk]
of
_ [Occ=Dead]
{ (#,#) ipv18_seDm [Occ=Once]
ipv19_seDn [Occ=Once] ->
case
or# [ipv19_seDn
32##]
of
sat_seDo
{ __DEFAULT ->
case
writeWord8OffAddr# [c#_seCT
r5#_seCL
sat_seDo
ipv18_seDm]
of
sp11#_seDp [OS=OneShot]
{ __DEFAULT ->
case
readWord8OffAddr# [c#_seCT
r6#_seCN
sp11#_seDp]
of
_ [Occ=Dead]
{ (#,#) ipv20_seDr [Occ=Once]
ipv21_seDs [Occ=Once] ->
case
or# [ipv21_seDs
64##]
of
sat_seDt
{ __DEFAULT ->
case
writeWord8OffAddr# [c#_seCT
r6#_seCN
sat_seDt
ipv20_seDr]
of
sp13#_seDu [OS=OneShot]
{ __DEFAULT ->
case
readWord8OffAddr# [c#_seCT
r7#_seCP
sp13#_seDu]
of
_ [Occ=Dead]
{ (#,#) ipv22_seDw [Occ=Once]
ipv23_seDx [Occ=Once] ->
case
or# [ipv23_seDx
128##]
of
sat_seDy
{ __DEFAULT ->
case
writeWord8OffAddr# [c#_seCT
r7#_seCP
sat_seDy
ipv22_seDw]
of
sp15#_seDz [OS=OneShot]
{ __DEFAULT ->
case
plusAddr# [c#_seCT
p#_seBQ]
of
sat_seDA
{ __DEFAULT ->
cull_seCS
sat_seDA
sp15#_seDz;
};
};
};
};
};
};
};
};
};
};
};
};
};
};
};
};
};
};
};
};
};
};
};
};
};
0# ->
(#,#) [c#_seCT
sp#_seCU];
};
} in
}}}
You can see that the STG code just reflects the original Haskell source
code and that the faulty register initialization has not yet been dropped
down to within the loop(s), so the problem is not here. The problem is in
the CMM optimization pass, thus it also applies to NCG (although of course
NCG also has other problems).
The easiest way to fix this might be to turn on the appropriate LLVM loop
invariant code flow optimizations (if they would work) and have it only
apply to LLVM.
--
Ticket URL: <http://ghc.haskell.org/trac/ghc/ticket/12808#comment:7>
GHC <http://www.haskell.org/ghc/>
The Glasgow Haskell Compiler
More information about the ghc-tickets
mailing list