[GHC] #8275: Loopification breaks profiling
GHC
ghc-devs at haskell.org
Mon Sep 16 14:58:10 CEST 2013
#8275: Loopification breaks profiling
----------------------------------------+----------------------------------
Reporter: jstolarek | Owner: jstolarek
Type: bug | Status: new
Priority: highest | Milestone:
Component: Profiling | Version: 7.7
Resolution: | Keywords:
Operating System: Unknown/Multiple | Architecture:
Type of failure: Building GHC failed | Unknown/Multiple
Test Case: | Difficulty: Unknown
Blocking: 8298 | Blocked By:
| Related Tickets:
----------------------------------------+----------------------------------
Comment (by jstolarek):
Loopification triggers for fully-saturated (but not over-saturated!) tail
calls. So in this code:
{{{
f 0 = 4
f 1 = 5
f n = case f (n - 2) of
4 -> 4
5 -> f (n - 1)
}}}
If will trigger for call to `f (n - 1)` in second branch of case, but will
not trigger for `f (n - 2)` in case scrutinee. I checked and this code
actually causes segfault when compiled with `-prof -fprof-auto -rtsopts`
(assuming you add `main = print (f 5)` in the file), but it only happens
when `f :: Integer -> Integer` and not when `f :: Int -> Int`, so I
suspect that the bug might actually be hidden somewhere in the libraries
and I might be looking at wrong code.
The idea behind the loopification is that it should put parameters in the
local variables (instead of global registers) and make a jump (instead of
call). `f` function begins like this in Cmm:
{{{
cYp:
_sUP::P64 = R2;
_sUO::P64 = R1;
if (%MO_UU_Conv_W32_W64(I32[era]) <= 0) goto cW1; else goto
cVZ;
cVZ:
I64[R1 + 15] = I64[R1 + 15] & 1152921503533105152 |
%MO_UU_Conv_W32_W64(I32[era]) | 1152921504606846976;
goto cW1;
cW1:
if (Sp - 104 < SpLim) goto cYq; else goto cYr;
cYr:
Hp = Hp + 40;
if (Hp > HpLim) goto cYt; else goto cYs
}}}
Without loopification tail call will be a normal call:
{{{
cYM:
I64[CCCS + 72] = I64[CCCS + 72] + %MO_UU_Conv_W64_W64(6 - 2);
I64[Hp - 40] = sat_sVa_info;
I64[Hp - 32] = CCCS;
I64[Hp - 24] = (%MO_UU_Conv_W32_W64(I32[era]) << 30) | 0;
P64[Hp - 8] = _sUN::P64;
P64[Hp] = _sUP::P64;
_cXT::P64 = Hp - 40;
R2 = _cXT::P64;
R1 = _sUO::P64;
Sp = Sp + 40;
call f1_sUQ_info(R2, R1) args: 8, res: 0, upd: 8;
}}}
With loopification we get:
{{{
cYM:
I64[CCCS + 72] = I64[CCCS + 72] + %MO_UU_Conv_W64_W64(6 - 2);
I64[Hp - 40] = sat_sV8_info;
I64[Hp - 32] = CCCS;
I64[Hp - 24] = (%MO_UU_Conv_W32_W64(I32[era]) << 30) | 0;
P64[Hp - 8] = _sUL::P64;
P64[Hp] = _sUP::P64;
_cXT::P64 = Hp - 40;
_sUP::P64 = _cXT::P64;
goto cW2;
cW2:
if (Sp - 104 < SpLim) goto uZq; else goto uZp;
uZq:
Sp = Sp + 40;
goto cYq;
cYq:
R2 = _sUP::P64;
R1 = _sUO::P64;
call (stg_gc_fun)(R2, R1) args: 8, res: 0, upd: 8;
uZp:
Sp = Sp + 40;
goto cYr;
}}}
What might be surprising is that value of `_sUO` is not set before making
tail call, but that *seems* to be OK - it is only shuffled between the
stack and local variable. Note also that loopified call doesn't jump
directly to second label `cVZ`, but instead it jumps to `cYr`. In
principle this is OK (we want to skip stack check but not heap check), but
TBH I can't tell whether in this case that is correct - I don't know what
the magical numbers in `cVZ` do.
--
Ticket URL: <http://ghc.haskell.org/trac/ghc/ticket/8275#comment:5>
GHC <http://www.haskell.org/ghc/>
The Glasgow Haskell Compiler
More information about the ghc-tickets
mailing list