[GHC] #8905: Function arguments are always spilled/reloaded if scrutinee is already in WHNF

GHC ghc-devs at haskell.org
Sun Mar 16 20:07:25 UTC 2014


#8905: Function arguments are always spilled/reloaded if scrutinee is already in
WHNF
--------------------------------------------+------------------------------
        Reporter:  tibbe                    |            Owner:
            Type:  bug                      |           Status:  new
        Priority:  normal                   |        Milestone:
       Component:  Compiler                 |          Version:  7.9
      Resolution:                           |         Keywords:
Operating System:  Unknown/Multiple         |     Architecture:
 Type of failure:  Runtime performance bug  |  Unknown/Multiple
       Test Case:                           |       Difficulty:  Unknown
        Blocking:                           |       Blocked By:
                                            |  Related Tickets:
--------------------------------------------+------------------------------

Comment (by tibbe):

 To provide some motivation, I've included the full Cmm of the '''common'''
 path below. Every line that starts with `>` is what I would considered
 unnecessary spills/loads for this path:

 {{{
   c2wQ:  // stack check
       if ((Sp + -72) < SpLim) goto c2wR; else goto c2wS;
   c2wS:  // stack check success
 >     I64[Sp - 40] = PicBaseReg + block_c2my_info;  // return addr for
 eval
       R1 = R6;  // t
 >     I64[Sp - 32] = R2;  // spill: s
 >     I64[Sp - 24] = R3;  // spill: x
 >     P64[Sp - 16] = R4;  // spill: k
 >     I64[Sp - 8] = R5;  // spill: h
       Sp = Sp - 40;
       if (R1 & 7 != 0) goto c2my; else goto c2mz;  // eval check of t
   c2my:  // eval check succeded
 >     _s2b1::I64 = I64[Sp + 8];  // reload: h
 >     _s2b2::I64 = I64[Sp + 16];  // reload: k
 >     _s2b3::P64 = P64[Sp + 24];  // reload: x
 >     _s2b4::I64 = I64[Sp + 32];  // reload: s
       switch [0 .. 4] (R1 & 7 - 1) {
           case 0 : goto c2wK;
           case 1 : goto c2wL;
           case 2 : goto c2wM;
           case 3 : goto c2wN;
           case 4 : goto c2wO;
       }
   c2wN:  // Full
       _s2cx::P64 = P64[R1 + 4];  // ary
       _s2cy::I64 = (_s2b1::I64 >> _s2b4::I64) & 15;  // i
       _s2cC::P64 = P64[(_s2cx::P64 + 24) + (_s2cy::I64 << 3)];  // st
       I64[Sp] = PicBaseReg + block_c2nJ_info;  // return addr
       R6 = _s2cC::P64;  // arg: st
       R5 = _s2b4::I64 + 4;  // arg: s + bitsPerSubkey
       R4 = _s2b3::P64;  // arg: x
       R3 = _s2b2::I64;  // arg: k
       R2 = _s2b1::I64;  // arg: h
       P64[Sp + 8] = _s2cC::P64;  // spill: st
       I64[Sp + 16] = _s2cy::I64;  // spill: i
       P64[Sp + 24] = _s2cx::P64;  // spill: ary
       P64[Sp + 32] = R1;  // spill: t (only used in uncommon branch)
       call $wpoly_go_info(R6,
                           R5,
                           R4,
                           R3,
                           R2) returns to c2nJ, args: 8, res: 8, upd: 8;
   c2nJ:
 >     _s2b6::P64 = P64[Sp + 32];  // reload: t (only used in uncommon
 branch)
       _s2cx::P64 = P64[Sp + 24];  // reload: ary
       _s2cy::I64 = I64[Sp + 16];  // reload: i
       _s2cE::P64 = R1;
       _s2cF::I64 = R1 == P64[Sp + 8];
       if (_s2cF::I64 != 1) goto c2nR; else goto c2AE;
   c2nR:  // heap check
       Hp = Hp + 176;
       if (Hp > I64[BaseReg + 856]) goto c2AB; else goto c2AA;
   c2AA:  // heap check success
       I64[Hp - 168] = I64[PicBaseReg +
 stg_MUT_ARR_PTRS_DIRTY_info at GOTPCREL];
       I64[Hp - 160] = 16;
       I64[Hp - 152] = 17;
       _c2nT::I64 = Hp - 168;
       call MO_Memcpy(_c2nT::I64 + 24, _s2cx::P64 + 24, 128, 8);
       P64[(_c2nT::I64 + 24) + (_s2cy::I64 << 3)] = _s2cE::P64;
       I64[_c2nT::I64] = I64[PicBaseReg +
 stg_MUT_ARR_PTRS_DIRTY_info at GOTPCREL];
       I8[(_c2nT::I64 + 24) + ((I64[_c2nT::I64 + 8] << 3) + (_s2cy::I64 >>
 7))] = 1 :: W8;
       I64[_c2nT::I64] = I64[PicBaseReg +
 stg_MUT_ARR_PTRS_FROZEN0_info at GOTPCREL];
       I64[Hp - 8] = PicBaseReg + Full_con_info;
       I64[Hp] = _c2nT::I64;
       R1 = Hp - 4;
       Sp = Sp + 40;
       call (P64[Sp])(R1) args: 8, res: 0, upd: 8;  // return
 }}}

 The main body of this function, except for the recursive call, is in the
 last block, `c2AA`. Quite of bit of time is spent on "administrative"
 things. Also note that there are a bunch of static arguments passed around
 (`x`, `k`, and `h`). I will try to see what the Cmm looks like if I
 manually perform a static argument transform on this code.

--
Ticket URL: <http://ghc.haskell.org/trac/ghc/ticket/8905#comment:6>
GHC <http://www.haskell.org/ghc/>
The Glasgow Haskell Compiler


More information about the ghc-tickets mailing list