[GHC] #9188: quot with a power of two is not optimized to a shift

Mon Jun 9 17:41:18 UTC 2014

#9188: quot with a power of two is not optimized to a shift
------------------------------+--------------------------------------------
       Reporter:  tibbe       |             Owner:
           Type:  bug         |            Status:  new
       Priority:  normal      |         Milestone:
      Component:  Compiler    |           Version:  7.8.2
       Keywords:              |  Operating System:  Unknown/Multiple
   Architecture:              |   Type of failure:  Runtime performance bug
  Unknown/Multiple            |         Test Case:
     Difficulty:  Unknown     |          Blocking:
     Blocked By:              |
Related Tickets:              |
------------------------------+--------------------------------------------
 The follow code:

 {{{
 module Test where

 f :: Int -> Int
 f n = n `quot` 2
 }}}

 results in the following core:

 {{{
 Test.f :: GHC.Types.Int -> GHC.Types.Int
 Test.f =
   \ (n_aeH :: GHC.Types.Int) ->
     case n_aeH of _ { GHC.Types.I# ww_aiQ ->
     GHC.Types.I# (GHC.Prim.quotInt# ww_aiQ 2)
     }
 }}}

 which in turn generates this Cmm

 {{{
 sjI_ret()
         { Just sjI_info:
                    const 0;
                    const 32;
         }
     cjX:
         Hp = Hp + 16;
         if (Hp > I64[BaseReg + 144]) goto ck3;
         _sjH::I64 = %MO_S_Quot_W64(I64[R1 + 7], 2);
         I64[Hp - 8] = PicBaseReg + GHC.Types.I#_con_info;
         I64[Hp + 0] = _sjH::I64;
         R1 = Hp - 7;
         Sp = Sp + 8;
         jump (I64[Sp + 0]); // [R1]
     ck1: jump (I64[BaseReg - 16]); // [R1]
     ck3:
         I64[BaseReg + 192] = 16;
         goto ck1;
 }
 }}}

 which finally ends up as this assembly:

 {{{
 sjI_info:
 _cjX:
         addq $16,%r12
         cmpq 144(%r13),%r12
         ja _ck3
         movl $2,%ecx
         movq 7(%rbx),%rax
         cqto
         idivq %rcx
         movq %rax,%rbx
         leaq GHC.Types.I#_con_info(%rip),%rax
         movq %rax,-8(%r12)
         movq %rbx,0(%r12)
         leaq -7(%r12),%rbx
         addq $8,%rbp
         jmp *0(%rbp)
 }}}

 Ideally this should have turned into a shift, not a division.
 `compiler/nativeGen/X86/CodeGen.hs` lacks any peephole optimizations for
 division.

--
Ticket URL: <http://ghc.haskell.org/trac/ghc/ticket/9188>
GHC <http://www.haskell.org/ghc/>
The Glasgow Haskell Compiler