[GHC] #13397: Optimise calls to tagToEnum#

GHC ghc-devs at haskell.org
Wed Mar 8 13:15:56 UTC 2017


#13397: Optimise calls to tagToEnum#
-------------------------------------+-------------------------------------
        Reporter:  simonpj           |                Owner:  (none)
            Type:  bug               |               Status:  new
        Priority:  normal            |            Milestone:
       Component:  Compiler          |              Version:  8.0.1
      Resolution:                    |             Keywords:
Operating System:  Unknown/Multiple  |         Architecture:
                                     |  Unknown/Multiple
 Type of failure:  None/Unknown      |            Test Case:
      Blocked By:                    |             Blocking:
 Related Tickets:                    |  Differential Rev(s):
       Wiki Page:                    |
-------------------------------------+-------------------------------------

Comment (by simonpj):

 I have committed these patches to branch `wip/spj-T13397`:
 {{{
 commit 43540c8c6b9e914f302c71213a71ab5c780be2ac
 Author: Simon Peyton Jones <simonpj at microsoft.com>
 Date:   Wed Mar 8 11:05:53 2017 +0000

     Improve code generation for conditionals

     This patch in in preparation for the fix to Trac #13397

     The code generator has a special case for
       case tagToEnum (a>#b) of
         False -> e1
         True  -> e2

     but it was not doing nearly so well on
       case a>#b of
         DEFAULT -> e1
         1#      -> e2

     This patch arranges to behave essentially identically in
     both cases.  In due course we can eliminate the special
     case for tagToEnum#, once we've completed Trac #13397.

     The changes are:

     * Make CmmSink swizzle the order of a conditional where necessary;
       see Note [Improving conditionals] in CmmSink

     * Hack the general case of StgCmmExpr.cgCase so that it use
       NoGcInAlts for conditionals.  This doesn't seem right, but it's
       the same choice as the tagToEnum version. Without it, code size
       increases a lot (more heap checks).

       There's a loose end here.

     * Add comments in CmmOpt.cmmMachOpFoldM

 commit e49f3154a5ceb1894414f4635579aeb3aa84054f
 Author: Simon Peyton Jones <simonpj at microsoft.com>
 Date:   Wed Mar 8 10:26:47 2017 +0000

     Re-engineer caseRules to add tagToEnum/dataToTag

     See Note [Scrutinee Constant Folding] in SimplUtils

     * Add cases for tagToEnum and dataToTag. This is the main new
       bit.  It allows the simplifier to remove the pervasive uses
       of     case tagToEnum (a > b) of
                 False -> e1
                 True  -> e2
       and replace it by the simpler
              case a > b of
                 DEFAULT -> e1
                 1#      -> e2
       See Note [caseRules for tagToEnum]
       and Note [caseRules for dataToTag] in PrelRules.

     * This required some changes to the API of caseRules, and hence
       to code in SimplUtils.  See Note [Scrutinee Constant Folding]
       in SimplUtils.

     * Avoid duplication of work in the (unusual) case of
          case BIG + 3# of b
            DEFAULT -> e1
            6#      -> e2

       Previously we got
          case BIG of
            DEFAULT -> let b = BIG + 3# in e1
            3#      -> let b = 6#       in e2

       Now we get
          case BIG of b#
            DEFAULT -> let b = b' + 3# in e1
            3#      -> let b = 6#      in e2

     * Avoid duplicated code in caseRules

     A knock-on refactoring:

     * Move Note [Word/Int underflow/overflow] to Literal, as
       documentation to accompany mkMachIntWrap etc; and get
       rid of PrelRuls.intResult' in favour of mkMachIntWrap
 }}}
 It's good stuff generally, so I'm quite keen to keep it.  It does indeed
 eliminate the annoying `tagToEnum#` stuff.

 I get the `nofib` results below.  There are some odd things happening,
 which is why I have not committed to HEAD.

 * I did not expect binary sizes to change, but the do wobble around a bit,
 with a net tiny increase

 * I did not expect allocations to change.  I chased down the change in
 `knights`: it was due to increased closure sizes.  That in turn was due to
 better CSE, which is a good thing (just made more live variables).  So I
 think I'm ok with that. Allocations sometimes go down too.  Net zero.

 * There are some troubling increases in execution time. Notably, `n-body`
 really does run slower, repeatably. I think.  I have no idea why.  I think
 the C-- code is the same... but perhaps we are somehow generating worse
 assembly code.


 {{{
         Program           Size    Allocs   Runtime   Elapsed  TotalMem
 --------------------------------------------------------------------------------
            anna          -0.0%     -0.7%      0.16      0.16     +0.0%
            ansi          +0.1%     +0.0%      0.00      0.00     +0.0%
            atom          +0.2%     +0.0%     -2.1%     -2.1%     +0.0%
          awards          +0.2%     +0.0%      0.00      0.00     +0.0%
          banner          -0.0%     +0.0%      0.00      0.00     +0.0%
      bernouilli          +0.4%     -0.0%     -0.7%     -0.9%     +0.0%
    binary-trees          +0.5%     -0.0%     +4.6%     +4.7%     +0.0%
           boyer          +0.0%     +0.0%      0.06      0.06     +0.0%
          boyer2          -0.0%     +0.0%      0.01      0.01     +0.0%
            bspt          +0.0%     +0.0%      0.01      0.01     +0.0%
       cacheprof          +0.0%     -0.0%     +1.6%     +2.3%     -1.8%
        calendar          +0.0%     +0.0%      0.00      0.00     +0.0%
        cichelli          -0.0%     +0.0%      0.12      0.12     +0.0%
         circsim          +0.1%     -0.0%     +0.7%     +0.7%     +0.0%
        clausify          +0.2%     +0.0%      0.05      0.05     +0.0%
   comp_lab_zift          -0.0%     +0.0%     +0.3%     +0.2%     +0.0%
        compress          -0.0%     +0.0%     -0.7%     +0.4%     +0.0%
       compress2          +0.3%     +0.0%     +2.3%     +2.4%     +0.0%
     constraints          +0.0%     +0.0%     +3.2%     +3.2%     +0.0%
    cryptarithm1          -0.0%     +0.0%     -9.0%     -9.0%     +0.0%
    cryptarithm2          -0.0%     +0.0%      0.01      0.01     +0.0%
             cse          -0.0%     +0.0%      0.00      0.00     +0.0%
    digits-of-e1          +0.4%     +0.0%     +2.9%     +2.9%     +0.0%
    digits-of-e2          +0.3%     +0.0%     -2.1%     -2.1%     +0.0%
           eliza          -0.0%     +0.0%      0.00      0.00     +0.0%
           event          +0.0%     +0.0%     +0.3%     +0.3%     +0.0%
          exp3_8          +0.2%     +0.0%     +0.7%     +0.7%     +0.0%
          expert          +0.1%     +0.0%      0.00      0.00     +0.0%
  fannkuch-redux          -0.0%     -0.0%     -1.1%     -1.1%     +0.0%
           fasta          +0.0%     +0.0%     +0.5%     -0.2%     +0.0%
             fem          +0.4%     +0.0%      0.04      0.04     +0.0%
             fft          +0.2%     -0.4%      0.06      0.06     +0.0%
            fft2          +0.2%     -0.1%      0.08      0.08     +0.0%
        fibheaps          +0.0%     +0.0%      0.03      0.03     +0.0%
            fish          -0.0%     +0.0%      0.02      0.02     +0.0%
           fluid          +0.2%     +0.0%      0.01      0.01     +0.0%
          fulsom          +0.1%     +0.0%     +0.1%     +0.0%     +0.0%
          gamteb          +0.2%     +0.0%      0.07      0.07     +0.0%
             gcd          +0.3%     +0.0%      0.09      0.09     +0.0%
     gen_regexps          -0.0%     +0.0%      0.00      0.00     +0.0%
          genfft          -0.1%     -0.2%      0.06      0.06     +0.0%
              gg          +0.1%     +0.0%      0.02      0.02     +0.0%
            grep          -0.1%     +0.0%      0.00      0.00     +0.0%
          hidden          +0.4%     +0.0%     +2.8%     +2.9%     +0.0%
             hpg          +0.1%     -0.0%     -1.9%     -2.1%     +0.0%
             ida          +0.0%     +0.0%      0.10      0.10     +0.0%
           infer          -0.0%     +0.0%      0.10      0.10     +0.0%
         integer          +0.5%     +0.0%     +1.6%     +1.6%     +0.0%
       integrate          +0.2%     +0.0%     +4.8%     +5.0%     +0.0%
    k-nucleotide          +0.1%     -0.1%     -1.5%     -1.6%     +0.0%
           kahan          +0.2%     +0.0%     +1.6%     +1.6%     +0.0%
         knights          +0.2%     +1.3%      0.01      0.01     +0.0%
          lambda          +0.0%     +0.0%     +6.5%     +6.5%     +0.0%
      last-piece          -0.1%     +0.3%     +2.4%     +2.5%     +0.0%
            lcss          +0.0%     +0.0%     +2.7%     +2.7%     +0.0%
            life          +0.1%     +0.0%     +0.8%     +1.0%     +0.0%
            lift          -0.0%     +0.0%      0.00      0.00     +0.0%
       listcompr          -0.0%     +0.0%      0.18      0.18     +0.0%
        listcopy          -0.0%     +0.0%      0.19      0.19     +0.0%
        maillist          -0.0%     -0.0%      0.08      0.09     -5.3%
          mandel          +0.5%     +0.0%      0.13      0.13     +0.0%
         mandel2          -0.0%     +0.0%      0.01      0.01     +0.0%
         minimax          -0.0%     +0.0%      0.01      0.01     +0.0%
         mkhprog          -0.0%     +0.0%      0.00      0.00     +0.0%
      multiplier          +0.0%     +0.0%      0.19      0.19     +0.0%
          n-body          +0.2%     +0.0%    +14.6%    +14.6%     +0.0%
        nucleic2          +0.2%     +0.0%      0.11      0.11     +0.0%
            para          -0.0%     +0.0%     -1.5%     -1.5%     +0.0%
       paraffins          -0.1%     -0.1%      0.19      0.20     +0.0%
          parser          -0.6%     +0.0%      0.04      0.04     +0.0%
         parstof          -0.0%     +0.0%      0.01      0.01     +0.0%
             pic          -0.3%     +1.1%      0.01      0.01     +0.0%
        pidigits          +0.3%     +0.0%     -0.0%     -0.0%     +0.0%
           power          +0.2%     +0.0%     +2.0%     +2.2%     +0.0%
          pretty          +0.3%     +0.0%      0.00      0.00     +0.0%
          primes          +0.0%     +0.0%      0.11      0.11     +0.0%
       primetest          +0.4%     +0.0%      0.13      0.13     +0.0%
          prolog          +0.2%     +0.0%      0.00      0.00     +0.0%
          puzzle          -0.0%     +0.0%      0.20      0.20     +0.0%
          queens          +0.0%     +0.0%      0.02      0.02     +0.0%
         reptile          -0.1%     +0.0%      0.02      0.02     +0.0%
 reverse-complem          -0.0%     +0.0%     +2.4%     +2.4%     +0.0%
         rewrite          +0.0%     +0.0%      0.03      0.03     +0.0%
            rfib          +0.5%     +0.0%      0.03      0.03     +0.0%
             rsa          +0.4%     +0.0%      0.03      0.03     +0.0%
             scc          -0.0%     +0.0%      0.00      0.00     +0.0%
           sched          +0.0%     +0.0%      0.03      0.03     +0.0%
             scs          +0.0%     +0.8%     +7.6%     +7.5%     +0.0%
          simple          +0.1%     +0.0%     +4.9%     +5.0%     +0.0%
           solid          +0.2%     +0.0%      0.19      0.19     +0.0%
         sorting          -0.0%     +0.0%      0.00      0.00     +0.0%
   spectral-norm          +0.2%     +0.0%     +1.5%     +1.5%     +0.0%
          sphere          +0.0%     +0.0%      0.08      0.08     +0.0%
          symalg          +0.4%     +0.0%      0.01      0.01     +0.0%
             tak          +0.0%     +0.0%      0.02      0.02     +0.0%
       transform          -0.0%     +0.0%     -4.5%     -4.5%     +0.0%
        treejoin          -0.0%     +0.0%    +16.7%    +16.6%     +0.0%
       typecheck          -0.0%     +0.0%     -1.4%     -1.3%     +0.0%
         veritas          -0.1%     +0.0%      0.00      0.00     +0.0%
            wang          +0.2%     +0.0%      0.17      0.17     +0.0%
       wave4main          +0.0%     +0.0%     +1.9%     +1.7%     +0.0%
    wheel-sieve1          +0.0%     +0.0%     +1.5%     +1.5%     +0.0%
    wheel-sieve2          +0.0%     +0.0%     +0.6%     +0.6%     +0.0%
            x2n1          +0.1%     +0.0%      0.01      0.01     +0.0%
 --------------------------------------------------------------------------------
             Min          -0.6%     -0.7%     -9.0%     -9.0%     -5.3%
             Max          +0.5%     +1.3%    +16.7%    +16.6%     +0.0%
  Geometric Mean          +0.1%     +0.0%     +1.6%     +1.6%     -0.1%
 }}}

--
Ticket URL: <http://ghc.haskell.org/trac/ghc/ticket/13397#comment:1>
GHC <http://www.haskell.org/ghc/>
The Glasgow Haskell Compiler


More information about the ghc-tickets mailing list