[GHC] #10397: Compiler performance regression 7.6 -> 7.8 in elimCommonBlocks

GHC ghc-devs at haskell.org
Mon May 18 08:38:47 UTC 2015


#10397: Compiler performance regression 7.6 -> 7.8 in elimCommonBlocks
-------------------------------------+-------------------------------------
        Reporter:  TobyGoodwin       |                   Owner:
            Type:  bug               |                  Status:  merge
        Priority:  normal            |               Milestone:  7.10.2
       Component:  Compiler          |                 Version:  7.8.4
      Resolution:                    |                Keywords:
Operating System:  Unknown/Multiple  |  performance
 Type of failure:  None/Unknown      |            Architecture:
      Blocked By:                    |  Unknown/Multiple
 Related Tickets:                    |               Test Case:  see ticket
                                     |                Blocking:
                                     |  Differential Revisions:  Phab:D892
-------------------------------------+-------------------------------------

Comment (by nomeata):

 <rant>Geez, how do I print these `CmmBlocks`? The lack of ubiquitous
 `Outputable` or `Show` instances can be quite a time waster.... ah `import
 PprCmm ()` helps.</rant>

 Indeed, the hash function is simply not fine-grained enough. Including the
 uniques of local registers yields
 {{{
              ghc-stage2 +RTS -t -p -RTS -B/home/jojo/build/haskell/ghc|
 ghc-stage2 +RTS -t -p -RTS -B/home/jojo/build/haskell/gh
                                                                       |
           total time  =       13.83 secs   (13831 ticks @ 1000 us, 1 p|
 total time  =       11.79 secs   (11791 ticks @ 1000 us, 1
           total alloc = 14,684,289,032 bytes  (excludes profiling over|
 total alloc = 11,894,920,976 bytes  (excludes profiling ove
                                                                       |
   COST CENTRE      MODULE          %time %alloc                       |
 COST CENTRE      MODULE          %time %alloc
                                                                       |
   elimCommonBlocks CmmPipeline      17.4   21.2                       |
 SimplTopBinds    SimplCore        12.7   12.6
   SimplTopBinds    SimplCore        11.1   10.2                       |
 regLiveness      AsmCodeGen        9.4    8.4
   regLiveness      AsmCodeGen        7.7    6.8                       |
 pprNativeCode    AsmCodeGen        8.3   10.5
   pprNativeCode    AsmCodeGen        7.0    8.5                       |
 RegAlloc         AsmCodeGen        7.1    8.8
   RegAlloc         AsmCodeGen        5.9    7.1                       |
 StgCmm           HscMain           6.9    6.5
   StgCmm           HscMain           5.6    5.3                       |
 sink             CmmPipeline       6.3    5.9
   sink             CmmPipeline       5.3    4.7                       |
 genMachCode      AsmCodeGen        3.9    3.7
   genMachCode      AsmCodeGen        3.5    3.0                       |
 layoutStack      CmmPipeline       3.9    4.0
   layoutStack      CmmPipeline       3.5    3.2                       |
 do_block         Hoopl.Dataflow    3.2    1.9
   do_block         Hoopl.Dataflow    2.7    1.5                       |
 postorderDfs     CmmUtils          2.9    2.4
   NativeCodeGen    CodeOutput        2.4    2.2                       |
 NativeCodeGen    CodeOutput        2.7    2.7
   postorderDfs     CmmUtils          2.3    2.0                       |
 elimCommonBlocks CmmPipeline       2.5    2.8
   sequenceBlocks   AsmCodeGen        2.0    1.8                       |
 sequenceBlocks
 }}}
 which now should be finally sufficient. Will create a DR for validation
 and then push this.

 I also replaced my fancy trie by a plain `Data.Map`. It turned out to be
 not performance critical, so let’s remove the custom code.

--
Ticket URL: <http://ghc.haskell.org/trac/ghc/ticket/10397#comment:25>
GHC <http://www.haskell.org/ghc/>
The Glasgow Haskell Compiler


More information about the ghc-tickets mailing list