[GHC] #15124: Improve block layout for the NCG

Sat Nov 17 10:28:17 UTC 2018

#15124: Improve block layout for the NCG
-------------------------------------+-------------------------------------
        Reporter:  AndreasK          |                Owner:  AndreasK
            Type:  task              |               Status:  new
        Priority:  normal            |            Milestone:  8.8.1
       Component:  Compiler (NCG)    |              Version:  8.2.2
      Resolution:                    |             Keywords:  CodeGen
Operating System:  Unknown/Multiple  |         Architecture:
                                     |  Unknown/Multiple
 Type of failure:  None/Unknown      |            Test Case:
      Blocked By:                    |             Blocking:
 Related Tickets:  #8082 #14672      |  Differential Rev(s):  Phab:D4726
       Wiki Page:                    |
-------------------------------------+-------------------------------------

Comment (by Andreas Klebinger <klebinger.andreas@…>):

 In [changeset:"912fd2b6ca0bc51076835b6e3d1f469b715e2760/ghc"
 912fd2b6/ghc]:
 {{{
 #!CommitTicketReference repository="ghc"
 revision="912fd2b6ca0bc51076835b6e3d1f469b715e2760"
 NCG: New code layout algorithm.

 Summary:
 This patch implements a new code layout algorithm.
 It has been tested for x86 and is disabled on other platforms.

 Performance varies slightly be CPU/Machine but in general seems to be
 better
 by around 2%.
 Nofib shows only small differences of about +/- ~0.5% overall depending on
 flags/machine performance in other benchmarks improved significantly.

 Other benchmarks includes at least the benchmarks of: aeson, vector,
 megaparsec, attoparsec,
 containers, text and xeno.

 While the magnitude of gains differed three different CPUs where tested
 with
 all getting faster although to differing degrees. I tested: Sandy
 Bridge(Xeon), Haswell,
 Skylake

 * Library benchmark results summarized:
   * containers: ~1.5% faster
   * aeson: ~2% faster
   * megaparsec: ~2-5% faster
   * xml library benchmarks: 0.2%-1.1% faster
   * vector-benchmarks: 1-4% faster
   * text: 5.5% faster

 On average GHC compile times go down, as GHC compiled with the new layout
 is faster than the overhead introduced by using the new layout algorithm,

 Things this patch does:

 * Move code responsilbe for block layout in it's own module.
 * Move the NcgImpl Class into the NCGMonad module.
 * Extract a control flow graph from the input cmm.
 * Update this cfg to keep it in sync with changes during
   asm codegen. This has been tested on x64 but should work on x86.
   Other platforms still use the old codelayout.
 * Assign weights to the edges in the CFG based on type and limited static
   analysis which are then used for block layout.
 * Once we have the final code layout eliminate some redundant jumps.

   In particular turn a sequences of:
       jne .foo
       jmp .bar
     foo:
   into
       je bar
     foo:
       ..

 Test Plan: ci

 Reviewers: bgamari, jmct, jrtc27, simonmar, simonpj, RyanGlScott

 Reviewed By: RyanGlScott

 Subscribers: RyanGlScott, trommler, jmct, carter, thomie, rwbarton

 GHC Trac Issues: #15124

 Differential Revision: https://phabricator.haskell.org/D4726
 }}}

-- 
Ticket URL: <http://ghc.haskell.org/trac/ghc/ticket/15124#comment:12>
GHC <http://www.haskell.org/ghc/>
The Glasgow Haskell Compiler