[GHC] #14667: Compiling a function with a lot of alternatives bottlenecks on insertIntHeap

GHC ghc-devs at haskell.org
Sat Jan 13 01:21:02 UTC 2018


#14667: Compiling a function with a lot of alternatives bottlenecks on
insertIntHeap
-------------------------------------+-------------------------------------
           Reporter:  niteria        |             Owner:  (none)
               Type:  bug            |            Status:  new
           Priority:  normal         |         Milestone:
          Component:  Compiler       |           Version:
           Keywords:                 |  Operating System:  Unknown/Multiple
       Architecture:                 |   Type of failure:  Compile-time
  Unknown/Multiple                   |  performance bug
          Test Case:                 |        Blocked By:
           Blocking:                 |   Related Tickets:
Differential Rev(s):                 |         Wiki Page:
-------------------------------------+-------------------------------------
 I have a module that looks like this:
 {{{
 module A10000 where

 data A = A
   | A00001
   | A00002
 ...
   | A10000

 f :: A -> Int
 f A00001 = 19900001
 f A00002 = 19900002
 ...
 f A10000 = 19910000
 }}}

 Compiling with `-s` gives:
 {{{
 [1 of 1] Compiling A10000           ( A10000.hs, A10000.o )

 A10000.hs:10004:1: warning:
     Pattern match checker exceeded (2000000) iterations in
     an equation for ‘f’. (Use -fmax-pmcheck-iterations=n
     to set the maximun number of iterations to n)
       |
 10004 | f A00001 = 19900001
       | ^^^^^^^^^^^^^^^^^^^...
   95,068,628,968 bytes allocated in the heap
   14,166,421,680 bytes copied during GC
      312,247,488 bytes maximum residency (19 sample(s))
        3,267,024 bytes maximum slop
              857 MB total memory in use (0 MB lost due to fragmentation)

                                      Tot time (elapsed)  Avg pause  Max
 pause
   Gen  0     35163 colls,     0 par    5.191s   5.170s     0.0001s
 0.0724s
   Gen  1        19 colls,     0 par    1.240s   1.236s     0.0650s
 0.1872s

   TASKS: 4 (1 bound, 3 peak workers (3 total), using -N1)

   SPARKS: 0 (0 converted, 0 overflowed, 0 dud, 0 GC'd, 0 fizzled)

   INIT    time    0.000s  (  0.000s elapsed)
   MUT     time   23.100s  ( 23.341s elapsed)
   GC      time    6.431s  (  6.405s elapsed)
   RP      time    0.000s  (  0.000s elapsed)
   PROF    time    0.000s  (  0.000s elapsed)
   EXIT    time    0.000s  (  0.001s elapsed)
   Total   time   29.532s  ( 29.748s elapsed)
 }}}

 With profiling enabled I get:
 {{{
   total time  =       22.67 secs   (22673 ticks @ 1000 us, 1 processor)
   total alloc = 36,143,653,320 bytes  (excludes profiling overheads)

 COST CENTRE     MODULE         SRC
 %time %alloc  ticks     bytes

 insertIntHeap   Hoopl.Dataflow
 compiler/cmm/Hoopl/Dataflow.hs:(450,1)-(454,38)      73.9   77.5  16746
 28001680176
 SimplTopBinds   SimplCore      compiler/simplCore/SimplCore.hs:770:39-74
 9.4    1.8   2136 647116224
 deSugar         HscMain        compiler/main/HscMain.hs:511:7-44
 2.6    4.2    599 1530056552
 pprNativeCode   AsmCodeGen
 compiler/nativeGen/AsmCodeGen.hs:(529,37)-(530,65)    2.3    2.9    524
 1030493864
 StgCmm          HscMain
 compiler/main/HscMain.hs:(1426,13)-(1427,62)          1.3    1.4    288
 497401376
 RegAlloc-linear AsmCodeGen
 compiler/nativeGen/AsmCodeGen.hs:(658,27)-(660,55)    0.9    1.1    201
 383789200
 tc_rn_src_decls TcRnDriver
 compiler/typecheck/TcRnDriver.hs:(494,4)-(556,7)      0.8    1.2    176
 441973152
 Parser          HscMain        compiler/main/HscMain.hs:(316,5)-(384,20)
 0.5    1.1    113 411448880
 }}}


 After a local patch that basically reverts
 https://phabricator.haskell.org/rGHC5a1a2633553 I get:
 {{{
 [1 of 1] Compiling A10000           ( A10000.hs, A10000.o )

 A10000.hs:10004:1: warning:
     Pattern match checker exceeded (2000000) iterations in
     an equation for ‘f’. (Use -fmax-pmcheck-iterations=n
     to set the maximun number of iterations to n)
       |
 10004 | f A00001 = 19900001
       | ^^^^^^^^^^^^^^^^^^^...
   15,162,268,144 bytes allocated in the heap
    4,870,184,600 bytes copied during GC
      323,794,936 bytes maximum residency (19 sample(s))
        3,074,056 bytes maximum slop
              886 MB total memory in use (0 MB lost due to fragmentation)

                                      Tot time (elapsed)  Avg pause  Max
 pause
   Gen  0       811 colls,     0 par    1.828s   1.821s     0.0022s
 0.0770s
   Gen  1        19 colls,     0 par    1.217s   1.213s     0.0638s
 0.1820s

   TASKS: 4 (1 bound, 3 peak workers (3 total), using -N1)

   SPARKS: 0 (0 converted, 0 overflowed, 0 dud, 0 GC'd, 0 fizzled)

   INIT    time    0.000s  (  0.000s elapsed)
   MUT     time    5.842s  (  6.144s elapsed)
   GC      time    3.046s  (  3.034s elapsed)
   RP      time    0.000s  (  0.000s elapsed)
   PROF    time    0.000s  (  0.000s elapsed)
   EXIT    time    0.000s  (  0.001s elapsed)
   Total   time    8.888s  (  9.179s elapsed)
 }}}

 For a file of smaller size with 1000 constructors, it still gives a 10%
 win.

 This example is artificial, but it looks like something that someone could
 write for a sparse enum that looks like this in C++:
 {{{
 enum access_t { read = 1, write = 2, exec = 4 };
 }}}

-- 
Ticket URL: <http://ghc.haskell.org/trac/ghc/ticket/14667>
GHC <http://www.haskell.org/ghc/>
The Glasgow Haskell Compiler


More information about the ghc-tickets mailing list