[GHC] #9669: Long compile time/high memory usage for modules with many deriving clauses

GHC ghc-devs at haskell.org
Mon Oct 6 12:11:31 UTC 2014


#9669: Long compile time/high memory usage for modules with many deriving clauses
-------------------------------------+-------------------------------------
       Reporter:  snoyberg           |                   Owner:
           Type:  bug                |                  Status:  new
       Priority:  normal             |               Milestone:
      Component:  Compiler           |                 Version:  7.8.3
       Keywords:                     |        Operating System:  Linux
   Architecture:  x86_64 (amd64)     |         Type of failure:  Compile-
     Difficulty:  Unknown            |  time performance bug
     Blocked By:                     |               Test Case:
Related Tickets:                     |                Blocking:
                                     |  Differential Revisions:
-------------------------------------+-------------------------------------
 I've seen many different people complain about this in various different
 contexts. One of the most common I personally see is when using Persistent
 with a large number of entities, which results in multi-minute build times
 for that single module. Usually, these reports have come from private
 repositories, which makes for a difficult repro. Also, the usage of
 Template Haskell in those modules tends to confuse the issue.

 So I'd like to report this issue from a completely separate project, with
 the guess that this represents an issue that is affecting many other
 users. Steps to reproduce:

 * `cabal unpack fpco-api-1.2.0.1`
 * `cd fpco-api-1.2.0.1`
 * `cabal install`
 * `ghc --make -isrc/library src/library/FP/API/Types.hs`
 * `touch src/library/FP/API/Types.hs`

 Then, to see the time and memory usage of compiling just the one module,
 run:

 * `time ghc -O2 --make -isrc/library src/library/FP/API/Types.hs +RTS -s`

 On my system (16GB RAM, quadcore i7), the results are:

 {{{
 $ time ghc -O2 --make -isrc/library src/library/FP/API/Types.hs +RTS -s
 [3 of 3] Compiling FP.API.Types     ( src/library/FP/API/Types.hs,
 src/library/FP/API/Types.o )
   51,846,533,568 bytes allocated in the heap
    6,799,246,288 bytes copied during GC
      389,297,584 bytes maximum residency (22 sample(s))
       11,330,472 bytes maximum slop
             1041 MB total memory in use (0 MB lost due to fragmentation)

                                     Tot time (elapsed)  Avg pause  Max
 pause
   Gen  0      2041 colls,     0 par    9.19s    9.19s     0.0045s
 0.0998s
   Gen  1        22 colls,     0 par    4.33s    4.33s     0.1969s
 0.5436s

   TASKS: 4 (1 bound, 3 peak workers (3 total), using -N1)

   SPARKS: 0 (0 converted, 0 overflowed, 0 dud, 0 GC'd, 0 fizzled)

   INIT    time    0.00s  (  0.00s elapsed)
   MUT     time   34.21s  ( 37.74s elapsed)
   GC      time   13.52s  ( 13.53s elapsed)
   EXIT    time    0.07s  (  0.07s elapsed)
   Total   time   47.81s  ( 51.34s elapsed)

   Alloc rate    1,515,662,937 bytes per MUT second

   Productivity  71.7% of total user, 66.8% of total elapsed

 gc_alloc_block_sync: 0
 whitehole_spin: 0
 gen[0].sync: 0
 gen[1].sync: 0

 real    0m51.375s
 user    0m50.864s
 sys     0m0.456s
 }}}

 By contrast, with optimizations turned off:

 {{{
 $ time ghc -O0 --make -isrc/library src/library/FP/API/Types.hs +RTS -s
 [3 of 3] Compiling FP.API.Types     ( src/library/FP/API/Types.hs,
 src/library/FP/API/Types.o )
   12,767,593,936 bytes allocated in the heap
    1,078,202,664 bytes copied during GC
      179,551,768 bytes maximum residency (13 sample(s))
        6,684,544 bytes maximum slop
              439 MB total memory in use (0 MB lost due to fragmentation)

                                     Tot time (elapsed)  Avg pause  Max
 pause
   Gen  0       623 colls,     0 par    1.24s    1.24s     0.0020s
 0.0651s
   Gen  1        13 colls,     0 par    0.76s    0.76s     0.0586s
 0.2397s

   TASKS: 4 (1 bound, 3 peak workers (3 total), using -N1)

   SPARKS: 0 (0 converted, 0 overflowed, 0 dud, 0 GC'd, 0 fizzled)

   INIT    time    0.00s  (  0.00s elapsed)
   MUT     time    8.28s  (  9.23s elapsed)
   GC      time    2.00s  (  2.00s elapsed)
   EXIT    time    0.05s  (  0.05s elapsed)
   Total   time   10.34s  ( 11.28s elapsed)

   Alloc rate    1,542,443,897 bytes per MUT second

   Productivity  80.6% of total user, 73.9% of total elapsed

 gc_alloc_block_sync: 0
 whitehole_spin: 0
 gen[0].sync: 0
 gen[1].sync: 0

 real    0m11.304s
 user    0m11.108s
 sys     0m0.172s
 }}}

 and -O1:

 {{{
 $ time ghc -O1 --make -isrc/library src/library/FP/API/Types.hs +RTS -s
 [3 of 3] Compiling FP.API.Types     ( src/library/FP/API/Types.hs,
 src/library/FP/API/Types.o )
   45,550,443,664 bytes allocated in the heap
    5,721,700,512 bytes copied during GC
      358,036,456 bytes maximum residency (21 sample(s))
        9,167,176 bytes maximum slop
              906 MB total memory in use (0 MB lost due to fragmentation)

                                     Tot time (elapsed)  Avg pause  Max
 pause
   Gen  0      1642 colls,     0 par    7.70s    7.70s     0.0047s
 0.1031s
   Gen  1        21 colls,     0 par    3.68s    3.69s     0.1756s
 0.4968s

   TASKS: 4 (1 bound, 3 peak workers (3 total), using -N1)

   SPARKS: 0 (0 converted, 0 overflowed, 0 dud, 0 GC'd, 0 fizzled)

   INIT    time    0.00s  (  0.00s elapsed)
   MUT     time   30.21s  ( 33.09s elapsed)
   GC      time   11.38s  ( 11.39s elapsed)
   EXIT    time    0.07s  (  0.07s elapsed)
   Total   time   41.68s  ( 44.55s elapsed)

   Alloc rate    1,507,571,106 bytes per MUT second

   Productivity  72.7% of total user, 68.0% of total elapsed

 gc_alloc_block_sync: 0
 whitehole_spin: 0
 gen[0].sync: 0
 gen[1].sync: 0

 real    0m44.577s
 user    0m44.164s
 sys     0m0.368s
 }}}

 Note that this module is essentially just 900 lines of data declarations,
 with a large number of derived classes, and a few manually written helper
 functions and instances.

 This high memory usage hasn't just resulted in user complaints: automated
 Jenkins and Travis jobs will often times fail without optimizations
 disabled, which can be problematic for proper testing and production code.
 In the case of fpco-api, I've worked around this by adding `-O0` to the
 cabal file, but it would be much nicer to not have to do that.

 For some other examples of complaints along these lines:

 * https://groups.google.com/d/msg/yesodweb/MX1bnOFu8Hc/VLQXhnvpIkMJ
 * https://groups.google.com/d/msg/yesodweb/XPWixNjuOnM/FN26bmkudgwJ

 I believe there are a few other threads discussing this, if it would be
 helpful. I did my testing on GHC 7.8.3 64-bit, Ubuntu 12.04.

--
Ticket URL: <http://ghc.haskell.org/trac/ghc/ticket/9669>
GHC <http://www.haskell.org/ghc/>
The Glasgow Haskell Compiler


More information about the ghc-tickets mailing list