[GHC] #5642: Deriving Generic of a big type takes a long time and lots of space

GHC ghc-devs at haskell.org
Thu Jun 9 00:35:02 UTC 2016


#5642: Deriving Generic of a big type takes a long time and lots of space
-------------------------------------+-------------------------------------
        Reporter:  basvandijk        |                Owner:  bgamari
            Type:  bug               |               Status:  new
        Priority:  normal            |            Milestone:
       Component:  Compiler          |              Version:  7.3
      Resolution:                    |             Keywords:  deriving-
                                     |  perf, Generics
Operating System:  Unknown/Multiple  |         Architecture:
 Type of failure:  Compile-time      |  Unknown/Multiple
  performance bug                    |            Test Case:  T5642
      Blocked By:                    |             Blocking:
 Related Tickets:                    |  Differential Rev(s):  Phab:D2304
       Wiki Page:                    |
-------------------------------------+-------------------------------------

Comment (by RyanGlScott):

 Per SPJ's request, I've reposted some sleuthing I did in
 [https://phabricator.haskell.org/D2304#67314 the comments] of Phab:D2304.

 I decided to my changes on the 300-constructor sum type mentioned at the
 top of this ticket. I manually implemented a `Generic` instance for this
 datatype three times, and put each one in its own file:

 * [https://gist.github.com/RyanGlScott/e1ef13b59440200e696f5ca0c23dfb8c
 #file-gen_v1-hs Gen_v1.hs], which contains a `Generic` instance as GHC
 derives it currently (without the changes in Phab:D2304)
 * [https://gist.github.com/RyanGlScott/e1ef13b59440200e696f5ca0c23dfb8c
 #file-gen_v2-hs Gen_v2.hs], which is like `Gen_v1.hs` except that it
 factors out the topmost `M1` in `from`/`to` (i.e., with the changes in
 Phab:D2304)
 * [https://gist.github.com/RyanGlScott/e1ef13b59440200e696f5ca0c23dfb8c
 #file-gen_v3-hs Gen_v3.hs], which is like `Gen_v1.hs` except that it both
 (1) factors out the topmost `M1` in `from`/`to` and (2) factors out common
 occurrences of `L1`/`R1` in `to`

 I compiled each file with `ghc -O2 -v3 +RTS -s` and dumped the results to
 logs. Here are the highlights:

 *
 [https://gist.githubusercontent.com/RyanGlScott/e1ef13b59440200e696f5ca0c23dfb8c/raw/4d4a0bc6f6aa834f7e95815ef7b79f46741f242d/Gen_v1.txt
 Gen_v1.txt]
   * `{terms: 16,282, types: 2,563,921, coercions: 639,012}`
   * `7,781,716,432 bytes allocated in the heap`
   * `Total   time    8.708s  (  8.719s elapsed)`
 *
 [https://gist.githubusercontent.com/RyanGlScott/e1ef13b59440200e696f5ca0c23dfb8c/raw/4d4a0bc6f6aa834f7e95815ef7b79f46741f242d/Gen_v2.txt
 Gen_v2.txt]
   * `{terms: 16,288, types: 2,924,492, coercions: 9,950}`
   * `4,479,400,544 bytes allocated in the heap`
   * `Total   time    5.580s  (  5.590s elapsed)`
 *
 [https://gist.githubusercontent.com/RyanGlScott/e1ef13b59440200e696f5ca0c23dfb8c/raw/4d4a0bc6f6aa834f7e95815ef7b79f46741f242d/Gen_v3.txt
 Gen_v3.txt]
   * `{terms: 16,288, types: 2,924,492, coercions: 9,950}`
   * `4,016,934,848 bytes allocated in the heap`
   * `Total   time    4.990s  (  5.006s elapsed)`

 There is a huge difference between `v1` and `v2`, as suspected. There is a
 difference between `v2` and `v3` in that it allocated fewer bytes on the
 heap, but interestingly, `v3` has the exact same number of types and
 coercions, so I'm not sure where the improvement is coming from.

--
Ticket URL: <http://ghc.haskell.org/trac/ghc/ticket/5642#comment:39>
GHC <http://www.haskell.org/ghc/>
The Glasgow Haskell Compiler


More information about the ghc-tickets mailing list