Slower Compilation on GHC 7.8.4 (vs. 7.6.3)

Simon Peyton Jones simonpj at microsoft.com
Wed Apr 15 08:06:14 UTC 2015


This is all very helpful Oz.  Thanks.

What would be even more helpful would be ticket(s) that focus on individual bad aspects, especially ones that are non-linear.  Eg “Here are modules with 500, 1000, 2000 declarations, and compile time goes up non-linearly”.

The two big ones seem to be:

·         Type declarations

·         TH-enabled derivations (whatever they are)

The simpler the better.   Eg safecopy probably has fewer dependencies than lens.  And maybe it’s nothing to do with safecopy, but any old derivation will do?

What’s ideal is to be able to turn up the size knob, with a profiled compiler, and see which bits of the compiler are going non-linear.

Thanks!

Simon

From: ghc-devs [mailto:ghc-devs-bounces at haskell.org] On Behalf Of Ozgun Ataman
Sent: 14 April 2015 20:56
To: Simon Marlow
Cc: michael.xavier at soostone.com; ghc-devs at haskell.org
Subject: Re: Slower Compilation on GHC 7.8.4 (vs. 7.6.3)

Hello all,

Thank you for the comments and apologies that it took a bit of time to get back to you. My colleague Michael and I have taken a stab at a package demonstrating the slowdown. Our hypothesis was that the presence of type declarations, large values/literals, typeclass instantiations and engagement of TH caused at least some of the slowdown we were seeing in our app code. It turns out this was only partially true:

  *   500 random type declarations were a modest 10-25% slower.
  *   Simply instantiating types with a given class did not contribute that much to the slowdown (around 5% on top of type declarations)
  *   There was no notable slowdown on large values/literals (false hypothesis)
  *   TH-enabled derivations were over 100% slower (lens and safecopy used as test subjects)

     *   lens: ~100% slower
     *   safecopy: ~125% slower

  *   The slowdowns appeared to be super-linear in at least some cases. Doubling the declaration count to 1000 increased the % of the slowdown:

     *   Declarations: 30% slower
     *   lens TH slowdown: ~100% slower (stayed the same)
     *   safecopy TH slowdown: ~140% slower

I'd be happy to create a ticket if this is sufficient support material. It's hard to say these conclusively isolate *all* culprits for the full slowdown we observe in our codebase, but are hopefully at least a significant part.

Here's the package: https://dl.dropboxusercontent.com/u/58525/ghc-slowdown.tar.gz

Best,
Oz



On Tue, Apr 7, 2015 at 3:39 PM, Simon Marlow <marlowsd at gmail.com<mailto:marlowsd at gmail.com>> wrote:
Just a comment on this one point:

On 05/04/2015 19:01, Ozgun Ataman wrote:
  * With O0 and -j=N the overall wall-clock time is 40% higher, but the
    CPU time spent is a staggering 7.5X higher (!)

When looking at Haskell programs executing in parallel, it's normal to see a high CPU time, because the GC threads spin looking for work.  The extra CPU time is just time spent spinning, it doesn't imply that the compiler was doing extra work.  What matters is the overall wall-clock time - a worse wall-clock time indicates that something has gone wrong.

We should look at compile-time regression independently of performance issues with -j.  Parallel compilation is a new feature and mostly likely needs a lot of tuning.  Note that cabal's -j feature is different from GHC's (and is likely to give much more reliable improvements) because it runs multiple compiler instances in parallel.

Cheers,
Simon

  * With -O0, "cabal repl" load times have gone up by a staggering 2.5X (!)
      o 7.8.4: 3 min. 35 seconds
      o 7.6.3: 1 min. 2 seconds
  * Measuring compile times for individual modules, we see that those
    that are heavily loaded with lots of types, TH-facilitated type and
    typeclass derivations and those that contain large "blobs" of values
    directly at the top level now take much longer to compile:
      o We have 10 modules that each take over 10 seconds
      o We have 3 modules that each take over 35 seconds
  * Sidenote observation: In general, parallel builds with -j appear to
    cause a very significant overhead under the "system" part of timing:
      o Example with 7.6.3, O0: cabal build  140.57s user 13.25s system
        100% cpu 2:33.70 total
      o Example with 7.8.4, O0: cabal build  507.83s user 655.43s system
        549% cpu 3:31.59 total


The main hurt here has been the infeasibility of using "cabal repl /
ghci"  fast-feedback style development (via emacs, vim, command-line,
etc.), since:

  * Unlike "cabal build", "cabal repl" re-compiles from scratch with
    every invocation ":l App.Foo.Bar", which itself is the necessary
    first step when starting to hack on a module (we can then use :r to
    somewhat help the situation)
  * The slowdown is about 2.5X as stated above: What used to take a
    minute now takes 3.5 minutes.
  * cabal repl does not seem to benefit from -j at all
  * cabal repl 7.8.4 appears to be hurt particularly by the heavier modules
  * The heavier modules are often at the top of the compile tree (types,
    derivations, etc) and are practically loaded on the critical path
    all the time

Let me know if I'm missing anything here and any/all feedback is much
appreciated!

Cheers,
Oz


_______________________________________________
ghc-devs mailing list
ghc-devs at haskell.org<mailto:ghc-devs at haskell.org>
http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.haskell.org/pipermail/ghc-devs/attachments/20150415/6263453f/attachment-0001.html>


More information about the ghc-devs mailing list