Where do I start if I would like help improve GHC compilation times?

Mon Apr 10 22:47:25 UTC 2017

Alfredo Di Napoli <alfredo.dinapoli at gmail.com> writes:

> Hey Ben,
>
Hi Alfredo,

Sorry for the late response! The email queue from the weekend was a bit
longer than I would like.

> as promised I’m back to you with something more articulated and hopefully
> meaningful. I do hear you perfectly — probably trying to dive head-first
> into this without at least a rough understanding of the performance
> hotspots or the GHC overall architecture is going to do me more harm than
> good (I get the overall picture and I’m aware of the different stages of
> the GHC compilation pipeline, but it’s far from saying I’m proficient with
> the architecture as whole). I have also read a couple of years ago the GHC
> chapter on the “Architeture of Open Source Applications” book, but I don’t
> know how much that is still relevant. If it is, I guess I should refresh my
> memory.
>
It sounds like you have done a good amount of reading. That's great.
Perhaps skimming the AOSA chapter again wouldn't hurt, but otherwise
it's likely worthwhile diving in.

> I’m currently trying to move on 2 fronts — please advice if I’m a fool
> flogging a dead horse or if I have any hope of getting anything done ;)
>
> 1. I’m trying to treat indeed the compiler as a black block (as you
> adviced) trying to build a sufficiently large program where GHC is not “as
> fast as I would like” (I know that’s a very lame definition of “slow”,
> hehe). In particular, I have built the stage2 compiler with the “prof”
> flavour as you suggested, and I have chosen 2 examples as a reference
> “benchmark” for performance; DynFlags.hs (which seems to have been
> mentioned multiple times as a GHC perf killer) and the highlighting-kate
> package as posted here: https://ghc.haskell.org/trac/ghc/ticket/9221 .

Indeed, #9221 would be a very interesting ticket to look at. The
highlighting-kate package is interesting in the context of that ticket
as it has a very large amount of parallelism available.

If you do want to look at #9221, note that the cost centre profiler may
not provide the whole story. In particular, it has been speculated that
the scaling issues may be due to either,

 * threads hitting a blackhole, resulting in blocking

 * the usual scaling limitations of GHC's stop-the-world GC

The eventlog may be quite useful for characterising these.

> The idea would be to compile those with -v +RTS -p -hc -RTS enabled,
> look at the output from the .prof file AND the `-v` flag, find any
> hotspot, try to change something, recompile, observe diff, rinse and
> repeat. Do you think I have any hope of making progress this way? In
> particular, I think compiling DynFlags.hs is a bit of a dead-end; I
> whipped up this buggy script which
> escalated into a Behemoth which is compiling pretty much half of the
> compiler once again :D
>
> ```
> #!/usr/bin/env bash
>
> ../ghc/inplace/bin/ghc-stage2 --make -j8 -v +RTS -A256M -qb0 -p -h \
> -RTS -DSTAGE=2 -I../ghc/includes -I../ghc/compiler -I../ghc/compiler/stage2
> \
> -I../ghc/compiler/stage2/build \
> -i../ghc/compiler/utils:../ghc/compiler/types:../ghc/compiler/typecheck:../ghc/compiler/basicTypes
> \
> -i../ghc/compiler/main:../ghc/compiler/profiling:../ghc/compiler/coreSyn:../ghc/compiler/iface:../ghc/compiler/prelude
> \
> -i../ghc/compiler/stage2/build:../ghc/compiler/simplStg:../ghc/compiler/cmm:../ghc/compiler/parser:../ghc/compiler/hsSyn
> \
> -i../ghc/compiler/ghci:../ghc/compiler/deSugar:../ghc/compiler/simplCore:../ghc/compile/specialise
> \
> -fforce-recomp -c $@
> ```
>
> I’m running it with `./dynflags.sh ../ghc/compiler/main/DynFlags.hs` but
> it’s taking a lot to compile (20+ mins on my 2014 mac Pro) because it’s
> pulling in half of the compiler anyway :D I tried to reuse the .hi files
> from my stage2 compilation but I failed (GHC was complaining about
> interface file mismatch). Short story short, I don’t think it will be a
> very agile way to proceed. Am I right? Do you have any recommendation in
> such sense? Do I have any hope to compile DynFlags.hs in a way which would
> make this perf investigation feasible?
>
What I usually do in this case is just take the relevant `ghc` command
line directly from the `make` output and execute it manually. I would
imagine your debug cycle would look something like,

 * instrument the compiler
 * build stage1
 * use stage2 to build DynFlags using the stage1 compiler (using a saved command line)
 * think
 * repeat

This should only take a few minutes per iteration.

> The second example (the highlighting-kate package) seems much more
> promising. It takes maybe 1-2 mins on my machine, which is enough to take a
> look at the perf output. Do you think I should follow this second lead? In
> principle any 50+ modules package I think would do (better if with a lot of
> TH ;) ) but this seems like a low-entry barrier start.
>
> 2. The second path I’m exploring is simply to take a less holistic approach
> and try to dive in into a performance ticket like the ones listed here:
> https://www.reddit.com/r/haskell/comments/45q90s/is_anything_being_done_to_remedy_the_soul/czzq6an/
> Maybe some are very specific, but it seems like fixing small things and
> move forward could help giving me understanding of different sub-parts of
> GHC, which seems less intimidating than the black-box approach.
>
Do you have any specific tickets from these lists that you found
interesting?

> In conclusion, what do you think is the best approach, 1 or 2, both or
> none? ;)

I would say that it largely depends upon what you feel most comfortable
with. If you feel up for it, I think #9221 would be a nice, fairly
self-contained, yet high-impact ticket which would be worth spending a
few days diving further into.

Cheers,

- Ben

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 487 bytes
Desc: not available
URL: <http://mail.haskell.org/pipermail/ghc-devs/attachments/20170410/5414856c/attachment.sig>