How to better parallelize GHC build.

Karel Gardas karel.gardas at centrum.cz
Wed Apr 1 21:17:01 UTC 2015


Hi Thomas,

thanks for your suggestion. Also thanks for the PR number. I've tried 
with quick way (build.mk) and benchmarking ghc while compiling ghc-cabal 
manually and here are the results:

-j1: 45s
-j2: 28s
-j3: 26s
-j4: 24s
-j5: 24s
-j6: 25s
-j6 -A32m: 23s
-j6 -A64m: 21s
-j6 -A128m: 23s

real time is reported, GHC compiles into i386 code on Solaris 11. GHC is 
located in /tmp hence basically in RAM. CPU is 6c/12ht E5-2620.

So not that bad, but on the other hand also not that good result. 
Anyway, unfortunately on my niagara this will probably not help me, 
since I guess --make -jX is recent addition probably not presented in 
7.6.x, right? If so, then I'm afraid this will not help me since on 
niagara I'm using patched 7.6.x with fixed SPARC NCG and this 
single-threaded will be probably faster than 7.10.1 multithreaded but 
building unregisterised (hence with C compiler...). Anyway, I'll try to 
benchmark this tomorrow and will keep you posted.

Thanks!
Karel

On 04/ 1/15 12:34 PM, Thomas Miedema wrote:
> Hi Karel,
>
> could you try adding `-j8` to `SRC_HC_OPTS` for the build flavor you're
> using in `mk/build.mk <http://build.mk>`, and running `gmake -j8`
> instead of `gmake -j64`. A graph like the one you attached will likely
> look even worse, but the walltime of your build should hopefully be
> improved.
>
> The build system seems to currently rely entirely on `make` for
> parallelism. It doesn't exploit ghc's own parallel `--make` at all,
> unless you explictly add `-jn` to SRC_HC_OPTS, with n>1 (which also sets
> the number of capabilities for the runtime system, so also adding `+RTS
> -Nn` is not needed).
>
> Case study: One of the first things the build system does is build
> ghc-cabal and Cabal using the stage 0 compiler, through a single
> invocation of `ghc --make`. All the later make targets depend on that
> step to complete first. Because `ghc --make` is not instructed to build
> in parallel, using `make -j1` or `make -j100000` doesn't make any
> difference (for that step). I think your graph shows that there are many
> of more of such bottlenecks.
>
> You would have to find out empirically how to best divide your number of
> threads (32) between `make` and `ghc --make`. From reading this comment
> <https://ghc.haskell.org/trac/ghc/ticket/9221#comment:12> by Simon in
> #9221 I understand it's better not to call `ghc --make -jn` with `n`
> higher than the number of physical cores of your machine (8 in your
> case). Once you get some better parallelism, other flags like `-A` might
> also have an effect on walltime (see that ticket).
>
> -Thomas
>
> On Sat, Mar 7, 2015 at 11:49 AM, Karel Gardas <karel.gardas at centrum.cz
> <mailto:karel.gardas at centrum.cz>> wrote:
>
>
>     Folks,
>
>     first of all, I remember someone already mentioned issue with
>     decreased parallelism of the GHC build recently somewhere but I
>     cann't find it now. Sorry, for that since otherwise I would use this
>     thread if it was on this mailing list.
>
>     Anyway, while working on SPARC NCG I'm using T2000 which provides 32
>     threads/8 core UltraSPARC T1 CPU. The property of this machine is
>     that it's really slow on single-threaded work. To squeeze some perf
>     from it man really needs to push 32 threads of work on it. Now, it
>     really hurts my nerves to see it's lazy building/running just one or
>     two ghc processes. To verify the fact I've created simple script to
>     collect number of ghc processes over time and putting this to graph.
>     The result is in the attached picture. The graph is result of running:
>
>     gmake -j64
>
>     anyway, the average number of running ghc processes is 4.4 and the
>     median value is 2. IMHO such low number not only hurts build times
>     on something like CMT SPARC machine, but also on let say a cluster
>     of ARM machines using NFS and also on common engineering
>     workstations which provide these days (IMHO!) around 8-16 cores (and
>     double the threads number).
>
>     My naive idea(s) for fixing this issue is (I'm assuming no Haskell
>     file imports unused imports here, but perhaps this may be also
>     investigated):
>
>     1) provide explicit dependencies which guides make to build in more
>     optimal way
>
>     2) hack GHC's make depend to kind of compute explicit dependencies
>     from (1) in an optimal way automatically
>
>     3) someone already mentioned using shake for building ghc. I don't
>     know shake but perhaps this is the right direction?
>
>     4) hack GHC to compile needed hi file directly in its memory if hi
>     file is not (yet!) available (issue how to get compiling options
>     right here). Also I don't know hi file semantics yet so bear with me
>     on this.
>
>
>     Is there anything else which may be done to fix that issue? Is
>     someone already working on some of those? (I mean those reasonable
>     from the list)?
>
>     Thanks!
>     Karel
>
>
>     _______________________________________________
>     ghc-devs mailing list
>     ghc-devs at haskell.org <mailto:ghc-devs at haskell.org>
>     http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs
>
>



More information about the ghc-devs mailing list