[GHC] #9221: (super!) linear slowdown of parallel builds on 40 core machine
GHC
ghc-devs at haskell.org
Sat Sep 13 23:13:15 UTC 2014
#9221: (super!) linear slowdown of parallel builds on 40 core machine
-------------------------------------+-------------------------------------
Reporter: carter | Owner:
Type: bug | Status: new
Priority: high | Milestone: 7.10.1
Component: Compiler | Version: 7.8.2
Resolution: | Keywords:
Operating System: | Architecture: Unknown/Multiple
Unknown/Multiple | Difficulty: Unknown
Type of failure: Compile- | Blocked By:
time performance bug | Related Tickets: #910
Test Case: |
Blocking: |
Differential Revisions: |
-------------------------------------+-------------------------------------
Comment (by gintas):
I think I know what's going on here. If you look at parUpsweep in
compiler/main/GhcMake.js, its argument n_jobs is used in two places: one
is the initial value of the par_sem semaphore used to limit
parallelization, and the other is a call to setNumCapabilities. The latter
seems to be the cause of the slowdown.
Note that setNumCapabilities is only invoked if the previous count of
capabilities was 1. I used that to control for both settings
independently, and it turns out that the runtime overhead is mostly
independent of the semaphore value and highly influenced by capability
count.
I ran some experiments on a 16-CPU VM (picked a larger one deliberately to
make the differences more pronounced). Running with jobs=4 & caps=4, a
test took 37s walltime, jobs=4 & caps=16 took 51s, jobs=4 & caps=32 took
114s (344s of MUT and 1021s of GC!). The figures are very similar for
jobs=16 and jobs=64. See attached log for more details (-sstderr output).
It looks like the runtime GC is just inefficient when running with many
capabilities, even if many physical cores are available. I'll try a few
experiments to verify that this is a general pattern that is not specific
to the GhcMake implementation.
Logic and a few experiments indicate that it does not help walltime to set
the number of jobs (semaphore value) higher than the number of
capabilities, so there's not much we can do about those two parameters in
the parUpsweep implementation other than capping n_jobs at some constant
(probably <= 8).
--
Ticket URL: <http://ghc.haskell.org/trac/ghc/ticket/9221#comment:9>
GHC <http://www.haskell.org/ghc/>
The Glasgow Haskell Compiler
More information about the ghc-tickets
mailing list