[commit: ghc] master: Default +RTS -qn to the number of cores (6c47f2e)
git at git.haskell.org
git at git.haskell.org
Sun Oct 9 22:55:17 UTC 2016
Repository : ssh://git@git.haskell.org/ghc
On branch : master
Link : http://ghc.haskell.org/trac/ghc/changeset/6c47f2efa3f8f4639f375d34f54c01a60c9a1a82/ghc
>---------------------------------------------------------------
commit 6c47f2efa3f8f4639f375d34f54c01a60c9a1a82
Author: Simon Marlow <marlowsd at gmail.com>
Date: Sun Oct 9 18:20:53 2016 -0400
Default +RTS -qn to the number of cores
Setting a -N value that is too large has a dramatic negative effect on
performance, but the new -qn flag can mitigate the worst of the effects
by limiting the number of GC threads.
So now, if you don't explcitly set +RTS -qn, and you set -N larger than
the number of cores (or use setNumCapabilities to do the same), we'll
default -qn to the number of cores.
These are the results from nofib/parallel on my 4-core (2 cores x 2
threads) i7 laptop, comparing -N8 before and after this change.
```
------------------------------------------------------------------------
Program Size Allocs Runtime Elapsed TotalMem
------------------------------------------------------------------------
blackscholes +0.0% +0.0% -72.5% -72.0% +9.5%
coins +0.0% -0.0% -73.7% -72.2% -0.8%
mandel +0.0% +0.0% -76.4% -75.4% +3.3%
matmult +0.0% +15.5% -26.8% -33.4% +1.0%
nbody +0.0% +2.4% +0.7% 0.076 0.0%
parfib +0.0% -8.5% -33.2% -31.5% +2.0%
partree +0.0% -0.0% -60.4% -56.8% +5.7%
prsa +0.0% -0.0% -65.4% -60.4% 0.0%
queens +0.0% +0.2% -58.8% -58.8% -1.5%
ray +0.0% -1.5% -88.7% -85.6% -3.6%
sumeuler +0.0% -0.0% -47.8% -46.9% 0.0%
------------------------------------------------------------------------
Min +0.0% -8.5% -88.7% -85.6% -3.6%
Max +0.0% +15.5% +0.7% -31.5% +9.5%
Geometric Mean +0.0% +0.6% -61.4% -63.1% +1.4%
```
Test Plan: validate, nofib/parallel benchmarks
Reviewers: niteria, ezyang, nh2, austin, erikd, trofi, bgamari
Reviewed By: trofi, bgamari
Subscribers: thomie
Differential Revision: https://phabricator.haskell.org/D2580
GHC Trac Issues: #9221
>---------------------------------------------------------------
6c47f2efa3f8f4639f375d34f54c01a60c9a1a82
docs/users_guide/runtime_control.rst | 3 ++-
rts/Schedule.c | 15 ++++++++++++---
2 files changed, 14 insertions(+), 4 deletions(-)
diff --git a/docs/users_guide/runtime_control.rst b/docs/users_guide/runtime_control.rst
index 5226d6d..0ffb1d8 100644
--- a/docs/users_guide/runtime_control.rst
+++ b/docs/users_guide/runtime_control.rst
@@ -467,7 +467,8 @@ performance.
.. rts-flag:: -qn <x>
- :default: the value of ``-N``
+ :default: the value of ``-N`` or the number of CPU cores,
+ whichever is smaller.
:since: 8.2.1
.. index::
diff --git a/rts/Schedule.c b/rts/Schedule.c
index 611d704..3cbfc0e 100644
--- a/rts/Schedule.c
+++ b/rts/Schedule.c
@@ -1531,6 +1531,7 @@ scheduleDoGC (Capability **pcap, Task *task USED_IF_THREADS,
uint32_t gc_type;
uint32_t i;
uint32_t need_idle;
+ uint32_t n_gc_threads;
uint32_t n_idle_caps = 0, n_failed_trygrab_idles = 0;
StgTSO *tso;
rtsBool *idle_cap;
@@ -1561,9 +1562,17 @@ scheduleDoGC (Capability **pcap, Task *task USED_IF_THREADS,
gc_type = SYNC_GC_SEQ;
}
- if (gc_type == SYNC_GC_PAR && RtsFlags.ParFlags.parGcThreads > 0) {
- need_idle = stg_max(0, enabled_capabilities -
- RtsFlags.ParFlags.parGcThreads);
+ // If -qn is not set and we have more capabilities than cores, set the
+ // number of GC threads to #cores. We do this here rather than in
+ // normaliseRtsOpts() because here it will work if the program calls
+ // setNumCapabilities.
+ n_gc_threads = RtsFlags.ParFlags.parGcThreads;
+ if (n_gc_threads == 0 && enabled_capabilities > getNumberOfProcessors()) {
+ n_gc_threads = getNumberOfProcessors();
+ }
+
+ if (gc_type == SYNC_GC_PAR && n_gc_threads > 0) {
+ need_idle = stg_max(0, enabled_capabilities - n_gc_threads);
} else {
need_idle = 0;
}
More information about the ghc-commits
mailing list