Perf regression: ghc --make: add nicer names to RTS threads (threaded IO manager, make workers) (f686682)
Sergei Trofimovich
slyich at gmail.com
Wed Aug 6 20:40:36 UTC 2014
On Wed, 6 Aug 2014 22:15:34 +0300
Sergei Trofimovich <slyfox at gentoo.org> wrote:
> On Wed, 06 Aug 2014 09:30:45 +0200
> Joachim Breitner <mail at joachim-breitner.de> wrote:
>
> > Hi,
> >
> > the attached commit seems to have regressed the scs nofib benchmark by
> > ~3%:
> > http://ghcspeed-nomeata.rhcloud.com/timeline/?ben=nofib/time/scs&env=1#/?exe=2&base=2+68&ben=nofib/time/scs&env=1&revs=50&equid=on
>
> That's a test of compiled binary performance, not the compiler, right?
>
> > The graph unfortunately is in the wrong order, as the tool gets confused
> > by timezones and by commits with identical CommitDate, e.g. due to
> > rebasing. This needs to be fixed, I manually verified that the commit
> > below is the first that shows the above-noise-level-increase of runtime.
> >
> > (Other benchmarks seem to be unaffected.)
> >
> > Is this regression expected and intended or unexpected? Is it fixable?
> > Or is is this simply inexplicable?
>
> The graph looks mysterious (18 ms bump). Bencmark does not use haskell threads at all.
>
> I'll try to reproduce degradation locally and will investigate.
I think I know what happens. According to perf the benchmark spends 34%+
of time in garbage collection ('perf record -- $args'/'perf report'):
27,91% test test [.] evacuate
9,29% test test [.] s9Lz_info
7,46% test test [.] scavenge_block
And the whole benchmark runs a tiny bit more than 300ms.
It is exactly in line with major GC timer (0.3s).
If we run
$ time ./test inverter 345 10n 4u 1>/dev/null
multiple times there is heavy instability in there (with my patch reverted):
real 0m0.319s
real 0m0.305s
real 0m0.307s
real 0m0.373s
real 0m0.381s
which is +/- 80ms drift!
Let's try to kick major GC earlier instead of running right at runtime
shutdown time:
$ time ./test inverter 345 10n 4u +RTS -I0.1 1>/dev/null
real 0m0.304s
real 0m0.308s
real 0m0.302s
real 0m0.304s
real 0m0.308s
real 0m0.306s
real 0m0.305s
real 0m0.312s
which is way more stable behaviour.
Thus my theory is that my changed stepped from
"90% of time 1 GC run per run"
to
"90% of time 2 GC runs per run"
--
Sergei
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 181 bytes
Desc: not available
URL: <http://www.haskell.org/pipermail/ghc-devs/attachments/20140806/db7182b8/attachment.sig>
More information about the ghc-devs
mailing list