[GHC] #15999: Stabilise nofib runtime measurements
GHC
ghc-devs at haskell.org
Thu Dec 13 09:47:05 UTC 2018
#15999: Stabilise nofib runtime measurements
-------------------------------------+-------------------------------------
Reporter: sgraf | Owner: (none)
Type: task | Status: new
Priority: normal | Milestone: ⊥
Component: NoFib benchmark | Version: 8.6.2
suite |
Resolution: | Keywords:
Operating System: Unknown/Multiple | Architecture:
| Unknown/Multiple
Type of failure: None/Unknown | Test Case:
Blocked By: | Blocking:
Related Tickets: #5793 #9476 | Differential Rev(s): Phab:D5438
#15333 #15357 |
Wiki Page: |
-------------------------------------+-------------------------------------
Comment (by sgraf):
Replying to [comment:2 osa1]:
> - I wonder if it'd be better to run the process multiple times, instead
of
> running the `main` function multiple times in the program. Why? That
way we
> know GHC won't fuse or somehow optimize the `replicateM_ 100` call in
the
> program, and we properly reset all the resources/global state (both
the
> program's and the runtime system's, e.g. weaks pointers, threads,
stable
> names). It just seems more reliable.
The whole point of this patch is that we iterate ''within'' the process,
so that GC paramerisation doesn't affect performance (counted
instructions, even) on the same program in 'unforeseen' ways, like the
discontinuous, non-monotonic paraffins example. There we would expect that
increasing nursery always leads to higher productivity, because the GC has
to run less often. That clearly isn't the case, due to effects outlined in
https://ghc.haskell.org/trac/ghc/ticket/5793#comment:38.
My hypothesis here is that fixing the productivity curve above has the
following effect on benchmarking a patch that changes allocations:
Basically, we fix the GC paramerisation and vary allocations instead of
fixing allocations and varying GC parameters. For a fixed GC
parameterisation (which is always the case when running NoFib), we would
expect an optimisation that reduces allocations by 10% to lead to less GC
time, thus higher productivity. Yet, in
https://ghc.haskell.org/trac/ghc/ticket/9476#comment:55, there is a
regression in total allocations by 18.6%, while counted instructions
improve by 11.7% (similar results when measuring actual runtime). As the
thread reveals, I had to do quite some digging to find out why
(https://ghc.haskell.org/trac/ghc/ticket/9476#comment:71): The program
produces more garbage, leading to more favorable points at which GC is
done. We don't want that to dilute our comparison!
This is also the reason why iterating the same program by restarting the
whole process is impossible: GC paramerisation is deterministic (rightly
so, IMO) and restarting the process will only measure the same dilusion
over and over.
On the other hand, iterating the logic $n times from within the program
leads to different GC states at which the actual benchmark logic starts,
thus leading to a more uniform distribution at the points in the program
when GC happens.
For every benchmark in the patch, I made sure that `replicateM_ 100` is
actually a sensible thing to do. Compare that to the current version of
[https://github.com/ghc/nofib/blob/f87d446b4e361cc82f219cf78917db9681af69b3/spectral/awards/Main.hs#L68
awards] where that is not the case: GHC will float out the actual
computation and all that is measured is `IO`. In paraffins I used
`relicateM_ 100` (or `forM_ [1..100] $ const`, rather, TODO), because the
inner action has a call to `getArgs`, the result of which is used to build
up the input data. GHC can't know that the result of `getArgs` doesn't
change, so it can't memoize the benchmark result and this measures what we
want. In other cases I actually had to use `forM_` and make use of the
counter somewhere in generating the input data.
> - You say "GC wibbles", but I'm not sure if these are actually GC
wibbles. I
> just checked paraffins: it doesn't do any IO (other than printing the
> results), and it's not even threaded (does not use threaded runtime,
does not
> do `forkIO`). So I think it should be quite deterministic, and I think
any
> wibbles are due to OS side of things. In other words, if we could have
an OS
> that only runs `paraffins` and nothing else I think the results would
be quite
> deterministic.
>
> Of course this doesn't change the fact that we're getting non-
deterministic
> results and we should do something about it, I'm just trying to
understand the
> root cause here.
The numbers are deterministic, but they are off in the sense above. By GC
wibble, I mean that varying tiny parameters of the program or the GC has
huge, non-monotonic, 'discontinuous' impact on the perceived performance,
which makes the benchmark suite unsuited to evaluate any optimisation that
changes allocations.
Replying to [comment:3 osa1]:
> I'm also wondering if having a fixed iteration number is a good idea.
What if,
> in 5 years, 100 iterations for paraffins is not enough to get reliable
numbers?
> I think `criterion` also has a solution for this (it does different
number of
> repetitions depending on results).
The point of iterating 100 times is not to adjust runtime so that we
measure something other than System overhead (type 1 in
https://ghc.haskell.org/trac/ghc/ticket/5793#comment:38). It's solely to
get smoother results in the above sense. There's still the regular
fast/norm/slow mechanism which are there to tune for specific runtimes,
which are quite likely to vary in the future.
Of course, we could just as well increment 100 to 243 for even smoother
results instead of modifying the actual input problem. But having it fixed
means that the curve for fast is as smooth as slow, which is a good thing,
I think. It means we can measure counted instructions on fast without
having to worry too much about said dilusions.
Replying to [comment:4 osa1]:
> Here's a library that can be useful for this purpose:
http://hackage.haskell.org/package/bench
Ah, yes. That could indeed be worthwhile as a replacement for the current
runner, but only if it would support measuring more than just time.
--
Ticket URL: <http://ghc.haskell.org/trac/ghc/ticket/15999#comment:5>
GHC <http://www.haskell.org/ghc/>
The Glasgow Haskell Compiler
More information about the ghc-tickets
mailing list