[GHC] #15999: Stabilise nofib runtime measurements
GHC
ghc-devs at haskell.org
Fri Dec 21 13:42:40 UTC 2018
#15999: Stabilise nofib runtime measurements
-------------------------------------+-------------------------------------
Reporter: sgraf | Owner: (none)
Type: task | Status: new
Priority: normal | Milestone: ⊥
Component: NoFib benchmark | Version: 8.6.2
suite |
Resolution: | Keywords:
Operating System: Unknown/Multiple | Architecture:
| Unknown/Multiple
Type of failure: None/Unknown | Test Case:
Blocked By: | Blocking:
Related Tickets: #5793 #9476 | Differential Rev(s): Phab:D5438
#15333 #15357 |
Wiki Page: |
-------------------------------------+-------------------------------------
Description changed by sgraf:
Old description:
> With Phab:D4989 (cf. #15357) having hit `nofib` master, there are still
> many benchmarks that are unstable. I identified three causes for
> unstability in https://ghc.haskell.org/trac/ghc/ticket/5793#comment:38.
> With system overhead mostly out of the equation, there are still two
> related tasks left:
>
> 1. Identify benchmarks with GC wibbles. Plan: Look at counted
> instructions while varying heap size with just one generation. A wibbling
> benchmark should have quite diverse sampled maximum residency (as opposed
> to a microbenchmark, which should have quite stable instruction count).
>
> Then fix these by iterating `main` 'often enough'. Maybe look at total
> bytes allocated for that, we want this to be monotonically declining as
> the initial heap size grows.
> 2. Now, all benchmarks should have stable instruction count. If not,
> maybe there's another class of benchmarks I didn't identify yet in #5793.
> Of these benchmarks, there are a few, like `real/eff/CS`, that still have
> highly unstable runtimes. Fix these 'microbenchmarks' by hiding them
> behind a flag.
New description:
With Phab:D4989 (cf. #15357) having hit `nofib` master, there are still
many benchmarks that are unstable in one way or another. I identified
three causes for unstability in
https://ghc.haskell.org/trac/ghc/ticket/5793#comment:38. With system
overhead mostly out of the equation, there are still two related tasks
left:
1. Identify benchmarks with GC wibbles. Plan: Look at how productivity
rate changes while increasing gen 0 heap size. A GC-sensitive benchmark
should have a non-monotonic or discontinuous productivity-rate-over-
nursery-size curve. Then fix these by iterating `main` often enough for
the curve to become smooth and monotone.
2. Now, all benchmarks should have monotonically decreasing instruction
count for increasing nursery sizes. If not, maybe there's another class of
benchmarks I didn't identify yet in #5793. Of these benchmarks, there are
a few, like `real/eff/CS`, that still have highly code layout-sensitive
runtimes. Fix these 'microbenchmarks' by hiding them behind a flag.
--
--
Ticket URL: <http://ghc.haskell.org/trac/ghc/ticket/15999#comment:8>
GHC <http://www.haskell.org/ghc/>
The Glasgow Haskell Compiler
More information about the ghc-tickets
mailing list