RTS changes affect runtime when they shouldn’t

Joachim Breitner mail at joachim-breitner.de
Wed Sep 20 16:11:05 UTC 2017


while keeping an eye on the performance numbers, I notice a pattern
where basically any change to the rts makes some benchmarks go up or
down by a significant percentage. Recent example:
which exposed an additional secure modular power function in integer
(and should really not affect any of our test cases) causes these

Benchmark name 	prev 	change 		now 	
nofib/time/FS 	0.434 	-  4.61% 	0.414 	seco
nofib/time/VS 	0.369 	+ 15.45% 	0.426 	seco
nofib/time/scs 	0.411 	-  4.62% 	0.392 	sec

The new effBench benchmarks (FS, VS) are particularly often
affected, but also old friends like scs, lambda, integer…

In a case like this I can see that the effect is spurious, but it
really limits our ability to properly evaluate changes to the compiler
– in some cases it makes us cheer about improvements that are not
really there, in other cases it makes us hunt for ghosts.

Does anyone have a solid idea what is causing these differences? Are
they specific to the builder for perf.haskell.org, or do you observe
them as well? And what can we do here?

For the measurements in my thesis I switched to measuring instruction
counts (using valgrind) instead. These are much more stable, requires
only a single NoFibRun, and the machine does not have to be otherwise
quiet. Should I start using these on perf.haskell.org? Or would we lose
too much by not tracking actual running times any more?


Joachim Breitner
  mail at joachim-breitner.de
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 833 bytes
Desc: This is a digitally signed message part
URL: <http://mail.haskell.org/pipermail/ghc-devs/attachments/20170920/5a89818b/attachment.sig>

More information about the ghc-devs mailing list