[Haskell-cafe] Benchmarking and Garbage Collection

Thu Mar 4 13:16:58 EST 2010

Hi,

I'm looking at benchmarking several different concurrency libraries 
against each other.  The benchmarks involve things like repeatedly 
sending values between two threads.  I'm likely to use Criterion for the 
task.

However, one thing I've found is that the libraries have noticeably 
different behaviour in terms of the amount of garbage created.  
Criterion has an option to perform GC between each benchmark, but I 
think that the benchmark is only fair if it takes into account the GC 
time for each system; it doesn't seem right for two systems to be 
counted as equal if the times to get the results are the same, but then 
one has to spend twice as long as the other in GC afterwards.  Here's 
some options I've considered:

* I could make a small change to Criterion to include the time for 
performing GC in each benchmark run, but I worry that running the GC so 
often is also misrepresentative (might 100 small GCs take a lot longer 
than one large GC of the same data?) -- it may also add a lot of 
overhead to quick benchmarks, but I can avoid that problem.

* Alternatively, I could run the GC once at the end of all the runs, 
then apportion the cost equally to each of the benchmark times (so if 
100 benchmarks require 0.7s of GC, add 0.007s to each time) -- but if GC 
is triggered somewhere in the middle of the runs, that upsets the 
strategy a little.

* I guess a further alternative is to make each benchmark a whole 
program (and decently long), then just time the whole thing, rather than 
using Criterion.

Has anyone run into these issues before, and can anyone offer an opinion 
on what the best option is?

Thanks,

Neil.