[Haskell-cafe] proposal: HaBench, a Haskell Benchmark Suite

Sun Jan 28 03:51:27 EST 2007

Hi,

> Following up and the threads on haskell and haskell-cafe, I'd like  
> to gather ideas, comments and suggestions for a standarized Haskell  
> Benchmark Suite.
>
> The idea is to gather a bunch of programs written in Haskell, and  
> which are representative for the Haskell community (i.e. apps,  
> libraries, ...). Following the example of SPEC (besides the fact  
> that the SPEC benchmarks aren't available for free), we would like  
> to build a database containing performance measurements for the  
> various benchmarks in the suite. Users should be able to submit  
> their results. This will hopefully stimulate people to take  
> performance into account when writing a Haskell program/library,  
> and will also serve as a valuable tool for further optimizing both  
> applications written in Haskell and the various Haskell compilers  
> out there (GHC, jhc, nhc, ...).
>
> This thread is meant to gather peoples thought on this subject.
> Which programs should we consider for the first version of the  
> Haskell benchmark suite?
> How should we standarize them, and make them produce reliable  
> performance measurement?
> Should we only use hardware performance counters, or also do more  
> thorough analysis such as data locality studies, ...
> Are there any papers available on this subject (I know about the  
> paper which is being written as we speak ICFP, which uses PAPI as a  
> tool).

I think that we should have, as David Roundy pointed out, a  
restriction to code that is actually used frequently. However, I  
think we should make a distinction between micro-benchmarks, that  
test some specific item, and real-life benchmarks. When using micro  
benchmarks, the wrong conclusions may be drawn, because e.g., code or  
data can be completely cached, there are no TLB misses after startup,  
etc. I think that is somebody is interested in knowing how Haskell  
performs, and if he should use it for his development, it is nice to  
know that e.g., Data.ByteString performs as good as C, but is would  
be even nicer to see that large, real-life apps can reach that same  
performance. There is more to the Haskell runtime than simply  
executing application code, and these things should also be taken  
into account.

Also, I think that having several compilers for the benchmark set is  
a good idea, because, afaik, they can provide a different runtime  
system as well. We know that in Java, the VM can have a significant  
impact on behaviour on the microprocessor. I think that Haskell may  
have similar issues.

Also, similar to SPEC CPU, it would be nice to have input sets for  
each benchmark that gets included into the set. Furthermore, I think  
that we should provide a rigorous analysis of the benchmarks on as  
many platforms as is feasible. See e.g., the analysis done for the  
Dacapo Java benchmark suite, published at OOPSLA 2006.

-- Andy