[Hs-Generics] Generic programming comparison suite

Mon Feb 12 03:30:24 EST 2007

Hi everyone, I have been working on an initial benchmarking suite for  
generic programs, based on the programs that Stephanie, James, Johan  
and others have committed to the repository. Also most of the ideas  
here have been collected from the discussion that was carried out in  
the list, so thanks to all!

We plan to be use this as the starting point for the suite that will  
allow us to compare the different approaches.

The main ideas of this benchmarking suite are the following:
  * The benchmark suite consists of a set of tests in the form of  
generic programs, i.e. gshow/gread, gmap, geq, etc.
  * Each test consists of a set of modules written by different  
users, i.e. library writers, power users and end users.
  * Each test tries to solve a particular problem for the end-user,  
using the different generic programming approaches.
  * The end-user code for a test (the application of a specialized  
generic function to a data type) is the same across approaches. API  
differences (e.g. class based approaches vs. type representation  
argument) would be hidden behind a wrapper that the end-user code  
will use.
  * The rest of the code (generic function definition) for a test  
usually varies across approaches.
  * There will be a test per generic function proposed in the mailing  
list. However we will not include tests that give the same  
information that a previous test already does for all approaches.  
Such a test cannot be used to distinguish approaches. For example a  
generic zip test won't be included if it doesn't say anything more  
than a generic map.
  * The comparison results will use the wiki template.

I would like to emphasize the decision of using the same end-user  
code in the different versions of a test. This ensures that, for  
example, all variants of equality are really doing the same thing for  
the end-user. This becomes more useful when you consider Stephanie's  
foldTree example versus the reduce function. Though they feel  
similar, these functions would be used in different ways by the end  
user since they have different ways to specialise their behaviour.  
This suggests that these two functions should not be compared against  
each other, instead they are two different test cases.

I think that requiring the reuse of end-user code gives a nice  
criterion for telling which things are comparable, and it will be  
useful should we become interested in comparing performance as well.

The tests that will become part of the suite are:
  * gshow/gread
  * greduce (generic reduce function)
  * foldTree
  * raise aka paradise aka update
  * generic map
  * generic equality
  * serialization (though similar to gshow/gread, the output is  
influenced by the data type structure, this might tell interesting  
things).
  * gmin (though I am not sure on how much more than gread this tells)

The tests that will not become part of the suite are:
  * generic zip. Generic map is already a function that is defined on  
kinds different than * , and generic equality already has two  
arguments (difficult to define in some approaches).

Tests I plan to look into:
  * Oleg's int2float

You can get a "branch" of the generics repository by typing

 > darcs get http://www.cs.uu.nl/~alexey/repos/generics/

The testsuite is in the "comparison" directory.

To run the tests, type:

 > runghc tests.hs

The end-user code/drivers are in "comparison/TestGEq", "comparison/ 
TestGShow", etc.

And the approach varying code is in "comparison/LIGD/", "comparison/ 
SYB1_2", etc.

At the moment the SYB gshow code prints parenthesis around nullary  
constructors so you will get a failure since LIGD doesn't. This is  
something that the original code already did and I plan to fix it.

I guess that the best license would be the BSD license that Manuel  
proposed, if the original authors agree, of course.

We would like to push it to the official repository eventually, but  
meanwhile we would like to hear your comments.

Cheers,

Alexey