[Haskell-cafe] [5/16] SBM: Support scripts and scriptlets
Peter Firefly Brodersen Lund
firefly at vax64.dk
Sat Dec 22 04:17:03 EST 2007
Some of the scripts warrant a closer look.
'make zipdata'
creates a nice tarball with all the data necessary to recreate a report AND
to merge that report together with other reports, possible with rescaled
bar charts. Very handy.
All the files in the tarball are inside the 'ghc-measurements/' directory so
the risk of things going wrong when unpacking the tarball is less.
The names of the benchmarks are put in ghc-measurements/progs, mainly to
ensure they end up in the right order when regenerating and merging reports.
tools/genreport.pl [list of benchmarks to put in the report]
It doesn't parse the command-line in any way because life is too short for
command-line parsing. Instead, it is controlled via (too many) environment
variables.
ASCII - set to avoid using UTF-8 for bar charts and "per mille" character.
NOSRC - the tool normally creates */*.srctimemem files containing the source
code for each benchmark with bar charts for time/mem appended to the
end. Setting this variable switches that off (necessary when
regenerating and merging reports).
EXCLUDE - disregard some of the benchmarks on the command line. Why is this
necessary? Because it makes regenerating and merging reports easier.
And because I was too lazy to filter the command line in
tools/regenreport.sh and tools/merge.pl.
FINDMAX - used by tools/merge.pl when rescaling. Outputs max time and max
mem to stdout instead of the normal report.
MAX_FILEWIDTH - used by tools/merge.pl to make merged reports look nice
MAX_TIME,
MAX_PEAKMEM - used by tools/merge.pl when rescaling
Note that strictly speaking, there is a bug in the script(s) because it
conflates the width of time/mem measurement represented as numbers (which
you always want to take into account when merging) and MAX_TIME/MAX_PEAKMEM
(which you only care about when rescaling).
[FIXED now - 2007-12-21]
tools/regenreport.sh
unpacks a measurement tarball into a tmp directory and runs
tools/genreport.pl to generate the report.
Takes care not to disturb the normal files.
tools/merge.pl [tarballs]
Uses tools/regenreport.sh on each tarball in turn to generate a report which
it reads in and stores on a benchmark-by-benchmark basis. At the end,
synthetically combine all the pieces it cut out of the original report(s)
into a brand-spanking new, merged report.
Even the headers and the platforminfo at the top of each report is cut out
and stored in data structures until they get spit out again at the end.
The reading magic is in the state machine in gather(). It is not as bad as
it looks. Some of the complications arise from marking repeated benchmark
names as ' -- ', which improves the readability of the merged reports
immensely. Another part of the complications arise due to the fact that not
all tarballs contain the exact same benchmarks! Those that don't get a nice
'n/a' instead of numbers and a bar. And finally, the benchmarks should be
in the right order. That is trickier than it sounds...
When rescaling, tools/regenreport.sh is first run once for each tarball with
the FINDMAX environment variable set. This results in tools/regenreport.sh
outputting the maximum filename width, time, and peakmem for each tarball.
ASCII - use ASCII instead of UTF-8
RESCALE - sometimes you want to rescale and sometimes you don't
MAX_FILEWIDTH - if you want to force a specific width
MAX_TIME,
MAX_PEAKMEM - if you want to force a specific max
-Peter
More information about the Haskell-Cafe
mailing list