[Haskell-cafe] [5/16] SBM: Support scripts and scriptlets

Peter Firefly Brodersen Lund firefly at vax64.dk
Sat Dec 22 04:17:03 EST 2007


Some of the scripts warrant a closer look.

'make zipdata'
  creates a nice tarball with all the data necessary to recreate a report AND
  to merge that report together with other reports, possible with rescaled
  bar charts.  Very handy.

  All the files in the tarball are inside the 'ghc-measurements/' directory so
  the risk of things going wrong when unpacking the tarball is less.

  The names of the benchmarks are put in ghc-measurements/progs, mainly to
  ensure they end up in the right order when regenerating and merging reports.

tools/genreport.pl [list of benchmarks to put in the report]
  It doesn't parse the command-line in any way because life is too short for
  command-line parsing.  Instead, it is controlled via (too many) environment
  variables.

  ASCII	- set to avoid using UTF-8 for bar charts and "per mille" character.
  NOSRC - the tool normally creates */*.srctimemem files containing the source
	  code for each benchmark with bar charts for time/mem appended to the
	  end.  Setting this variable switches that off (necessary when
	  regenerating and merging reports).
  EXCLUDE - disregard some of the benchmarks on the command line.  Why is this
	  necessary?  Because it makes regenerating and merging reports easier.
	  And because I was too lazy to filter the command line in
	  tools/regenreport.sh and tools/merge.pl.

  FINDMAX - used by tools/merge.pl when rescaling.  Outputs max time and max
	    mem to stdout instead of the normal report.
  MAX_FILEWIDTH - used by tools/merge.pl to make merged reports look nice
  MAX_TIME,
  MAX_PEAKMEM - used by tools/merge.pl when rescaling

  Note that strictly speaking, there is a bug in the script(s) because it
  conflates the width of time/mem measurement represented as numbers (which
  you always want to take into account when merging) and MAX_TIME/MAX_PEAKMEM
  (which you only care about when rescaling).
  [FIXED now - 2007-12-21]

tools/regenreport.sh
  unpacks a measurement tarball into a tmp directory and runs
  tools/genreport.pl to generate the report.
  Takes care not to disturb the normal files.

tools/merge.pl [tarballs]
  Uses tools/regenreport.sh on each tarball in turn to generate a report which
  it reads in and stores on a benchmark-by-benchmark basis.   At the end,
  synthetically combine all the pieces it cut out of the original report(s)
  into a brand-spanking new, merged report.

  Even the headers and the platforminfo at the top of each report is cut out
  and stored in data structures until they get spit out again at the end.

  The reading magic is in the state machine in gather().  It is not as bad as
  it looks.  Some of the complications arise from marking repeated benchmark
  names as ' -- ', which improves the readability of the merged reports
  immensely.  Another part of the complications arise due to the fact that not
  all tarballs contain the exact same benchmarks!  Those that don't get a nice
  'n/a' instead of numbers and a bar.  And finally, the benchmarks should be
  in the right order.  That is trickier than it sounds...

  When rescaling, tools/regenreport.sh is first run once for each tarball with
  the FINDMAX environment variable set.  This results in tools/regenreport.sh
  outputting the maximum filename width, time, and peakmem for each tarball.

  ASCII - use ASCII instead of UTF-8
  RESCALE - sometimes you want to rescale and sometimes you don't
  MAX_FILEWIDTH - if you want to force a specific width
  MAX_TIME,
  MAX_PEAKMEM   - if you want to force a specific max

-Peter


More information about the Haskell-Cafe mailing list