[Haskell-cafe] [4/16] SBM: How to use the Makefile (how to run benchmarks etc.)

Sat Dec 22 04:17:01 EST 2007

Introduction
------------
Most of the smarts of the benchmark harness is in the Makefile.
If you want to rerun the benchmarks (or a single benchmark) or look at the
intermediate code for a benchmark or the I/O trace or the memory consumption or
the time spent or ... then you use the makefile.

There are some support scripts in shell and Perl (and two C programs) that the
Makefile uses to do its job.  And there are some that you, the user, will want
to interact directly with.

The benchmarks are only expected to work on Linux.  They have been tested on
SuSE 8.2 (from 2003, with a 2.4 kernel), Ubuntu 7.04, and Ubuntu 7.10.

Quick howto
-----------

 make phase1	-- compiles, generates test files, measures memory use.
		   Safe to run on a busy machine if there's no active memory
		   pressure.
 make phase2	-- timing runs.  NOT safe to run on a busy machine.
		   Should be run in runlevel 1 (= no X, no daemons, single-user
		   mode) for best measurements.
		   Outputs report at end.  This is where you check the quality
		   of the measurements.  If you don't like them, run 'make
		   redophase2' (or delete the .time and .stat files with low
		   quality and run 'make phase2' again.)
 make zipdata	-- make a tarball with all the measurements, suitable for
		   emailing or putting on a website.

The Makefile will beep after phase 1 and 2.

The above will run a "NORMAL" run, which is fine during development if you
want to see if you nailed a performance bug.  It runs reasonably fast (about
43 seconds on my Athlon64).

If you want better measurements, you should use:

 make TESTKIND=THOROUGH phase1 phase2 zipdata

This will use a 150MB data file instead of a 15MB one and it will run the
timing measurements 6 times (before throwing the first away) instead of 4
times (before throwing the first away).

If you don't want to use single-user mode, you can improve the measurements
by piping the output to a file (or run the test from the console) instead of
involving a terminal and an X server (the screen update may kick in in the
middle of a timing run and disturb things if for no other reason than their
polluting the CPU caches).

Filesystem layout
-----------------
The benchmarks are in:
  hs/*.hs
  c/*.c
  hand/*.s

  hand/*.hs and hand/*.c are not compiled.  The two *.hs files are the
  originals from which the tweaked assembly code has been derived.  The two
  *.c files are sketches of how the MMX tweaks work (because MMX code by itself
  can be a bit off-putting).

These are the support scripts:
  tools/genfiles.pl	-- generate the test input files.

  tools/cutmem.pl
  tools/cutpid.pl	-- both are used to disentangle the outputs of strace
                           and pause-at-end (see below).  I combine strace,
                           memory info, and +RTS -sstderr into a single run to
                           save time.  This means that things end up in fewer
                           files than I'd like.

  tools/cut.pl		-- cut out main loop from disassembly ('make discut')

  tools/stat.pl		-- looks at all timings for a single benchmark and
  			   calculates average and standard deviation and "time
  			   slack", that is the discrepancy between user+sys and
  			   real.  It optionally throws away the first run.

  tools/eatmem.c	-- allocates a chunk of memory and makes damn sure
  			   it really is in RAM!
  tools/pause-at-end.c	-- part of a hack that copies /proc/self/maps and
			   /proc/self/status to stderr just before a benchmark
			   exits.

  tools/iosummary.pl	-- takes an strace and sums up the I/O
  tools/genreport.pl	-- generate a nice report with bar charts.
			   Takes way too many options in the form of
			   environment variables.

  tools/regenreport.sh	-- regenerates the report from ANY measurement tarball.

  tools/merge.pl	-- merge data from many measurement tarballs, with or
  			   without rescaling.

Generated files:
  hs/*.core hs/*.stg hs/*.cmm hs/*.s	-- intermediate code
  hs/*.hi		-- "Haskell Interface"
  */*.o			-- object code
  */*  (the files in $(HSPROGS) $(CPROGS) $(HANDPROGS)) -- programs
  */*.dis */*.discut	-- disassembled programs (and inner loops)
  */*.doc		-- source + intermediate code + inner loops + timings

  */*.mem		-- output from '+RTS -sstderr' + /proc/self/status +
			   /proc/self/maps + output from /usr/bin/time (where
			   the number of minor page faults is most interesting
			   datum)
  */*.strace		-- complete strace, taken together with */*.mem
  */*.iotrace		-- only I/O operations from the strace (read/write/
			   select)
  */*.iosum		-- summary of I/O operations
  */*.time		-- time measurements
  */*.stat		-- average + std.dev. + "time slack"
  */*.srctimespace	-- source code + time/mem barchart (in ASCII)

  sysinfo		-- description of the platform (uname, ghc, gcc, etc)
  platforminfo		-- short description of the platform
  report.txt [8-16K]
  docs       [1MB]	-- sysinfo + all */*.doc concatenated

Makefile targets
----------------
This is taken from 'make help':

  phase1 -- preparation + measurements that can run in background
  phase2 -- measurements that should run on unloaded machine
  redophase2 -- rerun phase2

  doc, [ASCII=1] report, lastreport - reports
  zipdata -- zip up measurements (to ghc-measurements.tar.gz)

  prog,core,stg,cmm,asm,dis,discut
     -- compile, compile to core/stg/cmm/asm, disassemble, cut out main loop
  time,stat,mem,strace,iotrace,iosum,cache
     -- measure run-time, GHC heap + OS mem, syscalls, I/O patterns, cache

  cleartime, clean, distclean -- delete measurements etc

  TESTKIND=(SMOKETEST,NORMAL,THOROUGH), defaults to NORMAL
  STRACE=OLD, defaults to NEW

Necessary tools
---------------
Perl, sed, /usr/bin/time, bash (doesn't have to be the default shell as long as
it its in PATH), strace, GNU Make, objdump (from the binutils package), gcc.

Other things that could come in handy:
  A console and/or terminal that understands UTF-8.
  A less that understands UTF-8.
  An editor that understands UTF-8.
  A %!&# printing program that understands both UTF-8 and fonts.  A2ps
    doesn't do UTF-8.  Uniprint used 1) a proportional font which 2) didn't
    even have all the fractional-width blocks.  U2ps used a with none of the
    block characters.  I ended up resorting to gedit's print function :(

That's enough for this email.

-Peter