profiling mutator time per function vs. GC time

Janis Voigtlaender
Fri, 08 Nov 2002 15:52:54 +0100

Simon Marlow wrote:
> If you run a profiled program with +RTS -sstderr, the time breakdown
> includes an extra category, PROF, which counts the time spent in the
> heap profiler.  The amount of GC time consumed by the profiled program
> will indeed be different from the unprofiled program, because of
> profiling overheads - there's no way around this, I'm afraid.  But you
> may find that the ratio of mutator to GC time in the profiled program is
> similar to the unprofiled program (I'd be interested to know whether
> this is/is not the case).

It seems I stumbled on a situation where this is not the case. The
following is a simplified version of my program:

> module Main where
> f xs = fst (go xs (id,id)) []
>	 where go (x:xs) ys = go xs ((\(rw,rl) -> (\zs->x:rl zs,\zs->rw (x:zs))) ys)
>              go []     ys = ys
> main = do print (let r = (map f (replicate 3000 (replicate 3000 0)))
>	           in  r==r)
Without profiling support, I get:

~> ghc-5.04.1 -O ratioGC.lhs 
~> ./a.out +RTS -sstderr
  INIT  time    0.00s  (  0.00s elapsed)
  MUT   time    8.18s  (  8.20s elapsed)
  GC    time    2.84s  (  3.13s elapsed)
  EXIT  time    0.00s  (  0.00s elapsed)
  Total time   11.02s  ( 11.33s elapsed)

When compiled for profiling, I get:

~> ghc-5.04.1 -O ratioGC.lhs -prof -auto-all
~> ./a.out +RTS -sstderr
  INIT  time    0.01s  (  0.01s elapsed)
  MUT   time   19.94s  ( 21.38s elapsed)
  GC    time   17.96s  ( 19.70s elapsed)
  RP    time    0.00s  (  0.00s elapsed)
  PROF  time    0.00s  (  0.00s elapsed)
  EXIT  time    0.00s  (  0.00s elapsed)
  Total time   37.91s  ( 41.09s elapsed)

So, the MUT vs. GC time ratio varies much with the program compiled for
profiling support.

My actual aim is to compare different implementations of f with respect
to both runtime for expression evaluation, and induced amount of GC. To
measure really just the runtime for f (in contrast to the test frame
with generation of test data and consumption via the forced (==)-check),
I wanted to use cost centre profiling, but the big change in GC
behaviour forbids this. Adding the GC time from the first run (without
profiling) and the mutator time for cost centre f from a profiled run is
also not an option, because the time attributed to f by the profiler is
bigger than the overall runtime of the unprofiled program, so the scale
differs completely.

I guess I will settle for more accurate total runtimes (including GC
time) without cost centre profiling, and hence without seperation of
algorithm from test infrastructure.


Janis Voigtlaender