[Haskell-cafe] Meaning abbreviations stat file GHC

Sun Jan 14 18:01:22 EST 2007

2007/1/14, Kirsten Chevalier <catamorphism at gmail.com>:
> [redirecting to ghc-users]
>
> On 1/13/07, Ron <iampure at gmail.com> wrote:
> > Dear,
> >
> > I made a profile[1] of a test program:
> > Where can I find documentation for the meaning of everything mentioned
> > below? Or alternatively, can anyone explain them?
> >
> > Where can I see the effect of using the -xt option in this profile?
> >
> > Ron
> >
> > [1]
> > /Main +RTS -p -s -xt -hc
> > 1,372,408,024 bytes allocated in the heap
> > 121,255,600 bytes copied during GC (scavenged)
> >   6,584,692 bytes copied during GC (not scavenged)
> >   2,768,896 bytes maximum residency (68 sample(s))
> >
> >        2649 collections in generation 0 (  1.11s)
> >          68 collections in generation 1 (  0.49s)
> >
> >           6 Mb total memory in use
> >
> >   INIT  time    0.00s  (  0.00s elapsed)
> >   MUT   time    5.97s  (  6.63s elapsed)
> >   GC    time    1.60s  (  1.88s elapsed)
> >   RP    time    0.00s  (  0.00s elapsed)
> >   PROF  time    0.19s  (  0.20s elapsed)
> >   EXIT  time    0.00s  (  0.00s elapsed)
> >   Total time    7.76s  (  8.71s elapsed)
> >
> >   %GC time      20.6%  (21.5% elapsed)
> >
> >   Alloc rate    229,946,873 bytes per MUT second
> >
> >   Productivity  76.9% of total user, 68.5% of total elapsed
> >
>
> I don't think that the format of this file is documented anywhere
> (though it should be), but this information is really meant for GHC
> implementors. Have you looked at the chapter on profiling in the GHC
> manual yet?
> http://www.haskell.org/ghc/docs/latest/html/users_guide/profiling.html
Yes, I spend quite some time on it and have some experience with the
cost centres from another project I worked on.

> There's also a very sketchy intro to profiling in the GHC Commentary:
> http://hackage.haskell.org/trac/ghc/wiki/Commentary/Profiling
> and you're welcome to improve it.
The ticky option was documented to be for implementors, IIRC. The stat
file seemed to be intended for mere mortals, but maybe I
misunderstood.
>
> If the above documentation doesn't answer your questions, feel free to
> reply to this mailing list with more specific questions; it might help
> to explain exactly what you're trying to find out about your program's
> behavior.
Ok, the real question is that I want to measure the exact space and
time behaviour of a few implementations of algorithms for _scientific_
purposes(i.e. "experimental algorithmic research"). Measuring time in
a lazy language is somewhat hard, since you don't want to count the
amount of time it takes to create the initial data structures, nor the
time it takes to print the result.

With some help a scheme was devised to solve this by something like this:

main =  do
 f<-readFile "someFileWithData.dat"
 g<-buildDataStructure f
 print g==g --force graph construction
 result <- computeAnswer g
 print result == result --force complete result

By doing several things multiple times, it should be possible to
derive how long computeAnswer actually takes, since that's the number
I am interested in.

A similar thing can be done for space usage, I think, but since I
don't understand everything(and I mean everything) that is in the
.stat file, there's little point in starting the endeavour.

As, I already mentioned, I wasn't able to see what the -xt option
changed in the .stat file, but it's important that I do measure the
total amount of memory the expression uses(thus including the stack).

The real question thus is: how to measure the time and space of a
single expression reliably and correctly (where the argument (or all
arguments) to that expression is(are) fully evaluated already)? In
case you think that doing the measurements in this way is wrong, feel
free to tell how it is usually done.

I am also a little unsure about how lazyness in for example the State
monad works (the building up of giant unevaluated thunks is something
I want to avoid) or what difference it makes when I wrap monads in
transformer monads(e.g. MaybeT) on it.

In fact, I would like to be able to predict before the program runs
what its space behaviour will be, but maybe I am asking for too much.
I would like to _reason_ about the space usage, such that even when
GHC has optimisation turned off, I still _know_ and not _hope_ it uses
only a certain amount of memory, for example.  A natural question to
ask would be whether there's something wrong with Haskell if a user
can't do this.

I also wonder where the thunks I mentioned earlier actually end up in
memory(i.e. heap or stack), but that's just curiosity for this moment.

I read half of the STG-machine paper once, but I can't remember the
precise contents, though.

Regards, Ron