Confused by profiling
Simon Marlow
simonmar@microsoft.com
Fri, 14 Sep 2001 16:53:27 +0100
> I am confused by the GHC profiling output. I have put some abstracts
> below to illustrate my confusion.
>=20
> I compiled my code with:
> HFLAGS=3D"-Wall -package lang -prof -auto-all" hmake -ghc main
> and then ran it with:
> ./main -t +RTS -p < test5.hs
>=20
> Here is some of main.prof:
> individual inherited
> COST CENTRE MODULE entries %time %alloc =20
> %time %alloc
>=20
> [...]
> parse Parser 1 0.0 0.0 =20
> 81.4 62.3
> p_module Parser 0 0.0 0.0 =20
> 81.4 62.3
> <|> PC_base 69517 20.3 5.7 =20
> 81.4 62.3
> [...]
> p_aexp_list Parser 0 0.0 0.0 =20
> 0.0 0.3
> con2tag_Token# Tokens 880 0.0 0.0 =20
> 0.0 0.0
> pIfThing PC_base 896 0.0 0.0 =20
> 0.0 0.0
> <$ PC 0 0.0 0.0 =20
> 0.0 0.3
> <$> PC 0 0.0 0.0 =20
> 0.0 0.3
> pSucceed PC_base 896 0.0 0.3 =20
> 0.0 0.3
> <*> PC_base 896 0.0 0.0 =20
> 0.0 0.0
> <*> PC_base 896 0.0 0.0 =20
> 0.0 0.0
> p_aexp_tuple Parser 0 0.0 0.0 =20
> 0.0 0.3
> [...]
>=20
> and here is the code for p_aexp_list:
>=20
> p_aexp_list :: Parser Token Aexp
> p_aexp_list =3D (Aexp_list . get_rights)
> <$ pIfThing ((=3D=3D) (Special '['))
> <*> pSList False p_comma p_exp
> <* pIfThing ((=3D=3D) (Special ']'))
>=20
> Now basically my problem is how things under p_aexp_list can=20
> be entered
> 896 times if p_aexp_list is entered 0 times - am I reading it wrong?
Hmmm. I suspect it may be to do with the fact that p_aexp_list isn't a
function, it's a CAF (constant applicative form - basically a top-level
expression). The "entries" field counts the number of times an SCC
annotation is executed, which corresponds more to "calls" than entries.
I must admit I haven't thought too hard about this.
It might be helpful to see what happens to the code inside the compiler:
p_aexp_list =3D {-# SCC "p_aexp_list" #-} ...
this SCC annotation gets executed once, when p_aexp_list is first
entered, and you'll probably find it elsewhere on the profile under one
of the CAF entries. The first time it is entered, p_aexp_list reduces
to a function closure with the SCC annotation attached to it, so
subsequent enters don't actually execute the SCC they just append it to
the current cost-centre-stack.
Perhaps the act of appending an SCC annotation when entering a function
closure should bump the entry counts for all the cost centres in the
stack. That might yield more reasonable-looking figures.
Cheers,
Simon