how much can %alloc in profiling be trusted

Simon Peyton-Jones simonpj@microsoft.com
Thu, 10 Jul 2003 10:37:56 +0100


It should be accurate.  Hard to say more without investigating your
program in detail.

Try -ddump-simpl (with your profiling flags) to see how your function
looks just before code generation.

Simoin

| -----Original Message-----
| From: glasgow-haskell-users-admin@haskell.org
[mailto:glasgow-haskell-users-admin@haskell.org]
| On Behalf Of Hal Daume
| Sent: 09 July 2003 23:17
| To: glasgow-haskell-users@haskell.org
| Subject: how much can %alloc in profiling be trusted
|=20
| In my program, I get the following time/allocation information for a
| call to my cosine function:
|=20
|                              individual      inherited
| COST CENTRE   no. entries  %time  %alloc  %time  %alloc
| cosine               2721  43.1   74.3    43.1   74.3
|=20
| this is a shocking amount of allocation, considering the definition of
| the function:
|=20
| cosine :: Vector -> Vector -> Double
| cosine v1 v2 =3D
|   case dot v1 v2 0 of
|     n | n <=3D 0    -> 0
|       | otherwise -> n / (size v1 * size v2)
|   where
|     dot [] _  n =3D n
|     dot _  [] n =3D n
|     dot ((x,xv):xs) ((y,yv):ys) n =3D
|       case x `compare` y of
|         EQ -> dot xs ys $! (xv * yv + n)
|         LT -> dot xs ((y,yv):ys) n
|         GT -> dot ((x,xv):xs) ys n
| --    size =3D sqrt . sum . map (square . snd)
|     size l =3D sqrt $! size' l 0
|     size' [] n =3D n
|     size' ((_,x):xs) n =3D size' xs $! n + x*x
|=20
| where Vector =3D [(Int,Double)] is a sparse vector representation.  =
This
| was even higher (moderately) until I switched from the old to the new
| definition of size listed there.
|=20
| You can't blame this on the fact that the two vectors are being passed
| lazily either: they're being read strictly from a file and even seq'd
| before being put into the list.  Specifically, we have (v0 is passed
in
| to the function from top-level):
|=20
|   ... do
|     v <- readList_H h
|     return (cosine v0 v, w)
|=20
| where readList_H is defined as:
|=20
| readList_H h =3D do
|   b <- FastIO.isEOF h
|   c <- FastIO.fscanfChar h
|   if c /=3D ' '
|     then return []
|     else do
|       b <- FastIO.isEOF h
|       i <- FastIO.fscanfInt h
|       FastIO.fscanfChar h
|       v <- FastIO.fscanfFloat h
|       rest <- readList_H h
|       return ((force (force i,force $ floatToDouble v)) : rest)
|   where force x =3D x `seq` x
|=20
| as far as i can tell, all the list allocation should be happening
here.
| (the forces were not there in the beginning -- I added them later but
it
| changed nothing.)
|=20
|  - Hal
|=20
| --
|  Hal Daume III                                   | hdaume@isi.edu
|  "Arrest this man, he talks in maths."           | www.isi.edu/~hdaume
| _______________________________________________
| Glasgow-haskell-users mailing list
| Glasgow-haskell-users@haskell.org
| http://www.haskell.org/mailman/listinfo/glasgow-haskell-users