profiling experience

Serge D. Mechveliani mechvel at botik.ru
Wed Dec 6 05:21:18 EST 2006


People, 

I describe below my small experience with the  time profiling in 
ghc-6.6.
To make sure, consider first the following simple example:

-------------------------------------------------------
import List (sort)

main = {-# SCC "main" #-}
       let n = 5*10^4  :: Int
       in
       putStr (shows (f [1 .. n]) "\n")

f :: [Int] -> Int
f xs =
       {-# SCC "f" #-}
       case xs of []    -> 0
                  x: ys -> (g x) + (h x) + (f ys)
                  where
                  g x = {-# SCC "g" #-} last $ sort [0 .. (rem x 10)]
                  h x = {-# SCC "h" #-} last $ sort [0 .. (rem x 20)]
-------------------------------------------------------

Making:   ghc -O -prof --make Main
Running:  ./Main +RTS -p -M100m -RTS

This produces the report (its indentation shows which center is called 
from which):

--------------------------------------------------------------------
	total time  =        3.00 secs   (60 ticks @ 50 ms)
	total alloc = 130,608,300 bytes  (excludes profiling overheads)

COST CENTRE                    MODULE               %time %alloc

h                              Main                  65.0   67.3
g                              Main                  33.3   30.9
main                           Main                   1.7    1.5
                                                                                               individual    inherited
COST CENTRE              MODULE                                               no.    entries  %time %alloc   %time %alloc

MAIN                     MAIN                                                   1           0   0.0    0.0   100.0  100.0
 main                    Main                                                 162           1   0.0    0.0     0.0    0.0
 CAF                     Main                                                 156           3   0.0    0.0   100.0  100.0
  main                   Main                                                 163           0   1.7    1.5   100.0  100.0
   f                     Main                                                 164       50001   0.0    0.3    98.3   98.5
    h                    Main                                                 166       50000  65.0   67.3    65.0   67.3
    g                    Main                                                 165       50000  33.3   30.9    33.3   30.9
 CAF                     GHC.Handle                                           109           3   0.0    0.0     0.0    0.0
--------------------------------------------------------------------

To my mind, these numbers look all right: like I expected.
Only I do not so firmly understand what is CAF, a 
"constant application form", a data which has cost, is assigned cost,
but is out of the stack of calls.

Is this profiling all right?

Then, I need to find the main time eater in a certain concrete 
(and complex) example. Its profiling looks strange.
I set SCC in all valuable inner loops in the modules  Main.hs
and  LemmaSearch.hs. 
It shows 
---------------------------------------------
           LemSearchList +RTS -p -M200m -RTS

        total time  =      112.00 secs   (2240 ticks @ 50 ms)
        total alloc = 3,813,365,740 bytes  (excludes profiling overheads)

COST CENTRE                    MODULE               %time %alloc

CAF                            Main                  95.5   96.1
splitDisjunctScheme            LemmaSearch            2.5    2.2
addEquationsWithReduction      LemmaSearch            1.3    1.4
---------------------------------------------

My idea was that some of the loops inside  LemmaSearch  must take 
almost all time.
Instead, it shows that some  CAF  (anonymous, to me) in  Main  takes 
almost everything, and among places which I marked with SCC, the two 
are shown as 2.5% and 1.4%, and all the rest shown as  zero:
---------------------------------------------------
MAIN                     MAIN                                                  \ 1           0   0.0    0.0   100.0  100.0

 CAF                     Main                                                 2
80          57  96.0   96.1    99.9   99.9
...
---------------------------------------------------

The whole user project is build under  -prof
("auto-all" skipped). The test Main.hs is built also under  -prof.
SCC are set only inside the functions  Main.main  and 
LemmaSearch.searchLemmataInInitial.
What may consitute this strange CAF cost of 96% ?

Kirsten Chevalier <catamorphism at gmail.com>
wrote
> I didn't look at your code all that carefully, but did you build the
> GHC libraries with "-prof -auto-all"? (Not just "-prof".) If you don't
> build the libraries with "-auto-all", then cost centres won't get
> inserted for library functions, and if it's really a standard library
> function that's taking all of that time, the profiling report won't
> indicate that.

I made ghc-6.6 from the official source in a standard way:

  ./configure ...;  make;  make install

Does this presume that this also generates the .p.o GHC library versions
for -prof -auto-all ?

Generally, I do not guess, so far, what is happening.

Regards,

-----------------
Serge Mechveliani
mechvel at botik.ru


More information about the Glasgow-haskell-users mailing list