difficult profiling example
Serge D. Mechveliani
mechvel at botik.ru
Sat Dec 9 08:19:26 EST 2006
This is again on the time profiling in ghc-6.6.
Who could, please, guess what is happening?
------------------------------------------------------------------------
main = putStr (showsList newProveds "\n")
where
...
((_, newProveds), _) =
{-# SCC "main.searchLemmataInInitial" #-}
searchLemmataInInitial opts rc rcPerLemma initState
searchLemmataInInitial proverOpts rc rcPerLemma searchState =
(let ((state', newProveds), rcRem) =
---------- {-# SCC "searchLemmata" #-}
searchLemmata ((searchState, []), rc)
cost = lsisCost searchState -- data field selection
in
((state' {lsisCost = cost + rc - rcRem}, newProveds), rcRem)
----------
-- only newProveds is used in the test in `main'
)
where
...
searchLemmata extState@ ((searchState, newProveds), rc) =
{-# SCC "searchLemmata 2" #-}
(case splitCs
of
... <complex definitions>
)
where
... <complex definitions>
--------------------------------------------------------------------
ghc-6.6 is built with -prof -auto-all,
so that the ghc library functions are shown as the cost centers too
(I doubt of whether this can help me).
Making: ghc $compilationOptions -prof --make Main
Running ./Main +RTS -p -M200m -RTS
(this is for -Onot,
because -O takes too long, and commonly produces a similar report for
this kind of example).
In this test, newProveds occurs [],
and it is printed out as the only result.
My user's SCC centers are set only at the three points shown above.
The profiling shows
----------------------------------------------------------------
CAF Main 2\60 89 0.0 0.0 99.9 99.9
...
main.searchLemmataInInitial Main \ 347 1 24.1 26.8 99.8 99.8
...
searchLemmata 2 Main 3\49 114 1.3 1.3 4.2 3.6
...
searchLemmata Main 3\48 1 0.0 0.0 0.0 0.0
...
------------------------------------------------------------
What I expected
---------------
1. "main.searchLemmataInInitial" inherits 99% of time.
2. searchLemmata is a loop inside searchLemmataInInitial
which does everything, so it inherits 99% of time
(it has SCC too).
What is reported
----------------
6 lines for the SCC "main.searchLemmataInInitial"
-- while in reality it is called only once.
I do not know why they are so many, but let it be.
Exactly one of these lines is "non-zero":
main.searchLemmataInInitial Main
347 1 24.1 26.8 99.8 99.8
Its individual time % is 24.1, inherited 99.8.
A single line for "searchLemmata" (main loop).
searchLemmata Main
348 1 0.0 0.0 0.0 0.0
-- entered 1 time from `main',
(but the function searchLemmata calls itself many times -- see
"searchLemmata 2").
A single line for "searchLemmata 2"
(its SCC is in the beginning of the definition of searchLemmata):
searchLemmata 2 Main
349 114 1.3 1.3 4.2 3.6
It is probable that it is entered 114 times.
But its inherited time 4.2 % is un-probable, it must be near 100%.
Because, who has taken a great cost of
time(searchLemmataInInitial) - time(searchLemmata)
?
This difference must be close to zero, because it is defined
let ((..., newProveds), ...) = searchLemmata ((searchState, []), rc)
...
in ((..., newProveds), ...),
and only newProveds is used in the test in `main'.
What computation remains after searchLemmata ?
Only extracting snd . fst from a pair of pairs.
The result is [], so `shows' and the output cost almost zero.
So, the profiling says actually that this snd . fst costs 99%.
I do not understand this profiling, so far.
It also shows many GHC and Haskell library centers, like this:
----------------------------------------------------------
total time = 228.40 secs (4568 ticks @ 50 ms)
total alloc = 4,272,700,464 bytes (excludes profiling overheads)
COST CENTRE MODULE %time %alloc
main.searchLemmataInInitial Main 24.1 26.8
. GHC.Base 7.6 4.9
insert Data.Set 6.2 1.4
foldlStrict Data.Set 5.6 0.0
hedgeUnion Data.Set 5.4 6.3
map GHC.Base 5.2 6.5
trim Data.Set 4.6 0.0
...
and also about 6000 of expanded lines.
----------------------------------------------------------
And this all does not explain why the inherited time % for
"searchLemmata 2" is 4.2 instead of 99.
Probably, compiling GHC under -auto-all does not help my aim,
but neither it seems creating extra difficulties.
Another curious point is, for example, that Base.map is said to
take a considerable time when entered from (.) (composition).
In my user program, this is applied as compose $ map shows xs
for the need of showing a list -- probably, the only usage.
And it costs zero in this example, because nothing is shown except
[].
Maybe, the GHC library uses it.
My program contains many places with foldl, which I am going, maybe,
to replace with foldr. But I do not think that this will change
essentially the profiling for this particular example.
Who can, please, tell what I may be missing?
Thank you in advance,
-----------------
Serge Mechveliani
mechvel at botik.ru
More information about the Glasgow-haskell-users
mailing list