Profiling Optimised Code
Ashley Yakeley
ashley@semantic.org
Sun, 03 Aug 2003 01:31:47 -0700
The profiling report from my program compiled with GHC 6.0 with -O
-fvia-C on Mac OS X isn't making a lot of sense. The line with the most
time, 30.1%, has an entry count of zero, as do most of them:
exprLetMap
Org.Org.Semantic.HScheme.Interpret.LambdaExpression 1939 0
0.3 0.0 30.3 0.0
liftF2 Org.Org.Semantic.HBase.Category.Functor
1940 0 0.0 0.0 30.1 0.0
liftF1 Org.Org.Semantic.HBase.Category.Functor
1941 0 30.1 0.0 30.1 0.0
It's also a fairly unlikely candidate for time consumption. liftF1 is
defined as simply equal to fmap, which presumably doesn't have a
cost-centre because it's defined in the standard libraries. But the
instance of fmap that it calls is fairly trivial.
Am I misinterpreting the report, or would I be better off profiling a
unoptimised program to see where the slow bits are in that? I have
successfully improved code based on reports from unoptimised programs
from GHC 5.*, since those reports made much more sense.
--
Ashley Yakeley, Seattle WA