Profiling Optimised Code

Ashley Yakeley ashley@semantic.org
Sun, 03 Aug 2003 01:31:47 -0700


The profiling report from my program compiled with GHC 6.0 with -O 
-fvia-C on Mac OS X isn't making a lot of sense. The line with the most 
time, 30.1%, has an entry count of zero, as do most of them:

            exprLetMap   
Org.Org.Semantic.HScheme.Interpret.LambdaExpression  1939           0   
0.3    0.0    30.3    0.0
             liftF2      Org.Org.Semantic.HBase.Category.Functor             
1940           0   0.0    0.0    30.1    0.0
              liftF1     Org.Org.Semantic.HBase.Category.Functor             
1941           0  30.1    0.0    30.1    0.0

It's also a fairly unlikely candidate for time consumption. liftF1 is 
defined as simply equal to fmap, which presumably doesn't have a 
cost-centre because it's defined in the standard libraries. But the 
instance of fmap that it calls is fairly trivial.

Am I misinterpreting the report, or would I be better off profiling a 
unoptimised program to see where the slow bits are in that? I have 
successfully improved code based on reports from unoptimised programs 
from GHC 5.*, since those reports made much more sense.

-- 
Ashley Yakeley, Seattle WA