New core profiling mode

Douglas Wilson douglas.wilson at
Tue Nov 14 22:34:25 UTC 2017

Hi ghc-devs,

I've been working on a new mode of adding cost-centres to programs and I'd
like to ask some questions and solicit some feedback. The code is here
it works, provided one enables -fprof-core on all modules.

I've recently been trying to pick some low hanging fruit from ghc
performance. A common frustration was in the difference between profiled and
non-profiled builds. Often I thought had I found a problem in the profiled
build, only to find it was optimized away in the non-profiled build. Several
times an issue was tail-calls not happening in profiled builds.

To solve this problem I've been working on a new way of inserting
adding them to core after simplification (currently at the end of
rather than adding them to HsSyn before simplification. This makes it
harder to
map cost-centres into source code (You have to -ddump-prep currently), but
exchange you are profiling the same core program as the non-profilied build.

I intend to investigate whether I can use SourceNotes to create SrcSpans
for the
generated cost-centres to somewhat alleviate the need to inspect dumped

There are several new flags:

-fprof-core: Enables the aforementioned mode. This is mutually exclusive
-fprof-auto etc.

-fprof-core-drop-ticks: Non-user ticks are dropped from unfoldings(though I
  don't know how to do this yet).

-fprof-core-tick-binds: ticks are inserted around the RHS of bindings
  top-level unlifted bindings).

-fprof-core-tick-cases: ticks are inserted around the scrutinees of cases.

-fprof-core-tick-alts: ticks are inserted around Alt expressions (unless
  is only one).

Some questions:

I need to strip (probably only non-user) ticks out of unfoldings before
they are
substituted into a module that uses -fprof-core. Where is the right place
to do
this? I need inlining to proceed exactly as if the ticks were not present,
however I don't want to strip ticks when the unfoldings are created as other
modules may still need them.

Is the end of corePrepPgm the right place to insert the cost-centres? I
chose it
because it can't affect any core optimizations if it's last, but perhaps it
could be earlier, or perhaps it needs to act on Stg?

Do you have any examples of programs for which existing profiling tools are
inadequate due to how cost-centres affect simplification? There is an
example in
#12893 but something self-contained would be great!

Doug Wilson
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>

More information about the ghc-devs mailing list