heap profiling

Evan Laforge qdunkan at gmail.com
Tue Jun 15 20:52:43 EDT 2010


I've been trying to profile my program and having some trouble.  I
imagine other people have stumbled across these same issues, so
hopefully I can benefit from someone else's experience.  In return, if
I can figure out some of these things, I'll put it up at
http://www.haskell.org/haskellwiki/How_to_profile_a_Haskell_program or
maybe make make a heap profiling page linked from that.

Firstly a few miscellaneous questions:

When running '-hc -hblag,drag' it works for a little while and then
stops.  The app is still hung, but cpu has gone to 0%.  The disk is
also idle, so I don't think it's swapping.  According to -S, all
garbage collection has stopped too.  It's apparently due to something
about this particular profile, since reducing the amount of data it
handles just results in a sooner hang.  This same combination works
with other profiles, so apparently something the code is doing is
locking up.  Has anyone else seen this?  Any tips on how to
troubleshoot where it's getting stuck, doing what?  If it sounds like
a ghc bug I can try to trim down the size and submit a ticket.  GHC
6.12.1 on OSX.

The image link from
http://www.haskell.org/ghc/docs/6.12.2/html/users_guide/profiling.html
is broken, which makes it a little harder to understand the
documentation.

-s stats say GC time is 46%, productivity is 32%.  That's pretty bad,
right?  And where is the remaining 22% going?

The ghc manual says lag is "time between creation and first use" and
drag is "time between last use and last reference is dropped".  I
think I understand drag: it means you haven't touched the data for a
while before it goes out of scope or its retainer goes out of scope.
So for instance:

stuff = do
  st <- get
  x1 st
  x2

Does this mean that 'st' will be dragging through 'x2' as it would in
an imperative language?  I gather the usual haskell use is when
combined with laziness: 'Map.insertWith (+) "a" 42' will keep the old
value of "a" around until you look "a" up and force it, since the (+)
won't be applied until then.  Same with foldl.  Are there some other
classic examples of drag?  Searching for "haskell dragging problem"
doesn't yield much to do with memory use.

Lag I'm not so sure about.  How is something created before it's used?

And... what's INHERENT_USE?  And what about VOID?  How can an object
be created but never used?  What triggered its creation?



So, the main question:

I have a program that runs some computation in a monad stack before
extracting the final result, a list, and passing it on.  When run
under the heap profiler, there's a huge spike in that phase, which I
think should be mostly bogus usage, since the final output is so
relatively small.  When I run under -hb I see big bands for LAG and
DRAG.

According to -hd the top 3 users are:

<mtl-1.1.0.2:Control.Monad.Writer.Lazy.sat_sltc>
(,)
D:Monad

This is kind of puzzling to me... first of all I've never seen an
explanation for sat_* closure descriptors, and second of all, why does
it not show up in the .prof file at all?  I switched to Writer.Strict
and the drag disappeared, which helped, but the lag is still there,
and the top -hd is now

<mtl-1.1.0.2:Control.Monad.Writer.Strict.sat_soa1>
State
stg_ap_2_upd_info

(the top -hy is "*", which I gather means "don't know").  And what's
"stg_ap_2_upd_info"?  The top item accounts for 70% of the memory
usage.

One obvious candidate for the lag is Writer's data (DList Log.Msg) is
collecting and only being forced at the end of the computation, but
there is no logging in this section and in any case should never be
30M of it!  -hc is not helpful since every monadic operation is
charged a little bit, -hr is similarly unhelpful (top retainer is
MANY... hah).

So what exactly is this sat_*?  Where is the memory going?  I guess it
doesn't have an SCC since it doesn't show up in the .prof output.  Is
there some place I can manually put an SCC?  I was able to fix the
drag just by guessing at a strict writer, but the lag is still around.
 Is there another spot in Writer's >>= that could be accumulating?
What's *in* that giant mountain of lag?


More information about the Glasgow-haskell-users mailing list