Looing for advice on profiling
duncan.coutts at worc.ox.ac.uk
Tue Nov 9 12:04:02 EST 2004
On Tue, 2004-11-09 at 14:45, Simon Marlow wrote:
> On 09 November 2004 12:54, Duncan Coutts wrote:
> > When I do time profiling, the big cost centres come up as putByte and
> > putWord. When I profile for space it shows the large FiniteMaps
> > dominating most everything else. I originally guessed from that that
> > the serialisation must be forcing loads of thunks which is why it
> > shows up so highly on the profile. However even after doing the
> > deepSeq before serialisation, it takes a great deal of time, so I'm
> > not sure what's going on.
> let's get the simple things out of the way first: make sure you're
> compiling Binary with -O -funbox-strict-fields (very important). When
> compiling for profiling, don't compile Binary with -auto-all, because
> that will add cost centres to all the small functions and really skew
> the profile. I find this is a good rule of thumb when profiling: avoid
> -auto-all on your low-level libraries that you hope to be inlined a lot.
Ok, I was missing -funbox-strict-fields. I'll try that.
> You say your instances are created using DrIFT - I don't think we ever
> modified DrIFT to generate the right kind of instances for the Binary
> library in GHC, so are you using the instances designed for the nhc98
> binary library? If so, make sure your instances are using put_ rather
> than put, because the former will allow binary output to run in constant
> stack space.
It's using put_
> Are you using BinMem, or BinIO?
> > The retainer profiling again shows that the FiniteMaps are holding on
> > to most stuff.
> > A major problem no doubt is space use. For the large gtk/gtk.h, when I
> > run with +RTS -B to get a beep every major garbage collection, the
> > serialisation phase beeps continuously while the file grows.
> > Occasionally it seems to freeze for 10s of seconds, not dong any
> > garbage collection and not doing any file output but using 100% CPU,
> > then it carries on outputting and garbage collecting furiously. I
> > don't know how to work out what's going on when it does that.
> I agree with Malcolm's conjecture: it sounds like a very long major GC
> > I don't understand how it can be generating so much garbage when it is
> > doing the serialisation stuff on a structure that has already been
> > fully deepSeq'ed.
> Yes, binary output *should* do zero allocation, and binary input should
> only allocate the structure being created. The Binary library is quite
> heavily tuned so that this is the case (if you compile with profiling
> and -auto-all, it will almost certainly break this property, though).
Yes, it's much better with optimisations. I'll try the
-funbox-strict-fields and report back.
More information about the Glasgow-haskell-users