[Haskell-cafe] Re: Performance Tuning & darcs (a real shootout?)

Tue Jan 24 04:55:35 EST 2006

Hi Jason,

Jason Dagit wrote:

> After almost two weeks of poking at darcs doing various benchmarks and 
> profiles I've realized that optimizing Haskell programs is no easy 
> task.  I've been following the advice of numerous people from the 
> haskell irc channel and learned a lot about darcs in the process.  I've 
> also been using this nifty library that Ian created for this purpose to 
> get a measure for the non-mmap memory usage: 
> http://urchin.earth.li/darcs/ian/memory
> 
> Potentially useful information about darcs;
> 1) Uses a slightly modified version of FastPackedStrings.
> 2) Can use mmap or not to read files (compile time option).
> 
> =Experiments and Findings=
> I have a summary of some of my experimentation with darcs here:
> http://codersbase.com/index.php/Darcs_performance

You can get a quick picture of heap usage with +RTS -Sstderr, by the 
way.  To find out what's actually in that heap, you'll need heap 
profiling (as you know).

> Basically what I have found is that the read of the original file does 
> not cause a spike in memory usage, nor does writing the patch.  This 
> would seem to imply that it's during application of the patch that the 
> memory spikes.  Modifying darcs to read the patch file and print just 
> the first line of the patch causes some interesting results.  The memory 
> usage according to Ian's memory tool stays very low, at about 150kb max, 
> but requesting the first line of the patch appears to make darcs read 
> the entire patch!  Darcs will literally grind away for, say, 30 minutes 
> to just print the first line.
> 
> On a side note, I've tried turing off mmap and running some of the above 
> experiments.  Ian's tool reports the same memory usage, and top still 
> reports large amounts of memory used.  Does ghc use mmap to allocate 
> memory instead of malloc?  Even if it does this shouldn't be a problem 
> for Ian's tool as long as it maps it anonymously.

Yes, GHC's heap is mmap()'d anonymously.  You really need to find out 
whether the space leak is mmap()'d by GHC's runtime, or by darcs itself 
- +RTS -Sstderr or profiling will tell you about GHC's memory usage.

> =Questions=
> So far I've been tracking this performance problem by reading the output 
> of ghc --show-iface and --ddump-simpl for strictness information, using 
> the ghc profiler (although that makes already bad performance much 
> worse), Ian's memory tool, and a lot of experiments and guess work with 
> program modifications.  Is there a better way?

I'd start by using heap profiling to track down what the space leak 
consists of, and hopefully to give you enough information to diagnose 
it.  Let's see some heap profiles!

Presumably the space leak is just as visible with smaller patches, so 
you don't need the full 300M patch to investigate it.

I don't usually resort to -ddump-simpl until I'm optimising the inner 
loop, use profiling to find out where the inner loops actually *are* first.

> Are there tools or techniques that can help me understand why the memory 
> consumption peaks when applying a patch?  Is it foolish to think that 
> lazy evaluation is the right approach?

Since you asked, I've never been that keen on mixing laziness and I/O. 
Your experiences have strengthened that conviction - if you want strict 
control over resource usage, laziness is always going to be problematic. 
  Sure it's great if you can get it right, the code is shorter and runs 
in small constant space.  But can you guarantee that it'll still have 
the same memory behaviour with the next version of the compiler?  With a 
different compiler?

If you want guarantees about resource usage, which you clearly do, then 
IMHO you should just program the I/O explicitly and avoid laziness. 
It'll be a pain in the short term, but a win in the long term.

> I'm looking for advice or help in optimizing darcs in this case.  I 
> guess this could be viewed as a challenge for people that felt like the 
> micro benchmarks of the shootout were unfair to Haskell.  Can we 
> demonstrate that Haskell provides good performance in the real-world 
> when working with large files?  Ideally, darcs could easily work with a 
> patch that is 10GB in size using only a few megs of ram if need be and 
> doing so in about the time it takes read the file once or twice and gzip 
> it.

I'd love to help you look into it, but I don't really have the time. 
I'm happy to help out with advice where possible, though.

Cheers,
	Simon