[Haskell-cafe] Re: Space and time leaks

Thu Oct 4 12:36:15 EDT 2007

Ronald Guida wrote:
> 
> Now for the hard questions.
> 1. How do I go about detecting space and time leaks?
> 2. Once I find a leak, how do I fix it?
> 3. Are there any programming techniques I can use to avoid leaks?
> 
I'm hard time to believe I'll write something you do not know but
  I had similar problem and it was not hard to fix (despite
  some stories that it is very hard in Haskell).
If you use ghc then skim over this, otherwise do not mind.
Check also GHC User Guide. The options are described, well not
  very well but better than nothing.

add 1)
   Here are few options I found most useful.
   compile with: -prof -auto-all
   run with: +RTS -p -hc -RTS
     ... to see what functions are creating leaks and which functions
     take the most time.
     Check out the <app>.prof and <app>.hp (the numbers in <app>.hp
     correspond to stack traces in <app>.prof (the no. column)).
   run with: +RTS -hr -RTS
     ... to see what is still keeping references to your data
     The stack traces corresponding to the numbers in <app>.hp are in
     <app>.prof. The stuff which is keeping the references are typically
     the routines which are creating closures and pass them around in
     their results instead of forcing computation and returning the
     processed data (which are hopefully much smaller than the input
     data processed).
   run with: +RTS -s +RTS
     ... and check your <app>.stat to see if your time problem is not
     actually a space problem leading to very poor GC performance.
   use hp2ps to look at the <app>.hp files

add 2)
   Add ! to your data types at places from the result data structure
   to the final/leaf data structures which will keep the processed data.
   This is provided you do not need laziness on some places. If you do
   (e.g. so that you do not compute data fields which are not mostly
   used or when you actually require it for efficient processing (like
   foldr with function nonstrict in second arg)) then you need to use
   seq or preferably $! on the code paths which need to be strict and
   leave the rest of code paths lazy. Idea is that you need strictness
   somewhere so that your huge input data are compacted to the small
   output data on the fly instead at the very end when you ask for
   some result.

add 3)
   There is something on the haskell wiki. Search for stack overflow
   and something about tail recursion and when it is a "red herring".
   I just looked for the data with google and there is enough of them.

Peter.