[Haskell-cafe] Re: Space and time leaks
Peter Hercek
phercek at gmail.com
Thu Oct 4 12:36:15 EDT 2007
Ronald Guida wrote:
>
> Now for the hard questions.
> 1. How do I go about detecting space and time leaks?
> 2. Once I find a leak, how do I fix it?
> 3. Are there any programming techniques I can use to avoid leaks?
>
I'm hard time to believe I'll write something you do not know but
I had similar problem and it was not hard to fix (despite
some stories that it is very hard in Haskell).
If you use ghc then skim over this, otherwise do not mind.
Check also GHC User Guide. The options are described, well not
very well but better than nothing.
add 1)
Here are few options I found most useful.
compile with: -prof -auto-all
run with: +RTS -p -hc -RTS
... to see what functions are creating leaks and which functions
take the most time.
Check out the <app>.prof and <app>.hp (the numbers in <app>.hp
correspond to stack traces in <app>.prof (the no. column)).
run with: +RTS -hr -RTS
... to see what is still keeping references to your data
The stack traces corresponding to the numbers in <app>.hp are in
<app>.prof. The stuff which is keeping the references are typically
the routines which are creating closures and pass them around in
their results instead of forcing computation and returning the
processed data (which are hopefully much smaller than the input
data processed).
run with: +RTS -s +RTS
... and check your <app>.stat to see if your time problem is not
actually a space problem leading to very poor GC performance.
use hp2ps to look at the <app>.hp files
add 2)
Add ! to your data types at places from the result data structure
to the final/leaf data structures which will keep the processed data.
This is provided you do not need laziness on some places. If you do
(e.g. so that you do not compute data fields which are not mostly
used or when you actually require it for efficient processing (like
foldr with function nonstrict in second arg)) then you need to use
seq or preferably $! on the code paths which need to be strict and
leave the rest of code paths lazy. Idea is that you need strictness
somewhere so that your huge input data are compacted to the small
output data on the fly instead at the very end when you ask for
some result.
add 3)
There is something on the haskell wiki. Search for stack overflow
and something about tail recursion and when it is a "red herring".
I just looked for the data with google and there is enough of them.
Peter.
More information about the Haskell-Cafe
mailing list