[Haskell-cafe] Profiling help (Warning: Euler spoilers)
James Jackson
j.e.m.jackson at gmail.com
Wed Mar 3 21:47:13 EST 2010
I have written the program below, which solves problem 14 from Project
Euler, which asks us to find the hailstone number below 1 million that
takes the longest to get to 1. The program solves the problem using
dynamic programming using a Data.Map. It completes in under a minute
(barely), but allocates a ridiculous amount of memory and actually
runs slower than the naive one-liner it was supposed to replace. I
have an intuition on what the problem is, but would like to verify it
by profiling.
Unfortunately, I don't know enough about how profiling works in GHC to
diagnose the problem. Hopefully the Cafe can help.
Here's the code.
scratch.hs: This is the main program, which just calls the
problemFourteen function.
The EP datatype just contains an int representing the problem number
and an (IO ()) which prints out the solution
---------------------------------------------------------------------------------------------------------------------------
import Euler1to20 -- contains problems 1 through 20
import Euler -- Contains various utility functions including the EC
type which represents Euler Problems
main = proc
where (EP num proc) = problemFourteen
---------------------------------------------------------------------------------------------------------------------------
Euler1to20.hs: contains the following lines, among others
---------------------------------------------------------------------------------------------------------------------------
problemFourteen = EP 14 $ print $ (key, fromJust $ findIndex (==key) keys)
where (keys, results) = runState (mapM findChainLength [1..10^6])
$ M.fromList [(1,1)]
key = maximum keys
findChainLength::Integer->State (M.Map Integer Integer) Integer
findChainLength n = do
m <- get
case M.lookup n m of
Nothing -> let nextLink = if even n then div n 2 else 3*n+1
in do
len <- liftM (1+) (findChainLength nextLink)
m2 <- get
m2 `seq` put (M.insert n len m2)
return len
Just n'-> return n'
---------------------------------------------------------------------------------------------------------------------------
My guess is that the problem is in the recursive call to
findChainLength, which is not in tail position. I think that this is
causing the code to keep too many old copies of the map around.
Before I go rewriting everything, however, I'd like to verify that
this is indeed the problem, so the next step is to compile and run
with profiling turned on. After quickly looking at the Real World
Haskell chapter on profiling, I do the following.
Compiling and running:
---------------------------------------------------------------------------------------------------------------------------
ghc -prof --make -auto-all -fforce-recomp scratch.hs
[1 of 3] Compiling Euler ( Euler.hs, Euler.o )
[2 of 3] Compiling Euler1to20 ( Euler1to20.hs, Euler1to20.o )
[3 of 3] Compiling Main ( scratch.hs, scratch.o )
Linking scratch ...
./scratch +RTS -sstderr -hc -p -K100M
(525,837798)
9,503,907,112 bytes allocated in the heap
42,785,764,376 bytes copied during GC (scavenged)
15,697,610,688 bytes copied during GC (not scavenged)
421,548,032 bytes maximum residency (193 sample(s))
17787 collections in generation 0 (650.00s)
193 collections in generation 1 (1748.37s)
782 Mb total memory in use
INIT time 0.00s ( 0.00s elapsed)
MUT time 18.68s ( 19.19s elapsed)
GC time 2398.37s (2401.47s elapsed)
RP time 0.00s ( 0.00s elapsed)
PROF time 50.48s ( 50.50s elapsed)
EXIT time 0.00s ( 0.00s elapsed)
Total time 2467.53s (2471.17s elapsed)
%GC time 97.2% (97.2% elapsed)
Alloc rate 508,742,769 bytes per MUT second
Productivity 0.8% of total user, 0.8% of total elapsed
---------------------------------------------------------------------------------------------------------------------------
I can see from the above that the maximum resident memory at one time
is 782Mb, which seems excessive, since I don't expect the Map to
contain more than around 2M entries. Also, I had to use "-K 100M" to
increase the stack space or the program bombs out with a stack
overflow exception.
Since I compiled with profiling information, running the program
produced the following scratch.prof:
---------------------------------------------------------------------------------------------------------------------------
Wed Mar 3 19:41 2010 Time and Allocation Profiling Report (Final)
scratch +RTS -sstderr -hc -p -K100M -RTS
total time = 18.68 secs (934 ticks @ 20 ms)
total alloc = 5,327,238,992 bytes (excludes profiling overheads)
COST CENTRE MODULE %time %alloc
findChainLength Euler1to20 93.8 90.5
CAF Euler1to20 5.2 8.0
problemFourteen Euler1to20 1.0 1.5
individual inherited
COST CENTRE MODULE
no. entries %time %alloc %time %alloc
MAIN MAIN
1 0 0.0 0.0 100.0 100.0
CAF Main
212 2 0.0 0.0 0.0 0.0
main Main
218 1 0.0 0.0 0.0 0.0
CAF GHC.Handle
164 4 0.0 0.0 0.0 0.0
CAF Euler1to20
139 21 5.2 8.0 100.0 100.0
findChainLength Euler1to20
220 3168610 93.8 90.5 93.8 90.5
problemFourteen Euler1to20
219 1 1.0 1.5 1.0 1.5
CAF Control.Monad.State.Lazy
137 1 0.0 0.0 0.0 0.0
---------------------------------------------------------------------------------------------------------------------------
The above tells me that almost all the CPU time and memory allocation
happens in findChainLength, but that isn't really a surprise.
I think the next step is to start adding cost centers to get a more
detailed profile, but here is where I hit the end of my ghc profiling
knowledge. Can anyone help with some general strategies for how to
use the profiling tools in GHC to track this down? Of course once the
culprit has been found, I have to figure out what to *do* about it,
but one thing at a time...
James
More information about the Haskell-Cafe
mailing list