[Haskell-cafe] Memory allocation in extractor function (newbie
question)
Alexander Kireyev
alexander.kireyev+haskell.cafe at gmail.com
Mon Apr 7 06:55:57 EDT 2008
Hello,
While trying to write a program for the countPaths Code Jam problem I
ran into what seems to me as a weird behaviour in terms of memory
allocation.
The task is to count the number of way you can spell a certain "word"
by walking some path on a board of letters.
Being a newbie I started with a very straightforward recursive
approach, which is not effective, but still helps to improve
understanding of Haskell basics. Quite naturally I quickly hit the
point where the program becomes too slow, so I tried to profile and
optimize the code.
I used 3 data types:
data GridEntry = GridEntry {letter :: Char, neighbors :: [Point]}
data Area = Area {letters :: (Array (Int, Int) GridEntry), width ::
Int, height :: Int}
data Point = Point { x :: Int, y :: Int} deriving (Show, Eq)
With the recursive approach I had to retrieve GridEntries (a letter in
a certain point of grid together with the list of its neighbours on
the grid) a lot. Quite surprisingly I saw that this operation was
overly expensive in my program. The most time was spent in these 2
functions:
countPathsFrom :: Area -> Int -> String -> Point -> Int
countPathsFrom _ count [] _ = count
countPathsFrom area count (l:[]) point =
count +
if (letter (letterAt area point) == l)
then 1
else 0
countPathsFrom area count (l:ls) point =
let gridPt = letterAt area point
points = neighbors gridPt
in
if (letter gridPt == l)
then countForPoints area ls count points
else count
letterAt :: Area -> Point -> GridEntry
letterAt area (Point x y) = letters area ! (x, y)
The profiling log (./paths +RTS -P) shows the following time/space
behaviour for them:
countPathsFrom Main
181 11830476 37.8 52.3 59.5 82.2 28 1325013312
letterAt Main
184 11830476 21.6 29.9 21.6 29.9 16 757150464
What puzzles me here is how the function letterAt generates 64 bytes
of garbage per call. Since there are no side-effects in this section
of program - and noone can damage the 'area' structure - I would
expect the compiler/runtime to use a reference to the data in the
'area' structure when returning values from letterAt. Instead it seems
to make a copy of GridEntry and return that. The question here is
whether (why?) this behaviour is normal and if it is possible to
optimize the code in such cases.
Thanks in advance,
-Alexander
More information about the Haskell-Cafe
mailing list