[Haskell-beginners] Memory usage problem
Sami Liedes
sliedes at cc.hut.fi
Sun Mar 21 21:06:08 EDT 2010
On Sun, Mar 21, 2010 at 08:28:23PM -0400, Patrick LeBoutillier wrote:
> I'm no profiling expert, but I have a few questions though:
>
> - What is the size (in bytes) of your input file?
The input file contains 31,129,639 bytes, with 710,355 lines and 27,954
individual packages/records.
In fact it seems that even if the input stream repeats only the line
" a" (that is, a space and the letter a) infinitely, the program
eventually eats all memory. That's interesting.
So I guess the lines are read in, but then held in memory and lazily
not processed further until the entire file has been read, because
it's not necessary? Or something.
I tried using $! in readRecordFields like
readRecordFields lines = (mapMaybe (readField $!) rl, rest) where
(rl,rest) = getOneRecordLines lines
but that didn't help either... I guess I don't fully understand $!.
Does it force the entire computation below it to finish? To achieve
sane memory behavior, it would seem necessary to parse the lines
before they're all read, and then mapMaybe in readRecordFields would
throw out the Nothings. After that I believe it should all be constant
amount of memory with all lines starting with a space.
> - Also, does memory usage improve if you remove the "sort"?
Yes. Then it only takes a few megabytes, regardless of how large the
file is.
Thanks,
Sami
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 836 bytes
Desc: Digital signature
Url : http://www.haskell.org/pipermail/beginners/attachments/20100321/5ea5e52b/attachment-0001.bin
More information about the Beginners
mailing list