[Haskell-beginners] Memory usage problem

Sami Liedes sliedes at cc.hut.fi
Sun Mar 21 21:06:08 EDT 2010


On Sun, Mar 21, 2010 at 08:28:23PM -0400, Patrick LeBoutillier wrote:
> I'm no profiling expert, but I have a few questions though:
> 
> - What is the size (in bytes) of your input file?

The input file contains 31,129,639 bytes, with 710,355 lines and 27,954
individual packages/records.

In fact it seems that even if the input stream repeats only the line
" a" (that is, a space and the letter a) infinitely, the program
eventually eats all memory. That's interesting.

So I guess the lines are read in, but then held in memory and lazily
not processed further until the entire file has been read, because
it's not necessary? Or something.

I tried using $! in readRecordFields like

readRecordFields lines = (mapMaybe (readField $!) rl, rest) where
  (rl,rest) = getOneRecordLines lines

but that didn't help either... I guess I don't fully understand $!.
Does it force the entire computation below it to finish? To achieve
sane memory behavior, it would seem necessary to parse the lines
before they're all read, and then mapMaybe in readRecordFields would
throw out the Nothings. After that I believe it should all be constant
amount of memory with all lines starting with a space.

> - Also, does memory usage improve if you remove the "sort"?

Yes. Then it only takes a few megabytes, regardless of how large the
file is.

Thanks,

	Sami

	
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 836 bytes
Desc: Digital signature
Url : http://www.haskell.org/pipermail/beginners/attachments/20100321/5ea5e52b/attachment-0001.bin


More information about the Beginners mailing list