[Haskell-beginners] Writing huge result of `show` to file results in out of memory

Thomas Hallgren hallgren at chalmers.se
Sun Oct 30 10:45:34 UTC 2016


Hi,

I just tried the following simple example:

	main = writeFile "numbers" (show [1..])

If I load it and run it in ghci (version 8.0.1), the memory use quickly grows to
several gigabytes, but if I compile it with ghc -O --make, and run the
executable, it runs in constant space (a few megabytes) for a long time.

So two observations:

1) If you get a space leak or not can depend on how the code is compiled.
Compilers have to take care to avoid space leaks when generating code.

2) I don't think there is any reason why show functions should leak in general,
so if you see the space leak also when running properly compiled code, the
leak is probably coming from the functions that build the data structure. For
example, even though show [1..n] can run in constant space, show (reverse
[1..n]) will use space proportional to the length of the list. With laziness, it
could be that the data structure isn't built until you show it, so that is when
you notice the space leak. Also, if you use the data structure for something
else after writing the file, the entire data structure will of course be kept in
memory.

Thomas H


On 2016-10-27 10:53, Mahdi Dibaiee wrote:
> Hi,
> 
> So I have a data instance which contains a few big matrices, and I want to save
> my instance to a file so I can `read` it back later
> to avoid the long computation every time (I'm training a recurrent neural network).
> 
> First attempt, the simplest method:
> 
>   writeFile "rnn" (show dataInstance)
> 
> It starts to take all of the memory and then bails out with `out-of-memory` error.
> 
> So I wrote a function to write the string chunk by chunk, without buffering,
> here is the code:
> https://github.com/mdibaiee/sibe/blob/728df02fbdd6f134af107c098f5477094c61ea76/examples/recurrent.hs#L52-L64
> 
> Copy/pasted from the link:
> 
> saveRecurrent :: FilePath -> String -> Int -> IO () saveRecurrent path str
> chunkSize = do handle <- openFile path AppendMode hSetBuffering handle
> NoBuffering loop handle str hClose handle where loop _ [] = return () loop
> handle s = do hPutStr handle $ take chunkSize s hFlush handle loop handle $ drop
> chunkSize s
> 
> But it doesn't work either, I still get `out-of-memory` errors. From what I
> understand, this should work, but it isn't.
> I asked on IRC and someone said "Show is not lazy /enough/", if that's the case,
> I would appreciate an explanation of that.
> 
> Thanks,
> Mahdi
> 
> 
> 
> 
> _______________________________________________
> Beginners mailing list
> Beginners at haskell.org
> http://mail.haskell.org/cgi-bin/mailman/listinfo/beginners
> 




More information about the Beginners mailing list