[Haskell-beginners] space leak
Daniel Fischer
daniel.is.fischer at web.de
Mon Feb 15 12:07:16 EST 2010
Am Montag 15 Februar 2010 16:44:51 schrieb Uchida Yasuo:
> Hello,
>
> I came across the following space leak problem today.
> How can I fix this?
> (Tested on Mac OS X 10.5.8, GHC 6.10.3)
>
> -- test.hs
> module Main where
>
> import System
> import qualified Data.ByteString.Lazy.Char8 as L
>
> main = do args <- getArgs
> let n = read $ args !! 0
> cs <- L.getContents
> let !a = L.take n cs
The problem is this. The Bang pattern does less than you probably think.
The definition of lazy ByteStrings is
data ByteString = Empty | Chunk {-# UNPACK #-} !S.ByteString ByteString
, so when you write
let !a = L.take n cs
, you force the constructor (null cs ? Empty : Chunk start rest), Since cs
is not empty, it's Chunk, and that forces the first part of the ByteString,
which will be as long as the prefix which stdin immediately delivers, but
at most the default chunk size (32K or 64K, normally [minus two words for
bookkeeping]).
If n is larger than a) the default chunk size or b) what L.getContents got
immediately[*], a holds on to the (almost) entire input and you have a bad
memory leak.
Fix: force a to be completely evaluated, e.g.
let !a = L.take n cs
!l = L.length a
By evaluating the length, a doesn't keep references to cs and all can be
garbage collected.
[*] how long the first chunk is, depends in this pipeline on scheduling,
number of available cores/CPUs, OS buffer size.
> mapM_ (print . L.length) $ L.lines cs
> print a
>
>
> -- gen.hs
> module Main where
>
> main = do putStrLn $ take 1000000 $ cycle "foo"
> main
>
>
> These are compiled with the following options:
>
> $ ghc --make -O2 test
> $ ghc --make -O2 gen
>
> The memory usage seems to depend on the argument(=17000) passed.
> On my MacBook(Core2 Duo 2.0GHz), 16000 works fine.
More information about the Beginners
mailing list