[Haskell-beginners] space leak
Uchida Yasuo
kg6y_ucd at yahoo.co.jp
Mon Feb 15 13:02:13 EST 2010
Oh, What a relief! Thank you for your clear explanation!
--- Daniel Fischer wrote:
> Am Montag 15 Februar 2010 16:44:51 schrieb Uchida Yasuo:
> > Hello,
> >
> > I came across the following space leak problem today.
> > How can I fix this?
> > (Tested on Mac OS X 10.5.8, GHC 6.10.3)
> >
> > -- test.hs
> > module Main where
> >
> > import System
> > import qualified Data.ByteString.Lazy.Char8 as L
> >
> > main = do args <- getArgs
> > let n = read $ args !! 0
> > cs <- L.getContents
> > let !a = L.take n cs
>
> The problem is this. The Bang pattern does less than you probably think.
> The definition of lazy ByteStrings is
>
> data ByteString = Empty | Chunk {-# UNPACK #-} !S.ByteString ByteString
>
> , so when you write
>
> let !a = L.take n cs
>
> , you force the constructor (null cs ? Empty : Chunk start rest), Since cs
> is not empty, it's Chunk, and that forces the first part of the ByteString,
> which will be as long as the prefix which stdin immediately delivers, but
> at most the default chunk size (32K or 64K, normally [minus two words for
> bookkeeping]).
>
> If n is larger than a) the default chunk size or b) what L.getContents got
> immediately[*], a holds on to the (almost) entire input and you have a bad
> memory leak.
>
> Fix: force a to be completely evaluated, e.g.
>
> let !a = L.take n cs
> !l = L.length a
>
> By evaluating the length, a doesn't keep references to cs and all can be
> garbage collected.
>
> [*] how long the first chunk is, depends in this pipeline on scheduling,
> number of available cores/CPUs, OS buffer size.
__
Regards,
Yasuo Uchida
More information about the Beginners
mailing list