[Haskell-beginners] space leak

Uchida Yasuo kg6y_ucd at yahoo.co.jp
Mon Feb 15 13:02:13 EST 2010


Oh, What a relief! Thank you for your clear explanation! 

--- Daniel Fischer  wrote:
> Am Montag 15 Februar 2010 16:44:51 schrieb Uchida Yasuo:
> > Hello,
> >
> > I came across the following space leak problem today.
> > How can I fix this?
> > (Tested on Mac OS X 10.5.8, GHC 6.10.3)
> >
> > -- test.hs
> > module Main where
> >
> > import System
> > import qualified Data.ByteString.Lazy.Char8 as L
> >
> > main = do args <- getArgs
> >           let n = read $ args !! 0
> >           cs <- L.getContents
> >           let !a = L.take n cs
> 
> The problem is this. The Bang pattern does less than you probably think.
> The definition of lazy ByteStrings is
> 
> data ByteString = Empty | Chunk {-# UNPACK #-} !S.ByteString ByteString
> 
> , so when you write
> 
> let !a = L.take n cs
> 
> , you force the constructor (null cs ? Empty : Chunk start rest), Since cs 
> is not empty, it's Chunk, and that forces the first part of the ByteString, 
> which will be as long as the prefix which stdin immediately delivers, but 
> at most the default chunk size (32K or 64K, normally [minus two words for 
> bookkeeping]).
> 
> If n is larger than a) the default chunk size or b) what L.getContents got 
> immediately[*], a holds on to the (almost) entire input and you have a bad 
> memory leak.
> 
> Fix: force a to be completely evaluated, e.g.
> 
>     let !a = L.take n cs
>         !l = L.length a
> 
> By evaluating the length, a doesn't keep references to cs and all can be 
> garbage collected.
> 
> [*] how long the first chunk is, depends in this pipeline on scheduling, 
> number of available cores/CPUs, OS buffer size.

__
Regards,
Yasuo Uchida


More information about the Beginners mailing list