[Haskell-cafe] Re: Space usage problems
Simon Marlow
simonmar at microsoft.com
Wed Jan 11 05:36:47 EST 2006
Ian Lynagh wrote:
> On Tue, Jan 10, 2006 at 04:44:33PM +0000, Ian Lynagh wrote:
>
>>readChunks :: FirstMonad String
>>readChunks = do xs <- get
>> if null xs then return []
>> else do let (ys, zs) = foo xs
>> put zs
>> rest <- readChunks
>> return (ys ++ rest)
>
>
> It looks like changing this let to a case fixes this example, but at the
> time I'd experimented with that there must have been other issues
> clouding the effect, such as the following.
>
> Foo1 (attached) uses large amounts of memory whereas Foo2 (also
> attached) runs in a little constant space. The difference is only
> changing this:
>
> else do chunk <- case foo xs of
> (ys, zs) ->
> do put zs
> return ys
> chunks <- readChunks
> return (chunk ++ chunks)
>
> to this:
>
> else case foo xs of
> (ys, zs) ->
> do put zs
> chunks <- readChunks
> return (ys ++ chunks)
I had great difficulty understanding this, but I think I do now. It's a
bit easier to understand if you inline the monads away. Foo1 translates
to this:
bar [] = ([],[])
bar (x:xs) = let (zs,ys) = bar xs in (x:zs,ys)
readChunks [] = ([],[])
readChunks xs = let (ys,zs) = bar xs
(chunks,rest) = readChunks zs in
(ys ++ chunks, rest)
and Foo2:
readChunks [] = ([],[])
readChunks xs = case bar xs of
(zs,ys) -> let (chunks,rest) = readChunks ys in
(zs ++ chunks, rest)
This is pretty much what GHC ends up with when you give -O (actually it
turns some of the tuples into unboxed tuples, but that's not important).
We can see in Foo1 that chunks is a thunk holding on to zs, which is a
thunk that holds on to xs, so you never get to release xs until the
whole result list (ys) is traversed. GHC's lazy tuple optimisation
doesn't kick in, because neither chunks nor rest are evaluated.
However, it's not so clear why Foo2 is better. chunks holds on to ys,
the second of the pair returned by bar. In fact, ys will point to a
chain of thunks that looks like this:
ys = snd (_, snd (_, snd (_, snd (_, snd ...))))
every time GC runs, it can completely eliminate this list via the
well-known lazy tuple optimisation. Unfortunately it doesn't
*completely* eliminate the list, because of a shortcoming in our
implementation, actually reported earlier by Ian Lynagh with a very
similar program :-) Fortunately in this example we do seem to be
reducing enough of the list to eliminate the space leak, though.
My suggestion: don't use the lazy state monad if you can help it.
Cheers,
Simon
More information about the Haskell-Cafe
mailing list