Why is there a space leak here?

Tom Moertel tom-list-haskell@moertel.com
Tue, 05 Jun 2001 15:05:52 -0400


Alastair David Reid wrote:
> 
> Executive summary: David's program has an incredibly subtle space leak
> in it (or I'm being incredibly dumb).  I encourage the honchos (and
> would be honchos) to have a look.  Users of other compilers might give
> it a shot too.

> David Bakin wrote:
> 
> Why is there a space leak in foo1 but not in foo2?

The reason that foo1 "leaks" space is because the middle of v grows
faster than its head.  So taking elements from v causes its in-memory
footprint to grow.  To see why this is the case, evaluate foo1 by hand:

> -- This has a space leak, e.g., when reducing (length (foo1 1000000))
> foo1 m
>   = take m v
>     where
>       v = 1 : flatten (map triple v)
>       triple x = [x,x,x]

Focusing on just v for now, and letting f = flatten for notation
purposes, we have

(1) v = 1 : f (map triple v)

(2)   = { unwrap v }
        1 : f (map triple (1 : f (map triple v)))

(3)   = { eval map }
        1 : f (triple 1 : map triple (f (map triple v)))

(4)   = { eval triple }
        1 : f ([1,1,1] : map triple (f (map triple v)))

(5)   = { eval f (= flatten = foldr (++) []) }
        1 : 1 : 1 : 1 : f (map triple (f (map triple v))))

In order to expose elements 2-4 of v, we had to evaluate v to the extent
that the overall expression held in memory *grew*.  Notice how in (1) we
had a single (f (map triple ...)) expression in the tail of v but in (5)
there are two such expressions, nested.

Continuing further, if we want to expose the 5th-7th elements of v, we
have to expand the expression yet even more.   Noticing that the (f (map
triple v)) subexpression in (5) is identical to the tail of (1), we can
apply the same expansion that we derived in (1)-(5) to yield

(6)   = { repeat (1)-(5) for f (map triple v) in (5) }
        1 : 1 : 1 : 1 :
            f (map triple (1 : 1 : 1 :
                f (map triple (
                    f (map triple v)))))))

(7)   = { eval map }
        1 : 1 : 1 : 1 :
            f (triple 1 : map triple (
                f (map triple (
                    f (map triple v))))))))

(8)   = { eval triple }
        1 : 1 : 1 : 1 :
            f ([1,1,1] : map triple (
                f (map triple (
                    f (map triple v))))))))

(9)   = { eval f }
        1 : 1 : 1 : 1 : 1 : 1 : 1 :
            f (map triple (
                f (map triple (
                    f (map triple v)))))))))

Notice how in (9) we have three nested (f (map triple (...)))
expressions in the tail of v whereas in (5) we had only two and in (1)
we had but one?

Now you can see why foo1 has a space "leak":  In order to take the Nth
element of v, v's definition must be expanded to the point where there
are 1+(N+1)/3 (f (map triple (...))) subexpressions in the tail of v
*that will never be reached*.  In other words, v's "middle" grows faster
than its head, ensuring that take will never consume the tail.  Taking
elements from the head only makes the middle grow larger.  The more your
take, the larger it grows.

So the problem isn't Hugs but rather the definition of v, which grows
faster than it can be consumed.

Cheers,
Tom