[Haskell-cafe] Computing lazy and strict list operations at the
same time
Andrew Pimlott
andrew at pimlott.net
Fri Jul 14 16:10:02 EDT 2006
On Mon, Jun 19, 2006 at 05:50:13PM +0100, Duncan Coutts wrote:
> On Mon, 2006-06-19 at 17:03 +0100, Jon Fairbairn wrote:
> > il [] = error "foo"
> > il [x] = ([], x)
> > il (x:xs) = cof x (il xs)
> > where cof x ~(a,b) = (x:a, b)
> > -- !
>
> From a quick test, it looks like none of our suggested solutions
> actually work in constant space.
>
> main = interact $ \s ->
> case il s of
> (xs, x) -> let l = length xs
> in l `seq` show (l,x)
I was hoping to have enlightenment served to me, but since nobody has
explained this, I took a crack at it. I still can't explain it, but I
got some data that maybe somebody else will understand. My code:
initlast :: [a] -> ([a], a)
initlast [x] = ([], x)
initlast (x:xs) = let (init, last) = initlast xs
in (x:init, {-# SCC "last" #-} last)
lenshow n (_:xs) last = let n1 = n+1 in n1 `seq` lenshow n1 xs last
lenshow n [] last = show (n,last)
main = interact $ \s -> case initlast s of
(xs, x) -> lenshow 0 xs x
lenshow is just "show (length xs, x)", written out so I can tweak it
later. This exhibits the runaway space usage with a large input that
Duncan described. If you throw away "last" in lenshow and just "show
n", it runs in constant space.
It seems that the reference to "last" that I annotated as a cost center
is holding a chain of trivial thunks--trivial because "last" is just
being copied from the result of the recursive call to initlast. I
thought maybe I could get rid of them by returning an unboxed tuple from
initlast, but this turned out to make no difference.
Profiling gave a couple hints. Retainer set profiling (-hr) showed the
retainer set holding all the memory was
{<Main.last,Main.initlast,Main.main,Main.CAF>}
I think this confirms that last holding a chain of thunks. I'm still
surprised that ghc doesn't see that they're trivial. It feels like it
should be an easy optimization.
Constructor and type profiling (-hd and -hy) both show the memory held
by stg_ap_1_upd_info. I don't know what that means.
Most frustrating, I can't find any work around: No matter how I tried
to write initlast, it had the same leak when consumed this way. (NB:
functional implementations only need apply.) Granted, I can't think of
any good reason to code in this style, but it's hard for me to accept
that it should be impossible.
Finally, here is a (silly) version that doesn't leak:
initlast :: [a] -> ([a], [a])
initlast [x] = ([], [x])
initlast (x:xs) = let (init, last) = initlast xs
in (x:init, undefined:last)
lenshow n (_:xs) (_:ls) = let n1 = n+1 in n1 `seq` lenshow n1 xs ls
lenshow n [] [last] = show (n,last)
This is the first case I recall in which adding more constructors makes
a space leak go away, because it gives me something to force (vis
"_:ls). With the original implementation, there was no way to "partly
force" last, one thunk at a time. Have others used this technique?
Andrew
More information about the Haskell-Cafe
mailing list