[Haskell-cafe] Why doesn't this consume all the computer's memory?

Tom Ellis tom-lists-haskell-cafe-2017 at jaguarpaw.co.uk
Mon Nov 5 22:16:11 UTC 2018


It can't be as simple as you make out.  The semantics of ViewPatterns cannot
be such that

   (length -> x, length -> y) = partitionEithers $ repeat (Left ())

means 

   x = length . fst $ partitionEithers $ repeat (Left ())
   y = length . snd $ partitionEithers $ repeat (Left ())

It must surely mean

   (px, py) = partitionEithers $ repeat (Left ())
   x = length px
   y = length py
  
With that interpretation my by-hand evaluations show a space leak, see
below.  The only way I can reconcile this with the observed behaviour is
that GHC's garbage collector can "see through" simple case statements.  I
vaguely remember reading something like this.

Can any GHC developer confirm?

Tom




main = print x >> print y
  where
    (length -> x, length -> y) = partitionEithers $ repeat (Left ())

main = 
    case (partitionEithers (repeat (Left ()))) of (px, py) ->
        let x = length px
            y = length py
        in print x >> print y

(let z = ([], []))

main = 
    case (foldr go z (repeat (Left ()))) of (px, py) ->
        ...

main = 
    case (foldr go z (Left () : repeat (Left ()))) of (px, py) ->
        ...

main = 
    case (go (Left ()) (foldr go z (repeat (Left ()))) of (px, py) ->
        ...

main = 
    case (let t = foldr go z (repeat (Left ()))
          in (fst t, snd t)) of (px, py) ->
        ...

t = foldr go z (repeat (Left ()))
main = 
    case (fst t, snd t) of (px, py) ->
        ...

t = foldr go z (repeat (Left ()))
main = 
    case (fst t, snd t) of (px, py) ->
        ...

t = foldr go z (repeat (Left ()))
main = let x = length (fst t)
           y = length (snd t)
       in print x >> print y
       
t = foldr go z (repeat (Left ()))
main = let x = length (fst t)
           y = length (snd t)
       in print x >> print y

(omitting a few steps ...)

t = go (Left ()) z (foldr go z (repeat (Left ())))
main = let x = length (fst t)
           y = length (snd t)
       in print x >> print y

t = let t2 = foldr go z (repeat (Left ()))
    in (fst t2, snd t2)
...

t = (fst t2, snd t2)
t2 = foldr go z (repeat (Left ()))
...

t = (fst t2, snd t2)
t2 = (fst t3, snd t3)
t3 = foldr go z (repeat (Left ()))
...

t = (fst t3, snd t2)
t2 = (fst t3, snd t3)
t3 = foldr go z (repeat (Left ()))
...


On Mon, Nov 05, 2018 at 02:25:50PM -0500, Tyson Whitehead wrote:
> I believe I actually figured it out.  There is not buildup because y
> is just forever bound to
> 
> y = length . snd $ paritionEithers $ repeat (Left ())
> 
> I guess the thing to realize is that this function will traverse the
> list twice.  That is, what I wrote is essentially
> 
> x = length . fst $ paritionEithers $ repeat (Left ())
> y = length . snd $ paritionEithers $ repeat (Left ())
> 
> where both x and y independently traverse the entire list repeating
> any work that needs to be done to generate the elements.
> 
> Thanks!  -Tyson
> On Mon, 5 Nov 2018 at 14:00, Tyson Whitehead <twhitehead at gmail.com> wrote:
> >
> > I would expect the following to consume all the computer's memory and
> > die due to a buildup of lazy pattern matches for the `y` value.
> >
> > ```
> > import Data.Either
> >
> > main = print x >> print y
> >   where
> >     (length -> x, length -> y) = paritionEithers $ repeat (Left ())
> > ```
> >
> > That is, `partitionEithers` is
> >
> > ```
> > partitionEithers :: [Either a b] -> ([a],[b])
> > partitionEithers = foldr go ([],[])
> >    where
> >      go (Left  x) ~(xs,ys) = (x:xs,ys)
> >      go (Right y) ~(xs,ys) = (xs,y:ys)
> > ```
> >
> > and, in the -ddump-simpl we see the `go Left` branch returns a thunk
> > on both the right and left sides that hold onto the evaluation of
> > (x:xs,ys) as we would expect
> >
> > ```
> > Left x_aqy ->
> >   (GHC.Types.:
> >     @ a_a1q8 x_aqy
> >   (case ds1_d1rO of { (xs_aqz, ys_aqA) -> xs_aqz }),
> >    case ds1_d1rO of { (xs_aqz, ys_aqA) -> ys_aqA });
> > ```
> >
> > Our code keeps generating more and more of these thunks as the
> > left-hand side chases down the infinite list of `Left ()` values, and
> > the machine cannot let go of them because, as far as it knows, we are
> > going to reach the end sometime and then need the right-hand side.
> >
> > Thus I expect it would consume all the memory and crash.  But it
> > doesn't.  It just sits there forever consuming 100% CPU at a constant
> > memory limit.  This means my mental model is defective and I'm unable
> > to properly reason about the space usage of my programs.
> >
> > Could someone please enlighten me as to were I'm missing?  Is there
> > some sort of optimization going on here?  When can it be depend on?


More information about the Haskell-Cafe mailing list