Space leak? Was: [Haskell] Queues / Lists with unbound tails

Mon May 10 15:55:50 EDT 2004

> > initQ n       = foldr (.) id $ take n $ repeat (Nothing:)
> > pushQ front q = tail . q . (front:)
> 
> Why doesn't repeated pushing give a space leak?  As far as I can see, tail
> never gets a "complete" list on which it can act, so repeated pushing will
> construct a function with more and more "tails" at the "front" and more
> and more "fronts" at the "tail".

well, tail gets a complete list whenever the back of the queue is inspected.
And since tail doesn't actually care about anything but the first (:) in a list,
it could easily be applied to a list containing (relatively) free variables deeper
inside: (\x->tail (_:x)) --> (\x->x). It's only that Haskell implementations have
tended to shy away from terms with relatively free variables..

So, unfortunately, naive implementations will not share the evaluation of 
the tail applications between inspections of the back of the queue and the 
construction of the new queue. That's why I recommended performance
testing, as we don't have enough information for complexity arguments.

You said you only wanted to use queues via a single combined operation:

  shiftQ front q = let q' = \rest->q (front: rest)
                   in (head (q' []), \rest-> tail (q' rest) )

using Integer instead of Maybe Integer (and modulo all kinds of errors;-), 
assuming demand for the head (back end), and a suitable operational 
semantics of Haskell (..), we get something like this reduction sequence:

  shiftQ 1 ( (0:) . (0:) . (0:) )
  ->
  let q' = \rest-> ( (0:) . (0:) . (0:) ) (1: rest)
  in (head (q' []), \rest-> tail (q' rest) )

  -> (1) {here we copy q', even though there are further
        reductions in the function body that could be 
        shared, including tail applications in shifted queues..}

  let q' = \rest-> ( (0:) . (0:) . (0:) ) (1: rest)
  in (head (\rest-> ( (0:) . (0:) . (0:) ) (1: rest) [])
     ,\rest-> tail (q' rest) )
  ->
  let q' = \rest-> ( (0:) . (0:) . (0:) ) (1: rest)
  in (head ( ( (0:) . (0:) . (0:) ) (1: []) ) 
     ,\rest-> tail (q' rest) )
  ->
  let q' = \rest-> ( (0:) . (0:) . (0:) ) (1: rest)
  in (head ( (0:) ( ( (0:) . (0:) ) (1: []) ) ) 
     ,\rest-> tail (q' rest) )
  ->
  let q' = \rest-> ( (0:) . (0:) . (0:) ) (1: rest)
  in (head ( 0: ( ( (0:) . (0:) ) (1: []) ) ) 
     ,\rest-> tail (q' rest) )

  -> (2) {all reductions between (1) and (2) are parametric
        in rest, so could have been shared..}

  let q' = \rest-> ( (0:) . (0:) . (0:) ) (1: rest)
  in (0, \rest-> tail (q' rest) )
  =
  (0, \rest-> tail ((\rest->((0:).(0:).(0:)) (front:rest)) rest) )

The new queue in the snd part of the result still has the tail
application, so, as you point out, these will pile up for successive
shiftQs  (it would be quite possible to share the evaluation of tail, 
before substituting for q', but I don't think any current Haskell 
implementation is doing that). GHC does some full-lazyness stuff, 
but I doubt that will solve the issue ("full" lazyness does not 
mean optimal number of reductions..).

Another way to share the tail reductions would be by forcing 
them before returning the new queue:

  (0, \rest-> tail ((\rest->((0:).(0:).(0:)) (front:rest)) rest) )
  ->
  (0, \rest-> tail ( ((0:).(0:).(0:)) (front:rest)) )
  ->
  (0, \rest-> tail ( (0:) ( ((0:).(0:)) (front:rest) ) ) ) 
  ->
  (0, \rest-> tail ( 0: ( ((0:).(0:)) (front:rest) ) ) ) 
  ->
  (0, \rest-> ( ((0:).(0:)) (front:rest) ) ) 

Such a "strict" abstraction (that evaluates its body to whnf) would
often be useful, but again, I don't think it is supported.

> Maybe I'm being mislead by the types?  My reasoning is that tail has type
> [a] -> [a], but is applied to something that is never type [a], but rather
> type [a] -> [a].  Is that rubbish?  Is it wrong to think of [a] -> [a] as
> a single thing (the first "[a]" is what is being pushed on the queue, not
> the queue itself, so it's not OK to simply treat is as a suitable argument
> for tail to operate on).

the type of queue's here is [a]->[a], but with the implicit constraint that the
lenght of the partial list inside remains constant. tail is _composed_ with
such queues and some (front:), to keep that invariant. tail is _applied_
to perfectly normal lists, as far as the definition of tail is concerned
(which will only inspect the top (:) of the list).

Cheers,
Claus