Why is there a space leak here?

Carl R. Witty cwitty@newtonlabs.com
11 Jun 2001 15:58:14 -0700


"S. Alexander Jacobson" <alex@shop.com> writes:

> On 6 Jun 2001, Carl R. Witty wrote:
> 
> > "S. Alexander Jacobson" <alex@shop.com> writes:
> >
> > > For example w/ foldl:
> > >
> > > foldl + 0 [1..10000]
> > > foldl (+) ((+) 0 1) [2..10000]
> > > foldl (+) ((+) ((+) 0 1) 2) [3..10000]
> > >
> > > Can't the implementation notice that each iteration leads to a
> > > larger closure and, if it is running out of space go ahead an just
> > > evaluate (+) 0 1?
> >
> > It's complicated.  You can't (in general) know whether application of
> > a function will increase or decrease the space used.  If you were
> > running out of space, would you just search the whole unevaluated
> > program graph for reductions which somehow seemed "likely" to reduce
> > the space used?  Would you add such reduction nodes to some global
> > list at the time they were created?
> 
> I'm not clear why you can't in general notice that you are using
> more space after function application than before.  I it hard to see why a
> program couldn't do the analysis I just did on foldl.

I wasn't worried about foldl; you assumed that (+) 0 1 got smaller if
you carried out the application.  Even for (+) on Integer, this is not
guaranteed (for large integers, if something else happens to be
holding on to the summands, evaluating the addition can increase total
space usage).

> You could accumulate statistics on funtions that increase/decrease space
> used at runtime and evaluate those that do reduce space used...

Right, that's the sort of thing I meant about "likely" above.  But how
do you find such function applications in the global program graph, if
you seem to be running low on space?  (And you also need to realize
that some functions might usually have "small" outputs, and sometimes
have "large" outputs.)

> > One portable way to implement a memoizing function in Haskell (if the
> > domain of the function is countable) is to lazily build a data
> > structure that contains the results of the function on every possible
> > argument.  Then you evaluate the portions of the data structure that
> > you need; the result on each argument is only evaluated once.  This
> > probably would count as a "growing expression", and it's certainly
> > possible that the function on some arguments would be bottom.
> 
> I don't think I understood this.  Can you clarify?

Let me know if JCAB's response wasn't enough here.

Carl Witty