Wadler space leak

Simon Marlow marlowsd at gmail.com
Mon Nov 8 08:28:46 EST 2010


On 07/11/2010 17:47, Jan Christiansen wrote:
>
> On 02.11.2010, at 10:20, Simon Marlow wrote:
>
>> It's not really a question of priority, rather that we don't know of a
>> good way to fix it!
>
> I would not have guessed that there exists a Haskell related problem
> that cannot immediately be fixed by the ghc headquarters ; )
>
> If I understand correctly, the problem is to keep the selectors
> recognizable while still performing optimizations that might "destroy"
> the selector structure. In Bertram's example the resulting expression
> looked as follows.
>
>> l : case q of (_, ys) -> case ys of
>> [] -> []
>> _:zs -> splitBy' p zs
>
> Is it correct that the selector is not recognizable in this case because
> the right hand side fo the outermost case expression is not a simple
> variable but a case expression? It is probably quite naive to assume
> that the problem can be solved by looking for a structure like
>
> case q of
> (_,ys) -> e
>
> where ys is a free variable in e. There are probably cases where further
> optimizations prevent this. Or do I miss the real problem in identifying
> selectors?

The problem is that a "selector" has to be an expression of the form

   case x of
     C y1 .. yn -> yi

for some i.  The RTS contains pre-compiled selectors for values of i up 
to 16.  The garbage collector recognises selectors where x is already 
evaluated, and replaces them with the value.

So you can't do this for an arbitrary expression without generalising 
the mechanism.  Perhaps you could annotate an arbitrary thunk to say 
something like

   I select field N from free variable x

and then the GC could null out all fields except for N, as long as x was 
not required elsewhere.  But this has other problems - apart from being 
a lot more complicated in terms of what information we attach to thunks 
and what the GC has to do, it doesn't immediately eliminate the 
constructor, only the unreferenced fields, so you could still get 
certain kinds of leak.

There's another approach in Jan Sparud's paper here:

http://portal.acm.org/citation.cfm?id=165196

although it's not clear that this interacts very well with inlining 
either, and it has a suspicious-looking side-effecting operation.  It 
also looks like it creates a circular reference between the thunk and 
the selectors, which might hinder optimisations, and would probably also 
make things slower (by adding extra free variables to the thunk).

I haven't given it a great deal of thought though, maybe it could be 
made to work in GHC.

Cheers,
	Simon


More information about the Glasgow-haskell-users mailing list