Wadler space leak
Simon Marlow
marlowsd at gmail.com
Mon Nov 8 08:28:46 EST 2010
On 07/11/2010 17:47, Jan Christiansen wrote:
>
> On 02.11.2010, at 10:20, Simon Marlow wrote:
>
>> It's not really a question of priority, rather that we don't know of a
>> good way to fix it!
>
> I would not have guessed that there exists a Haskell related problem
> that cannot immediately be fixed by the ghc headquarters ; )
>
> If I understand correctly, the problem is to keep the selectors
> recognizable while still performing optimizations that might "destroy"
> the selector structure. In Bertram's example the resulting expression
> looked as follows.
>
>> l : case q of (_, ys) -> case ys of
>> [] -> []
>> _:zs -> splitBy' p zs
>
> Is it correct that the selector is not recognizable in this case because
> the right hand side fo the outermost case expression is not a simple
> variable but a case expression? It is probably quite naive to assume
> that the problem can be solved by looking for a structure like
>
> case q of
> (_,ys) -> e
>
> where ys is a free variable in e. There are probably cases where further
> optimizations prevent this. Or do I miss the real problem in identifying
> selectors?
The problem is that a "selector" has to be an expression of the form
case x of
C y1 .. yn -> yi
for some i. The RTS contains pre-compiled selectors for values of i up
to 16. The garbage collector recognises selectors where x is already
evaluated, and replaces them with the value.
So you can't do this for an arbitrary expression without generalising
the mechanism. Perhaps you could annotate an arbitrary thunk to say
something like
I select field N from free variable x
and then the GC could null out all fields except for N, as long as x was
not required elsewhere. But this has other problems - apart from being
a lot more complicated in terms of what information we attach to thunks
and what the GC has to do, it doesn't immediately eliminate the
constructor, only the unreferenced fields, so you could still get
certain kinds of leak.
There's another approach in Jan Sparud's paper here:
http://portal.acm.org/citation.cfm?id=165196
although it's not clear that this interacts very well with inlining
either, and it has a suspicious-looking side-effecting operation. It
also looks like it creates a circular reference between the thunk and
the selectors, which might hinder optimisations, and would probably also
make things slower (by adding extra free variables to the thunk).
I haven't given it a great deal of thought though, maybe it could be
made to work in GHC.
Cheers,
Simon
More information about the Glasgow-haskell-users
mailing list