Wadler space leak
Max Bolingbroke
batterseapower at hotmail.com
Tue Nov 9 07:50:21 EST 2010
On 9 November 2010 07:58, Duncan Coutts <duncan.coutts at googlemail.com> wrote:
> This proposal is mentioned favourably by Jörgen Gustavsson David Sands
> in [1] (see section 6, case study 6). They mention that there is a
> formalisation in Gustavsson's thesis [2]. That may say something about
> inlining, since that's just the kind of transformation they'd want to
> show is a space improvement.
During development of my supercompiler, I noticed that a feature like
this one (explicit thunk update in the intermidate language) would
enable more deforestation in rare circumstances.
First step is to build an alternative operational semantics for Core
which includes explicit thunk update. First, we add to the term syntax
something of the form "update x v e" where:
* x is a variable
* v is either
1) A syntactic value
2) Or a variable pointing to something that is certainly evaluated,
by which we mean that the corresponding binding is either:
a) The "wildcard" binder of a case expression (or equivalently
the variable bound by its default alternative)
b) A let with a value RHS
* e is a term
We give this new syntax this operational semantics in the standard
Sestoft abstract machine for call by need:
< H | update x v e | K > --> < H, x |-> v | e | K >
We also need to change the variable rule in the standard semantics from:
< H, x |-> e | x | K > --> < H | e | update x, K >
To:
< H, x |-> e | x | K > --> < H | e | K >
So now plain "let" only expresses code sharing, not work sharing. You
can then desugar Haskell lets as follows:
DESUGAR[let x = e1 in e2] :: Haskell
= let x = (case DESUGAR[e1] of FRESH -> update x FRESH FRESH) in
DESUGAR[e2] :: Core
i.e. Haskell lets are work shared.
The additional power we get in this scheme is that the v and e in
"update x v e" needn't have the same semantics. To see why this is
useful, consider this Haskell program:
"""
let x = case f x of A y -> A (g y)
B z -> B (h z)
C -> C
in case x of A y -> y
B z -> z
C -> 10
"""
In desugared form (I've omitted the update frames for the thunk in the
argument of A/B since its irrelevant for this example), we get:
"""
let x = case (case f x of A y -> A (g y)
B z -> B (h z)
C -> C) of x_fresh -> update x x_fresh x_fresh
in case x of A y -> y
B z -> z
C -> 10
"""
Now we could transform our original program as follows by doing
case-of-case ***through the update frame for x***. First, push the
update frame into the case branches:
"""
let x = case f x of A y -> case A (g y) of x_fresh -> update x x_fresh x_fresh
B z -> case B (h z) of x_fresh -> update x x_fresh x_fresh
C -> case C of x_fresh -> update x x_fresh x_fresh
in case x of A y -> y
B z -> z
C -> 10
"""
Now move the case consuming of "x" into the case branches, then into
the inner case expressions, then into the argument of the "update"
construct. We end up with:
"""
let x = error "Black hole: x"
in case f x of A y -> case A (g y) of x_fresh -> update x x_fresh
(case x_fresh of A y -> y
B z -> z
C -> 10)
B z -> case B (h z) of x_fresh -> update x x_fresh
(case x_fresh of A y -> y
B z -> z
C -> 10)
C -> case C of x_fresh -> update x x_fresh
(case x_fresh of A y -> y
B z -> z
C -> 10)
"""
Note that I've had to keep around a binding for "x" so that the
"update x" has something to update, and in case "f" is strict in its
argument. Now we can do some trivial simplifications to get this code:
"""
let x = error "Black hole: x"
in case f x of A y -> let y' = g y
x_fresh = A y'
in update x x_fresh y'
B z -> let z' = h z
x_fresh = B z'
in update x x_fresh z'
C -> let x_fresh = C
in update x x_fresh 10
"""
Now we can spot that the "update x" in the C branch writes memory that
cannot be read, since "x" cannot escape through the literal 10. So we
can transform as follows:
"""
let x = error "Black hole: x"
in case f x of A y -> let y' = g y
x_fresh = A y'
in update x x_fresh y'
B z -> let z' = h z
x_fresh = B z'
in update x x_fresh z'
C -> 10
"""
None of the other transformations increase allocation, and this last
one reduces allocation by 1 in the case that f x == C -- so we have
deforested the C. What's more, we avoid any case scrutinisation on the
A and B constructors built in the branches of the original case,
though we still have to allocate them since f might have closed over
x.
AFAIK it is impossible to achieve this in any other way without either
risking work duplication in some form or introducing new allocations
that make the deforestation of C moot.
Of course, it may not be worth having all this extra mechanism just to
get a bit more deforestation/optimisation in esoteric situations like
this one. However, since it also lets us get a better story for
avoiding space leaks arising from pattern bindings it almost looks
worth the complexity cost...
Cheers,
Max
More information about the Glasgow-haskell-users
mailing list