Questions about sharing

Simon Marlow simonmar@microsoft.com
Mon, 10 Dec 2001 10:43:55 -0000


> My understanding is that GHC tries to have it both ways.  Here's my
> understanding of how it works (implementors should probably chime in
> if my memory is faulty or out of date):
>=20
> 1) There is a single, canonical copy of every nullary constructor.
>   This canonical copy is used wherever possible---just like a
>   top-level constant.
>=20
> 2) Updating a thunk with an indirection makes it expensive to obtain
>   the thunk's value.  The extra indirection does not require
>   allocation---the only reason we need indirections at all is to
>   overwrite memory that previously held a thunk.  The real problem is
>   that it takes time to chase down indirections once they exist.
>=20
>   Therefore when a thunk evaluates to a nullary constructor, it is
>   overwritten directly.  This effectively creates another copy of the
>   nullary constructor.
>=20
> 3) When the GC runs, instead of copying these newly-created nullary
>   constructors, it replaces them with the canonical copy.
>=20
>   [The GC also eliminates indirections, and thus helps us no matter
>   what we do in 2) above]

Actually we don't do (2) and (3) - update in place only happens in very
limited conditions nowadays, namely when we know we're returning to an
update frame and the constructor being returned is not nullary and also
has no pointer fields (the latter restriction is to avoid complications
due to generational GC).

In The Olden Days (<=3D 3.02) we used a return-in-registers policy to
avoid heap-allocating return values when they were about to be used once
and thrown away, and this also enabled update-in-place in certain
circumstances.  However the whole thing was terribly complicated to
implement, so now we have standard heap returns but we also do analysis
(previously CPR analysis, now incorporated into the new demand analysis)
to discover when a value can be returned on the stack instead of the
heap.

Cheers,
	Simon