Weak reference semantics - why does a dead weak ref keep its value alive?

Tue May 27 08:29:08 UTC 2014

On 24/05/2014 01:11, Luite Stegeman wrote:
>
>     In particular, the variant of weak reference you suggest is the
>     /ephemeron/ semantics in Hayes.  Their reachability rule is:
>
>          The value field of an ephemeron is reachable if both (a) the
>          ephemeron (weak pointer object) is reachable, and (b) the key is
>          reachable.
>
>
> Actually it's not the same, since I think the finalizer should still be
> run if the weak pointer object is unreachable (and it should run when
> the key becomes unreachable).
>
> The implementation would indeed need to keep some reference to the key
> and finalizer around after the weak pointer becomes unreachable, perhaps
> on some weak pointers list, but the same goes for GHC's semantics. The
> only difference is that the value (which might in turn make a whole
> bunch of other data reachable) would not have to be retained.
>
> I haven't been able to think of any issues with considering the value
> unreachable here, so I'm still puzzled as to why GHC's semantics would
> be preferable. It doesn't look like it would complicate implementation
> too much either.

So the semantics is currently:

   w <- mkWeak k v f

  1. v is reachable if k is reachable
  2. f is reachable if k is reachable
  3. when k is unreachable, the finalizer f is run
  4. deRefWeak w returns
     - Nothing, if k is not reachable
     - Just v, otherwise

In your proposal I think you would change the first one to

  1. v is reachable if both k and w are reachable

Arguably this makes sense, because as you say, v is accessed via w, and 
what's the point of making v reachable if you can't access it? It's just 
a space leak.

My only worry is how hard this is to implement.  Rather than considering 
w as unconditionally reachable (which is what we do now), you would have 
to track its reachability, and only consider v reachable when both k and 
w are reachable.  A weak pointer object where only k was reachable would 
probably need to be put in a semi-dead (zombie?) state, so that we would 
still run the finalizer when k becomes unreachable.  I suspect all this 
might be more complicated to implement, but maybe there's a simpler way 
that I'm missing.

Cheers,
Simon