[Haskell-cafe] Can reallyUnsafePtrEquality give false positives?

Bertram Felgenhauer bertram.felgenhauer at googlemail.com
Thu Nov 23 12:12:39 UTC 2017


Brandon Allbery wrote:
> Actually, I'd say that it can, but only if used incorrectly. Which is why
> it's reallyUnsafe.
> 
> The OP is correct in that a gc at the wrong time can lead to spurious
> positives.

There will be no such GCs. Consider the type:

  reallyUnsafePtrEquality# :: a -> a -> Int#

The operation takes two *Haskell values*, represented by pointers to the
heap, and compares those pointers. (See also Carter Schonwald's mail.)
The same code would also be allowed to dereference those pointers to
find the tags of the corresponding heap objects, check whether they are
constructors, and if so, read the corresponding values from the heap. In
other words, if the identity of heap objects could be confused at this
point, then perfectly ordinary compiled Haskell code could go wrong as
well. The compiler and RTS ensure that the pointers that the pointer
comparison sees are valid pointers for the Haskell values. (The story
would be different if reallyUnsafePtrEquality# were implemented in terms
of anyToAddr#.)

So there are no false positives; if reallyUnsafePtrEquality# x y
returned 1#, then x and y had a common corresponding heap object at the
time. There is plenty of room for surprises and unsafety though:

- Even if x and y compare as equal once, this may change later on. For
  example, with multiple threads running in parallel, a thunk may be
  evaluated several times, resulting in several results on the heap.

- Extending the previous scenario, if purity is violated in the
  evaluation of the initially shared thunk, then x and y may become
  distinct values later on.

- Even without parallelism, it is possible that after evaluating y,
  x points to an indirection on the heap and y to the evaluated value,
  resulting in distinct pointers (at least until the next GC).

- Bottoms also complicate the picture. For example, if one uses pointer
  equality as an optimization in an Eq instance, one may find x == y by
  pointer equality once, just to later find that x == y bottoms out.
  The opposite scenario is also possible. So it's quite easy to violate
  purity this way.

These scenarios can be mitigated by evaluating to WHNF first.

[Note: No spurious positives by GCs]

In current GHC, all garbage collections (even minor ones) stop execution
of all Haskell threads; this is done by tweaking the heap check, and
Haskell threads are not interrupted between heap checks.

There is a design and an experimental implementation of "local garbage
collections" in ghc, from 2011:

  https://ghc.haskell.org/trac/ghc/blog/new-gc-preview
  https://www.microsoft.com/en-us/research/wp-content/uploads/2016/07/local-gc.pdf

In this case, garbage collections of nurseries would run in parallel to
other Haskell threads. However, even in this design,
reallyUnsafePtrEquality# would not have false positives. To quote from
the paper,

  "the key invariant is that the processor that owns the local heap has
  exclusive access to its contents."

(In one design there are actually two local heaps; there is a second,
"sticky" heap that contains objects that other processors might see;
however, these objects are pinned.)

Cheers,

Bertram


More information about the Haskell-Cafe mailing list