[GHC] #913: instance Ord (StableName a)
GHC
ghc-devs at haskell.org
Tue May 15 20:35:42 UTC 2018
#913: instance Ord (StableName a)
-------------------------------------+-------------------------------------
Reporter: ekarttun | Owner: (none)
Type: feature request | Status: new
Priority: normal | Milestone: 6.10 branch
Component: libraries/base | Version: 6.4.2
Resolution: | Keywords:
Operating System: Unknown/Multiple | Architecture:
| Unknown/Multiple
Type of failure: None/Unknown | Test Case:
Blocked By: | Blocking:
Related Tickets: | Differential Rev(s):
Wiki Page: |
-------------------------------------+-------------------------------------
Comment (by ekmett):
Up until now the fact that you can't care about the arbitrary ordering
that the `StableName`s have has been viewed as a feature. Pragmatically,
if you need to use them as keys in a container `unordered-containers`
should properly treat them as `Hashable`, and rather explicitly considers
the container "unordered".
That said, one could argue that the same reasoning should apply to
`Data.Unique` in that it should be `Hashable`/`Eq` only, and not provide
the `Ord` instance it already does, so to your point there is at least an
appearance of inconsistency between these two positions.
There is some difference between those two scenarios though.
It is worth noting that nothing in our current API says a `StableName#`
gets any sort of actual _unique_ integer assigned to it during creation.
It just lives on the heap. Pretending we can gives me pause. You can't
compare pointers to them using an ordering, any more than you can compare
`MutVar#`'s for `Ord` because GC can change the relative position of them,
and we don't want to inflate them with a superfluous costs to support an
operation that isn't fundamental to their existence.
Now, we do have a `stableNameToInt#` operation, but it rather carefully
doesn't claim to produce a unique name. Mind you it doesn't make this
claim largely due to concerns about GC happening in the meantime, but
we've had this degree of freedom for a long time, and could abuse it more
in the internals without anybody actually being able to care.
@simonmar's initial suggestion that we can use the `Ord` on the
`stableNameToInt#` used by `hashStableName` to compare `StableName#`s
seems a little racy to me: the integer isn't guaranteed to be unique due
to the fact that `StableName` IDs can be reused, but due to the fact that
I could construct two stable names, converting one to an integer through
`stableNameToInt#`, have a gc happen in which the now 'dead' `StableName`
gets recollected, have someone else get assigned to the same id, then
force another stablename that was constructed in the meantime, which might
resolve to the same id. While it seems like a reasonably far fetched
scenario that that might go wrong, the chain of reasoning to prove that it
always goes right is delicate. e.g. I don't see what prevents you from
constructing the second stable name using an `unsafePerformIO` in between
the unwrapping of the first and unwrapping the second, it would be
constructed by the act of forcing the `StableName` data constructor. Do I
think this will really happen in user code? Maybe not, but the fact that I
could even see a path towards it, and Jan-Willem's experience report,
combined with my previous efforts along these lines building the `intern`
library and basically giving up on the cleanup once problems get large
enough due to issues of this sort, gives me a great deal of pause.
But there isn't really any particular reason other than the current
implementation why these Ids are small unique numbers.
In a different world, they could just be heap objects with some ID created
during initialization, say, based on their initial allocation address.
They could then be "moved" like any other heap object. The current API
doesn't require uniqueness of keys, so this would pass everything about
the `StableName` API, much like how my
[https://github.com/ekmett/unique/blob/master/src/Control/Concurrent/Unique.hs#L38
Control.Concurrent.Unique] matches the shape of the `Data.Unique` API, but
can only supply `Hashable`, not `Ord`, as the integer of initial
allocation location associated with each `Unique` value could be shared
across several such `Unique` values. This has the benefit of avoiding
_any_ coordination during allocation of these identifiers across threads
and so scales better than our current `Data.Unique` system.
I could see a world in which we'd want to refactor the internals of
`System.Mem.StableName` at some point to offer a more efficient
construction. The current API offers a lot of possible implementation
approaches. Adding an `Ord` constraint limits those choices, and reasoning
about its soundness takes us into relatively dubious territory as seen
above.
On a more theoretical basis, to phrase this another way, not everything
that can be broken into equivalence classes breaks up into _ordered_
equivalence classes. If these were merely unique objects on the heap, then
merely being able to compare these for equality vs. having the power to
sort them is analogous to the problem of 'pointer discrimination'
mentioned by Fritz Henglein in
[https://pdfs.semanticscholar.org/f425/7af9221ca7fe21dc84a049a8545a28a874ae.pdf
Generic Top-down Discrimination for Sorting and Partitioning in Linear
Time].
All of these issues taken together leaves me inclined to let the old
`wontfix` status of this issue stand.
--
Ticket URL: <http://ghc.haskell.org/trac/ghc/ticket/913#comment:12>
GHC <http://www.haskell.org/ghc/>
The Glasgow Haskell Compiler
More information about the ghc-tickets
mailing list