unique identifiers as a separate library
isaacdupree at charter.net
Thu Dec 18 12:55:20 EST 2008
Sebastian Fischer wrote:
> On Dec 17, 2008, at 10:54 AM, Sebastian Fischer wrote:
>> Would it be possible to put everything concerned with unique
>> identifiers in GHC into a separate package on Hackage?
> I have wrapped up (a tiny subset of) GHC's uniques into the package
> `uniqueid` and put it on Hackage:
> The main difference is due to my fear of depending on the foreign
> function `genSymZh` which I replaced by a global counting IORef.
which is its own risk. maybe you should NOINLINE it?
Potential code criticisms / suggestions for it as a library:
Unboxed: so it only works on GHC, even though others have
unsafe IO too. In theory, strictness annotations should be
able to achieve the same efficiency.
"Char" is supposed to represent a Unicode character -- but
this code behaves oddly:
For 64-bit Int#, it does so.
For 32-bit Int#, it assumes Char is within the first 8 bits
(ASCII and a little more).
If Int# (or Int) can be 30-bit (like Haskell98 permission),
its correctness suffers even worse.
Is it really even a necessary part of the design? The only
way you provide to extract it or depend on its value is
indirectly via the "Show" instance. Its presence there is,
in any case, at the cost of max. 2^24 (16 million) IDs
before problems happen, whereas billions is still not a
great limit but at least is somewhat larger. (applications
that are long-running or deal with huge amounts of data
could be affected)
unsafeDupableInterleaveIO: this "Dupable" was safe for GHC
to use because GHC is single-threaded. Is it safe in a
library setting? I guess likewise, the IORef global
variable wouldn't be thread-safe... but this one isn't even
safe between separate runs of initIdSupply. On the other
hand, thread-safety probably makes it much less efficient
(if you can find a way to use atomic int CPU instructions,
it might not be too bad, or else per-thread counters... or
just declare how unsafe it is)
unsafePerformIO: it's not totally necessary here. Its only
function is to make IDs generated by different runs of
initIdSupply be distinct. So it could, anyway, probably be
refactored to only use unsafePerformIO global-ness once per
initIdSupply and just use unsafeInterleaveIO within (where
currently nextInt is called).
More information about the Glasgow-haskell-users