unique identifiers as a separate library

Isaac Dupree isaacdupree at charter.net
Thu Dec 18 12:55:20 EST 2008


Sebastian Fischer wrote:
> On Dec 17, 2008, at 10:54 AM, Sebastian Fischer wrote:
> 
>> Would it be possible to put everything concerned with unique 
>> identifiers in GHC into a separate package on Hackage?
> 
> 
> I have wrapped up (a tiny subset of) GHC's uniques into the package 
> `uniqueid` and put it on Hackage:

thanks!

> The main difference is due to my fear of depending on the foreign 
> function `genSymZh` which I replaced by a global counting IORef.

which is its own risk.  maybe you should NOINLINE it?

Potential code criticisms / suggestions for it as a library:

Unboxed: so it only works on GHC, even though others have 
unsafe IO too.  In theory, strictness annotations should be 
able to achieve the same efficiency.

"Char" is supposed to represent a Unicode character -- but 
this code behaves oddly:
For 64-bit Int#, it does so.
For 32-bit Int#, it assumes Char is within the first 8 bits 
(ASCII and a little more).
If Int# (or Int) can be 30-bit (like Haskell98 permission), 
its correctness suffers even worse.
Is it really even a necessary part of the design?  The only 
way you provide to extract it or depend on its value is 
indirectly via the "Show" instance.  Its presence there is, 
in any case, at the cost of max. 2^24 (16 million) IDs 
before problems happen, whereas billions is still not a 
great limit but at least is somewhat larger. (applications 
that are long-running or deal with huge amounts of data 
could be affected)

unsafeDupableInterleaveIO: this "Dupable" was safe for GHC 
to use because GHC is single-threaded.  Is it safe in a 
library setting?  I guess likewise, the IORef global 
variable wouldn't be thread-safe... but this one isn't even 
safe between separate runs of initIdSupply.  On the other 
hand, thread-safety probably makes it much less efficient 
(if you can find a way to use atomic int CPU instructions, 
it might not be too bad, or else per-thread counters... or 
just declare how unsafe it is)

unsafePerformIO: it's not totally necessary here.  Its only 
function is to make IDs generated by different runs of 
initIdSupply be distinct.  So it could, anyway, probably be 
refactored to only use unsafePerformIO global-ness once per 
initIdSupply and just use unsafeInterleaveIO within (where 
currently nextInt is called).

-Isaac


More information about the Glasgow-haskell-users mailing list