unique identifiers as a separate library
iavor.diatchki at gmail.com
Tue Dec 23 16:56:50 EST 2008
Sigh. I think that Isaac is quite right. Even though I think that it
would be quite rare for multiple threads to share the same name supply
in practise, I would really rather have safe code and not have to
think about it. So... I have reverted to the non-dupable versions by
default. I also added an "unsafeNewIntSupply" that uses that dupable
primitives, for those who like living on the edge :) Thanks to
Sebastian for the bench marking program! The performance numbers for
the current version on hackage (0.4) are as follows:
With safe primitives value-supply is about 7 times slower, with the
unsafe ones it is about 2 times slower (a bit less actually). Even
though this seems like a big difference, the actual time it takes to
generate names is very small: I have to generate about 10 million
names to get reliable measurements, so I am not sure if the difference
matters in practise.
If anyone has further ideas, please chime in.
On Tue, Dec 23, 2008 at 8:30 AM, Isaac Dupree <isaacdupree at charter.net> wrote:
> Iavor Diatchki wrote:
>> - It uses unsafeDupableInterleaveIO to avoid double locking,
> in particular,
>> gen r = unsafeDupableInterleaveIO
>> $ do v <- unsafeDupableInterleaveIO (genSym r)
>> ls <- gen r
>> rs <- gen r
>> return (Node v ls rs)
> where is the double locking? We want referential transparency...
> e.g. suppose you use newNumSupply to create a thunk for a Gen; when
> evaluated, it will run unsafeDupableInterleaveIO. You send that thunk off
> to two different other threads. Then those threads decide to evaluate it
> (say, enough to get the first genSym'd value) and happen to make a race
> condition, so it becomes two separate IO computations. Therefore one of them
> runs atomicModifyIORef, and the other one runs atomicModifyIORef, and so
> they get two different values out.
> Node 0 (...) (...)
> Node 1 (...) (...)
> when it's suppose to be the very same Gen data structure.
> so, am I right or wrong about what the perils of unsafeDupableInterleaveIO
> I could see changing (unsafe[Dupable]InterleaveIO (genSym r)) to (genSym r),
> to halve the number of unsafeInterleaveIOs needed if we assume that most of
> the time a node is evaluated in order to get a value... but it's hard to see
> a good way to make *fewer* InterleaveIOs than there are genSym'd values.
> (possible, but hard, and really depends on the relative expenses/risks of
> locking, of computing the next number, and of using up the "address space"
> of all possible Ints for example). Maybe the outer InterleaveIO could
> "strictly" make a few levels of Nodes (with lazy genSym'd values this time)
> before "interleaving" again, to reduce the amount of interleaving from the
> non-semantics-changing side.
More information about the Glasgow-haskell-users