unique identifiers as a separate library
Isaac Dupree
isaacdupree at charter.net
Tue Dec 23 18:22:46 EST 2008
Hi again Iavor,
A couple performance ideas if you want to test them:
unsafeInterleaveIO is cheap until you need to evaluate its
result. So how about this, I think it makes there be 1/3 as
many "structural" unsafeInterleaveIO's, so if it took "2"
amount of time on unsafeInterleaveIO:ing before, it should
take "1.33" time on it after this: and just a bit more
time/memory to construct Nodes that might not be used.
gen r = unsafeInterleaveIO $ do
v <- unsafeInterleaveIO (genSym r)
n1 <- gen r; n2 <- gen r; n3 <- gen r; n4
<- gen r
return (Node v1 (Node v2 n1 n2) (Node v3
n3 n4))
I also feel tempted to apply the
static-argument-transformation manually,
where
gen r = gen'
where
gen' = unsafeInterleaveIO $ do
v <- unsafeInterleaveIO (genSym r)
n1 <- gen'; n2 <- gen' --etc.
return (Node ...)
or similar
which I guess is safe because this is only
unsafeInterleaveIO, not unsafePerformIO? Dunno if it'd be
speed-beneficial though.
version 0.4:
> genericNewSupply :: b -> (IORef b -> IO a) -> IO (Supply a)
> genericNewSupply start genSym = gen =<< newIORef start
> where gen r = unsafeInterleaveIO
> $ do ls <- gen r
> rs <- gen r
> return (Node (unsafePerformIO (genSym r)) ls rs)
Why unsafePerformIO, was it faster?(i'd guess slower
actually, as unsafePerformIO is NOINLINE..) It's
considerably less safe than unsafeInterleaveIO! For
example, do the static-argument-transformation above, then
float out the unsafePerformIO because it's the same
expression each time through gen', and suddenly the all the
"unique" values are all the same!
we can make this value-supply very good ultimately :-)
also, I might call "unsafeNewIntSupply" something more
specific, like "unthreadsafeNew...", or the more obscure but
conventional "dupable" description-word. Did it help
specializing that to Int, i.e. why not
"unsafeGenericNewSupply"? because I can imagine a simple
data that's not an Int, where you'd still want to avoid the
thread-safety overhead. Also, your implementation of it
could be more efficient: it doesn't need to do locking, so I
suggest modifyIORef rather than atomicModifyIORef (Actually
you'll have to use readIORef >>= writeIORef >> return,
instead, because modifyIORef has a different type than
atomicModifyIORef). Possible refactor: All the functions
***GenSym r = atomicModifyIORef r (some expression that
doesn't mention r); doing the "[atomic]ModifyIORef r" could
be the caller's responsibility instead, and e.g. listGenSym
(a:as) = (as,a).
in fact, for lists (as you get a incomplete-pattern-match
warning there, but you know the list is always infinite,
because you made it with "iterate"), you could instead use
an infinite-list type, Data.Stream from package "Stream"[*];
as Stream is not a sum type (it only has one possible
constructor: Cons), it might even be a bit more efficient!
[*]
http://hackage.haskell.org/packages/archive/Stream/0.2.6/doc/html/Data-Stream.html
thanks for your effort! and especially for measuring the
performance timing!
-Isaac
More information about the Glasgow-haskell-users
mailing list