unique identifiers as a separate library

Iavor Diatchki iavor.diatchki at gmail.com
Wed Dec 24 15:15:55 EST 2008


Hi,
Thanks for the feedback!  Suggestions implemented in your daily
value-supply release :)
Happy Holidays!
-Iavor

On Wed, Dec 24, 2008 at 7:46 AM, Isaac Dupree <isaacdupree at charter.net> wrote:
> Iavor Diatchki wrote:
>>>
>>>  Also, your
>>> implementation of it could be more efficient: it doesn't need to do
>>> locking,
>>> so I suggest modifyIORef rather than atomicModifyIORef (Actually you'll
>>> have
>>> to use readIORef >>= writeIORef >> return, instead, because modifyIORef
>>> has
>>> a different type than atomicModifyIORef).
>>
>> I don't think that that's quite right.  I was thinking that it should
>> be OK to use different supplies that share a reference in different
>> threads.  So, for example, split the supply and pass each version to a
>> new thread.  This should be OK with the dupable primitives because the
>> thunks would be evaluated in different threads.  However, we still
>> need the synchronize access to the reference, or we'll get incorrect
>> values.
>
> That's true (if you're careful enough about which thread is evaluating the
> splitting -- the existing split* functions are actually *not* strict enough
> for this, and they probably shouldn't be either, so an additional warning
> that the user may have to add some `seq`s or `evaluate`s might be needed).
>  Anyway, if atomicModifyIORef isn't that big of an overhead, then no problem
> :-)
>
>>> in fact, for lists (as you get a incomplete-pattern-match warning there,
>>> but
>>> you know the list is always infinite, because you made it with
>>> "iterate"),
>>> you could instead use an infinite-list type, Data.Stream from package
>>> "Stream"[*]; as Stream is not a sum type (it only has one possible
>>> constructor: Cons), it might even be a bit more efficient!
>>
>> You are right but the benefit is small enough that I don't think it
>> warrants a dependency on another package.
>
> I'm going to try to argue to convince you that it's entirely appropriate to
> use Data.Stream :-)
>
> From my somewhat mathematical point of view, that is.
>
> 1. Data.Stream is a small module that implements a very well-known,
> practically standard (though under-used), and simple data type.  All you
> really need from it is import Data.Stream(Stream(Cons)), plus (iterate,
> streamToList) if you're keeping your current interface.  It's nothing you
> should be afraid to depend on, if you're using its concept (which you are).
>  It's much smaller than the 'containers' package, which similarly you don't
> use but if you needed a Map or something you obviously should.
>
> 2. The more compelling argument, that it took me a good night's sleep to
> think of:
>
> Supply is an infinite binary tree.  Stream is an infinite unary tree (er,
> more commonly called infinite "list"). They're both codata.  They're both
> comonads.  (although they don't go so far as to depend on category-extras to
> provide an instance.)  The primary function/purpose of Supply is an
> *efficient* way to turn a Stream into a Supply.  I daresay it would, even,
> be more fundamental to provide interface
> newSupply      :: Stream a -> IO (Supply a)
> than
> newSupply      :: a -> (a -> a) -> IO (Supply a)
> (although it might be worth keeping both interfaces? mainly for
> compatibility, since one's just a Data.Stream.iterate away from the other
> and your haddocks could say so)
> In fact, it's terribly annoying to turn a (Stream a) into a (a, a -> a) --
> in fact I don't think it can be done in general (you can turn it into a (b,
> b -> (a, b)) though a-la unfold, with b = Stream a) -- so you should
> definitely provide the above (Stream a -> IO (Supply a)) interface.
> (Although if you're cowardly enough about the extra dependency, I guess you
> could make it the riskier ([a] -> IO (Supply a)), risk being if the user
> provides a finite list... Sorry for conditionally insulting you; it seems a
> horribly underhanded thing for me to do :-)
>
> Likewise, it would be nice for
> split          :: Supply a -> Stream (Supply a)
> ...and then you would not even depend on Data.List anymore!
>
> (I don't happen to think the arguments that Data.List is better than Stream
> for definitely-infinite lists are very convincing; except possibly that List
> will be more up-to-date with respect to stream-fusion optimizations, and
> even then, value-supply doesn't actually rely on any of those optimizations;
> it actually does keep around the Stream (or List), or in the case of
> Num/Enum, it doesn't use one in the first place.)
>
> On the other hand, I still want the Stream-based interfaces, but my initial
> argument isn't even that necessary: consider implementing the current
> signature:
> newSupply      :: a -> (a -> a) -> IO (Supply a)
> currently by:
> newSupply x f   = genericNewSupply (iterate f x) listGenSym
> but we don't need to use a list at all, it could be like:
> newSupply x f   = genericNewSupply x (\a -> (f a, a))
> (with atomicallyModifyIORef added as appropriate depending on your
> refactorings)
>
>
> P.S. more code cleanup, if you didn't notice it:
>
> 0.4:
>>
>> -- XXX: Is the atomic necessary?
>> import Data.IORef(IORef,newIORef,atomicModifyIORef)
>
> yes the atomic is necessary, so you can update the comment to say why?
> (because multiple threads might be reading from a Supply-structure created
> from the same newSupply run, and we don't want memory corruption etc.)
>
>> import System.IO.Unsafe(unsafePerformIO,unsafeInterleaveIO)
>>
>> #if __GLASGOW_HASKELL__ >= 608
>> import GHC.IOBase(unsafeDupableInterleaveIO,unsafeDupablePerformIO)
>> #else
>> ...
>> unsafeDupablePerformIO :: IO a -> a
>> unsafeDupablePerformIO = unsafePerformIO
>
> various "PerformIO" variants to delete (I'm absurdly pleased that we
> implemented this without needing any)
>
> -Isaac
>


More information about the Glasgow-haskell-users mailing list