[Haskell-cafe] ANNOUNCE: iterIO-0.1 - iteratee-based IO with pipe operators
Alexander Kjeldaas
alexander.kjeldaas at gmail.com
Tue May 17 22:49:27 CEST 2011
On 16 May 2011 21:31, <dm-list-haskell-cafe at scs.stanford.edu> wrote:
> At Mon, 16 May 2011 10:56:02 +0100,
> Simon Marlow wrote:
> >
> > Yes, it's not actually documented as far as I know, and we should fix
> > that. But if you think about it, sequential consistency is really the
> > only sensible policy: suppose one processor creates a heap object and
> > writes a reference to it in the IORef, then another processor reads the
> > IORef. The writes that created the heap object must be visible to the
> > second processor, otherwise it will encounter uninitialised memory and
> > crash. So sequential consistency is necessary to ensure concurrent
> > programs can't crash.
> >
> > Now perhaps it's possible to have a relaxed memory model that provides
> > the no-crashes guarantee but still allows IORef writes to be reordered
> > (e.g. some kind of causal consistency). That might be important if
> > there is some processor arcitecture that provides that memory model, but
> > as far as I know there isn't.
>
> Actually, in your heap object example, it sounds like you only really
> care about preserving program order, rather than write atomicity.
> Thus, you can get away with less-than-sequential consistency and not
> crash.
>
> The x86 is an example of a relaxed memory model that provides the
> no-crashes guarantee you are talking about. Specifically, the x86
> deviates from sequential consistency in two ways
>
> 1. A load can finish before an earlier store to a different memory
> location. [intel, Sec. 8.2.3.4]
>
> 2. A thread can read its own writes early. [intel, 8.2.3.5]
>
> [Section references are to the intel architecture manual, vol 3a:
> http://www.intel.com/Assets/PDF/manual/253668.pdf]
>
> One could imagine an implementation of IORefs that relies on the fact
> that pointer writes are atomic and that program order is preserved to
> avoid mutex overhead for most calls. E.g.:
>
> struct IORef {
> spinlock_t lock; /* Only ever used by atomicModifyIORef */
> HaskellValue *val; /* Updated atomically because pointer-sized
> writes are atomic */
> };
>
> HaskellValue *
> readIORef (struct IORef *ref)
> {
> return ref->val;
> }
>
> void
> writeIORef (struct IORef *ref, HaskellValue *val)
> {
> /* Note that if *val was initialized in the same thread, then by
> * the time another CPU sees ref->val, it will also see the
> * correct contents of *ref->val, because stores are seen in a
> * consistent order by other processors [intel, Sec. 8.2.3.7].
> *
> * If *val was initialized in a different thread, then since this
> * thread has seen it, other threads will too, because x86
> * guarantees stores are transitively visible [intel, Sec. 8.2.3.6].
> */
> ref->val = val;
> }
>
> /* modifyIORef is built out of readIORef and writeIORef */
>
> HaskellValue *
> atomicModifyIORef (Struct IORef *ref, HaskellFunction *f)
> {
> HaskellValue *result;
> spinlock_acquire (&ref->lock);
>
> result = modifyIORef (ref, f);
>
> spinlock_release (&ref->lock);
> return result;
> }
>
> This is actually how I assumed IORefs worked. But then consider the
> following program:
>
> maybePrint :: IORef Bool -> IORef Bool -> IO ()
> maybePrint myRef yourRef = do
> writeIORef myRef True
> yourVal <- readIORef yourRef
> unless yourVal $ putStrLn "critical section"
>
> main :: IO ()
> main = do
> r1 <- newIORef False
> r2 <- newIORef False
> forkOS $ maybePrint r1 r2
> forkOS $ maybePrint r2 r1
> threadDelay 1000000
>
> Under sequential consistency, the string "critical section" should be
> output at most once. However, with the above IORef implementation on
> x86, since a read can finish before a write to a different location,
> both threads might see False for the value of yourVal.
>
> To prevent this deviation from sequential consistency, you would need
> to do something like stick an MFENCE instruction at the end of
> writeIORef, and that would slow down the common case where you don't
> care about sequential consistency. In fact, I would argue that if you
> care about S.C., you should either be using atomicModifyIORef or
> MVars.
>
mfence is apparently slower than lock add. see
http://blogs.oracle.com/dave/entry/instruction_selection_for_volatile_fences
so using mfence would make it slower than atomicModifyIORef, and with weaker
guarantees. not a good combination.
Alexander
> Can you explain what actually happens inside the real IORef
> implementation?
>
> As an aside, these days one sees a lot of hand-wringing over the fact
> that CPU clock rates have been flat for a while and the only way to
> get more performance is through parallelism. "How are we going to
> teach programmers to write concurrent code when it's so hard to write
> and debug?" I've heard numerous people ask.
>
> Haskell could be a major step in the right direction, since in the
> absence of variables, it's impossible to have data races. (You can
> still have deadlock and other kinds of race condition, such as the one
> in maybePrint above, if you had my definition of IORef, but data races
> are by far the most pernicious concurrency problems.) Of course, the
> key to making Haskell useful in a parallel setting is that things like
> the memory model have to be fully specified...
>
> Thanks,
> David
>
> _______________________________________________
> Haskell-Cafe mailing list
> Haskell-Cafe at haskell.org
> http://www.haskell.org/mailman/listinfo/haskell-cafe
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.haskell.org/pipermail/haskell-cafe/attachments/20110517/3cab215b/attachment.htm>
More information about the Haskell-Cafe
mailing list