[Haskell-cafe] ANNOUNCE: iterIO-0.1 - iteratee-based IO with pipe operators

Tue May 17 22:49:27 CEST 2011

On 16 May 2011 21:31, <dm-list-haskell-cafe at scs.stanford.edu> wrote:

> At Mon, 16 May 2011 10:56:02 +0100,
> Simon Marlow wrote:
> >
> > Yes, it's not actually documented as far as I know, and we should fix
> > that.  But if you think about it, sequential consistency is really the
> > only sensible policy: suppose one processor creates a heap object and
> > writes a reference to it in the IORef, then another processor reads the
> > IORef.  The writes that created the heap object must be visible to the
> > second processor, otherwise it will encounter uninitialised memory and
> > crash.  So sequential consistency is necessary to ensure concurrent
> > programs can't crash.
> >
> > Now perhaps it's possible to have a relaxed memory model that provides
> > the no-crashes guarantee but still allows IORef writes to be reordered
> > (e.g. some kind of causal consistency).  That might be important if
> > there is some processor arcitecture that provides that memory model, but
> > as far as I know there isn't.
>
> Actually, in your heap object example, it sounds like you only really
> care about preserving program order, rather than write atomicity.
> Thus, you can get away with less-than-sequential consistency and not
> crash.
>
> The x86 is an example of a relaxed memory model that provides the
> no-crashes guarantee you are talking about.  Specifically, the x86
> deviates from sequential consistency in two ways
>
>  1. A load can finish before an earlier store to a different memory
>     location.  [intel, Sec. 8.2.3.4]
>
>  2. A thread can read its own writes early. [intel, 8.2.3.5]
>
>  [Section references are to the intel architecture manual, vol 3a:
>   http://www.intel.com/Assets/PDF/manual/253668.pdf]
>
> One could imagine an implementation of IORefs that relies on the fact
> that pointer writes are atomic and that program order is preserved to
> avoid mutex overhead for most calls.  E.g.:
>
>  struct IORef {
>    spinlock_t lock;     /* Only ever used by atomicModifyIORef */
>    HaskellValue *val;   /* Updated atomically because pointer-sized
>                            writes are atomic */
>  };
>
>  HaskellValue *
>  readIORef (struct IORef *ref)
>  {
>    return ref->val;
>  }
>
>  void
>  writeIORef (struct IORef *ref, HaskellValue *val)
>  {
>    /* Note that if *val was initialized in the same thread, then by
>     * the time another CPU sees ref->val, it will also see the
>     * correct contents of *ref->val, because stores are seen in a
>     * consistent order by other processors [intel, Sec. 8.2.3.7].
>     *
>     * If *val was initialized in a different thread, then since this
>     * thread has seen it, other threads will too, because x86
>     * guarantees stores are transitively visible [intel, Sec. 8.2.3.6].
>     */
>    ref->val = val;
>  }
>
>  /* modifyIORef is built out of readIORef and writeIORef */
>
>  HaskellValue *
>  atomicModifyIORef (Struct IORef *ref, HaskellFunction *f)
>  {
>    HaskellValue *result;
>    spinlock_acquire (&ref->lock);
>
>    result = modifyIORef (ref, f);
>
>    spinlock_release (&ref->lock);
>    return result;
>  }
>
> This is actually how I assumed IORefs worked.  But then consider the
> following program:
>
>  maybePrint :: IORef Bool -> IORef Bool -> IO ()
>  maybePrint myRef yourRef = do
>    writeIORef myRef True
>    yourVal <- readIORef yourRef
>    unless yourVal $ putStrLn "critical section"
>
>  main :: IO ()
>  main = do
>    r1 <- newIORef False
>    r2 <- newIORef False
>    forkOS $ maybePrint r1 r2
>    forkOS $ maybePrint r2 r1
>    threadDelay 1000000
>
> Under sequential consistency, the string "critical section" should be
> output at most once.  However, with the above IORef implementation on
> x86, since a read can finish before a write to a different location,
> both threads might see False for the value of yourVal.
>
> To prevent this deviation from sequential consistency, you would need
> to do something like stick an MFENCE instruction at the end of
> writeIORef, and that would slow down the common case where you don't
> care about sequential consistency.  In fact, I would argue that if you
> care about S.C., you should either be using atomicModifyIORef or
> MVars.
>

mfence is apparently slower than lock add.  see
http://blogs.oracle.com/dave/entry/instruction_selection_for_volatile_fences
so using mfence would make it slower than atomicModifyIORef, and with weaker
guarantees.  not a good combination.

Alexander

> Can you explain what actually happens inside the real IORef
> implementation?
>
> As an aside, these days one sees a lot of hand-wringing over the fact
> that CPU clock rates have been flat for a while and the only way to
> get more performance is through parallelism.  "How are we going to
> teach programmers to write concurrent code when it's so hard to write
> and debug?" I've heard numerous people ask.
>
> Haskell could be a major step in the right direction, since in the
> absence of variables, it's impossible to have data races.  (You can
> still have deadlock and other kinds of race condition, such as the one
> in maybePrint above, if you had my definition of IORef, but data races
> are by far the most pernicious concurrency problems.)  Of course, the
> key to making Haskell useful in a parallel setting is that things like
> the memory model have to be fully specified...
>
> Thanks,
> David
>
> _______________________________________________
> Haskell-Cafe mailing list
> Haskell-Cafe at haskell.org
> http://www.haskell.org/mailman/listinfo/haskell-cafe
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.haskell.org/pipermail/haskell-cafe/attachments/20110517/3cab215b/attachment.htm>