MVar semantics: proposal

Thu Mar 30 18:50:06 EST 2006

Based on discussion, I'd like to offer up this concrete proposal when it
comes to MVars, IORefs, and what we can guarentee when it comes to their
interaction with concurrency.

My wording can use some cleaning up but I want to get this out there
while it is fresh in mind  and before I forget the subtleties that led
me to this formulation.

its primary goals are
 * let us reason sanely about and provide the tools needed to write concurrent
   programs.
 * not break SMP scalability. programs we write now should work on
   systems of the future.

some definitions

action - a memory read/write either MVar or IORef that is theoretically
observable by a different thread, not necessarily atomic or ordered.

sequence point - exactly and only the actions putMVar and takeMVar

in a given thread, for some sequence of actions before a sequence point,
and some sequence of actions after it. another particular thread may
observe those actions in an arbitrary order, at an arbitrary time in the
future, except actions may not cross the sequence point boundry.

for example

thread 1 :  A B SP C
thread 2 may observe A B C  or B A C, but not A C B. 

another formulation would be that if any action after the sequence point
is observable, then all actions before it are guarenteed to be
observable in the current thread.

putMVar and takeMVar are atomic.

this is the _one and only_ guarentee when it comes to different threads
observing the same memory locations. in particular, IORefs cannot safely
be used except in a single-threaded manner.

This is a _pairwise guarentee_ between specific threads, for
threads 1,2,and 3 where thread 1 is writing to memory. threads 2 and 3
may observe completly different orderings but each is independently
subject to the sequence point constraint. You cannot reason about the
observable behavior of a group of threads except insofar as what you can
infer by applying this rule pairwise to the members of the group. 

sequence points are _not_ atomic global events. for the case above

thread 1: A B SP C

thread 2 might see A B C  before thread 3 even sees A. however both are
guarenteed to see A and B before they see C. just because one thread
reaches and passes a sequence point, it does not mean any other thread
has done so. 

how this relates to IORefs:

IORefs do not introduce sequence points nor are operations on them
guarenteed to be atomic. given the sequence point constraint above, the
following rule is needed for safety

IORefs must be used in a single threaded manner, when IORefs are
protected by an MVar, the MVar _must_ be locked by the same thread that
is using the IORefs. so no locking an MVar and then telling another
thread to go ahead and modify some IORefs as the other thread may not
have even observed the 'locking' sequence point yet.

I believe this lets us sanely reason about concurrent accesses to memory
and won't hamper SMP scalable implementations of haskell.

notes:

We should drop atomicModifyIORef since we have MVars, for architectures
with only a test and set instruction and no atomic exchange, supporting
atomicModifyIORef would entail the same overhead as MVars.
atomicModifyIORef also cannot (easily) be implemented on implementations
that use update-in-place rather than indirections for thunk updates.

IORefs are not even guarenteed to be atomic. so, if thread 1 writes A
then B to an IORef, thread 2 is not even guarenteed to see A or B when
it reads it. it may see an intermediate value or even crash the system.
if however thread 1 creates a sequence point after writing to the IORef
and thread 2 waits for that sequence point to be visible (by say waiting
on an MVar) then it may safely read the IORef.

if you have not guessed, sequence points correspond exactly to where
memory barriers need to be placed in the instruction stream or the much
more expensive bus locks on architectures that don't support true memory
barriers (cough. x86.). Using the weaker memory barrier semantics for
sequence points is a concious decision as it is vital vital for
reasonable SMP scaling and global atomic sequence points can be
simulated when needed with MVars between an arbitrary number of threads
by pairwise sequenceing them. (perhaps we can provide some library
routines to make this easy if it turns out to be a common thing people
want?)

        John

-- 
John Meacham - ⑆repetae.net⑆john⑈