[Haskell-cafe] Haskell memory model (was iterIO-0.1)

Wed May 18 10:56:22 CEST 2011

On 17/05/2011 00:44, dm-list-haskell-cafe at scs.stanford.edu wrote:

>>    But I've never heard anyone claim that a prerequisite to Haskell being
>> useful as a parallel programming language is a well-defined memory
>> model.  I think there's a couple of reasons for that:
>>
>>     - deterministic parallel programming models (e.g. Strategies,
>>       monad-par) don't care about memory models.  These are the
>>       first port of call for parallel programming.
>
> Okay, well, I make this claim as a C/C++ programmer more used to
> writing low-level/kernel code than functional code.  So I'm thinking
> less of things like deterministic scientific codes and more along the
> lines of network servers processing lots of messages and other
> asynchronous events happening in a non-deterministic order anyway.
>
> I think several programming patterns would be useful in Haskell that
> require some understanding of the memory model.  One that particularly
> jumps to mind is the read-copy-update (RCU) pattern for frequently
> accessed but seldom updated data (configuration state, routing tables,
> etc.)
 >
> As you've described them, IORefs are well suited to such a pattern
> because reads are very cheap and updates happen through an atomic
> pointer write.  But if the documentation doesn't say that readIORef is
> supposed to be cheap (or imply so by mentioning that readIORef exposes
> the underlying hardware's memory consistency), then there's no way to
> tell that IORefs are suitable for RCU, so people may think they have
> to do something uglier using peek and poke.

Ok.  I'm not sure how feasible RCU is with IORefs, or even whether it's 
necessary.  After all, the usual pattern of having readers use readIORef 
while writers use atomicModifyIORef gives the RCU cost model (zero 
overhead for readers, expensive writes) with far less complexity. 
Garbage collection does the job of reclamation automatically.  Have I 
missed something here?

A slight improvement over this scheme is to use TVar with readTVarIO for 
the readers (no transaction involved), and transactions for the writers. 
  This greatly increases the scope of what a writer can do, since they 
can perform an update on a bunch of state at the same time.

The situation is complicated somewhat by generational garbage 
collection, which can create weird performance artifacts when mutation 
is involved.

> Actually:
>
> 	http://hackage.haskell.org/packages/archive/base/latest/doc/html/Control-Concurrent-MVar.html
>
> There's nothing in the documentation for MVars that says anything
> about sequential consistency.

That's true, though I think most readers would assume sequential 
consistency in the absence of any statements to the contrary (obviously 
you are a counter example ;-).

 > If you take my example from the
> previous email, replace writeIORef with (\p v ->  modifyMVar_ p $
> return v), replace all other occurrences of IORef with MVar, nothing
> in the docs suggests you won't see the "critical section" message
> printed twice.

There is an operational semantics in the Concurrent Haskell paper that 
does not admit the behaviour you describe, but I'll add something to the 
docs to that effect.

> Presumably modifyMVar must take a spinlock.  Moreover, to be correct
> on x86, the spinlock must execute an LFENCE after acquiring the lock
> and an SFENCE prior to releasing the lock.  But does readMVar acquire
> the spinlock, or is it optimized to take advantage of pointer-sized
> writes being atomic?

That's a good point - readMVar cannot be optimised to avoid the lock. In 
fact, readMVar is just

   readMVar m = do x <- takeMVar m; putMVar m x; return x

although there have been suggestions that we should make it atomic.  If 
we were to do so, it would still have to use a barrier to avoid reordering.

> Systems have memory models for a reason; you can't get away from them
> entirely for all applications.  Haskell's strength, I think, is in
> making sure that 99+% of code can't possibly depend on the memory
> model.  For functional and ST code, you don't even need to look at the
> code to know that this is true--safety is guaranteed by the types
> (modulo some unsafe stuff I hope will be even easier to detect with
> ghc 7.2...).  But for that last little bit of tricky code, the best
> you can do is specify the behavior of the building blocks and maybe
> provide some useful architecture-independent wrappers for things
> (e.g., abstracted memory barriers).

Agree 100%.

Cheers,
	Simon