Semantics of IORefs in GHC

Mon Mar 14 09:06:58 UTC 2016

Maclan

I’m glad you enjoyed the awkward squad paper.

I urge you to write to the Haskell Café mailing list and/or ghc-devs.  Lots of smart people there.  Ryan Newton is working on this kind of stuff; I’ve cc’d him.

But my rough answer would be: IORefs are really only meant for single-threaded work.  Use STM for concurrent communication.

That’s not to say that we are done!  The Haskell community doesn’t have many people like you, who care about the detail of the memory model.  So please do help us :).   For example, perhaps we could guarantee a simple sequential memory model without much additional cost?

Simon

From: Madan Musuvathi
Sent: 11 March 2016 19:35
To: Simon Peyton Jones <simonpj at microsoft.com>
Subject: Semantics of IORefs in GHC

Dear Simon,
I really enjoyed reading your awkward squad paper<http://research.microsoft.com/en-us/um/people/simonpj/papers/marktoberdorf/mark.pdf>. Thank you for writing such an accessible paper.

My current understanding is that the implementation of IORefs in GHC breaks the simple semantics you develop in this paper. In particular, by not inserting sufficient fences around reads and writes of IORefs, a Haskell program is exposed to the weak-memory-consistency effects of the underlying hardware and possibly the backend C compiler. As a result, the monadic bind operator no longer has the simple semantics of sequential composition. Is my understanding correct?

This is very troublesome as this weaker semantics can lead to unforeseen consequences even in pure functional parts of a program. For example, when a reference to an object is passed through an IORef to another thread, the latter thread is not guaranteed to see the updates of the first thread. So, it is quite possible for some (pure functional) code to be processing objects with broken invariants or partially-constructed objects. In the extreme, this could lead to type-unsafety unless the GHC compiler is taking careful precautions to avoid this. (Many of these problems are unlikely to show up on x86 machines, but will be common on ARM.)

I am sure the GHC community is addressing these problems one way or the other. But, my question is WHY?  Why can’t GHC tighten the semantics of IORefs so that the bind operation simply means sequential composition? Given that Haskell has a clean separation between pure functional parts and “awkward” parts of the program, the overheads of these fenced IORefs should be acceptable.

My coauthors and I wrote a recent SNAPL article<http://research.microsoft.com/apps/pubs/default.aspx?id=252150> about this problem for other (“less-beautiful” :)) imperative languages like C# and Java. I really believe we should support sequential composition in our programming languages.

madan

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.haskell.org/pipermail/ghc-devs/attachments/20160314/773f593e/attachment.html>