[Haskell-cafe] Relaxing atomicity of STM transactions

Wed Sep 29 05:39:00 EDT 2010

Hi Tom,

You wrote that you are interested in the programming experience with relaxed atomicity. What you are asking for are the ideas behind Twilight STM, written in these papers:

> http://proglang.informatik.uni-freiburg.de/projects/syncstm/techreport2010twilight.pdf
(brief summary of the underlying ideas + rationale. Newer version available upon request)

> http://www.cs.uu.nl/research/techreps/repo/CS-2010/2010-020.pdf
(about a nicer API for Twilight using Haskell + formalization. Search in the document for "twilight")

For experimental purposes, we have implementations in C, Java and Haskell. Annette already replied to this thread with a link to our Haskell implementation. 

The main idea is to change (only) the commit protocol of a transaction. The transaction (speculatively/lockless) executes as always: depending on its semantics, it may restart when it reads a value that has been modified since the start of a transaction, or if you use a multi-versioning STM, perhaps read data from older memory snapshots. What semantics you choose here is of no real concern as long as the transaction reaches its commit-point while having read data from a consistent memory snapshot. At the commit-point, however, we do something special: the programmer can provide code to manually validate the transaction using a special API.

At the commit point, we first put the transaction in an irrevocable state. This ensures that at any point the programmer's validation code gives an OK, we can publish the outcome of the transaction. We then validate the transaction (without restarting) to discover inconsistencies. Finally, the validation code is executed to inspect these inconsistencies.

To easily inspect these inconsistencies, you can tag all reads in the transactional body with region identifiers. In the validation code, you can then query if this region has become inconsistent, and what the new values are. These new values are again taken from a consistent memory snapshot - important!
In the example that you provided, this would mean that the reference that you want to loosely read from, you tag with a special region. In the validation code, you can then easily ignore its inconsistency. Or, more likely: inspect the new value to see if the difference with the actual read value is not too big.

Additionally, in the validation code, since the transaction is irrevocable, it is possible to safely do I/O, and you can change the outcome of the transaction slightly based on the new values inspected, i.e. repair the inconsistencies. These ideas provide a more fundamental way to deal with relaxed atomicity, whereas just reading "loosely" from a variable is not. In the latter case, you have no clue how out-dated the values are that you read, for example. 

Hope this is useful to you,
Arie

> Thanks for the responses, but I think I should explain a bit more.
> I'm not interested in being able to read the live value of a TVar at
> any arbitrary time (via. unsafeIOToSTM).  But rather I would like
> looslyReadTVar to have exactly the same semantics as readTVar, except
> that the STM runtime would not reject the transaction if the TVar is
> modified by another transaction before the atomic commit takes place.
> 
> Also, as I would be implementing something similar in Atom, I'm not
> necessarily interested in a Haskell implementation, but rather if the
> programming experience is elevated by these alternative semantics.
> 
> For example:
> 
> incr :: TVar -> STM ()
> incr a = looslyReadTVar a >>= writeTVar a . (+ 1)
> 
> decr a :: TVar -> STM ()
> decr a = readTVar a >>= writeTVar a . (- 1)
> 
> If incr and decr where atomically started at the same time with the
> same TVar, decr would be rejected if incr completed first, but not the
> other way around.  The initial reaction may be that this seriously
> breaks the atomicity of STM, but there may be cases where this could
> be useful.  For instance, it allow a computationally expensive
> transactions to complete, even if their inputs are constantly being
> modified.  In the embedded domain, this could be a fault monitor that
> reads a bunch of constantly changing sensors.