[Haskell] Re: [Haskell-cafe] SimonPJ and Tim Harris explain STM - video

Thu Nov 23 17:17:49 EST 2006

(Dropping Haskell at hakell.org)

Hi,

We've not yet looked at I/O in detail in Haskell, but there's a paper from a few years back where I experimented with ways of integrating I/O with an earlier implementation of atomic blocks in Java.

http://research.microsoft.com/~tharris/papers/2005-scp.pdf

The basic idea is to provide a way for a transaction to call into transaction-aware libraries.  The libraries can register callbacks for if the transaction commits (to actually do any "O") and for if the transaction aborts (to re-buffer any "I" that the transaction has consumed).  In addition, a library providing access to another transactional abstraction (e.g. a database supporting transactions) can perform a 2-phase commit that means that the memory transaction and database transaction either both commit or both abort.

Of course, these solutions don't deal with the question of atomic blocks that want to perform output (e.g. to the console) and receive input in response to that.  My view at the moment is _that does not make sense in an atomic block_ -- the output and input can't be performed atomically because the intervening state must be visible for the user to respond to.

We also briefly experimented with extending the SXM system Maurice Herlihy worked on at MSR Cambridge to support transactions that include accesses to the file system and registry -- http://msdn2.microsoft.com/en-us/library/aa366295.aspx describes the TxF and TxR systems it was built over.

Some other interesting work in this area is Elliot Moss' papers on "open nested" transactions -- these provide another building block at the same level as the Java system I mentioned: library writers can use them with care to extend the range of things that can be done inside an atomic block.

Cheers,

Tim

-----Original Message-----
From: haskell-bounces at haskell.org [mailto:haskell-bounces at haskell.org] On Behalf Of Benjamin Franksen
Sent: 24 November 2006 03:16
To: haskell at haskell.org
Cc: haskell-cafe at haskell.org
Subject: [Haskell] Re: [Haskell-cafe] SimonPJ and Tim Harris explain STM - video

[sorry for quoting so much, kinda hard to decide here where to snip]

Cale Gibbard wrote:
> On 23/11/06, Jason Dagit <dagit at eecs.oregonstate.edu> wrote:
>> A comment on that video said:
>>
>> ----- BEGIN QUOTE ----
>> It seems to me that  STM creates  new problems with composability.
>> You create two classes of code: atomic methods and non atomic methods.
>>
>> Nonatomic methods can easily call atomic ones ? the compiler could
>> even automatically inject the atomic block if the programmer forgot.
>>
>> Atomic methods and blocks cannot be allowed to call nonatomic code.
>> The nonatomic code could do I/O or other irrevocable things that would
>> be duplicated when the block had to retry.
>> ---- END QUOTE ----
>>
>> I imagine an example like this (some pseudo code for a side effect
>> happy OO language):
>>
>> class Foo {
>>   protected int counter; // assume this gets initialized to 0
>>   public doSomething() {
>>     atomic{
>>       counter++;
>>       Console.Write("called doSomething execution# " + counter);
>>       // something which could cause the transaction to restart
>>     }
>>   }
>>   public doOtherThing() {
>>     atomic{
>>       doSomething();
>>       // something which could cause the transaction to restart
>>     }
>>   }
>> }
>>
>> Now imagine doSomething gets restarted, then we see the console output
>> once each time and counter gets incremented.  So one solution would be
>> to move the side effects (counter++ and the console write) to happen
>> before the atomic block.  This works for doSomething, but now what if
>> we called doOtherThing instead?  We're back to having the extra
>> side-effects from the failed attempts at doSomething, right?  We just
>> lost composability of doSomething?  I'm assuming counter is only meant
>> to be incremented once per successful run of doSomething and we only
>> want to see the output to the log file once per successful run, but it
>> needs to come before the log output inside doSomething so that the log
>> makes sense.
>>
>> I realize STM is not a silver bullet, but it does seem like
>> side-effects do not play nicely with STM.  What is the proposed
>> solution to this?  Am I just missing something simple?  Is the
>> solution to make it so that Console.Write can be rolled back too?
>
> The solution is to simply not allow side effecting computations in
> transactions. They talk a little about it in the video, but perhaps
> that's not clear. The only side effects an atomic STM transaction may
> have are changes to shared memory.
>
> Another example in pseudocode:
>
> atomic
>    x <- launchMissiles
>    if (x < 5) then retry
>
> This is obviously catastrophic. If launchMissiles has the side effect
> of launching a salvo of missiles, and then the retry occurs, it's
> unlikely that rolling back the transaction is going to be able to put
> them back on the launchpad. Worse yet, if some variable read in
> launchMissiles changes, the transaction would retry, possibly causing
> a second salvo of missiles to be launched.
>
> So you simply disallow this. The content of a transaction may only
> include reads and writes to shared memory, along with pure
> computations. This is especially easy in Haskell, because one simply
> uses a new monad STM, with no way to lift IO actions into that monad,
> but atomically :: (STM a -> IO a) goes in the other direction, turning
> a transaction into IO. In other languages, you'd want to add some
> static typechecking to ensure that this constraint was enforced.

This is of course the technically correct answer. However, I suspect that it
may not be completely satisfying to the practitioner. What if you want or
even need your output to be atomically tied to a pure software transaction?

One answer is in fact "to make it so that Console.Write can be rolled back
too". To achieve this one can factor the actual output to another task and
inside the transaction merely send the message to a transactional channel
(TChan):

atomic $ do
  increment counter
  counterval <- readvar counter
  sendMsg msgChan ("called doSomething execution# " ++ show counterval)
  -- something which could cause the transaction to restart

Another task regularly takes messages from the channel and actually outputs
them. Of course the output will be somewhat delayed, but the order of
messages will be preserved between tasks sending to the same channel. And
the message will only be sent if and only if the transaction commits.

Unfortunately I can't see how to generalize this to input as well...

Cheers
Ben

_______________________________________________
Haskell mailing list
Haskell at haskell.org
http://www.haskell.org/mailman/listinfo/haskell