[Haskell] Re: [Haskell-cafe] SimonPJ and Tim Harris explain STM - video

Thu Nov 23 16:45:55 EST 2006

[sorry for quoting so much, kinda hard to decide here where to snip]

Cale Gibbard wrote:
> On 23/11/06, Jason Dagit <dagit at eecs.oregonstate.edu> wrote:
>> A comment on that video said:
>>
>> ----- BEGIN QUOTE ----
>> It seems to me that  STM creates  new problems with composability.
>> You create two classes of code: atomic methods and non atomic methods.
>>
>> Nonatomic methods can easily call atomic ones ? the compiler could
>> even automatically inject the atomic block if the programmer forgot.
>>
>> Atomic methods and blocks cannot be allowed to call nonatomic code.
>> The nonatomic code could do I/O or other irrevocable things that would
>> be duplicated when the block had to retry.
>> ---- END QUOTE ----
>>
>> I imagine an example like this (some pseudo code for a side effect
>> happy OO language):
>>
>> class Foo {
>>   protected int counter; // assume this gets initialized to 0
>>   public doSomething() {
>>     atomic{
>>       counter++;
>>       Console.Write("called doSomething execution# " + counter);
>>       // something which could cause the transaction to restart
>>     }
>>   }
>>   public doOtherThing() {
>>     atomic{
>>       doSomething();
>>       // something which could cause the transaction to restart
>>     }
>>   }
>> }
>>
>> Now imagine doSomething gets restarted, then we see the console output
>> once each time and counter gets incremented.  So one solution would be
>> to move the side effects (counter++ and the console write) to happen
>> before the atomic block.  This works for doSomething, but now what if
>> we called doOtherThing instead?  We're back to having the extra
>> side-effects from the failed attempts at doSomething, right?  We just
>> lost composability of doSomething?  I'm assuming counter is only meant
>> to be incremented once per successful run of doSomething and we only
>> want to see the output to the log file once per successful run, but it
>> needs to come before the log output inside doSomething so that the log
>> makes sense.
>>
>> I realize STM is not a silver bullet, but it does seem like
>> side-effects do not play nicely with STM.  What is the proposed
>> solution to this?  Am I just missing something simple?  Is the
>> solution to make it so that Console.Write can be rolled back too?
> 
> The solution is to simply not allow side effecting computations in
> transactions. They talk a little about it in the video, but perhaps
> that's not clear. The only side effects an atomic STM transaction may
> have are changes to shared memory.
> 
> Another example in pseudocode:
> 
> atomic
>    x <- launchMissiles
>    if (x < 5) then retry
> 
> This is obviously catastrophic. If launchMissiles has the side effect
> of launching a salvo of missiles, and then the retry occurs, it's
> unlikely that rolling back the transaction is going to be able to put
> them back on the launchpad. Worse yet, if some variable read in
> launchMissiles changes, the transaction would retry, possibly causing
> a second salvo of missiles to be launched.
> 
> So you simply disallow this. The content of a transaction may only
> include reads and writes to shared memory, along with pure
> computations. This is especially easy in Haskell, because one simply
> uses a new monad STM, with no way to lift IO actions into that monad,
> but atomically :: (STM a -> IO a) goes in the other direction, turning
> a transaction into IO. In other languages, you'd want to add some
> static typechecking to ensure that this constraint was enforced.

This is of course the technically correct answer. However, I suspect that it
may not be completely satisfying to the practitioner. What if you want or
even need your output to be atomically tied to a pure software transaction?

One answer is in fact "to make it so that Console.Write can be rolled back
too". To achieve this one can factor the actual output to another task and
inside the transaction merely send the message to a transactional channel
(TChan):

atomic $ do
  increment counter
  counterval <- readvar counter
  sendMsg msgChan ("called doSomething execution# " ++ show counterval)
  -- something which could cause the transaction to restart

Another task regularly takes messages from the channel and actually outputs
them. Of course the output will be somewhat delayed, but the order of
messages will be preserved between tasks sending to the same channel. And
the message will only be sent if and only if the transaction commits.

Unfortunately I can't see how to generalize this to input as well...

Cheers
Ben