[Haskell] Re: [Haskell-cafe] SimonPJ and Tim Harris explain STM - video

Tue Nov 28 10:52:58 EST 2006

Here I restate what you obviously know several times, then take a shot at
answering your final question.

Tim Harris (RESEARCH) wrote:
> Hi,
> 
>> After seeing how close I could come to creating onRetry/retryWith I have a
>> question about the semantics of your idea for retryWith.
>>
>> Normally after a retry the STM block is rolled back and put to sleep and
>> will
>> only be awakened and re-executed if one of the STM variables it had read
>> from is
>> committed to by a different STM block.
>>
>> What about retryWith ?  Will the STM block be put to sleep under the same
>> conditions?  Can the IO action given to retryWith include commits to STM
>> variables?
> 
> Starting with this last question -- yes, the example of an STM GetLine using
retryWith to pull more input into a buffer relies on an atomic block in the IO
action to update the buffer.

That makes such (atomic X) actions which call (retryWith Y) useful, but will
require changing the runtime to do efficiently if the (Y) action does not allow
the (atomic X) to succeed/commit on the next attempt. Imaging that getLine does
not block and returns no input, so (Y) does not feed (X).  Then the next attempt
to do (X) will (retryWith Y) and (Y) still fails to get input, so you have a
"busy wait" where (Y) and (X) alternately execute.  It will require a runtime
change to prevent (X) from being retried until more input appears.

In pseudocode this would be:

X = if tchan is empty then retryWith Y else do work and commit
Y = if input is available then feed tchan else return ()

> 
> There are some interesting questions to think about the semantics of
"retryWith". The semantics of "retry" in the PPoPP paper say nothing about
putting threads to sleep -- it would be correct for an implementation to spin
repeatedly executing an "atomic" block until it completes without calling
"retry".

It would indeed be semantically correct, but if you have many spinning threads
then the program will grind to a halt and few people would be able to employ
STM.  So the spec need not mention the blocking, but it should be possible to
implement it that way.  And GHC does.  And onCommit can be implemented without
changing from blocking to spinning, while retryWith does change from blocking to
spinning (without a runtime change)

> What should happen with "retryWith" -- should we introduce blocking &
wake-up into the semantics, or say that the "retryWith" actions are collected
together, executed in place of the transaction (if it does ultimately retry) and
then the transaction re-attempted?

Since (retryWith y1) `orElse` (retryWith y2) must execute y1 and y2, I used
onRetry to collect such actions: "retryWith io = onRetry io >> retry".  After
running all the onRetry actions it immediately re-attempts the whole atomic
transation without regard for whether any STM variables have been updated, so
this causes spinning.

[ Aside: I probably will modify it so that a lack of onRetry actions will
prevent spinning. ]

> For simplicity (and to leave flexibility to
the implementation) I'd prefer to keep blocking & wake-up out of the semantics.

> Taking that option, suppose we have "atomic { X }" and X retries with IO
action Y. I think I'm saying that that would be equivalent to "Y ; atomic { X
}". What are the consequences of this?

Bad consequences.  In particular, Y must finish before X is re-attempted.  My
implementation does this at the moment, which is bad, but the right solution
requires a runtime change.

> In the GetLine example it means that an
implementation that does put threads to sleep must detect conflicts between the
first execution of X and any transactions performed in Y.

There may also be a distinction for whether
[Existing/retry] If atomic { X } fails with a normal retry then rollback and put
X to sleep until an appropriate STM update

[Existing/fail] If atomic { X } fails with from a conflicting update then
rollback and immediately re-attempt X

[New case/retry] If atomic { X } fails with (retryWith Y) then rollback and put
X to sleep until an appropriate STM update and then schedule (Y) (e.g. with forkIO).

[New case/fail a] If atomic { X } fails with a conflicting update and has a
pending (onRetry Y) then rollback and schedule both Y and a re-attempt of X.

This last case comes from imagining using orElse:
(code calls retryWith Y) `orElse` (code that causes conflicting update)
in which case running Y seems like the obvious thing to do.

> 
> Are there any interesting problems that come up if "Y" performs transactions
that use "retry" or "retryWith"?
> 
> Tim
> 

In many useful cases, such as the getLine example, the Y action will have its
own atomic {} block.  In which case the semantics of when it is allowed to
re-attempt X are what is important.  If you require (Y) to complete before
re-attempting (X) then you get an infinite regression where every (atomic block)
fails with (retryWith (next atomic block)), and nothing is ever re-attempted.
This is why "retryWith Y" meaning rollback X and do "Y >> atomic X" is the wrong
implementation.

If one takes "retryWith Y" to mean rollback X and do "forkIO Y >> atomic X" then
one might spawn many many copies of Y waiting for atomic X to do something novel.

Even if Y does not use retry/retryWith it can fail if it writes a conflicting
update to an STM variable.  Since we want Y to be able to commit updates to STM
variables then Y can implicitly fail and retry.

So the semantics must be what I said above, which implicitly allow X to be
re-attempted as soon as anything performs an STM update without requiring the
retryWith actions to finish first.

One could imagine (atomic X) fails via (retryWith Y).  X is put to sleep and Y
is scheduled to run.  Meanwhile another action Z executes and commits and
updates whatever X is waiting on. So X is awakened and runs and commits.  And
only later does (Y) execute.

-- 
Chris