[Haskell-cafe] Combining ST with STM

Tue Feb 9 12:43:04 UTC 2016

Jonas,

On 9 February 2016 at 14:43, Thomas Koster <tkoster at gmail.com> wrote:
> I have an STM transaction that needs some private, temporary state.
> The most obvious way is to simply pass pure state as arguments, but
> for efficiency, I would like this state to be some kind of mutable
> array, like STArray.

On 9 February 2016 at 19:36, Jonas Scholl <anselm.scholl at tu-harburg.de> wrote:
> This sounds like optimizing before you know what is really slow,
> complicating your code for no good reason...

>From my wording it sounded like I was "only thinking about it". This is
not actually the case, sorry. Very early versions of my program did use
a plain, immutable array (Vector actually), but now it uses STArray. The
benefits to my program of ST are significant and proven by benchmarks.
Switching to ST did not significantly complicate the program.

By going back to immutable arrays and all that excessive copying and GC,
I could easily wipe out the small benefits of any extra parallelism I
might squeeze out of STM. If it turns out that what I want is
impossible, I will keep ST, drop STM, and say that my program is single-
threaded.

What *is* unknown is the cost of replacing the STArrays with TArrays as
proposed below.

On 9 February 2016 at 14:43, Thomas Koster <tkoster at gmail.com> wrote:
> I know, STM has TVars and TArray, but since this state is private to
> the transaction, I am wondering if using TVars/TArrays for private
> state might be overkill that will unnecessarily slow down the STM
> commit process. The private state is, by definition, not shared, so
> including it in the STM log and commit process is, as far as I can
> tell, pointless.

On 9 February 2016 at 19:36, Jonas Scholl <anselm.scholl at tu-harburg.de> wrote:
> The STM log allows you to revert a failed transaction. If you do not
> record your writes to an array, you can not revert them and they can
> leak outside an aborted transaction.

The array is not shared between transactions or threads; it is local to
the transaction and is not referenced outside the transaction that
created it. Sorry if this was not clear. Like ST state, it is created
inside the transaction, invisible outside the transaction, and discarded
whenever the transaction commits or retries. There is no need to revert
it, ever. The closest thing to reverting it is to have the GC reclaim
it.

On 9 February 2016 at 14:43, Thomas Koster <tkoster at gmail.com> wrote:
> ST and STArray still appear to be the most appropriate tools for the
> private state, because STRefs and STArrays really, really are private.
>
> So this basically means I want to interleave ST and STM in a "safe"
> way. That is, if the STM transaction retries, I want the ST state to
> be vaporised as well.

On 9 February 2016 at 19:36, Jonas Scholl <anselm.scholl at tu-harburg.de> wrote:
> So how should this work? Some log recording what you did would be good,
> so the runtime knows which changes you did to the array... If you
> however create the array in the same transaction, this would work by
> just throwing away the whole array.

The array is indeed created in same transaction, for that transaction.
Throwing away the whole array is exactly what *must* occur, just as it
does with ordinary ST. Otherwise, as you say, the invariants of STM are
violated.

On 9 February 2016 at 14:43, Thomas Koster <tkoster at gmail.com> wrote:
> Ideally, I would love to be able to say something like this:
>
> -- | Copy the value from the shared TVar into the private STRef.
> load :: TVar a -> STRef a -> STSTM s ()
> load shared private = do
>   value <- liftSTM (readTVar shared)
>   liftST (writeSTRef private value)
>
> Naturally, that STRef must originate from a call to newSTRef earlier
> in the same transaction and is private to it, just like the real ST
> monad. As far as I can tell, I am not trying to weaken either ST or
> STM in any way here.

Please forgive the typo in the type signature of "load", which should
have been:

load :: TVar a -> STRef s a -> STSTM s ()

I will elaborate on this imagined STSTM monad in a separate reply,
shortly.

> I found the STMonadTrans package on Hackage [1] that claims to
> implement ST as a monad transformer, STT, which sounds close to what I
> want. While its documentation does not mention STM, it does say that
> some monads are unsafe to use as a base monad for STT.
>
> Is STMonadTrans safe to use with STM?

On 9 February 2016 at 19:36, Jonas Scholl <anselm.scholl at tu-harburg.de> wrote:
> It is not even safe to use with Maybe (for now), as it can share
> different STRefs and STArrays. I filed a bug report. After the bug is
> fixed, I see no reason, why it should not work with STM, as the complete
> ST action should be repeated if the STM transaction aborts.

I see. Thank you for evaluating STMonadTrans. I will propose an
implementation of the STSTM monad I mentioned above in a separate reply,
and would be very grateful if you could evaluate the Core of that one
too.

On 9 February 2016 at 17:20, Thomas Koster <tkoster at gmail.com> wrote:
> It seems stm-containers itself uses unsafeFreezeArray from the
> "primitive" package. One difference though is that while my private
> array would be thawed, modified and refrozen regularly, the
> stm-containers WordArray stays immutable (not thawed) once frozen, as
> far as I can tell.
>
> Since I am using only a single array for the entire private state,
> sprinkling some runST calls with unsafeThawArray/unsafeFreezeArray in
> my STM transaction may be enough for my needs, as long as I am
> exceptionally careful not to leak one of these arrays into or out of
> any STM transaction demarcated by the "atomically" block. If anybody
> knows of any reason why I should abort this idea, please speak up.

On 9 February 2016 at 19:36, Jonas Scholl <anselm.scholl at tu-harburg.de> wrote:
> So what happens if you thaw an array, write to it and then abort the
> transaction? You have to revert the writes because they could be visible
> to the transaction you just aborted. When this transaction restarts, the
> array will still contain the values written prior to it. Even if nothing
> else contains a reference to it, your array is garbage after you aborted
> a transaction only once.
>
> Keep in mind that ST is only "safe IO" in a sense such that no side
> effects are visible to the outside. You lose this if you start to modify
> anything which you did not create yourself. I think this is not that
> different from using
> http://hackage.haskell.org/package/base-4.8.2.0/docs/GHC-Conc-Sync.html#v:unsafeIOToSTM.
> To be safe, you should at least copy the arrays instead of unsafely
> thawing them... but then it could be faster just to use TArrays from the
> start.

No argument there, but as before, no revert is necessary because there
are no references to the array outside of the STM transaction that
created it (by abstinence, I concede, not by a ruling from the type
checker as with ST). Having it collected immediately after a retry or
commit is exactly what I want. The sooner the better!

--
Thomas Koster