[Haskell-cafe] STM Finalizers

David Turner dct25-561bs at mythic-beasts.com
Fri Jul 31 08:14:33 UTC 2015


Hi Michael,

Good work - what a lovely topic for your master's thesis! There's something
a little bit odd about the real world being able to access the result of a
transaction that hasn't yet (and may still yet not be) committed, but your
arguments for why this is a good thing are compelling. It is also a bit odd
that a STM transaction nested running within the finalizer won't see the
results of the previous transaction as it's not yet committed. Not too
surprising once you read your justification, but surprising beforehand.

A finalizer which has non-atomic real-world effects needs to be quite
careful about undoing those effects when exceptions are thrown. For
instance, looking at your example code[1]:

------------------------------------------------------------------------------

durably :: DatabaseHandle d -> TX d a -> IO a
durably h (TX m) = atomicallyWithIO action finalizer
  where
    action                 = runStateT m (database h, [])
    finalizer (a, (_,ops)) = serialize ops h >> return a

serialize :: [Operation d] -> DatabaseHandle d -> IO ()
serialize ops (DatabaseHandle _ h) =
    withMVar h (\h -> forM_ ops $ B.hPut h . runPut . safePut)

------------------------------------------------------------------------------

[1] https://github.com/mcschroeder/social-example/blob/8925056c/tx/TX.hs#L98

If some of those B.hPut calls succeed but then one fails (e.g. the disk is
full) then the transaction will be rolled back, but the on-disk state will
be left partially written. You're in good company with this problem, by the
way - I've known large, expensive, commercial database products fail
horribly and irretrievably in exactly this fashion when encountering a full
disk at the wrong moment!

Even if the finalizer did include exception handling to deal with this
situation, what happens with asynchronous exceptions? Does the finalizer
run with async exceptions masked? I think it needs to: if not, then it
seems it could run past the point where cleanup could occur and then
receive an exception: essentially the finalizer runs to completion and then
the STM transaction rolls back. Async exceptions get a brief mention in
your thesis but I can't see anything about this point there - apologies if
I've missed it.

Cheers,

David


On 30 July 2015 at 12:11, Michael Schröder <mc.schroeder at gmail.com> wrote:

> Hi everyone,
>
> I want to finally announce some work I did as part of my master's thesis
> [1]. It's an extension to GHC's STM implementation. The main idea is to
> introduce a new primitive function
>
>     atomicallyWithIO :: STM a -> (a -> IO b) -> IO b
>
> Like the existing atomically operation, atomicallyWithIO performs an STM
> computation. Additionally, it takes a *finalizer*, which is an arbitrary
> I/O action that can depend on the result of the STM computation, and
> combines it with the transaction in such a way that:
>
>    1. The finalizer is only performed if the STM transaction is
>    guaranteed to commit.
>    2. The STM transaction only commits if the finalizer finishes without
>    raising an exception.
>
>
>
> A motivation of why this is useful:
>
> Say we want to save the results of some STM transaction, i.e. persist them
> to disk. Currently, this is not possible in a transactionally safe way. The
> naive approach would be to first atomically perform some STM computation
> and then independently serialise its result:
>
>     do
>         x <- atomically m
>         serialize x
>
> There are two issues with this. First, another thread could perform a
> second transaction and serialise it before we have finished serialising
> ours, which could result in an inconsistent state between the memory of our
> program and what is stored on disk. Secondly, the function serialize
> might not terminate at all; it could throw an exception or its thread could
> die. Again we would end up with an inconsistent state and possibly data
> loss.
>
> We might try to simply move serialisation into the atomic block, but to do
> this we have to circumvent the types, which should already be a huge red
> flag:
>
>     atomically $ do
>         x <- m
>         unsafeIOToSTM (serialize x)
>
> The problem here is of course that an STM transaction can retry many
> times, which will also result in serialize being executed many times,
> which is probably not something we want. Furthermore, if the thread
> receives an asynchronous exception, the transaction will abort in an
> orderly fashion, while serialize, with its irrevocable side effects,
> cannot be undone.
>
> But with finalizers, the solution is easy:
>
>     atomicallyWithIO m serialize
>
>
> I've written a small example application that uses finalizers and other
> stuff from my thesis to build a lightweight STM database framework: [2]
>
> There are more possible uses cases besides serialisation (such as
> interactive transactions) and some interesting issues surrounding
> finalizers (like nesting of transactions) which are discussed in greater
> detail in my thesis [1], which also includes a formal operational semantics
> of the whole thing.
>
> I have implemented finalizers in my fork of GHC [3], if any of you want to
> play around with them yourself. The atomicallyWithIO function is exported
> from the GHC.Conc.Sync module.
>
>
> Cheers,
> Michael
>
>
> [1] https://github.com/mcschroeder/thesis
> [2] https://github.com/mcschroeder/social-example
> [3] https://github.com/mcschroeder/ghc
>
>
> _______________________________________________
> Haskell-Cafe mailing list
> Haskell-Cafe at haskell.org
> http://mail.haskell.org/cgi-bin/mailman/listinfo/haskell-cafe
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.haskell.org/pipermail/haskell-cafe/attachments/20150731/08bfeffe/attachment-0001.html>


More information about the Haskell-Cafe mailing list