[Haskell-cafe] ANNOUNCE: btree-concurrent

Johan Brinch brinchj at gmail.com
Wed Oct 31 11:45:15 CET 2012

On Tue, Oct 30, 2012 at 10:09 PM, Alberto G. Corona <agocorona at gmail.com> wrote:
> how the IO and STM is managed?

STM and IO works in cooperation in a safe manner using the error
monad. Something like ErrorT (IO ()) STM, which means that either the
STM action will result in an IO action that needs to run (like reading
a node and storing it in the cache), or it will succeed with the
requested result (after some btree operation).

When the result is an error (with an IO action), the IO action is run
and the STM operation is retried, thus retrying the btree operation
until it has everything it needs and can succeed.

> The serialization- deserialization is
> automatic  or programmer must write the cached blocks?  (i suppose that the
> block reads are automatic on data requests).

The cereal package is used to automatic serialize and deserialize keys
and values. This makes for an easy-to-use API (just dump data in
there), but it also takes up performance, and requires a lot of memory
(pointers everywhere!).

I've considered changing the API to require bytestring keys and values
for simplicity.

> The state in the disk is coherent at every moment?. I mean, if there is a
> program failure, the state in the btree in the file would be coherent?
> For example, if I write two TVars in two different blocks within the same
> STM transaction, do I have the guarantee that either both updated values
> will be in the file storage or none of them  will be?

Well, STM is used internally, but it's not exposed. Thus, simple
operations (lookup, insert, modify) will run atomically and in
isolation, but they are not composable. They cannot be composed with
other STM operations or with each other. Saving to persistent storage
is done buttom-up to ensure integrity in case of a crash (the parent
is stored /after/ the nodes it is referencing).

I don't know whether composing tree-operations is possible somehow by
rewriting the operations, but composing with arbitrary STM operations
likely isn't due to the need for IO actions (if the node isn't in the
cache, it needs to be fetched from the external persistent storage).

Composing non-modifying or idempotent operations are theoretically
composable (like ensuring two keys are both present at the same time),
but non-idempotent operations are more difficult if not impossible due
to retrying.

Johan Brinch

More information about the Haskell-Cafe mailing list