[Haskell-cafe] Persistent Concurrent Data Structures

Alberto G. Corona agocorona at gmail.com
Wed Nov 2 00:47:56 CET 2011


hi Dimitri

Take a look at TCache. It is a transactional cache with configurable
persistence.

http://hackage.haskell.org/package/TCache

It defines persistent TVars  (DBRef`s)  with similar primitives.
Persistence can be defined by the user for each datatype by an
instance declaration. There is a default persistence in files.

Cache size, synchronization with the database and caching policies are
also defined by the user.

Using this package is easy to define persistent concurrent data
structures. An example are the persistent Queues defined in the
package Workflow:

http://hackage.haskell.org/package/Workflow



2011/11/1 dokondr <dokondr at gmail.com>:
> Hi,
> Please comment on the idea and advise on steps to implement it.
> Real world applications need persistent data, that can be accessed and
> modified concurrently by several clients, in a way that preserves
> "happen-before" relationship.
> Idea: Design and implement Persistent Concurrent Data Types in Haskell.
> These data types should mirror existing Data.List , Data.Map and similar
> types but provide persistency and support consistent concurrent access and
> modification (or simply - "concurrency").
> Persistency and concurrency should be configurable through these type
> interfaces. Configuration should include:
> 1) Media to persist data, such as file, DBMS, external key-value store (for
> example Amazon SimpleDB, CouchDB, MongoDB, Redis, etc)
> 2) Caching policy - when (on what events) and how much data to read/write
> from/to persistent media. Media reads / writes can be done asynchronously in
> separate threads.
> 3) Concurrency configuration: optimistic or pessimistic data locking.
>
> One may ask why encapsulate persistency and concurrency in the data type
> instead of using "native" storage API, such as for example key-value /
> row-column API that  NoSQL databases provide?
> The answer is simple: APIs that your code use greatly influence the code
> itself. Using low-level storage  API directly in your code results in
> bloated obscure code, or you need to encapsulate this low-level API in clear
> and powerful abstractions. So why not to do this encapsulation once and for
> all for such powerful types as Data.Map, for example, and forget all
> Cassandra and SimpleDB low-level access method details?
> When the right time comes and you will need to move your application to the
> next new "shiny_super_cloud", you will just write the implementation of
> NData.Map backed by Data.Map in terms of low-level API of this super-cloud.
>
> (Side note: I really need such a NData.Map type. I was requested to move my
> code that heavily uses Data.Map and simple text file persistence into Amazon
> AWS cloud. Looking at SimpleDB API, I realized that I will have to rewrite
> 90% of code. This rewrite will greatly bloat my code and will make it very
> unreadable. In case I had NData.Map I would just switch implementation from
> 'file' to SimpleDB persistency inside my NData.Map type.)
>
> Implementation:
> To start playing with this idea, NData.Map persisted in a regular file will
> do, no concurrency yet. Next step -   NData.Map persisted in SimpleDB or
> Cassandra or Redis, with concurrent access supported.
>
> So it looks like  NData.Map should be a monad ...
> Any ideas on implementation and similar work?
>
> Thanks!
> Dmitri
> ---
> http://sites.google.com/site/dokondr/welcome
>
>
>
> _______________________________________________
> Haskell-Cafe mailing list
> Haskell-Cafe at haskell.org
> http://www.haskell.org/mailman/listinfo/haskell-cafe
>
>



More information about the Haskell-Cafe mailing list