[Haskell-cafe] Map with incremental serialization

Britt Mathis britt.mathis at gmail.com
Fri Oct 10 21:06:17 UTC 2014


"Acid state has two logs on disk, one for events and another for
checkpoints. " - does this mean that I choose which to do (for
instance, I want the state to be saved every 10 minutes or so but
obviously just the events), or does acid-state handle this internally?
Essentially the app should (in a perfect world) never go offline, but
I need something stored on disk to handle the event that it crashes,
etc. I want the functionality that a map provides (index by a
username, get a user, etc.). I know that a traditional database would
give me the fault tolerance I want, but I don't need all of the
features of a traditional database like postgre. All I need is some
data structure that stores key value pairs, but also writes the state
to disk at some interval of time (10 minutes, etc.). I could obviously
just dump the entire thing every time but this is going to be a ton of
overhead I would like to avoid.

Let's say the map is currently [("foo", "bar"), ("baz", "quux")] and
this is mirrored on disk. If I add another pair ("spam", "eggs") to
the map, at the moment this is only in memory. At some time in the
future, I need to replicate this addition in the structure on disk.
There are two main ways I could do this (that I know of) - I could
write all three pairs, or I could only add ("spam", "eggs") to the
structure on disk somehow. This is the behavior I want.

On Fri, Oct 10, 2014 at 4:52 PM, David Johnson <djohnson.m at gmail.com> wrote:
> Britt,
>
> If you use the remote module you will have the ability to create multiple
> states that can reside on multiple machines, so you won't be limited by the
> RAM capacity of just one machine, and you can then use RPC calls to query
> them from a web server or command line app.
>
> In order to minimize your memory footprint and keep things fast you could
> use an IntMap to store your users (i.e. IntMap User). You could also install
> ekg on the server that broadcasts your state to get a good idea of just how
> large your state is. Since acid-state transactions are pure they complement
> parallelism nicely as well.
>
> Alberto,
>
> Acid state has two logs on disk, one for events and another for checkpoints.
> Checkpoints represent the entire serialized state. Events are incremental
> changes on the state. Writing checkpoints is an expensive operation (since
> the entire state will be written to disk from the MVar), and shouldn't be
> done very often.
>
> Note on usability:
>
> If you have long event files, and you are constantly opening and closing
> your state you will most likely experience poor performance. Since
> acid-state will attempt to read the entire event log to recreate the state
> in memory. Someone can correct me if I'm wrong.
>
> On Fri, Oct 10, 2014 at 3:15 PM, Britt Mathis <britt.mathis at gmail.com>
> wrote:
>>
>> Someone on irc mentioned acid-state remote (it may have even been you), I
>> will definitely be using it. My only worry is that I only have 512MB of Ram
>> to work with atm, which should be fine until I have several thousand users.
>> I see on the acid state website that I would need to integrate with
>> something else to provide the ability to have a data structure larger than
>> memory (I may be missing or misunderstanding something), would it be trivial
>> to add this in later or should I be worried about it now?
>>
>> On Oct 10, 2014 4:10 PM, "David Johnson" <djohnson.m at gmail.com> wrote:
>>>
>>> I recommend looking into Data.Acid.Remote if you plan on using
>>> acid-state. Will allow you to inspect your state using ghci.
>>>
>>>
>>> http://hackage.haskell.org/package/acid-state-0.12.2/docs/Data-Acid-Remote.html
>>>
>>> On Fri, Oct 10, 2014 at 2:55 PM, Britt Mathis <britt.mathis at gmail.com>
>>> wrote:
>>>>
>>>> I was told about acid-state on irc,  but I wasn't sure if it did the
>>>> incremental part - it looks like it will be perfect, thank you. And I will
>>>> definitely check out that paper as well.
>>>>
>>>> On Oct 10, 2014 3:37 PM, "Nicolas Trangez" <nicolas at incubaid.com> wrote:
>>>>>
>>>>> On Fri, 2014-10-10 at 23:29 +0400, Nikita Volkov wrote:
>>>>> > There's "acid-state", which provides a log-based persistence model.
>>>>> > I.e.,
>>>>> > it persists the modifications you make to the data structure.
>>>>>
>>>>> You might also be interested in the paper 'Generic Storage in Haskell'
>>>>> by Sebastiaan Visser, and his thesis about the same subject.
>>>>>
>>>>> Nicolas
>>>>>
>>>>> >
>>>>> > 2014-10-10 23:09 GMT+04:00 Britt Mathis <britt.mathis at gmail.com>:
>>>>> >
>>>>> > > Hello everyone,  I'm not sure if something like this already exists
>>>>> > > so I
>>>>> > > figured I would ask here first before I reinvent the wheel. What I
>>>>> > > need is
>>>>> > > a data structure that behaves like Data.Map, but is serialized to
>>>>> > > disk
>>>>> > > incrementally so I only have the overhead of writing what has
>>>>> > > changed every
>>>>> > > time.
>>>>> > >
>>>>> > > _______________________________________________
>>>>> > > Haskell-Cafe mailing list
>>>>> > > Haskell-Cafe at haskell.org
>>>>> > > http://www.haskell.org/mailman/listinfo/haskell-cafe
>>>>> > >
>>>>> > >
>>>>> > _______________________________________________
>>>>> > Haskell-Cafe mailing list
>>>>> > Haskell-Cafe at haskell.org
>>>>> > http://www.haskell.org/mailman/listinfo/haskell-cafe
>>>>>
>>>>>
>>>>
>>>> _______________________________________________
>>>> Haskell-Cafe mailing list
>>>> Haskell-Cafe at haskell.org
>>>> http://www.haskell.org/mailman/listinfo/haskell-cafe
>>>>
>>>
>>>
>>>
>>> --
>>> Cell: 1.630.740.8204
>
>
>
>
> --
> Cell: 1.630.740.8204


More information about the Haskell-Cafe mailing list