[Haskell-cafe] Amazon AWS storage best to use with Haskell?

Steve Severance steve at medwizard.net
Wed Nov 16 19:01:25 CET 2011

We use AWS extensively. We use the aws package and have contributed to it,
specifically SQS functionality. I will give you the rundown of what we do.

We moved off of SimpleDb and now use mondodb. The reason is that simple db
seemed to have problems with write pressure and there are not good tools
for profiling your queries. My main application is extremely write heavy
with a single instance needing to do 100s or 1000s of writes a second.
Mongodb has worked well for us. I am scared of things like cassandra having
looked at the code, however some people have made it work.

We store data such as crawled web pages in S3. The files are lzma
compressed and the data format is built on protocol buffers. We picked lzma
for both storage costs of cold data and the fact that the pipe between S3
and EC2 is somewhat limited and we want to make the most effective use of
it as possible.

In my experience AWS simulators are more trouble than they are worth since
they don't accurately model the way AWS will respond to you under load. The
free tier at AWS should allow you to experiment with building an app. The
first couple of months of development cost us less than $1.


On Tue, Nov 1, 2011 at 1:27 AM, dokondr <dokondr at gmail.com> wrote:

> On Tue, Nov 1, 2011 at 10:53 AM, Neil Davies <
> semanticphilosopher at gmail.com> wrote:
>> Word of caution
>> Understand the semantics (and cost profile) of the AWS services first -
>> you can't just open a HTTP connection and dribble data out over several
>> days and hope for things to work. It is not a system that has that sort of
>> laziness at its heart.
>> AWS doesn't supply a traditional remote file store semantics - is
>> queuing, simple database and object store have all been designed for large
>> scale systems being offered as a service to a (potentially hostile) large
>> set of users - you can see that in the way that things are designed. There
>> are all sorts of (sensible from their point of view) performance related
>> limits and retries.
>> The challenge in designing nice clean layers on top of AWS is how/when to
>> hide the transient/load related failures.
> As a straw-man approach I would go first to NData.Map backed by Data.Map
> with addition of "flush" function  to write Data.Map to external key-value
> store / NoSQL DB.
> Another requirement for NData.Map is concurrent consistency, so different
> clients could modify its state preserving "happen-before" relationship. For
> this I would add to NData.Map a "reftresh" function, that updates local
> copy from  external key-value store.
> As for hSimpleDB package, it looks like it doesn't build on ghc7:
> http://hackage.haskell.org/package/hSimpleDB
>> The hSimpleDB package
>> Interface to Amazon's SimpleDB service.
>> PropertiesVersions0.1 <http://hackage.haskell.org/package/hSimpleDB-0.1>,
>> 0.2 <http://hackage.haskell.org/package/hSimpleDB-0.2>, *0.3*
>> Dependenciesbase <http://hackage.haskell.org/package/base-> (≥3 &
>> ≤4), bytestring <http://hackage.haskell.org/package/bytestring->,
>> Crypto <http://hackage.haskell.org/package/Crypto-4.2.4>, dataenc<http://hackage.haskell.org/package/dataenc->,
>> HTTP <http://hackage.haskell.org/package/HTTP-4000.1.2>, hxt<http://hackage.haskell.org/package/hxt-9.1.4>,
>> network <http://hackage.haskell.org/package/network->, old-locale<http://hackage.haskell.org/package/old-locale->,
>> old-time <http://hackage.haskell.org/package/old-time->,
>> utf8-string <http://hackage.haskell.org/package/utf8-string-0.3.7>
>> LicenseBSD3AuthorDavid Himmelstrup 2009, Greg Heartsfield 2007Maintainer David
>> Himmelstrup <lemmih at gmail.com>CategoryDatabase<http://hackage.haskell.org/packages/archive/pkg-list.html#cat:database>,
>> Web <http://hackage.haskell.org/packages/archive/pkg-list.html#cat:web>,
>> Network<http://hackage.haskell.org/packages/archive/pkg-list.html#cat:network> Upload
>> dateThu Sep 17 17:09:26 UTC 2009Uploaded byDavidHimmelstrupBuilt on ghc-6.10,
>> ghc-6.12Build failureghc-7.0 (log<http://hackage.haskell.org/packages/archive/hSimpleDB/0.3/logs/failure/ghc-7.0>
>> )
> _______________________________________________
> Haskell-Cafe mailing list
> Haskell-Cafe at haskell.org
> http://www.haskell.org/mailman/listinfo/haskell-cafe
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.haskell.org/pipermail/haskell-cafe/attachments/20111116/06398248/attachment.htm>

More information about the Haskell-Cafe mailing list