[Haskell-cafe] Subject: A universal data store interface

Mon Feb 13 21:40:22 CET 2012

Paul, I appreciate your experience and you might find it interesting
that I have come to similar conclusions from my own experience!

As Tom commented though, just because something is limited does not
mean it is bad for every project. There are many small scale projects
that do just fine with about any database abstraction layer, and are
probably better off for it. And as Michael said, many find it a good
approach to use a database abstraction layer for some of their code
and then go raw for other parts of it.
like-wise, the acid-state project (which I actually think is very
similar to Persistent, but it only stores in the process memory) is a
nice approach until the point that you run out of RAM.

So lets assume our project has very demanding needs from our
persistence layer, and we don't want to ever have to spend time
re-writing an abstraction-layer query into a raw form. Does that mean
it is impossible to write anything more than a database driver? I
still believe there are several important areas that a database layer
can be awesome for.

The first is proper serialization - converting between Haskell values
and database value. We have re-factored Persistent to separate this
out from the querying. So you can use Persistent's serilization layer
without using its query layer. The new rawSql function lets you write
raw queries and get back real Haskell values.

The second aspect is declaring the schema and leveraging that to help
make things more type-safe. This is even more important in a
schema-less data store like MongoDB. Using raw strings means you are a
typo away from silent error. Declaring your schema in a database layer
creates the possibility of catching these error in your code. This is
less important overall in SQL, but Persistent still has your back by
checking the schema and telling you what you need to migrate. But I
want to take this a step further in Haskell: I want to know at compile
time that my query is valid, even if I want to write it in a raw form.
Something like Persistent is needed to make sure the columns
referenced are correct, but we also want to know that the entire query
is correct. There are some interesting but immature efforts in this
area, which is yet another way that Persistent could be improved.

On Mon, Feb 13, 2012 at 11:53 AM, Paul R <paul.r.ml at gmail.com> wrote:
> I have quiet a lot of experience in the business of web services
> strongly backed by data stores, in a company that allowed me to apply
> a technologies such as RubyOnRails, DataMapper, PostgreSQL, Redis, Riak,
> HappStack and Snap. Greg, with no offense intended, I will share with
> the café a conclusion that we took year to draw, but that made out job
> much better since : Abstraction over high level data stores is one of
> the worst idea in software engineering.
>
> The most proeminent example is probably PostgreSQL, which is an
> incredibly strong product with high SQL power. But as soon as you access
> it through the ActiveRecord or Persistent API, it gets turned into
> a very limited store, with the SQL power of SQLITE or MongoDB.
>
> So Sergiu, my POV is that universal data stores is at best a glue
> targeting small projects, so that they can be hacked quickly. They offer
> a set of features that, by design, is the greatest common divisor of the
> backends, which unfortunately isn't that great. This is certainly nice
> for a do-a-blog-in-5-minutes with "MyFramework init" video tutorial, but
> probably not for industrial projects in the long run.
>
> As a side note, the acid-state package that Greg kindly mentioned, takes
> a very different approach to data storage, resulting in
> a haskell-centric solution with an original features set.
>
> Regarding your other option, the value behind the LLVM backend seems
> huge for the whole Haskell community. It has the power to lighten GHC,
> improve runtime performance, bring binaries to more platforms and much
> more. In my opinion, that's quiet exciting :)
>
>
> Greg> Hi Sergiu,
> Greg> Thanks you for your interest in that proposal. I rushed it off a year
> Greg> ago. Since then we have made a lot of improvements to Persistent and
> Greg> the library forms a basic building block for most Yesod users and
> Greg> other Haskellers. Persistent offers a level of type-safety and
> Greg> convenience not available elsewhere (except perhaps for libraries like
> Greg> acid-state that are limited to in-memory storage). That being said,
> Greg> there are still a lot of improvements that could be made. With the
> Greg> effort of a GSoC volunteer we could probably get it to the point of
> Greg> being the go-to data storage library for Haskellers, at least those
> Greg> planning on using the subset of backends (likely SQL) with great
> Greg> support. This proposal is vague and we would need to work with you to
> Greg> narrow things down a bit.
>
> Greg> I am biased, but I believe the Yesod project is one of the most
> Greg> compelling in the Haskell ecosystem. There are a lot of different ways
> Greg> a GSoC project could help make things even better besides improving
> Greg> the associated Persistent library, and we would really like to mentor
> Greg> at least one GSoC student. I would open more tickets for this in the
> Greg> system, but I am not sure how helpful it will be. It seems that we
> Greg> need to reach out to more students like yourself, but I am not sure
> Greg> how to do that unless I see messages like these first.
>
> Greg> Greg Weber
>
> --
>  Paul