[Haskell-cafe] Subject: A universal data store interface

Tue Feb 14 00:13:45 CET 2012

On 02/13/2012 09:36 PM, Michael Snoyman wrote:
> You make it sound like your options are "use the crippled abstraction
> layer" or "use the full-powered database layer." You're leaving out
> two very important points:
>
> 1. There's a reason the abstraction layer exists: it can be clumsy to
> go directly to the full-powered database for simple stuff.

That's simply a reason to make access to the *full-powered database* 
easier, not a reason to make access to *every database* identical. Doing 
that is a mistake *unless* you're going to avoid SQL entirely but 
somehow still retain the full database power. For example, SQLite 
requires entirely different SQL contortions to get certain types of 
fields in query results from the way PostgreSQL does it. That means 
you'll have to change your program a lot even if you use e.g. HDBC for 
database access.

My experience is roughly similar to Paul R's. You often give up too much 
by going with "generic" ORM and such.

That's not to say you can't make working with each particular DB much 
more pleasant that it is currently -- postgresql-libpq, for example, is 
almost useless as an application-level API, and I'm working on (no 
guarantees!) a little postgresql-libpq-conduit thingy which will 
hopefully make issuing queries and iterating over results a much more 
pleasant experience without burdening you will all kinds of ridiculously 
low-level detail, and at the same time will NOT shield you from the 
low-level detail that actually *matters*.

The Database Supported Haskell stuff 
(http://hackage.haskell.org/package/DSH) also seems relevant to this 
discussion, since this does seem like it could actually leverage the 
immense power of (some) databases without having to bother too much with 
low-level DB access.

> 2. You can bypass the abstraction layer whenever you want.
>
> I like to describe Persistent's goal as doing 95% of what you need,
> and getting out of your way for the other 5%. You can write raw SQL
> queries with Persistent. I use this for implementing full-text search.
> I haven't needed to write deep analytical tools recently, but if I
> did, I would use raw SQL for that too.

Yes, but then you end up being fully tied to the database *anyway*, so 
why not just make *that* easier and safer from the start?

(I realize that this is a hard problem in practice. It's certainly NOT 
small enough for a GSoC, IMO.)

>
> Persistent's advantage over going directly to the database is concise,
> type-safe code. Are you really telling me that `runSql "SELECT * FROM
> foo where id=?" [someId]` plus a bunch of marshal code is better then
> `get someId`?
>

For starters you should probably never do a "SELECT *" (which is what 
one assumes Persistent would/will do) -- on an SQL database the 
performance characteristics and locking behavior may change dramatically 
over time... while on $generic-NoSQL database there may not really any 
other option, and the performance characteristics *won't* necessarily 
change too dramatically. This is an example of why introducing something 
like Persistent (or ORMs in general) may be a non-trivial decision.

Besides, you probably won't need a lot of marshalling code if you know 
what the query result field types are going to be (you should!). You 
just pattern-match, e.g.

       processQueryResult [SqlInteger i, SqlByte j, SqlText x] =
            ... -- do whatever to i,j and x
       processQueryResult _ = error "Invalid columns in query result"

Yes, this means you'll need to know exactly how the table was created 
(but not in the case of SQLite -- there you MAY have to add various 
casts to the SQL or to manually convert from SqlText to your intended 
Haskell datatype).

I don't think anyone denies that having a compile-time guarantee of a 
successful match would be a bad thing.

It's just that this are is far more complicated than people give it 
credit for.