[Haskell-cafe] HDBC, postgresql, bytestrings and embedded NULLs

Mon Jan 17 22:38:50 CET 2011

On 01/17/2011 03:16 PM, Michael Snoyman wrote:
> I've brought up before my problem with the convertible package: it
> encourages usage of partial functions. I would prefer two typeclasses,
> one for guaranteed conversions and one for conversions which may fail.
> In fact, that is precisely why convertible-text[1] exists.

I would be open to making that change in convertible.  The unfortunate 
reality with databases, however, is that many times we put things into 
strings for sending to the DB engine, and get things back from it in the 
form of strings, which must then be parsed into numeric types and the 
like.  We can't, as a matter of type system principles, guarantee that a 
String can be converted to an Integer.  How were you thinking the 
separation into these typeclasses would be applied in the context of 
databases/

> As a related issue, there are a large number of data constructors in
> HDBC for SqlValue. I would not argue with the presence of any of them:
> for your purposes, every one of them is necessary. But for someone
> writing a cross-backend package with a more limited set of datatypes,
> it gets to be a problem. I know I can use convertible for this, but
> see my previous paragraph ;).

How about using an import...hiding statement?  Perhaps even your own 
module that only re-exports the constructors you like?

> I also don't like using the lazy result functions. I'm sure for many
> people, they are precisely what is needed. However, in my
> applications, I try to avoid it whenever possible. I've had bugs crop
> up because I accidently used the lazy instead of strict version of a
> function. I would prefer using an interface that uses enumerators[2].

It would be pretty simple to add an option to the API to force the use 
of the strict versions of functions in all cases (or perhaps to generate 
an exception if a lazy version is attempted.)  Would that address the 
concern?  Or perhaps separating them into separate modules?

I took a quick look at the enumerators library, but it doesn't seem to 
have the necessary support for handling data that comes from arbitrary C 
API function calls rather than handles or sockets.

> For none of these do I actually think that HDBC should change. I think
> it is a great library with a well-thought-out API. All I'm saying is
> that I doubt there will ever be a single high-level API that will suit
> everyone's need, and I see a huge amount of value in splitting out the
> low-level code into a separate package. That way, *everyone* can share
> that code together, *everyone* can find the bugs in it, and *everyone*
> can benefit from improvements.

Splitting out the backend code is quite reasonable, and actually that 
was one of the goals with the HDBC v2 API.  I would have no objection if 
people take, say, HDBC-postgresql and add a bunch of non-HDBC stuff to 
it, or even break off the C bindings to a separate package and then make 
HDBC-postgresql an interface atop that.

I hope that we can, however, agree upon one low-level database API.  The 
Java, Python, and Perl communities, at least, have.  Failing to do so 
produces unnecessary incompatibility.

I would also hope that this database API would be good enough that there 
is rarely call to bypass it and use a database backend directly.

-- John