[Haskell-cafe] Haskell DB and XML libs: one user's experience
robdockins at fastmail.fm
Tue May 16 16:57:18 EDT 2006
I recently found myself needing to do some data manipulation; I
needed to take some data from a database and generate a series of XML
files from it. In the past I've done most of this sort of work in
Java, but this time I decided I'd take the opportunity to explore the
state of the art of Haskell DB and XML libraries.
As to DB, I tried using HDBC first. I was actually a little
surprised how straightforward it was. My database (PostgreSQL) is
directly supported, and the compile/install went smoothly. My first
test connection program that typechecked worked as expected (!) and I
was soon executing queries doing useful work with the results. I'd
just like to take a moment to congratulate John Goerzen for creating
a product with a low barrier of entry for using databases in
Haskell. As I didn't really do anything beyond simple queries, can't
comment on more advanced features.
For XML, I wavered between HaXml and HXT (the Haskell XML Tookbox).
I initially decided to use HXT because it has support for xml
namespaces, which I was going to need, and because it just seems to
be the most complete and advanced package available. The HXT install
suffers a little bit from transitive-closureitis, but, overall wasn't
too bad. However, I had a really hard time using it! The API is
_really_ intimidating, and I couldn't find any basic tutorial-style
documentation. The API docs are a little hard to use because related
definitions are spread out over a bunch of modules, and the links
don't always work. Also, the theses are nice, but they read like
theses ;-) That's not what I want when I have a job to complete.
Long story made short; I couldn't figure out how to create and XML
document and serialize it to disk. I was reasonably motivated and
I'm a pretty experienced Haskell programmer, but I had to call it
quits after about 3 hours of struggling with it. Most of my programs
would mysteriously fail to produce output OR errors! It was really
I ended up using HaXml instead and shoehorning in the namespaces by
using attributes named "xmlns:xyz" etc. on the document root element
(which is OK, but not ideal). The HaXml API was also tough to work
with but was less mystifying than HXT's, and I eventually got it to
work. I was a little disappointed by the results, because the pretty
printer does some fairly bizarre things to ensure that it doesn't
introduce extra whitespace into the DOM. I also had to do some
futzing to make HaXml correctly escape literal text. Finally, the
using the HaXml API to generate XML results in verbose code that's
hard to read. I was hoping that I'd get results comparable to using
xmlenc (http://xmlenc.sourceforge.net/) in Java, but I was
disappointed by fairly low signal-to-noise ratio (although in all
fairness, its probably comparable to using the DOM or SAX Java
APIs). Overall, HaXml works, but feels a bit awkward, at least for
this use case.
Now taking a slightly closer look at HXML, I see that it may be the
best choice for what I was attempting to do (although it also doesn't
support namespaces). The simplified representation looks
particularly nice for building XML from scratch. I may try rewriting
with HXML and see how that goes.
So that's it. I don't have any deep conclusions, but I thought I'd
share my experiences in the hopes that they will be helpful to somebody.
Speak softly and drive a Sherman tank.
Laugh hard; it's a long way to the bank.
More information about the Haskell-Cafe