Seth Gordon sethg at ropine.com
Fri Sep 29 14:22:14 EDT 2006

I've finally gotten enough round tuits to learn Haskell, and now that
I've done some of the exercises from _The Haskell School of Expression_
and I finally (think I) understand what a monad is, the language is
making a lot more sense to me (although my code is not always making so
much sense to the compiler :-).

My employer (MetaCarta) makes a search engine that can recognize
geographic data.  My group within MetaCarta is responsible for building
the "Geographic Data Module" within our software.  To do this, we slurp
a heap of geographic and linguistic data from a variety of sources,
normalize it, and then use some algorithms (that I'm not allowed to
describe) to generate the module.

This seems like the sort of task that cries out for a
functional-programming approach, and that's what we use, sorta: a lot of
the code that I'm responsible for is SQL, with chains of "CREATE TEMP
TABLE X AS [insert very complicated query here]", some C++ for the parts
that would be very time-consuming or impossible to implement in SQL, and
shell scripts to tie everything together.

I told my tech lead that I want to try porting some of this code to
Haskell in the hope that it would run faster and/or be easier to read.
He said I should spend two work days on the project and then be prepared
to convince my co-workers that further research in this vein is (or is
not) worth doing.

So before I embark on day 1 of the project, I thought I should check and
see if anyone on this list has used Haskell to munge a ten-million-row
database table, and if there are any particular gotchas I should watch
out for.


