mutjida at gmail.com
Fri Sep 29 21:05:24 EDT 2006
> So before I embark on day 1 of the project, I thought I should check and
> see if anyone on this list has used Haskell to munge a ten-million-row
> database table, and if there are any particular gotchas I should watch
> out for.
One immediate thing to be careful about is how you do IO. Haskell is
not very good, in my experience, at reading files fast. You'll
probably want to skip the standard Haskell IO functions and use the
lazy bytestring library (http://www.cse.unsw.edu.au/~dons/fps.html).
Another thing to be careful about is laziness. I suspect it will be
very easy to write code that does what you want but overflows your
heap space due to delaying the computation on each row until after the
entire file is read and the result of the complete computation is
needed. More information on this is available at:
More information about the Haskell-Cafe