Haskell performance

Thu Mar 18 10:18:25 EST 2004

On Thu, Mar 18, 2004 at 03:43:21PM +0100, Ketil Malde wrote:
> Okay.  What's really bothering me is that I can't find any good
> indication of what to do to get IO faster.  Do I need to FFI the whole
> thing and have a C library give me large chunks?  Or can I get by with
> hGet/PutArray?  If so, what sizes should they be?  Should I use memory
> mapped files?
>
> I'm willing to put in some work, accept some kluges, and so on, but I
> can't really blindly try all possible combinations with my fingers
> crossed.  Some people seem to manage to speed things up, but I can't seem
> to find anything *specific* anywhere.
> 
> E.g. when I posted a snippet to do readFile in somewhat larger chunks a
> while ago, I was hoping somebody would say, hey, that's just stupid, what
> you need to do instead is... or point me to TFM, but unfortunately only
> silence ensued, and left me a sadder but none the wiser man...

If your usage needs are similar to those of darcs, you could use my
FastPackedString module (which isn't packaged separately, but wouldn't be
too hard to separate out).  It supports reasonably fast IO on Word8-based
strings, supports reading and writing to gzipped files, reading
(unchanging) files with mmap, etc.  It supports breaking PackedStrings up
without copying, i.e. tailPS is an efficient operation, as are the
splitting and breaking operations.  So as long as you stay in the world of
PackedString, you're in good shape.

If your data is in some binary format, you ought to be able to take the
FastPackedString code and replace Word8 with some other data type, and
still take advantage of the work I've done on getting fast IO (and also
some other fast data manipulations, such as linesPS).
-- 
David Roundy
http://civet.berkeley.edu/droundy/