[Haskell-cafe] Is Haskell capable of matching C in string processing performance?

John Millikin jmillikin at gmail.com
Fri Jan 22 01:09:52 EST 2010

Recently I've been working on a library for generating large JSON[1]
documents quickly. Originally I started writing it in Haskell, but
quickly encountered performance problems. After exhausting my (meager)
supply of optimization ideas, I rewrote some of it in C, with dramatic
results. Namely, the C solution is

* 7.5 times faster than the fastest Haskell I could write (both using
raw pointer arrays)
* 14 times faster than a somewhat functional version (uses monads, but
no explicit IO)
* >30 times faster than fancy functional solutions with iteratees, streams, etc

I'm wondering if string processing is simply a Haskell weak point,
performance-wise. The problem involves many millions of very small
(<10 character, usually) strings -- the C solution can copy directly
from string literals into a fixed buffer and flush it occasionally,
while even the fastest Haskell version has a lot of overhead from
copying around arrays.

Dons suggested I was "doing it wrong", so I'm posting on -cafe in the
hopes that somebody can tell me how to get better performance without
resorting to C.

Here's the fastest Haskell version I could come up with. It discards
all error handling, validation, and correctness in the name of
performance, but still can't get anywhere near C:

[1] http://json.org/

More information about the Haskell-Cafe mailing list