[Haskell-cafe] Strings in Haskell

Spencer Janssen sjanssen at cse.unl.edu
Mon Jan 22 20:40:00 EST 2007


On Jan 22, 2007, at 7:18 PM, Alexy Khrabrov wrote:
> Greetings -- I'm looking at several FP languages for data mining, and
> was annoyed to learn that Erlang represents each character as 8 BYTES
> in a string which is just a list of characters.  Now I'm reading a
> Haskell book which states the same.

The standard string type in Haskell is indeed a linked list of  
characters, with about 12 bytes of overhead per character.

> Is there a more efficient Haskell string-handling method?

Yes!  There is a library called Data.ByteString [1], it is included  
with the latest versions of GHC and Hugs, and is also available as a  
standalone package.  Data.ByteString represents strings as packed  
arrays of bytes, so the overhead is about 1 byte per character.  This  
library exhibits fantastic performance, rivaling C's speed while  
maintaining the elegance of Haskell.


Cheers,
Spencer Janssen

[1] http://www.cse.unsw.edu.au/~dons/fps.html




More information about the Haskell-Cafe mailing list