[Haskell-cafe] Strings in Haskell

Stefan O'Rear stefanor at cox.net
Mon Jan 22 20:34:40 EST 2007


On Mon, Jan 22, 2007 at 05:18:19PM -0800, Alexy Khrabrov wrote:
> Greetings -- I'm looking at several FP languages for data mining, and
> was annoyed to learn that Erlang represents each character as 8 BYTES
> in a string which is just a list of characters.  Now I'm reading a
> Haskell book which states the same.

The book is lying - the size of strings is unspecified and implementation
dependant.  In GHC String is 12 or 20 bytes per character, depending on
construction details.

> Is there a more efficient Haskell string-handling method?

Yes!  Data.ByteString.* implements packed strings of bytes.  They are less
lazy, and don't support unicode, but they are small (8 bits / character)
and fast (I have 100 MBy/s disks and my ByteString-based throwaway filters
are IO-bound).

> Which functional language is the most suitable for text processing?

If you expected any answer other than Haskell, you asked on the wrong list. :)

Stefan


More information about the Haskell-Cafe mailing list