[Haskell-cafe] Re: String vs ByteString

John Millikin jmillikin at gmail.com
Sat Aug 14 20:11:48 EDT 2010


On Sat, Aug 14, 2010 at 16:38, Bryan O'Sullivan <bos at serpentine.com> wrote:
> In my opinion, this "performance penalty" hand-wringing is mostly silly.
> We're talking a pretty small factor of performance difference in most of these
> cases. Even the biggest difference, between ByteString and String, is usually
> much less than a factor of 100.

This attitude towards performance, that it doesn't really matter as
long as something happens *eventually*, is what pushed me away from
Python and towards more performant languages like Haskell in the first
place. Sure, you might not notice a few extra seconds when parsing
some file on your quad-core developer desktop, but those seconds turn
into 20 minutes of lost battery power when running on smaller systems.
Having to convert the internal data structure between [Char], (Ptr
Word16), and (Ptr Word8) can quickly cause user-visible problems.

Libraries which will (by their nature) see heavy use, such as
"bytestring" and "text", ought to have much attention paid to their
performance characteristics. A factor of 2-3x might be the difference
between being able to use a library, and having to rewrite its
functionality to be more efficient.

> In the unlikely event that you need to support non-Unicode encodings,
> they are readily available via text-icu.

Unfortunately, text-icu is hardcoded to use libicu 4.0, which was
released well over a year ago and is no longer available in many
distributions. I sent you a patch to support newer versions a few
months ago, but never received a response. Meanwhile, libicu is up to
4.4 by now.


More information about the Haskell-Cafe mailing list