String != [Char]

Sat Mar 24 21:16:22 CET 2012

Hi Johan,

On Sat, Mar 24, 2012 at 11:50:10AM -0700, Johan Tibell wrote:
> 
> On Sat, Mar 24, 2012 at 12:39 AM, Heinrich Apfelmus
> <apfelmus at quantentunnel.de> wrote:
> > Which brings me to the fundamental question behind this proposal: Why do we
> > need Text at all? What are its virtues and how do they compare? What is the
> > trade-off? (I'm not familiar enough with the Text library to answer these.)
> >
> > To put it very pointedly: is a %20 performance increase on the current
> > generation of computers worth the cost in terms of ease-of-use, when the
> > performance can equally be gained by buying a faster computer or more RAM?
> > I'm not sure whether I even agree with this statement, but this is the
> > trade-off we are deciding on.
> 
> Correctness
> ==========
> 
> Using list-based operations on Strings are almost always wrong

Data.Text seems to think that many of them are worth reimplementing for
Text. It looks like someone's systematically gone through Data.List.
And in fact, very few functions there /don't/ look like they are
directly equivalent to list functions.

> , as
> soon as you move away from English text. You almost always have to
> deal with Unicode strings as blobs, considering several code points at
> once. For example,
> 
>     upcase :: String -> String
>     upcase = map toUpper

This is no more incorrect than
    upcase = Data.Text.map toUpper

There's no reason that there couldn't be a Data.String.toUpper
corresponding to Data.Text.toUpper.

> Performance
> ===========
> 
> Depending on the benchmark, the difference can be much bigger than
> 20%. For example, here's a comparison of decoding UTF-8 byte data into
> a String vs a Text value:

I think Heinrich meant 20% performance in a useful program, not a
micro-benchmark.

Thanks
Ian