String != [Char]
johan.tibell at gmail.com
Mon Mar 26 19:14:03 CEST 2012
On Mon, Mar 26, 2012 at 9:59 AM, Henrik Nilsson <nhn at cs.nott.ac.uk> wrote:
> So, is the argument to deprecate Char, then? As long as Haskell
> allows Chars to be handled in isolation, it would seem impossible
> to prevent naive users from accidentally stumbling over the
> complexities of Unicode?
I haven't proposed anything at all. Someone asked why one should
prefer Text to String. I showed that the former is more correct (given
the currently available APIs) and much faster.
> There are canonical equivalence and compatibility, and each
> has two normal forms (fully composed and fully decomposed),
> and "each of these four normal forms can be used in text processing".
> As an example of the difference between "equivalent" and "compatible",
> the ligature "ff" is "compatible - but not canonically equivalent"
> to a sequence of two characters latin "f", meaning they "may be treated the
> same way in some applications (such as sorting and indexing), but not in
> others; and may be substituted for each other in some situations, but not in
> Is it realistic to think that if only Haskell used Text and not
> String = [Char], a naive user/beginner would be able to write
> correct code for all manner of text processing tasks without
> needing to understand a great deal about Unicode?
> I'm sorry, but I'm rather sceptical.
Why? We can hide most of these details behind the Text API. We can
pick which encoding and normal form is used internally and then have
the externally provided API for e.g. sorting do the right thing.
> So I reiterate that I see little if any gain, be it in terms of making
> life simpler for beginners, making Haskell more "multi cultural", or
> giving Haskell applications in general a performance boost, in
> deprecating String = [Char] and mandating the use of Text.
> But the costs would be massive.
I agree and thus I don't propose we do something like that.
The way this will go down is that part of the Prelude and other base
modules will eventually be replaced by more modern packages (e.g. see
system-fileio) and the use of String will decline. Unfortunately it's
a bit of a painful transition as today we need to convert back and
forth between the two string types quite a lot.
More information about the Haskell-prime