String != [Char]

Edward Kmett ekmett at gmail.com
Fri Mar 23 20:30:52 CET 2012


Like I said, my objection to including Text is a lot less strong than my feelings on any notion of deprecating String.

However, I still see a potentially huge downside from an pedagogical perspective to pushing Text, especially into a place where it will be front and center to new users. String lets the user learn about induction, and encourages a "Haskelly" programming style, where you aren't mucking about with indices and Builders everywhere, which is frankly very difficult to use when building Text. If you cons or append to build up a Text fragment, frankly you're doing it wrong.

The pedagogical concern is quite real, remember many introductory lanuage classes have time to present Haskell and the list data type and not much else. Showing parsing through pattern matching on strings makes a very powerful tool, its harder to show that with Text.

But even when taking apart Text, the choice of UTF16 internally makes it pretty much a worst case for many string manipulation purposes. (e.g. slicing has to spend linear time scanning the string) due to the existence of codepoints outside of plane 0. 

The major benefits of Text come from FFI opportunities, but even there if you dig into its internals it has to copy out of the array to talk to foreign functions because it lives in unpinned memory unlike ByteString.

The workarounds for these  limitations all require access to the internals, so a Text proposed in an implementation-agnostic manner is less than useful, and one supplied with a rigid set of implementation choices seems to fossilize the current design.

All of these things make me lean towards a position that it is premature to push Text as the one true text representation.

That I am very sympathetic to the position that the standard should ensure that there are Text equivalents for all of the exposed string operations, like read, show, etc, and the various IO primitives, so that a user who is savvy to all of these concerns has everything he needs to make his code perform well.

Sent from my iPad

On Mar 23, 2012, at 1:32 PM, Brandon Allbery <allbery.b at gmail.com> wrote:

> On Fri, Mar 23, 2012 at 13:05, Edward Kmett <ekmett at gmail.com> wrote:
> Isn't it enough that it is part of the platform?
> 
> As long as the entire Prelude and large chunks of the bootlibs are based around String, String will be preferred.  String as a boxed singly-linked list type is therefore a major problem.
> 
> -- 
> brandon s allbery                                      allbery.b at gmail.com
> wandering unix systems administrator (available)     (412) 475-9364 vm/sms
> 
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.haskell.org/pipermail/haskell-prime/attachments/20120323/daa056a9/attachment.htm>


More information about the Haskell-prime mailing list