[Haskell-cafe] Abandoning String = [Char]?

Andrew Gibiansky andrew.gibiansky at gmail.com
Fri May 22 17:37:52 UTC 2015


Mario,

Thank you for that detailed write-up. That's exactly the sort of thing I
was looking for.

I imagine a path like the one you describe is possible, but very, very
difficult, and likely the effort could be better spent elsewhere.

I imagine an alternate route (that would have immediate gains in the near
future, and wouldn't be a long-term transition plan) would be to have a
`text-base` package, which exports everything `base` does, exporting `Text`
instead of `String`. Then base packages off that instead of `base`, thus
ensuring you do not rely on []-manipulation for `String` (you should still
have full compatibility with normal `base`).

Anyway, hard choices all around, for no 100% clear gain, so I personally do
not envision this happening any time soon. Oh well...

-- Andrew

On Fri, May 22, 2015 at 6:07 PM, Michal Antkiewicz <
mantkiew at gsd.uwaterloo.ca> wrote:

> Mario, thanks for that great writeup.
>
> The switch can only happen if there's a way to make the old code somehow
> transparently work the same or better in the new setup.
>
> Maybe some GHC magic could bring the string operations to Prim Ops and
> transparently switch the underlying representation to Text from [Char].
> Basically, Text would have to become a built in primitive, not a library.
>
> Michał
>
> On Fri, May 22, 2015 at 10:29 AM, Mario Blažević <mblazevic at stilo.com>
> wrote:
>
>> On 15-05-18 06:44 PM, Andrew Gibiansky wrote:
>>
>>> Hey all,
>>>
>>> In the earlier haskell-cafe discussion of IsString, someone mentioned
>>> that it would be nice to abandon [Char] as the blessed string type in
>>> Haskell. I've thought about this on and off for a while now, and think
>>> that the fact that [Char] is the default string type is a really big
>>> issue (for example, it gives beginners the idea that Haskell is
>>> incredibly slow, because everything that involves string processing is
>>> using linked lists).
>>>
>>> I am not proposing anything, but am curious as to what already has been
>>> discussed:
>>>
>>> 1. Has the possibility of migrating away from [Char] been investigated
>>> before?
>>>
>>
>>         No, not seriously as far as I'm aware. That ship has sailed a
>> long time ago. Still, as I have actually thought about that, I'll give you
>> an outline of a possible process.
>>
>>
>>  2. What gains could we see in ease of use, performance, etc, if [Char]
>>> was deprecated?
>>>
>>
>>         They could be very significant for any code that took advantage
>> of the new type, but the existing code would not benefit that much. But
>> then, any new Haskell code can already use Text where performance matters.
>>
>>
>>  3. What could replace [Char], while retaining the same ease of use for
>>> writing string manipulation functions (pattern matching, etc)?
>>>
>>
>>         You would not have the same ease of use exactly. The options
>> would lie between two extremes. At one end, you can have a completely
>> opaque String type with fromChars/toChars operations and nothing else. At
>> the other end, you'd implement all operations useful on strings so there
>> would never be any need to convert between String and [Char].
>>
>>         The first extreme would be mostly useless from the performance
>> point of view, but with some GHC magic perhaps it could be made a viable
>> upgrade path. The compiler would have to automatically insert the implicit
>> fromChars/toChars conversion whenever necessary, and I expect that some of
>> the existing Haskell code would still be broken.
>>
>>         Once you have an opaque String type, you can think about
>> improving the performance. A more efficient instance of Monoid String would
>> be a good start, especially since it wouldn't break backward compatibility.
>> Unfortunately that is the only [Char] instance in wide use that can be
>> easily optimized. Perhaps Foldable could be made to work with even more
>> compiler magic, but I doubt it would be worth the effort.
>>
>>         If you add more operations on String that don't require
>>
>>
>>  4. Is there any sort of migration path that would make this change
>>> feasible in mainline Haskell in the medium term (2-5 years)?
>>>
>>
>>         Suppose GHC 7.12 were to bring Text into the core libraries,
>> change Prelude to declare type String = Text, and sprinkle some magic
>> compiler dust to make the explicit Text <-> Char conversions unnecessary.
>>
>>         The existing Haskell code would almost certainly perform worse
>> overall. The only improved operations would be mappend on String, and
>> possibly the string literal instantiation.
>>
>>         I don't think there's any chance to get this kind of change
>> proposal accepted today. You'd have to make the pain worth the gain.
>> The only viable path is to ensure beforehand that the change improves
>> more than just the mappend operation.
>>
>>         In other words, you'd have to get today's String to instantiate
>> more classes in common with tomorrow's String, and you'd have to get the
>> everyday Haskell code to use those classes instead of list manipulations.
>>
>>         The first tentative step towards the String type change would
>> then be either the mono-traversable or my own monoid-subclasses package.
>> They both define new type classes that are instantiated by both [Char] and
>> Text. The main difference is that the former builds upon the Foldable
>> foundation, the latter upon Monoid. They are both far from being a complete
>> replacement for list manipulations. But any new code that used their
>> operations would see a big improvement from the String type change.
>>
>>         Here, then, is the five-year plan you're asking for:
>>
>> Year one: Agree on the ideal set of type classes to bridge the gap
>> between [Char] and Text.
>>
>> Year two: Bring the new type classes into the Prelude. Have all relevant
>> types instantiate them. Everybody's updating their code in delight to use
>> the new class methods.
>>
>> Year three: GHC issues warnings about using List-specific [], ++, null,
>> take, drop, span, drop, etc, on String. Everybody's furiously updating
>> their code.
>>
>> Year four: Add Text to the core libraries. The GHC magic to make the Text
>> <-> [Char] convertions implicit is implemented and ready for testing but
>> requires a pragma.
>>
>> Year five: Update Haskell language report. Flip the switch.
>>
>> So there. How feasible does that sound?
>>
>>
>> _______________________________________________
>> Haskell-Cafe mailing list
>> Haskell-Cafe at haskell.org
>> http://mail.haskell.org/cgi-bin/mailman/listinfo/haskell-cafe
>>
>
>
> _______________________________________________
> Haskell-Cafe mailing list
> Haskell-Cafe at haskell.org
> http://mail.haskell.org/cgi-bin/mailman/listinfo/haskell-cafe
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.haskell.org/pipermail/haskell-cafe/attachments/20150522/45163957/attachment.html>


More information about the Haskell-Cafe mailing list