[Haskell-cafe] Abandoning String = [Char]?

Mario Blažević mblazevic at stilo.com
Fri May 22 14:29:31 UTC 2015


On 15-05-18 06:44 PM, Andrew Gibiansky wrote:
> Hey all,
>
> In the earlier haskell-cafe discussion of IsString, someone mentioned
> that it would be nice to abandon [Char] as the blessed string type in
> Haskell. I've thought about this on and off for a while now, and think
> that the fact that [Char] is the default string type is a really big
> issue (for example, it gives beginners the idea that Haskell is
> incredibly slow, because everything that involves string processing is
> using linked lists).
>
> I am not proposing anything, but am curious as to what already has been
> discussed:
>
> 1. Has the possibility of migrating away from [Char] been investigated
> before?

	No, not seriously as far as I'm aware. That ship has sailed a long time 
ago. Still, as I have actually thought about that, I'll give you an 
outline of a possible process.


> 2. What gains could we see in ease of use, performance, etc, if [Char]
> was deprecated?

	They could be very significant for any code that took advantage of the 
new type, but the existing code would not benefit that much. But then, 
any new Haskell code can already use Text where performance matters.


> 3. What could replace [Char], while retaining the same ease of use for
> writing string manipulation functions (pattern matching, etc)?

	You would not have the same ease of use exactly. The options would lie 
between two extremes. At one end, you can have a completely opaque 
String type with fromChars/toChars operations and nothing else. At the 
other end, you'd implement all operations useful on strings so there 
would never be any need to convert between String and [Char].

	The first extreme would be mostly useless from the performance point of 
view, but with some GHC magic perhaps it could be made a viable upgrade 
path. The compiler would have to automatically insert the implicit 
fromChars/toChars conversion whenever necessary, and I expect that some 
of the existing Haskell code would still be broken.

	Once you have an opaque String type, you can think about improving the 
performance. A more efficient instance of Monoid String would be a good 
start, especially since it wouldn't break backward compatibility. 
Unfortunately that is the only [Char] instance in wide use that can be 
easily optimized. Perhaps Foldable could be made to work with even more 
compiler magic, but I doubt it would be worth the effort.

	If you add more operations on String that don't require


> 4. Is there any sort of migration path that would make this change
> feasible in mainline Haskell in the medium term (2-5 years)?

	Suppose GHC 7.12 were to bring Text into the core libraries, change 
Prelude to declare type String = Text, and sprinkle some magic compiler 
dust to make the explicit Text <-> Char conversions unnecessary.

	The existing Haskell code would almost certainly perform worse overall. 
The only improved operations would be mappend on String, and possibly 
the string literal instantiation.

	I don't think there's any chance to get this kind of change proposal 
accepted today. You'd have to make the pain worth the gain.
The only viable path is to ensure beforehand that the change improves 
more than just the mappend operation.

	In other words, you'd have to get today's String to instantiate more 
classes in common with tomorrow's String, and you'd have to get the 
everyday Haskell code to use those classes instead of list manipulations.

	The first tentative step towards the String type change would then be 
either the mono-traversable or my own monoid-subclasses package. They 
both define new type classes that are instantiated by both [Char] and 
Text. The main difference is that the former builds upon the Foldable 
foundation, the latter upon Monoid. They are both far from being a 
complete replacement for list manipulations. But any new code that used 
their operations would see a big improvement from the String type change.

	Here, then, is the five-year plan you're asking for:

Year one: Agree on the ideal set of type classes to bridge the gap 
between [Char] and Text.

Year two: Bring the new type classes into the Prelude. Have all relevant 
types instantiate them. Everybody's updating their code in delight to 
use the new class methods.

Year three: GHC issues warnings about using List-specific [], ++, null, 
take, drop, span, drop, etc, on String. Everybody's furiously updating 
their code.

Year four: Add Text to the core libraries. The GHC magic to make the 
Text <-> [Char] convertions implicit is implemented and ready for 
testing but requires a pragma.

Year five: Update Haskell language report. Flip the switch.

So there. How feasible does that sound?



More information about the Haskell-Cafe mailing list