[RFC] Support Unicode characters in instance Show String

Oleg Grenrus oleg.grenrus at iki.fi
Sun Jul 11 12:11:02 UTC 2021


This is tricky design. Any instance in the composition which doesn't
define showsPrecUnicode will ruin formatting of inner Strings.
Maybe it's not as bad as in aeson* as most Show instances are derived.
(But not Show1 etc.)

* The problem with aeson is ToJSON class having toValue (old method),
and toEncoding, which is fast but which default implementation is using
slower toValue.
   Thus `instance ToJSON MyType` derived generally silently ruins the
performance.

It might be better to not define default implementation for
showsPrecUnicode (cause most instances are derived), as though it will
break explicitly written instances, but fixing them is straigh-forward.
(GHC developers use head.hackage will hate this option though).

I'm not sure this is any better then separate class, and whether two
type-classes (either actually or two-in-one) is a good idea.

- Oleg

On 11.7.2021 7.01, Viktor Dukhovni wrote:
> On Thu, Jul 08, 2021 at 06:11:28PM +0800, Kai Ma wrote:
>
>> It's proposed here to change the Show instance of String, to achieve the following output:
>>
>>     ghci> print "Hello, 世界”
>>     "Hello, 世界”
>>     
>>     ghci> print "Hello, мир”
>>     "Hello, мир”
>>     
>>     ghci> print "Hello, κόσμος”
>>     "Hello, κόσμος”
>>     
>>     ghci> "Hello, 世界”      
>>     “Hello, 世界”
>>     
>>     ghci> "😀” 
>>     “😀"
> Another possibility is to extend the `Show` class with two new methods
> and their default implementations:
>
>     class Show where
>         ...
>         showsPrecUnicode :: Int -> a -> ShowS
>         showsPrecUnicode = showsPrec
>
>         showListUnicode :: [a] -> ShowS
>         showListUnicode = showList
>
>     showUnicode :: Show a => a -> String
>     showUnicode x = showsPrecUnicode 0 x ""
>
> at which point a small number of classes can override `showUnicode`
> and `showListUnicode`:
>
>     instance Show a => Show [a] where
>         showsPrec _      = showList
>         showsPrecUnicode = showListUnicode
>
>     instance Show Char where
>         showsPrecUnicode = ... -- Unicode char
>         showListUnicode  = ... -- Unicode string 
>
>     instance Show Text where
>         showsPrecUnicode = ... -- Unicode text
>
> Once these are implemented, "ghci" can be modified to instead used
> `showUnicode`, rather than `show`, with no new incompatibilities
> elsewhere.
>
> We can also introduce `uprint = putStrLn . showUnicode`, ...
>
> This would still require explicit opt-in to use the Unicode show,
> but it would be available for all `Show` instances, and used by
> default in "ghci".
>
> It would still be a good idea to implement `Render`, which is a related
> but separate concern.
>


More information about the Libraries mailing list