[Haskell-cafe] Can we improve Show instance for non-ascii charcters?
Takayuki Muranushi
muranushi at gmail.com
Wed Feb 3 02:20:42 UTC 2016
Dear David Kraeutmann, thank you for your advice. Now I know that the
best thing to do before I submit something to GHC Trac was nothing but
actually to submit it :)
https://ghc.haskell.org/trac/ghc/ticket/11529#ticket
Dear David Feuer, thank you for your suggestion. I also think `ushow`
is a good idea. Then we'd like to have a ghci flag, that switches the
`show` in the REPL to `ushow` .
Shall we continue the discussion to the ticket #11529, and could you
please help me list up the cases where the change will break the
assumptions of existing codes?
Takayuki MURANUSHI
RIKEN Advanced Institute for Computational Science
http://nushio3.github.io/
http://www.geocities.jp/takascience/
2016-02-03 11:05 GMT+09:00 David Feuer <david.feuer at gmail.com>:
> Unfortunately, I don't think there is any way to do exactly this
> without breaking the assumptions of a lot of existing code. But I
> suspect you can work around the problem in a few different ways, one
> of which strikes me as reasonable, if not quite perfectly accurate in
> all cases:
>
> Write a function ushow that applies show to the given element, then
> digs through the resulting string converting escape sequences
> corresponding to valid Unicode codepoints into those codepoints.
>
> On Tue, Feb 2, 2016 at 8:37 PM, Takayuki Muranushi <muranushi at gmail.com> wrote:
>> Show instance for non-ascii characters prints their character codes.
>> This is sad for Haskell users that speaks language other than English.
>>
>>> 'A'
>> 'A'
>>> 'Ä'
>> '\196'
>>> '漢'
>> '\28450'
>>> print $ [(++"'s dad"), (++"'s mom")] <*> ["Simon", "John"]
>> ["Simon's dad","John's dad","Simon's mom","John's mom"]
>>> print $ [(++"の父"), (++"の母")] <*> ["田中", "山田"]
>> ["\30000\20013\12398\29238","\23665\30000\12398\29238","\30000\20013\12398\27597","\23665\30000\12398\27597"]
>>
>> The function that needs improvement is showLitChar in GHC.Show, which
>> currently prints any character larger than ASCII code 127 by its
>> character code:
>>
>> http://haddock.stackage.org/lts-5.1/base-4.8.2.0/src/GHC-Show.html
>>
>> showLitChar :: Char -> ShowS
>> showLitChar c s | c > '\DEL' = showChar '\\' (protectEsc isDec (shows
>> (ord c)) s)
>>
>> On the other hand, there is GHC.Unicode.isPrint, the predicate for
>> printable Unicode characters, that is calling on a foreign function
>> u_iswprint for the knowledge.
>>
>> https://hackage.haskell.org/package/base-4.8.2.0/docs/src/GHC.Unicode.html#isPrint
>>
>> I think one of the solution is to import and call u_iswprint from
>> GHC.Show, too,
>> but I don't know it's against any design choices.
>>
>>
>>
>> Yesterday, I had a chance to teach Haskell (in Japanese,) and I had to
>> use English in some of the most exciting examples, like the
>> Applicative List example above. I would heartedly like to see GHC
>> improve in these directions, so that we can make more happy learning
>> materials on Haskell.
>>
>> Let me ask your opinions on what is the best way to do this (or is
>> better not to do this), before I submit something to GHC Trac.
>>
>>
>> Best,
>>
>> --------------------------------
>> -- Takayuki MURANUSHI
>> -- RIKEN Advanced Institute for Computational Science
>> -- http://nushio3.github.io/
>> -- http://www.geocities.jp/takascience/
>> --------------------------------
>> _______________________________________________
>> Haskell-Cafe mailing list
>> Haskell-Cafe at haskell.org
>> http://mail.haskell.org/cgi-bin/mailman/listinfo/haskell-cafe
More information about the Haskell-Cafe
mailing list