Unicode, strings, and Show
allbery.b at gmail.com
Thu Mar 31 01:58:34 UTC 2016
On Wed, Mar 30, 2016 at 9:50 PM, Manuel M T Chakravarty <
chak at justtesting.org> wrote:
> Firstly, we have
> isPrint :: Char -> Bool
> Are you saying that this type is wrong?
> Secondly, how often do you feed the output of ’show’ to ’read’ in another
> locale versus how often is everybody whose whole life is outside of ASCII
> (i.e., not anglo-centric people) bothered by this shortcoming? (*)
> Moreover, the argument on the ticket was that changing the current
> implementation would go against the standard. Now that I am saying, the
> current implementation is not conforming to the standard, the standard
> suddenly doesn’t seem to matter. Personally, I would say, when we wrote
> that standard, we knew what we were doing.
The standard I am aware of is the Report, which deliberately limited the
output to the subset which is guaranteed to be usable in all locales. show
conforms to this; apparently people want it to *not* conform, and in a way
which requires some locale to become the One True Locale.
isPrint is, as per the language Report, based on what Char is --- which is
Unicode codepoints. Using it for output --- or for input, for that matter
--- gets you into locale issues because nobody anywhere guarantees that
Unicode codepoints that pass isPrint are representable in every locale.
isPrint is not the place to verify that a character can actually be
displayed in the current locale.
Or have you decided that ghc should require Unicode locales and nothing but
Unicode locales from now on? If so, what do you do when the next issue
comes up, where Unix is UTF8 and Windows is UTF16?
brandon s allbery kf8nh sine nomine associates
allbery.b at gmail.com ballbery at sinenomine.net
unix, openafs, kerberos, infrastructure, xmonad http://sinenomine.net
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the ghc-devs