[Haskell-cafe] Has character changed in GHC 6.8?

Ketil Malde ketil+haskell at ii.uib.no
Wed Jan 23 09:15:56 EST 2008


"Johan Tibell" <johan.tibell at gmail.com> writes:

>>> The benefit would be that if the input is not in latin-1 an exception
>>> could be thrown rather than returning a Char representing the wrong
>>> Unicode code point.

>> I'm not sure what you mean here. All 256 possible values have a meaning.

OTOH, going the other way could be more troublesome, I'm not sure that
outputting a truncated value is what you want.

> You're of course right. So we don't have a problem here. Maybe I was
> thinking of an encoding (7-bit ASCII?) where some of the 256 values
> are invalid.

Well - each byte can be converted to the equivalent code point, but
0x80-0x9F are control characters, and some of those are left
undefined.  Perhaps instead of truncating on output, we should map
code points > 0xFF to such a value?  E.g. 0x81 is undefined in both
Unicode and Windows 1252. 

-k
-- 
If I haven't seen further, it is by standing in the footprints of giants


More information about the Haskell-Cafe mailing list