[Haskell-cafe] Has character changed in GHC 6.8?
jules at jellybean.co.uk
Wed Jan 23 07:23:37 EST 2008
Johan Tibell wrote:
>>>>> What *does* matter to the programmer is what encodings putStr and
>>>>> getLine use. AFAIK, they use "lower 8 bits of unicode code point" which
>>>>> is almost functionally equivalent to latin-1.
>>>> Which is terrible! You should have to be explicit about what encoding
>>>> you expect. Python 3000 does it right.
>>> Presumably there wasn't a sufficiently good answer available in time for
>> Will there be one for haskell prime ?
> The I/O library needs an overhaul but I'm not sure how to do this in a
> backwards compatible manner which probably would be required for
> inclusion in Haskell'. One could, like Python 3000, break backwards
> compatibility. I'm not sure about the implications of doing this.
> Maybe introducing a new System.IO.Unicode module would be an option.
> If one wants to keep the interface but change the semantics slightly
> one could define e.g. getChar as:
> getChar :: IO Char
> getChar = getWord8 >>= decodeChar latin1
> Assuming latin-1 is what's used now.
> The benefit would be that if the input is not in latin-1 an exception
> could be thrown rather than returning a Char representing the wrong
> Unicode code point.
I'm not sure what you mean here. All 256 possible values have a meaning.
I did say 'lower 8 bits of unicode code point which is almost
functionally equivalent to latin-1.'
IIUC, it's latin-1 plus the two control-character ranges.
There are no decoding errors for haskell98's getChar.
> My proposal is for I/O functions to specify the encoding they use if
> they accept or return Chars (and Strings). If they deal in terms of
> bytes (e.g. socket functions) they should accept and return Word8s.
I would be more inclined to suggest they default to a particular well
understand encoding, almost certainly UTF8. Another interface could give
access to other encodings.
> Optionally, text I/O functions could default to the system locale
That is a disastrous idea.
Please read the other flamewars^Wdiscussions on this list about this
subject :) One was started by a certain Johann Tibell :)
More information about the Haskell-Cafe