[Haskell-cafe] Has character changed in GHC 6.8?

Jules Bean jules at jellybean.co.uk
Wed Jan 23 07:23:37 EST 2008


Johan Tibell wrote:
>>>>> What *does* matter to the programmer is what encodings putStr and
>>>>> getLine use. AFAIK, they use "lower 8 bits of unicode code point" which
>>>>> is almost functionally equivalent to latin-1.
>>>> Which is terrible! You should have to be explicit about what encoding
>>>> you expect. Python 3000 does it right.
>>> Presumably there wasn't a sufficiently good answer available in time for
>>> haskell98.
>> Will there be one for haskell prime ?
> 
> The I/O library needs an overhaul but I'm not sure how to do this in a
> backwards compatible manner which probably would be required for
> inclusion in Haskell'. One could, like Python 3000, break backwards
> compatibility. I'm not sure about the implications of doing this.
> Maybe introducing a new System.IO.Unicode module would be an option.
> 
> If one wants to keep the interface but change the semantics slightly
> one could define e.g. getChar as:
> 
> getChar :: IO Char
> getChar = getWord8 >>= decodeChar latin1
> 
> Assuming latin-1 is what's used now.
> 
> The benefit would be that if the input is not in latin-1 an exception
> could be thrown rather than returning a Char representing the wrong
> Unicode code point.

I'm not sure what you mean here. All 256 possible values have a meaning.

I did say 'lower 8 bits of unicode code point which is almost 
functionally equivalent to latin-1.'

IIUC, it's latin-1 plus the two control-character ranges.

There are no decoding errors for haskell98's getChar.

> My proposal is for I/O functions to specify the encoding they use if
> they accept or return Chars (and Strings). If they deal in terms of
> bytes (e.g. socket functions) they should accept and return Word8s.

I would be more inclined to suggest they default to a particular well 
understand encoding, almost certainly UTF8. Another interface could give 
access to other encodings.

> Optionally, text I/O functions could default to the system locale
> setting.

That is a disastrous idea.

Please read the other flamewars^Wdiscussions on this list about this 
subject :) One was started by a certain Johann Tibell :)

http://haskell.org/pipermail/haskell-cafe/2007-September/031724.html

http://haskell.org/pipermail/haskell-cafe/2007-September/032195.html

Jules



More information about the Haskell-Cafe mailing list