[Haskell-cafe] Strings and utf-8

Reinier Lamers reinier.lamers at phil.uu.nl
Thu Nov 29 05:13:24 EST 2007


Bulat Ziganshin wrote:

>Hello Andrew,
>
>Thursday, November 29, 2007, 1:11:38 AM, you wrote:
>
>  
>
>>>IMHO, someone should make a full proposal by implementing an alternative
>>>System.IO library that deals with all these encoding issues and
>>>implements H98 IO in terms of that.
>>>      
>>>
>
>  
>
>>We need two seperate interfaces. One for text-mode I/O, one for raw
>>binary I/O.
>>    
>>
>
>  
>
>>When doing text-mode I/O, the programmer needs to be able to explicitly
>>specify exactly which character encoding is required. (Presumably 
>>default to the current 8-bit truncation encoding?)
>>    
>>
>
>http://haskell.org/haskellwiki/Library/Streams already exists
>  
>
Which would mean that we have streams to do character I/O, ByteString to 
do binary I/O, and System.IO to do, eh, something in between.

That seems rather unfortunate to me. While the "truncate to 8 bits" 
semantics may be nice to keep old code working, it really isn't all that 
intuitive. When I do 'putStr "u\776"', I want a u with an umlaut to 
appear, not to get it printed as if it were "u\8".

The strange thing is that Hugs at the moment _does_ print a u-umlaut, 
while ghci prints "u\8", which is a u followed by a backspace, so I see 
nothing.

Reinier



More information about the Haskell-Cafe mailing list