[Haskell-cafe] getting crazy with character encoding

Andrea Rossato mailing_list at istitutocolli.org
Wed Sep 12 11:26:49 EDT 2007


On Wed, Sep 12, 2007 at 11:16:25AM -0400, Seth Gordon wrote:
>  It appears that in spite of the locale definition, hGetContents is treating 
>  each byte as a separate character without translating the multi-byte 
>  sequences *from* UTF-8, and then putStrLn sends each of those bytes to 
>  standard output without translating the non-ASCII characters *to* UTF-8.  So 
>  the second line of your program's output is correct...but only by accident.

that's it indeed. As I said in the message I've just sent, I've read
that the String/CString conversion is automatically done in
ISO-8859-1, so "èèè", which are 6 bytes in utf-8, are translated
into 6 iso-8859-1 characters.

What puzzles me is the behavior of putStrLn.

Thanks for your time.

Andrea



More information about the Haskell-Cafe mailing list