[Haskell-cafe] How to use Unicode strings?

Alexey Khudyakov alexey.skladnoy at gmail.com
Sun Nov 23 02:20:47 EST 2008


>
> This upsets me. We need to get on with doing this properly. The
> System.IO.UTF8 module is a useful interim workaround but we're not using
> it properly most of the time.
>
> ... skipped ...
>
> The right thing to do is to make Prelude.putStrLn do the right thing. We
> had a long discussion on how to fix the H98 IO functions to do this
> better. We just need to get on with it, or we'll end up with too many
> cases of people using System.IO.UTF8 inappropriately.
>
But this bring question what "the right thing" is? If locale is UTF8 or system
support unicode some other way - no problem, just encode string properly.
Problem is how to deal with untanslatable characters. Skip? Replace with
question marks? Anything other? Probably we need to look how this is
solved in other languages. (Or not solved)

And this problem related not only to IO. It raises whenever strings cross
border between haskell world and outside world. Opening files with unicode
names, execing, etc.

For example:
Prelude> readFile "файл"
*** Exception: D09;: openFile: does not exist (No such file or directory)
Prelude> executeFile "echo" True ["Сейчас сломается"] Nothing
!59G0A A;><05BAO

Althrough it's possible to work around using encodeString/decodeString from
Codec.Binary.UTF8.String it won't work on non-UTF8 systems. It's not only
neandertalian systems with one-byte locales, windows AFAIK uses other
unicode encoding.


More information about the Haskell-Cafe mailing list