[Haskell-cafe] Re: Strings and utf-8

Duncan Coutts duncan.coutts at worc.ox.ac.uk
Thu Nov 29 08:30:02 EST 2007


On Thu, 2007-11-29 at 13:05 +0000, Jules Bean wrote:

> Language of messages is quite different from language of a file you read.
> 
> Suppose I am English, and I have a russian friend, Vlad.
> 
> My default locale is, say, latin-1, and his is something cyrillic.
> 
> I might well open files including my own files, and his files. The 
> locale of the current user is simple no guide to the correct encoding to 
> read a file in, and not a particularly reliable guide to writing a file out.
> 
> Locale makes perfect sense for messages (you are communicating with the 
> user, his locale tells you what language he speaks). It makes much less 
> sense for file IO.

Yes, it's a fundamental limitation of the unix locale system and
multi-user systems. However it's no less wrong than just picking UTF8
all the time. Obviously one needs a text file api that allows one to
specify the encoding for the cases where you happen to know it, but for
the H98 file api where there is no way of specifying an encoding, what's
better than using the unix default method? (at least on unix)

Duncan



More information about the Haskell-Cafe mailing list