[Haskell-cafe] Re: getting crazy with character encoding

Stephane Bortzmeyer bortzmeyer at nic.fr
Thu Sep 13 05:07:03 EDT 2007


On Thu, Sep 13, 2007 at 12:23:33AM +0000,
 Aaron Denney <wnoise at ofb.net> wrote 
 a message of 76 lines which said:

> the characters read and written should correspond to the native
> environment notions and encodings.  These are, under Unix,
> determined by the locale system.

Locales, while fine for things like the language of the error messages
or the format to use to display the time, are *not* a good solution
for things like file names and file contents.

Even on a single Unix machine (without networking), there are
*several* users. Using the locale to find out the charset used for a
file name won't work if these users use different locales.

Same thing for file contents. The charset used must be marked in the
file (XML...) or in the metadata, somehow. Otherwise, there is no way
to exchange files or even to change the locale (if I switch from
Latin1 to UTF-8, what do my files become?)


More information about the Haskell-Cafe mailing list