[Haskell-cafe] invalid character encoding
Glynn Clements
glynn at gclements.plus.com
Thu Mar 17 21:20:23 EST 2005
Marcin 'Qrczak' Kowalczyk wrote:
> > E.g. Gtk-2.x uses UTF-8 almost exclusively, although you can force the
> > use of the locale's encoding for filenames (if you have filenames in
> > multiple encodings, you lose; filenames using the "wrong" encoding
> > simply don't appear in file selectors).
>
> Actually they do appear, even though you can't type their names
> from the keyboard. The name shown in the GUI used to be escaped in
> different ways by different programs or even different places in one
> program (question marks, %hex escapes \oct escapes), but recently
> they added some functions to glib to make the behavior uniform.
In the last version of Gtk-2.x which I tried, "invalid" filenames are
just omitted from the list. Gtk-1.x displayed them (I think with
question marks, but it may have been a box).
I've just tried with a more recent version (2.6.2); the default
behaviour is similar, although you can now get around the issue by
using G_FILENAME_ENCODING=ISO-8859-1. Of course, if your locale is
a long way from ISO-8859-1, that isn't a particularly good solution.
The best test case would be a system used predominantly by Japanese,
where (apparently) it's common to have a mixture of both EUC-JP and
Shift-JIS filenames (occasionally wrapped in ISO-2022, but usually
raw).
--
Glynn Clements <glynn at gclements.plus.com>
More information about the Haskell-Cafe
mailing list