duncan.coutts at worc.ox.ac.uk
Tue Mar 20 22:32:49 EDT 2007
On Fri, 2007-03-16 at 15:42 +0100, Sven Panne wrote:
> The main point here is that UTF-8
> => Unicode => UTF-8 is lossless, and the same holds for my proposed change
> for *nices, too, as long as the local encoding is invertible in this sense. I
> am not sure if there are encodings in use out there which do not have this
> property, but even if they are: All e.g. Qt-based programs would share the
> same problems.
Gtk+/GNOME programs are fairly careful in this regard. When loading a
file they keep *both* the original sequence of bytes that is the file
name and they also interpret it in a particular locale and try to
convert that to Unicode to display in the GUI. If that conversion fails
it will do a best-effort conversion using replacement characters or just
display "unknown file name" or somethin. However when saving the file
again they always use the original file name which is just the raw
sequence of bytes.
When saving a new file and taking a Unicode string from the user they
try to convert it to a locale encoding and if that conversion fails it
asks the user to use a different name.
the section near the top on "File Name Encodings"
So a hypothetical FilePath ADT might keep both the raw and displayable
unicode versions of a file name.
More information about the Libraries