Proposal #3456: Add FilePath -> String decoder

Judah Jacobson judah.jacobson at gmail.com
Tue Sep 1 02:19:18 EDT 2009


On Fri, Aug 28, 2009 at 3:50 PM, Duncan
Coutts<duncan.coutts at worc.ox.ac.uk> wrote:
> On Wed, 2009-08-26 at 16:14 +0300, Yitzchak Gale wrote:
>
>> I am now beginning to lean towards Ketil's suggestion
>> that on POSIX platforms we should always use
>> UTF-8. We then need a prominent warning in the
>> documentation that if you need something else,
>> like the current locale, decode it yourself.
>
> That's nice in that it makes the function pure, or equivalently so that
> it does not need a locale parameter.
>
>> Note that it is becoming increasingly rare for people
>> to use non-UTF-8 locales anywhere in the world,
>> and even then it's likely ignored by many UIs.
>> So I'm inclined against cluttering the API with
>> convenience functions for other encodings, as Johan
>> is suggesting.

I agree that this would make the API much simpler; but I'm wary of
broad statements like the above.  My (very vague) impression was that
many Japanese users, for example, still use non-Unicode encodings.

I think that glib is an interesting example.  Its developers advocate
pretty strongly for everyone to use utf-8 filenames; but even they
provide a simple way for the user of any glib program to override that
behavior by setting G_FILENAME_ENCODING=@locale.

As another example, Python v.3, which recently redesigned its Unicode
interface, also still uses the locale for filenames rather than solely
utf-8.  The following interview with Guido from January has a good
take on why they did that (about halfway through the article):

http://broadcast.oreilly.com/2009/01/the-evolution-of-python-3.html

If we really want a pure FilePath->String conversion, then perhaps we
could make the rts check the locale once at the start of the program,
and have every subsequent conversion use that locale.   This would be
safe from order-of-operation changes; though it would be possible for
the same pure code to behave differently in two different program
runs...so I'm unsure about that solution.

Best,
-Judah


More information about the Libraries mailing list