[Haskell] System.FilePath survey
Ben Rudiak-Gould
Benjamin.Rudiak-Gould at cl.cam.ac.uk
Wed Feb 8 16:10:37 EST 2006
John Meacham wrote:
> On Tue, Feb 07, 2006 at 04:25:35PM +0000, Ben Rudiak-Gould wrote:
>> Posix NT Win9x
>>
>> pathnames bytes UTF-16 locale
>> command line bytes UTF-16 locale
>> file contents bytes bytes bytes
>> pipes/sockets bytes bytes bytes
>
> actually, Posix systems should be the following
>
>> pathnames locale UTF-16 locale
>> command line locale UTF-16 locale
>> file contents * bytes bytes
>> pipes/sockets * bytes bytes
>
> Although the Posix interface is in terms of bytes, the strings should
> always be interpreted via the locale specified in $LANG or $LC_CTYPE
> also, for file contents and pipes/sockets, if you are passing text, and
> in the absence of some overriding standard or protocol, you should be
> using the encoding specified in the locale too.
But that's an application-level convention; the kernel only knows about
bytes. On Windows the encoding of pathnames and the command line is a
requirement imposed by the kernel. I think assuming the locale encoding for
the command line on Posix is a bad idea. Users are unlikely to pass a
misencoded command line explicitly, but I want my-haskell-util `find .` to
work even on a mounted volume that uses the wrong encoding. (And I also want
your-haskell-util to work, even if you didn't write it with this situation
in mind.)
-- Ben
More information about the Libraries
mailing list