[Haskell] System.FilePath survey
John Meacham
john at repetae.net
Wed Feb 8 18:50:09 EST 2006
On Wed, Feb 08, 2006 at 09:10:37PM +0000, Ben Rudiak-Gould wrote:
> John Meacham wrote:
> >On Tue, Feb 07, 2006 at 04:25:35PM +0000, Ben Rudiak-Gould wrote:
> >> Posix NT Win9x
> >>
> >>pathnames bytes UTF-16 locale
> >>command line bytes UTF-16 locale
> >>file contents bytes bytes bytes
> >>pipes/sockets bytes bytes bytes
> >
> >actually, Posix systems should be the following
> >
> >>pathnames locale UTF-16 locale
> >>command line locale UTF-16 locale
> >>file contents * bytes bytes
> >>pipes/sockets * bytes bytes
> >
> >Although the Posix interface is in terms of bytes, the strings should
> >always be interpreted via the locale specified in $LANG or $LC_CTYPE
> >also, for file contents and pipes/sockets, if you are passing text, and
> >in the absence of some overriding standard or protocol, you should be
> >using the encoding specified in the locale too.
>
> But that's an application-level convention; the kernel only knows about
> bytes. On Windows the encoding of pathnames and the command line is a
> requirement imposed by the kernel. I think assuming the locale encoding for
> the command line on Posix is a bad idea. Users are unlikely to pass a
> misencoded command line explicitly, but I want my-haskell-util `find .` to
> work even on a mounted volume that uses the wrong encoding. (And I also
> want your-haskell-util to work, even if you didn't write it with this
> situation in mind.)
when the command line is to be interpreted as a string, then
interpreting it in the current locale is definitly the right thing to
do. This is why we need two varieties of getArgs, one which returns
[String] and one which returns [[Word8]]. though, I doubt the second
form will be needed much since in general you usually think of command
line arguments as strings, but it should be provided since it can't
really be worked around.
John
--
John Meacham - ⑆repetae.net⑆john⑈
More information about the Libraries
mailing list