Abstract FilePath Proposal
Yitzchak Gale
gale at sefer.org
Sun Jun 28 01:02:14 UTC 2015
OK, based on what David and Brandon wrote, I guess
that representing paths as bytestrings does make
some low-level sense on all platforms. Although
for Windows we would still need some way to deal
with the requirement that the bytestring have an even
length.
We will need platform-dependent coercions of
paths to and from String/Text. Those might sometimes
be partial functions. We need a notion of the coercions
for the current platform, and we also need it to be
possible to access the coercions for all platforms.
On Sun, Jun 28, 2015 at 12:28 AM David Turner <dct25-561bs at mythic-beasts.com>
wrote:
> Hi,
>
> I'm +1 on the general idea of this proposal. Using String for filenames
> has caused me all sorts of trouble, particularly when I've had to deal with
> a bunch of files whose names don't all use the same encoding.
>
> However, be careful about the exact semantics of filenames on Windows.
> Quoting MSDN:
>
>
> There is no need to perform any Unicode normalization on path and file
> name strings for use by the Windows file I/O API functions because* the
> file system treats path and file names as an opaque sequence of WCHARs*.
> Any normalization that your application requires should be performed with
> this in mind, external of any calls to related Windows file I/O API
> functions.
>
>
> (from
> https://msdn.microsoft.com/en-us/library/windows/desktop/aa365247(v=vs.85).aspx,
> emphasis mine)
>
> Thus FilePath = String (or Text) doesn't really seem correct on Windows
> either (although it'll be pretty close as long as you stay within the BMP).
>
> By my reckoning, when you get down to brass tacks, all filesystems on all
> platforms name files with sequences of bytes. There are various interesting
> ways to represent these bytes to human beings as sequences of characters,
> but aiming for FilePath = ByteString everywhere and dealing with the
> conversion to characters elsewhere seems more correct.
>
> Cheers,
>
> David
>
>
>
> On 27 June 2015 at 22:02, Brandon Allbery <allbery.b at gmail.com> wrote:
> > On Sat, Jun 27, 2015 at 4:50 PM, Yitzchak Gale <gale at sefer.org> wrote:
> >>
> >> On Mac OS X, it's normalized Unicode. The important
> >> point is *normalized* - if you create a FilePath from two
> >> different Unicode strings that have the same normalized
> >> form, the result FilePaths must be equal on Mac OS X.
> >
> >
> > This is only true for higher level OS X APIs. ghc normally operates in
> the
> > BSD layer, which mostly follows POSIX semantics; in particular,
> filesystem
> > paths are bytestrings in the BSD layer, and only normalized in Cocoa
> APIs.
> > (Which, among other things, means you can make a GUI application dump
> core
> > by trying to use a file dialog in a directory containing a filename
> created
> > using the BSD API which does not use a UTF8 encoding.)
> >
> > --
> > brandon s allbery kf8nh sine nomine
> associates
> > allbery.b at gmail.com
> ballbery at sinenomine.net
> > unix, openafs, kerberos, infrastructure, xmonad
> http://sinenomine.net
> >
> > _______________________________________________
> > Libraries mailing list
> > Libraries at haskell.org
> > http://mail.haskell.org/cgi-bin/mailman/listinfo/libraries
> >
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.haskell.org/pipermail/libraries/attachments/20150628/8cc21a6c/attachment.html>
More information about the Libraries
mailing list