Abstract FilePath Proposal

David Turner dct25-561bs at mythic-beasts.com
Sat Jun 27 21:28:24 UTC 2015


Hi,

I'm +1 on the general idea of this proposal. Using String for filenames has
caused me all sorts of trouble, particularly when I've had to deal with a
bunch of files whose names don't all use the same encoding.

However, be careful about the exact semantics of filenames on Windows.
Quoting MSDN:


There is no need to perform any Unicode normalization on path and file name
strings for use by the Windows file I/O API functions because* the file
system treats path and file names as an opaque sequence of WCHARs*. Any
normalization that your application requires should be performed with this
in mind, external of any calls to related Windows file I/O API functions.


(from
https://msdn.microsoft.com/en-us/library/windows/desktop/aa365247(v=vs.85).aspx,
emphasis mine)

Thus FilePath = String (or Text) doesn't really seem correct on Windows
either (although it'll be pretty close as long as you stay within the BMP).

By my reckoning, when you get down to brass tacks, all filesystems on all
platforms name files with sequences of bytes. There are various interesting
ways to represent these bytes to human beings as sequences of characters,
but aiming for FilePath = ByteString everywhere and dealing with the
conversion to characters elsewhere seems more correct.

Cheers,

David



On 27 June 2015 at 22:02, Brandon Allbery <allbery.b at gmail.com> wrote:
> On Sat, Jun 27, 2015 at 4:50 PM, Yitzchak Gale <gale at sefer.org> wrote:
>>
>> On Mac OS X, it's normalized Unicode. The important
>> point is *normalized* - if you create a FilePath from two
>> different Unicode strings that have the same normalized
>> form, the result FilePaths must be equal on Mac OS X.
>
>
> This is only true for higher level OS X APIs. ghc normally operates in the
> BSD layer, which mostly follows POSIX semantics; in particular, filesystem
> paths are bytestrings in the BSD layer, and only normalized in Cocoa APIs.
> (Which, among other things, means you can make a GUI application dump core
> by trying to use a file dialog in a directory containing a filename
created
> using the BSD API which does not use a UTF8 encoding.)
>
> --
> brandon s allbery kf8nh                               sine nomine
associates
> allbery.b at gmail.com
ballbery at sinenomine.net
> unix, openafs, kerberos, infrastructure, xmonad
http://sinenomine.net
>
> _______________________________________________
> Libraries mailing list
> Libraries at haskell.org
> http://mail.haskell.org/cgi-bin/mailman/listinfo/libraries
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.haskell.org/pipermail/libraries/attachments/20150627/69442e90/attachment.html>


More information about the Libraries mailing list