<div dir="ltr"><div>On 28 June 2015 at 02:02, Yitzchak Gale <span dir="ltr"><<a href="mailto:gale@sefer.org" target="_blank">gale@sefer.org</a>></span> wrote:<br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;padding-left:1ex"><div dir="ltr">OK, based on what David and Brandon wrote, I guess<div>that representing paths as bytestrings does make</div><div>some low-level sense on all platforms. Although</div><div>for Windows we would still need some way to deal</div><div>with the requirement that the bytestring have an even</div><div>length.</div></div></blockquote><div><br></div><div>I would guess this could just be done by making the type abstract so you can't easily get to the underlying bytes. Windows will only ever give you even-length bytestrings (in directory listings or similar) and all the other ways of synthesizing paths from strings could be set up to preserve the evenness.</div><div><br></div><div>If you end up passing an odd-length bytestring to Windows as a path then Bad Things could certainly happen, but no worse than mucking around with other unsafe APIs like Data.ByteString.Internal.</div><div><br></div><div> </div><div class="gmail_extra"><br><div class="gmail_quote">On 28 June 2015 at 02:02, Yitzchak Gale <span dir="ltr"><<a href="mailto:gale@sefer.org" target="_blank">gale@sefer.org</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;padding-left:1ex"><div dir="ltr">OK, based on what David and Brandon wrote, I guess<div>that representing paths as bytestrings does make</div><div>some low-level sense on all platforms. Although</div><div>for Windows we would still need some way to deal</div><div>with the requirement that the bytestring have an even</div><div>length.</div><div><br></div><div>We will need platform-dependent coercions of</div><div>paths to and from String/Text. Those might sometimes</div><div>be partial functions. We need a notion of the coercions</div><div>for the current platform, and we also need it to be</div><div>possible to access the coercions for all platforms.<div><div class="h5"><br><br><div class="gmail_quote"><div dir="ltr">On Sun, Jun 28, 2015 at 12:28 AM David Turner <<a href="mailto:dct25-561bs@mythic-beasts.com" target="_blank">dct25-561bs@mythic-beasts.com</a>> wrote:<br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;padding-left:1ex"><div dir="ltr">Hi,<div><br></div><div>I'm +1 on the general idea of this proposal. Using String for filenames has caused me all sorts of trouble, particularly when I've had to deal with a bunch of files whose names don't all use the same encoding.</div><div><br></div><div>However, be careful about the exact semantics of filenames on Windows. Quoting MSDN:</div><div><blockquote style="margin:0px 0px 0px 40px;border:none;padding:0px"><div><br></div><div>There is no need to perform any Unicode normalization on path and file name strings for use by the Windows file I/O API functions because<i> <b>the file system treats path and file names as an opaque sequence of WCHARs</b></i>. Any normalization that your application requires should be performed with this in mind, external of any calls to related Windows file I/O API functions.</div></blockquote></div><div><br></div><div>(from <a href="https://msdn.microsoft.com/en-us/library/windows/desktop/aa365247(v=vs.85).aspx" target="_blank">https://msdn.microsoft.com/en-us/library/windows/desktop/aa365247(v=vs.85).aspx</a>, emphasis mine)</div><div><br></div><div>Thus FilePath = String (or Text) doesn't really seem correct on Windows either (although it'll be pretty close as long as you stay within the BMP).</div><div><br></div><div>By my reckoning, when you get down to brass tacks, all filesystems on all platforms name files with sequences of bytes. There are various interesting ways to represent these bytes to human beings as sequences of characters, but aiming for FilePath = ByteString everywhere and dealing with the conversion to characters elsewhere seems more correct.</div><div><br></div><div>Cheers,</div><div><br></div><div>David</div><div><br></div><div></div></div><div dir="ltr"><div><br><br>On 27 June 2015 at 22:02, Brandon Allbery <<a href="mailto:allbery.b@gmail.com" target="_blank">allbery.b@gmail.com</a>> wrote:<br>> On Sat, Jun 27, 2015 at 4:50 PM, Yitzchak Gale <<a href="mailto:gale@sefer.org" target="_blank">gale@sefer.org</a>> wrote:<br>>><br>>> On Mac OS X, it's normalized Unicode. The important<br>>> point is *normalized* - if you create a FilePath from two<br>>> different Unicode strings that have the same normalized<br>>> form, the result FilePaths must be equal on Mac OS X.<br>><br>><br>> This is only true for higher level OS X APIs. ghc normally operates in the<br>> BSD layer, which mostly follows POSIX semantics; in particular, filesystem<br>> paths are bytestrings in the BSD layer, and only normalized in Cocoa APIs.<br>> (Which, among other things, means you can make a GUI application dump core<br>> by trying to use a file dialog in a directory containing a filename created<br>> using the BSD API which does not use a UTF8 encoding.)<br>><br>> --<br>> brandon s allbery kf8nh sine nomine associates<br>> <a href="mailto:allbery.b@gmail.com" target="_blank">allbery.b@gmail.com</a> <a href="mailto:ballbery@sinenomine.net" target="_blank">ballbery@sinenomine.net</a><br>> unix, openafs, kerberos, infrastructure, xmonad <a href="http://sinenomine.net" target="_blank">http://sinenomine.net</a><br>><br></div></div><div dir="ltr"><div>> _______________________________________________<br>> Libraries mailing list<br>> <a href="mailto:Libraries@haskell.org" target="_blank">Libraries@haskell.org</a><br>> <a href="http://mail.haskell.org/cgi-bin/mailman/listinfo/libraries" target="_blank">http://mail.haskell.org/cgi-bin/mailman/listinfo/libraries</a><br>><br><br></div></div></blockquote></div></div></div></div></div>
</blockquote></div><br></div></div>