behaviour change in getDirectoryContents in GHC 7.2?

John Millikin jmillikin at gmail.com
Tue Nov 8 16:42:43 CET 2011


On Tue, Nov 8, 2011 at 03:04, Simon Marlow <marlowsd at gmail.com> wrote:
>> As mentioned earlier in the thread, this behavior is breaking things.
>> Due to an implementation error, programs compiled with GHC 7.2 on
>> POSIX systems cannot open files unless their paths also happen to be
>> valid text according to their locale. It is very difficult to work
>> around this error, because the paths-are-text logic was placed at a
>> very low level in the library stack.
>
> So your objection is that there is a bug?  What if we fixed the bug?

My objection is that the current implementation provides no way to
work around potential bugs.

GHC is software. Like all software, it contains errors, and new
features are likely to contain more errors. When adding behavior like
automatic path encoding, there should always be a way to avoid or work
around it, in case a severe bug is discovered.

>>> It would probably be better to have an abstract FilePath type and to keep
>>> the original bytes, decoding on demand.  But that is a big change to the
>>> API
>>> and would break much more code.  One day we'll do this properly; for now
>>> we
>>> have this, which I think is a pretty reasonble compromise.
>>
>> Please understand, I am not arguing against the existence of this
>> encoding layer in general. It's a fine idea for a simplistic
>> high-level filesystem interaction library. But it should be
>> *optional*, not part of the compiler or "base.
>
> Ok, so I was about to reply and say that the low-level API is available via
> the unix and Win32 packages, and then I thought I should check first, and I
> discovered that even using System.Posix you get the magic encoding
> behaviour.
>
> I really think we should provide the native APIs.  The problem is that the
> System.Posix.Directory API is all in terms of FilePath (=String), and if we
> gave that a different meaning from the System.Directory FilePaths then
> confusion would ensue.  So perhaps we need to add another API to
> System.Posix with filesystem operations in terms of ByteString, and
> similarly for Win32.

+1

I think most users would be OK with having System.Posix treat FilePath
differently, as long as this is clearly documented, but if you feel a
separate API is better then I have no objection. As long as there's
some way to say "I know what I'm doing, here's the bytes" to the
library.

The Win32 package uses wide-character functions, so I'm not sure
whether bytes would be appropriate there. My instinct says to stick
with chars, via withCWString or equivalent. The package maintainer
will have a better idea of what fits with the OS's idioms.



More information about the Glasgow-haskell-users mailing list