behaviour change in getDirectoryContents in GHC 7.2?

Tue Nov 1 18:14:02 CET 2011

You're right -- many parts of system-fileio (the parts based on
"directory") are broken due to this. I'll need to update it to call
the posix/win32 functions directly.

IMO, the GHC behavior in <=7.0 is ugly, but the behavior in 7.2 is
fundamentally wrong.

Different OSes have different definitions of a "file path". A Windows
path is a sequence of Unicode characters. A Linux/BSD path is a
sequence of bytes. I'm not certain what OSX does, but I believe it
uses bytes also.

In GHC <= 7.0, the String type was used for both sorts of paths, with
interpretation of the contents being OS-dependent. This sort of works,
because it's possible to represent both byte- and text-based paths in

GHC 7.2 assumes Linux/BSD paths are text, which 1) silently breaks all
existing code and 2) makes it impossible to fix within the given API.

On Tue, Nov 1, 2011 at 08:48, Felipe Almeida Lessa
<felipe.lessa at> wrote:
> On Tue, Nov 1, 2011 at 5:16 AM, Ganesh Sittampalam <ganesh at> wrote:
>> I'm just investigating what we can do about a problem with darcs'
>> handling of non-ASCII filenames on GHC 7.2.
>> The issue is apparently that as of GHC 7.2, getDirectoryContents now
>> tries to decode filenames in the current locale, rather than converting
>> a stream of bytes into characters:
>> I found an old thread on the subject:
>> and
>> some GHC tickets (e.g.
>> Can anyone point me at the rationale and details of the change and/or
>> suggest workarounds?
> You could try using system-fileio [1], but by reading its source code
> I guess that it may have the same bug (since it tries to decode what
> the directory package gives).  I'm CCing John Millikin, its
> maintainer.
