[Haskell-cafe] Re: Unicode workaround for getDirectoryContents under Windows?

Yitzchak Gale gale at sefer.org
Wed Jun 17 10:03:14 EDT 2009

Simon Marlow wrote:
>>> The following cases are currently broken...
>>> I propose to fix these (on Windows).  It will mean that your second case
>>> above will be broken, until someone fixes getDirectoryContents...
> ...it's a lot easier on Windows...
> on Unix I don't have a clear idea of how to proceed...
> If someone else has a good understanding of what
> needs done, please wade in.
>>> I don't know how getArgs fits in here...
> I agree it's broken and needs to be fixed.

OK, would you like me to reflect this discussion in tickets?
Let's see, so far we have #3300, I don't see anything else.

Do you want two tickets, one each for WIndows/Unix? Or
four, separating the FilePath and getArgs issues?

> On Unix, all file APIs take [Word8]...
> So we should probably be converting from FilePath to
> [Word8] by encoding using the current locale...
> what about encoding errors,

Where relevant, we should emulate what the common
shells do. In general, I don't see why they should be different
than any other file operation error.

> and what if encode.decode is not the identity due to normalisation

Well, is it common for people using typical input methods
and common shells to create file paths containing
text that decodes to non-normalized Unicode?

I'm guessing not. If that's the case, then we don't really have
to worry about it. People who went out of their way to create
a weird file name will have the same troubles they have
always had with that in Unix.

But perhaps a better solution would be to make the underlying
type of FilePath platform-dependent - e.g., String on Windows
and [Word8] on Unix - and let it support platform-
independent methods such as to/from String, to/from Bytes,
setEncoding (defaulting to the current locale). That way,
pass-through file paths will always work flawlessly on any
platform, and applications have complete flexibility
to deal with any other scenario however they choose. It's a
breaking change though.


More information about the Haskell-Cafe mailing list