[Haskell-cafe] Re: Unicode workaround for getDirectoryContents under Windows?

Simon Marlow marlowsd at gmail.com
Wed Jun 17 10:18:01 EDT 2009


On 17/06/2009 15:03, Yitzchak Gale wrote:
> Simon Marlow wrote:
>>>> The following cases are currently broken...
>>>> I propose to fix these (on Windows).  It will mean that your second case
>>>> above will be broken, until someone fixes getDirectoryContents...
>> ...it's a lot easier on Windows...
>> on Unix I don't have a clear idea of how to proceed...
>> If someone else has a good understanding of what
>> needs done, please wade in.
>>>> I don't know how getArgs fits in here...
>> I agree it's broken and needs to be fixed.
>
> OK, would you like me to reflect this discussion in tickets?
> Let's see, so far we have #3300, I don't see anything else.
>
> Do you want two tickets, one each for WIndows/Unix? Or
> four, separating the FilePath and getArgs issues?

One for each issue is usually better, so four.  Thanks!

>> On Unix, all file APIs take [Word8]...
>> So we should probably be converting from FilePath to
>> [Word8] by encoding using the current locale...
>> what about encoding errors,
>
> Where relevant, we should emulate what the common
> shells do. In general, I don't see why they should be different
> than any other file operation error.
>
>> and what if encode.decode is not the identity due to normalisation
>
> Well, is it common for people using typical input methods
> and common shells to create file paths containing
> text that decodes to non-normalized Unicode?
>
> I'm guessing not. If that's the case, then we don't really have
> to worry about it. People who went out of their way to create
> a weird file name will have the same troubles they have
> always had with that in Unix.
>
> But perhaps a better solution would be to make the underlying
> type of FilePath platform-dependent - e.g., String on Windows
> and [Word8] on Unix - and let it support platform-
> independent methods such as to/from String, to/from Bytes,
> setEncoding (defaulting to the current locale). That way,
> pass-through file paths will always work flawlessly on any
> platform, and applications have complete flexibility
> to deal with any other scenario however they choose. It's a
> breaking change though.

Yes, we coud do a lot better if FilePath was an abstract type, but sadly 
it is not, and we can't change that without breaking Haskell 98 
compatibility, not to mention tons of existing code.

Cheers,
	Simon


More information about the Haskell-Cafe mailing list