[Haskell-cafe] Re: Unicode workaround for getDirectoryContents
under Windows?
Simon Marlow
marlowsd at gmail.com
Tue Jun 16 09:02:43 EDT 2009
On 16/06/2009 13:46, Yitzchak Gale wrote:
> Simon Marlow wrote:
>>>> Care to submit a patch to put this in System.Directory, or better still
>>>> put the relevant functionality in System.Win32 and use it in
>>>> System.Directory?
>
> Bulat Ziganshin wrote:
>>> now getDirectoryContents return ACP (ansi code page) names so openFile
>>> works for files 1) and 2).
>>> With such change getDirectoryContents will return correct unicode
>>> names, so openFile will work only with names in first group.
>>> The right way is to fix ALL string-related calls in System.IO,
>>> System.Posix.Internals, System.Environment.
>
>> You're right in that we really ought to fix everything. However, I'm happy
>> to just fix some of these things, even if it introduces some inconsistencies
>> in the meantime. We already have much of System.Directory working with
>> Unicode FilePaths, so there are already inconsistencies here.
>
> +1 for integrating Unicode file paths. Thanks, Bulat!
Excuse my ignorance, but... what Unicode file paths?
> I think the most important use cases that should not break are:
>
> o open/read/write a FilePath from getArgs
> o open/read/write a FilePath from getDirectoryContents
>
> There's not much we can do about non-Latin-1 ACP file paths
> hard coded in Strings. I hope there aren't too many
> of those in the wild.
The following cases are currently broken:
* Calling openFile on a literal Unicode FilePath (note, not
ACP-encoded, just Unicode).
* Reading a Unicode FilePath from a text file and then calling
openFile on it
I propose to fix these (on Windows). It will mean that your second case
above will be broken, until someone fixes getDirectoryContents.
Also currently broken:
* calling removeFile on a FilePath you get from getDirectoryContents,
amongst other System.Directory operations
Fixing getDirectoryContents will fix these.
I don't know how getArgs fits in here - should we be decoding argv using
the ACP?
Cheers,
Simon
More information about the Haskell-Cafe
mailing list