[Haskell-cafe] Re: Unicode workaround for getDirectoryContents under Windows?

Simon Marlow marlowsd at gmail.com
Tue Jun 16 07:30:31 EDT 2009


On 14/06/2009 05:56, Judah Jacobson wrote:
> On Sat, Jun 13, 2009 at 8:41 PM, Shu-yu Guo<shu at rfrn.org>  wrote:
>> Hello all,
>>
>> It seems like getDirectoryContents applies codepage conversion based
>> on the default program locale under Windows. What this means is that
>> if my default codepage is some kind of Latin, Asian glyphs get
>> returned as '?' in the filename. By '?' I don't mean that the font is
>> lacking the glyph and rendering it as '?', but I mean 'show (head
>> (getDirectoryContents "C:\\Music"))' returns something that looks like
>> like "?? ????".
>>
>> This is a problem as I can't get the filenames of my music directory,
>> some of which are in Japanese and Chinese, some of which have accents.
>> If I change the default codepage to Japanese, say, then I get the
>> Japanese filenames in Shift-JIS and I lose all the accented letters.
>>
>> I have filed this as a bug already, but is there a workaround in the
>> meantime (I don't know the Win32 API, but didn't see anything that
>> looked like it would help under System.Win32 anyways) that lets me
>> gets the list of files in a directory that's encoded in some kind of
>> Unicode?
>
> Try taking a look at the code in the following module, which uses FFI
> to access the Unicode-aware Win32 APIs:
>
> http://code.haskell.org/haskeline/System/Console/Haskeline/Directory.hsc

Care to submit a patch to put this in System.Directory, or better still 
put the relevant functionality in System.Win32 and use it in 
System.Directory?

Cheers,
	Simon


More information about the Haskell-Cafe mailing list