[Haskell-cafe] Core packages and locale support

Brandon S Allbery KF8NH allbery at ece.cmu.edu
Fri Jun 25 05:00:08 EDT 2010


-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

On 6/25/10 02:42 , Roman Cheplyaka wrote:
> * Jason Dagit <dagit at codersbase.com> [2010-06-24 20:52:03-0700]
>> On Sat, Jun 19, 2010 at 1:06 AM, Roman Cheplyaka <roma at ro-che.info> wrote:
>>> While ghc 6.12 finally has proper locale support, core packages (such as
>>> unix) still use withCString and therefore work incorrectly when argument
>>> (e.g. file path) is not ASCII.
>>
>> Pardon me if I'm misunderstanding withCString, but my understanding of unix
>> paths is that they are to be treated as strings of bytes.  That is, unlike
>> windows, they do not have an encoding predefined.  Furthermore, you could
>> have two filepaths in the same directory with different encodings due to
>> this.
> 
> you got everything right here. So, as you said, there is a mismatch
> between representation in Haskell (list of code points) and
> representation in the operating system (list of bytes), so we need to
> know the encoding. Encoding is supplied by the user via locale
> (https://secure.wikimedia.org/wikipedia/en/wiki/Locale), particularly
> LC_CTYPE variable.

You might want to look at how Python is dealing with this (including the
pain involved; best to learn from example).

- -- 
brandon s. allbery     [linux,solaris,freebsd,perl]      allbery at kf8nh.com
system administrator  [openafs,heimdal,too many hats]  allbery at ece.cmu.edu
electrical and computer engineering, carnegie mellon university      KF8NH
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.10 (Darwin)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/

iEYEARECAAYFAkwkcAYACgkQIn7hlCsL25W4BgCfVEyndklgo2TOyyemqdTKGkvS
dBMAoKq3t9vMOkZZHiEHkIN5IDjgVbRt
=69C5
-----END PGP SIGNATURE-----


More information about the Haskell-Cafe mailing list