Adding System.FilePath

Sven Panne sven.panne at
Fri Mar 16 09:24:47 EDT 2007

On Friday 16 March 2007 00:15, Wolfgang Thaller wrote:
> Indeed, paths and command-line arguments are becoming very string-
> like on Unix systems, too.
> On Mac OS X, the locale for file names is pretty much hardcoded to
> UTF-8. Mac OS X's native file system stores file names in UTF-16, but
> the POSIX layer sees it as UTF-8. [...]

I had a look how other languages and toolkits handle this issue. For those 
which do not completely ignore it, the consensus seems to be:

   * On Mac OS X, the POSIX layer indeed seems to use UTF-8, but in a 
*decomposed* form. This could be a little bit surprising, so some 
normalization is probably needed.

   * On Windows, the current ANSI code page is assumed, which could vary from 
installation to installation and can be changed by the user AFAIK.

   * For *nices the story is a bit tricky, but often a combination of

         * nl_laninfo(CODESET)
         * setlocale(LC_TYPE, 0)
         * the environment variables LC_ALL, LC_TYPE and LANG
         * iconv

     is used to figure out the current local encoding and use that. Depending 
on the distribution, it can be UTF-8 (e.g. recent SuSE distros), but it 
doesn't have to be.

So I propose a compromise, we don't really have to be better than most 
languages/toolkits out there: Let's keep FilePath = String, but improve the 
real culprit, i.e. CString and friends. Currently, peekCString{,Len}, 
newCString{,Len} and withCString{,Len} simply use their "CA" ASCII 
counterparts. If we put the above common logic into Foreign.C.String, we 
could already achieve a lot.

In addition, we might consider adding some e.g. ByteString-based API entries 
to the POSIX package for the real low-level stuff, but I think this is not a 
topmost priority.



More information about the Libraries mailing list