Raw filenames vs locales

Udo Stenzel u.stenzel at web.de
Sat Jul 30 12:13:21 EDT 2005


Ian Lynagh wrote:
> ===========
> The problem
> ===========
> 
> With it's closer adherence to the Haskell 98 report, it is no longer
> possible with hugs to manipulate files using the standard IO functions
> if the filenames are not representable in your locale.

Note that this basically means your filesystem is broken.  This
situation can only occur if a filesystem is written in one and then read
in another locale.  This "problem" cannot really be fixed, only worked
around.


> UTF-8:       65533 = U+FFFD = "replacement character"
> 
> =================
> Proposed solution
> =================

I have a simpler proposal: allocate 128 "replacement characters" in the
"Vendor Zone" of Unicode.  Their purpose is as place holders for
incorrect UTF8.  Then use these replacement characters when decoding
UTF8 and reproduce the original, broken, code when re-encoding.  Under
ordinary circumstances these codes should never occur in strings.


> =======================
> Backwards compatibility
> =======================

comes at no additional cost ;-)

 

Udo.
-- 
It's not that perl programmers are idiots, it's that the language
rewards idiotic behavior in a way that no other language or tool has
ever done. -- Erik Naggum
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: Digital signature
Url : http://www.haskell.org//pipermail/libraries/attachments/20050730/ae51f749/attachment.bin


More information about the Libraries mailing list