UTF-8 library
George Russell
ger@tzi.de
Tue, 06 Aug 2002 18:11:04 +0200
Axel wrote
[snip]
> I guess that is a good point, but due to backwards compatibility this is
> propably not acceptable: The C interface of the FFI has the string
> functions:
> peekCString :: CString -> IO String
> newCString :: String -> IO CString
>
> which should really be
>
> peekCString :: CString -> IO [Word8]
> newCString :: [Word8] -> IO CString
>
> Unless that changes, there is really no point to give the encode and
> decode functions that type.
[snip]
Such a change would be annoying, since I have already used peekCString and
newCString quite a lot. (They are a great improvement on what we had before!)
Converting CStrings to [Word8] is probably a bad idea anyway, since there is
absolutely no reason to assume a C character will be only 8 bits long, and
under some implementations it isn't.
A better suggestion would be to provide ALTERNATIVE functions which
got from CString/CStringLen and friends to [CChar], and make your UTF8
converters go between [CChar] and String. However we should not be forced
to do this every time we want to construct a CString from a String (a very
common need when calling C functions) so the existing functions should remain
with their existing semantics.