UTF-8 library
Axel Simon
A.Simon@ukc.ac.uk
Wed, 7 Aug 2002 17:18:06 +0100
On Wed, Aug 07, 2002 at 03:29:33PM +0200, George Russell wrote:
> Ashley Yakeley wrote
> [snip]
> > Text encoded with ISO 8859-1 or UTF-8 is octets. If you want to use
> > CChars, you should then subsequently convert the Word8s into CChars.
> We were talking about converting CStrings, which are necessarily sequences of CChars.
> I have to say I do not relish the prospect of replacing the current peekCString
> interface by three functions which
> (1) translate a CString/CStringLen into [CChar]
> (2) translate a [CChar] into a [Word8]
> (3) translate a [Word8] into a String
> (and of course the inverse functions to go in the other direction.)
I guess you can avoid (2) or (3) in practice.
> If we want to do this, we certainly need to keep the existing functions since I
> certainly don't want to have to pass a String through three separate transformations
> just to make it suitable for C.
I don't see a problem with supplying a backward compatible withCString
function, even it might use the current locale to do the conversion.
Let's just wait till someone actually has a ready-to-criticise library for
ghc at hand.
Axel.