UTF-8 library
Manuel M T Chakravarty
chak@cse.unsw.edu.au
Thu, 08 Aug 2002 19:28:18 +1000 (EST)
Axel Simon <A.Simon@ukc.ac.uk> wrote,
> On Wed, Aug 07, 2002 at 02:54:47AM -0700, Ashley Yakeley wrote:
> > At 2002-08-07 02:43, Axel Simon wrote:
> >
> > >But the point was that C might have different sized characters and that
> > >these functions would still be portable even if the size of CChar changes.
> >
> > Text encoded with ISO 8859-1 or UTF-8 is octets. If you want to use
> > CChars, you should then subsequently convert the Word8s into CChars.
> Then I hope there is no C implementation where char is less than 8 bits
> long.
ANSI C guarantees that char is 1 byte (more precisely that
"sizeof (char)" == 1).
Manuel