UTF-8 library

Manuel M T Chakravarty chak@cse.unsw.edu.au
Thu, 08 Aug 2002 19:28:18 +1000 (EST)


Axel Simon <A.Simon@ukc.ac.uk> wrote,

> On Wed, Aug 07, 2002 at 02:54:47AM -0700, Ashley Yakeley wrote:
> > At 2002-08-07 02:43, Axel Simon wrote:
> > 
> > >But the point was that C might have different sized characters and that 
> > >these functions would still be portable even if the size of CChar changes. 
> > 
> > Text encoded with ISO 8859-1 or UTF-8 is octets. If you want to use 
> > CChars, you should then subsequently convert the Word8s into CChars.
> Then I hope there is no C implementation where char is less than 8 bits 
> long.

ANSI C guarantees that char is 1 byte (more precisely that
"sizeof (char)" == 1).

Manuel