[Haskell] ANNOUNCE: Data.CompactString 0.1 - my attempt at a
zednenem at psualum.com
Mon Feb 5 23:05:39 EST 2007
Alistair Bayley writes:
> On 05/02/07, Chris Kuklewicz <haskell at list.mightyreason.com> wrote:
> > UTF-8 is a 4 byte encoding. There is no valid UTF-8 5 or 6 byte
> > encoding.
> Chris is right here, in that Takusen's decoder is incorrect w.r.t. the
> standard, in allowing up to 6 bytes to encode a single char.
> There's nothing stopping the Unicode consortium from expanding the
> range of codepoints, is there? Or have they said that'll never happen?
I believe they have. In particular, UTF-16 only supports code points up
> the UCS stops at 10FFFF and ISO/IEC 10646 has stated that all future
> assignments of characters will also take place in that range
> ISO 10646 was limited to contain as many characters as could be
> encoded by UTF-16 and no more, that is, a little over a million
> characters instead of over 2,000 million
David Menendez <zednenem at psualum.com> | "In this house, we obey the laws
<http://www.eyrie.org/~zednenem> | of thermodynamics!"
More information about the Haskell