Unicode support
Jens Petersen
petersen@redhat.com
01 Oct 2001 10:28:38 +0900
Colin Paul Adams <colin@colina.demon.co.uk> writes:
> >>>>> "Jens" == Jens Petersen <petersen@redhat.com> writes:
>
> Jens> 16 bits is enough to describe the Basic Multilingual Plane
> Jens> and I think 24 bits all the currently defined extended
> Jens> planes. So I guess the report just refers to the BMP.
>
> I guess it does, and I think back in 1998 that may still have been
> identical to Unicode.
> But the revision of the report that SPJ is preparing is unchanged in
> this respect, and so is factually inaccurate.
> I think it should either be amended to mention the BMP subset of
> Unicode, or, better, change the reference from 16-bit to 24-bit.
Actually ISO 10646 defines formally a 31-bit character set. [1]
Is it still possible to update the latest revision of the
Haskell 98 report to this effect? (There was some unicode
discussion earlier, about upper and lower case if I remember
correctly, but I am surprised noone has raised this point
again.)
Jens
ps We need better unicode support in the implementations too.
At least ghc-5 has 31 bit Char's now! Hurrah!
Footnotes:
[1] http://www.cl.cam.ac.uk/~mgk25/unicode.html