Unicode support

Jens Petersen petersen@redhat.com
01 Oct 2001 10:28:38 +0900


Colin Paul Adams <colin@colina.demon.co.uk> writes:

> >>>>> "Jens" == Jens Petersen <petersen@redhat.com> writes:
> 
>     Jens> 16 bits is enough to describe the Basic Multilingual Plane
>     Jens> and I think 24 bits all the currently defined extended
>     Jens> planes.  So I guess the report just refers to the BMP.
> 
> I guess it does, and I think back in 1998 that may still have been
> identical to Unicode. 
> But the revision of the report that SPJ is preparing is unchanged in
> this respect, and so is factually inaccurate.
> I think it should either be amended to mention the BMP subset of
> Unicode, or, better, change the reference from 16-bit to 24-bit.

Actually ISO 10646 defines formally a 31-bit character set. [1]

Is it still possible to update the latest revision of the
Haskell 98 report to this effect?  (There was some unicode
discussion earlier, about upper and lower case if I remember
correctly, but I am surprised noone has raised this point
again.)

Jens

ps We need better unicode support in the implementations too.
At least ghc-5 has 31 bit Char's now!  Hurrah!

Footnotes: 
[1]  http://www.cl.cam.ac.uk/~mgk25/unicode.html