[Haskell-i18n] Surrogate pairs?

Sven Moritz Hallberg pesco@gmx.de
20 Aug 2002 02:06:17 +0200


Hi,

I just implemented a UTF-8 coder and decoder in Haskell. While reading
the Unicode standard I realized what someone had pointed out earlier
with respect to code values versus code points: Unicode, while "usually"
using 16-bit words, supports "surrogate pairs" to handle all 31 bits of
UCS-4.

The report says, Char is a 16-bit Unicode value. What's the stance on
surrogate pairs? How are we going to support those? My code currently
just errors "unsupported" when encountering a surrogate.


Regards,
Sven Moritz