Unicode
Ashley Yakeley
ashley@semantic.org
Thu, 24 May 2001 14:41:21 -0700
At 2001-05-24 05:57, Julian Seward (Intl Vendor) wrote:
> - Initial Unicode support - the Char type is now 31 bits.
It might be appropriate to have two types for Unicode, a UCS2 type (16
bits) and a UCS4 type (31 bits). For instance, something like:
--
newtype UCS2CodePoint = MkUCS2CodePoint Word16
newtype UCS4CodePoint = MkUCS4CodePoint Word31
type Char = UCS4CodePoint
toUCS4 :: UCS2CodePoint -> UCS4CodePoint
fromUCS4 :: UCS4CodePoint -> Maybe UCS2CodePoint
encodeUTF16 :: [UCS4CodePoint] -> Maybe [UCS2CodePoint]
decodeUTF16 :: [UCS2CodePoint] -> Maybe [UCS4CodePoint]
--
--
Ashley Yakeley, Seattle WA