Unicode support
Ashley Yakeley
ashley@semantic.org
Tue, 9 Oct 2001 03:27:31 -0700
At 2001-10-09 02:58, Kent Karlsson wrote:
>In summary:
>
> code position (=code point): a value between 0000 and 10FFFF.
Would this be a reasonable basis for Haskell's 'Char' type? At some point
perhaps there should be a 'Unicode' standard library for Haskell. For
instance:
encodeUTF8 :: String -> [Word8];
decodeUTF8 :: [Word8] -> Maybe String;
encodeUTF16 :: String -> [Word16];
decodeUTF16 :: [Word16] -> Maybe String;
data GeneralCategory = Letter_Uppercase | Letter_Lowercase | ...
getGeneralCategory :: Char -> Maybe GeneralCategory;
...sorting & searching...
...canonicalisation...
etc. Lots of work for someone.
--
Ashley Yakeley, Seattle WA