[Haskell-cafe] UTF-16
Bulat Ziganshin
bulat.ziganshin at gmail.com
Thu Jul 26 05:00:21 EDT 2007
Hello Donald,
Thursday, July 26, 2007, 8:13:37 AM, you wrote:
>> I don't know if anybody cares, but... Today a wrote some trivial code to
>> decode (not encode) UTF-16.
These functions already exist in win-specific part of base:
cWcharsToChars :: [CWchar] -> [Char]
charsToCWchars :: [Char] -> [CWchar]
#ifdef mingw32_HOST_OS
-- On Windows, wchar_t is 16 bits wide and CWString uses the UTF-16 encoding.
-- coding errors generate Chars in the surrogate range
cWcharsToChars = map chr . fromUTF16 . map fromIntegral
where
fromUTF16 (c1:c2:wcs)
| 0xd800 <= c1 && c1 <= 0xdbff && 0xdc00 <= c2 && c2 <= 0xdfff =
((c1 - 0xd800)*0x400 + (c2 - 0xdc00) + 0x10000) : fromUTF16 wcs
fromUTF16 (c:wcs) = c : fromUTF16 wcs
fromUTF16 [] = []
charsToCWchars = foldr utf16Char [] . map ord
where
utf16Char c wcs
| c < 0x10000 = fromIntegral c : wcs
| otherwise = let c' = c - 0x10000 in
fromIntegral (c' `div` 0x400 + 0xd800) :
fromIntegral (c' `mod` 0x400 + 0xdc00) : wcs
--
Best regards,
Bulat mailto:Bulat.Ziganshin at gmail.com
More information about the Haskell-Cafe
mailing list