UTF-8 encode/decode libraries.
ger at informatik.uni-bremen.de
Tue May 4 12:16:22 EDT 2004
Sven Panne wrote:
> Hmmm, "String -> [Word8]" would be nicer...
My UTF8 encoder is
toUTF8 :: String -> String
but an obvious alternative would be
toUTF8 :: Enum codedChar => String -> [codedChar]
and I could implement this quite easily, by globally-exchanging
chr with toEnum. It would then be appropriate to SPECIALIZE
to types String -> String and String -> [Word8], satisfying
both the purists and those who actually want to write the
output to a file.
> ... and here: "[Word8] -> String" or "[Word8] -> Maybe String
and my UTF8 decoder has type
fromUTF8WE :: Monad m => String -> m String
Errors are reported by "fail". If for example you import
Control.Monad.Error that means you have a function returning
either an error message or the converted string
fromUTF8WE :: String -> Either String String
Of course for Word8, you would change the type of the decoder to
fromUTF8WE :: (Monad m,Enum codedChar) => [codedChar] -> m String
Incidentally I am *hoping* I shall be able to say that my UTF8 code
is LGPL but you know what University administrators are like ...
More information about the Libraries