UTF-8 library

John Meacham john@repetae.net
Tue, 6 Aug 2002 05:38:13 -0700


One major nit I have with this is the type signature of 
decodeUTF8 and encodeUTF8
a String should always represent a string of characters, not a byte
stream, the signatures should be

decodeUTF8 :: String -> [Word8]
encodeUTF8 :: [Word8] -> String

this problem occurs all over the place in the haskell libraries, now
that the FFI spec gives us Word8 we should make use of it. Just a pet
peeve of mine. good work otherwise, i like it. you might want to check
out my Format.hs (similar to your printf module but somewhat more
powerful) and my modified utf8 code to use byte streams properly...

http://repetae.net/john/computer/haskell/Format.hs
http://repetae.net/john/computer/haskell/UTF8.hs


	John

On Mon, Aug 05, 2002 at 12:12:02PM +0200, Martin Norbäck wrote:
> In a previous thread on this mailing list I proposed a way to use
> gettext. Since Haskell uses Unicode to represent characters, and gettext
> has support for converting strings into UTF-8, I made that the default
> mode of operation in my I18N module.
> 
> I found a UTF8 library, which I modified to pass illegal UTF-8 sequences
> through unchanged.
> 
> Since Haskell uses Unicode, and UTF-8 is one of the most common
> encodings for unicode, it would be good to have a UTF-8 library like
> this.
> You can find the files at http://www.dtek.chalmers.se/~d95mback/gettext/
> where you also can find a printf-like implementation.
> 
> I would very much like to hear some comments, also on the use of
> unsafePerformIO with locale-dependent functions (gettext is of course
> locale-dependent).


-- 
---------------------------------------------------------------------------
John Meacham - California Institute of Technology, Alum. - john@foo.net
---------------------------------------------------------------------------