UTF8 libraries
shelarcy
shelarcy at gmail.com
Fri Feb 2 10:40:14 EST 2007
Hi Alistar,
On Fri, 02 Feb 2007 21:01:04 +0900, Alistair Bayley <alistair at abayley.org> wrote:
> What is the state of UTF8 support in Haskell libraries (base or
> user-contributed)? I had a need for a UTF8 en & de-coder for Takusen,
> and after looking around couldn't find anything particularly
> satisfactory, so ended up writing (yet another) one.
regex-posix doesn't support UTF8. Because regex-posix uses POSIX
regex. So this problem can't fixed by only correct UTF8 en & de-coder.
If someone is interested in suppourting UTF8, I recommend to
use oniguruma.
http://www.geocities.jp/kosako3/oniguruma/
Oniguruma also supports UTF-16BE, UTF-16LE, UTF-32BE, UTF-32LE,
etc .... And it is portable, it's available both on Unix and
Windows.
So I think it is best regex C library to choose backend.
--
shelarcy <shelarcy capella.freemail.ne.jp>
http://page.freett.com/shelarcy/
More information about the Libraries
mailing list