UTF8 libraries

shelarcy shelarcy at gmail.com
Fri Feb 2 10:40:14 EST 2007

Hi Alistar,

On Fri, 02 Feb 2007 21:01:04 +0900, Alistair Bayley <alistair at abayley.org> wrote:
> What is the state of UTF8 support in Haskell libraries (base or
> user-contributed)? I had a need for a UTF8 en & de-coder for Takusen,
> and after looking around couldn't find anything particularly
> satisfactory, so ended up writing (yet another) one.

regex-posix doesn't support UTF8. Because regex-posix uses POSIX
regex. So this problem can't fixed by only correct UTF8 en & de-coder.

If someone is interested in suppourting UTF8, I recommend to
use oniguruma.


Oniguruma also supports UTF-16BE, UTF-16LE, UTF-32BE, UTF-32LE,
etc .... And it is portable, it's available both on Unix and

So I think it is best regex C library to choose backend.

shelarcy <shelarcy    capella.freemail.ne.jp>

More information about the Libraries mailing list