[Haskell-cafe] PROPOSAL: New efficient Unicode string library.

Johan Tibell johan.tibell at gmail.com
Wed Sep 26 03:05:40 EDT 2007


> I'll look over the proposal more carefully when I get time, but the
> most important issue is to not let the storage type leak into the
> interface.

Agreed,

>  From an implementation point of view, UTF-16 is the most efficient
> representation for processing Unicode. It's the native Unicode
> representation for Windows, Mac OS X, and the ICU open source i18n
> library. UTF-8 is not very efficient for anything except English. Its
> most valuable property is compatibility with software that thinks of
> character strings as byte arrays, and in fact that's why it was
> invented.

If UTF-16 is what's used by everyone else (how about Java? Python?) I
think that's a strong reason to use it. I don't know Unicode well
enough to say otherwise.


More information about the Haskell-Cafe mailing list