Marcin 'Qrczak' Kowalczyk
qrczak at knm.org.pl
Thu Aug 28 12:40:37 EDT 2003
Dnia czw 28. sierpnia 2003 12:34, Simon Marlow napisał:
> The only right way to do this, it seems, is to generate the tables from
> UnicodeData.txt. However, I'm prepared to live with the current
> solution as long as we document its shortcomings. After all, it does
> the right thing on the majority of our installed base.
I have some UnicodeData.txt-generated predicate functions & toUpper/toLower
in QForeign. There doesn't seem to be an official mapping from Unicode
character categories to various predicates (I've once tried to find one
asking on Unicode groups with no success - I was told that for good
definitions of some predicates character categories are not precise
enough), or even the set of useful predicates, and it's not clear which
predicates should recognize only ASCII characters (most probably isDigit).
It's all to be designed, perhaps even by changing Haskell 98 a bit.
QForeign at least has some machinery to generate tables and functions from
UnicodeData and some proposals of predicate definitions.
I think these functions should all be locale-indepentent, and ideally
Haskell should have a portable definition - perhaps in terms of character
Usable wcwidth requires it to be the same on both sides of a remote
terminal; <http://www.cl.cam.ac.uk/~mgk25/ucs/wcwidth.c> suggests
a definition of wcwidth. wcwidth was removed from the ANSI C addendum which
defined other wide character functions, but it's in some POSIX standards.
__("< Marcin Kowalczyk
\__/ qrczak at knm.org.pl
More information about the FFI