CWString

Marcin 'Qrczak' Kowalczyk qrczak at knm.org.pl
Thu Aug 28 12:40:37 EDT 2003


Dnia czw 28. sierpnia 2003 12:34, Simon Marlow napisał:

> The only right way to do this, it seems, is to generate the tables from
> UnicodeData.txt.  However, I'm prepared to live with the current
> solution as long as we document its shortcomings.  After all, it does
> the right thing on the majority of our installed base.

I have some UnicodeData.txt-generated predicate functions & toUpper/toLower 
in QForeign. There doesn't seem to be an official mapping from Unicode 
character categories to various predicates (I've once tried to find one 
asking on Unicode groups with no success - I was told that for good 
definitions of some predicates character categories are not precise 
enough), or even the set of useful predicates, and it's not clear which 
predicates should recognize only ASCII characters (most probably isDigit). 
It's all to be designed, perhaps even by changing Haskell 98 a bit.
QForeign at least has some machinery to generate tables and functions from 
UnicodeData and some proposals of predicate definitions.

I think these functions should all be locale-indepentent, and ideally 
Haskell should have a portable definition - perhaps in terms of character 
categories.

Usable wcwidth requires it to be the same on both sides of a remote 
terminal; <http://www.cl.cam.ac.uk/~mgk25/ucs/wcwidth.c> suggests
a definition of wcwidth. wcwidth was removed from the ANSI C addendum which 
defined other wide character functions, but it's in some POSIX standards.

-- 
   __("<         Marcin Kowalczyk
   \__/       qrczak at knm.org.pl
    ^^     http://qrnik.knm.org.pl/~qrczak/




More information about the FFI mailing list