[Haskell-cafe] Valid Haskell characters

Richard A. O'Keefe ok at cs.otago.ac.nz
Mon Aug 25 23:27:58 EDT 2008


On 26 Aug 2008, at 1:31 pm, Deborah Goldsmith wrote:

> You can't determine Unicode character properties by analyzing the  
> names of the characters.

However, the OP *does* have a copy of the UnicodeData...txt file,
and you *can* determine the relevant Unicode character properties from  
that.

For example, consider the entry for space:
0020;SPACE;Zs;0;WS;;;;;N;;;;;
            ^^
The Zs bit says it's a white space character
(Zs: separator/space, Zl: separator/line, Zp:
separator/paragraph).

Or look at capital A:
0041;LATIN CAPITAL LETTER A;Lu;0;L;;;;;N;;;;0061;^
                             ^^
The Lu bit says it's a L(etter) that is u(pper case).

Upper case: Lu, lower case: Ll, title case: Lt,
modifier letter: Lm, other letter: Lo, digit: Nd,
...

If memory serves me correctly, this is explained in the
UnicodeData.html file, under a heading something like
Normative Categories.




More information about the Haskell-Cafe mailing list