[Haskell-cafe] Re: Valid Haskell characters

Maurí­cio briqueabraque at yahoo.com
Mon Aug 25 22:18:31 EDT 2008


On chapter 4 I see the following
nice table in page 139. Do you think
I can use it together with UnicodeData.txt
to choose valid characters for Haskell?
Here is the only place I found where names
match with haskell syntax reference
(uppercase, lowercase, punctuation, symbol).

Thanks,
Maurício

                        Table 4-7. General Category

Lu = Letter, uppercase
Ll = Letter, lowercase
Lt = Letter, titlecase
Lm = Letter, modifier
Lo = Letter, other
Mn = Mark, nonspacing
Mc = Mark, spacing combining
Me = Mark, enclosing
Nd = Number, decimal digit
Nl = Number, letter
No = Number, other
Pc = Punctuation, connector
Pd = Punctuation, dash
Ps = Punctuation, open
Pe = Punctuation, close
Pi = Punctuation, initial quote (may behave like Ps or Pe depending on usage)
Pf = Punctuation, final quote (may behave like Ps or Pe depending on usage)
Po = Punctuation, other
Sm = Symbol, math
Sc = Symbol, currency
Sk = Symbol, modifier
So = Symbol, other
Zs = Separator, space
Zl = Separator, line
Zp = Separator, paragraph
Cc = Other, control
Cf = Other, format
Cs = Other, surrogate
Co = Other, private use
Cn = Other, not assigned (including noncharacters)




Deborah Goldsmith a écrit :
> You can't determine Unicode character properties by analyzing the names 
> of the characters.
> 
> Read chapter 4 of the standard:
> http://www.unicode.org/versions/Unicode5.0.0/ch04.pdf
> 
> and get the property values here:
> http://www.unicode.org/Public/UNIDATA/DerivedCoreProperties.txt
> 
> It sounds like the properties you want are "Case" and "General 
> Category". Maybe the spec should be more explicit on exactly how the 
> definitions map onto Unicode properties, so there is no ambiguity.
> 
> Deborah
> 
> On Aug 25, 2008, at 6:15 PM, Maurí cio wrote:
> 
>> Hi,
>>
>> In Haskell reference, I see the
>> following definitions:
>>
>> uniWhite -> any Unicode character defined
>> as whitespace;
>>
>> uniSmall -> any Unicode lowercase letter;
>>
>> uniLarge -> any uppercase or titlecase
>> Unicode letter;
>>
>> uniSymbol -> any Unicode symbol or
>> punctuation.
>>
>> Where do I get lists for those
>> characters? My first attempt was to
>> check:
>>
>> http://unicode.org/Public/UNIDATA/UnicodeData.txt
>>
>> and consider large anything marked as
>> CAPITAL and small anything marked as SMALL. I
>> didn't know what to guess about the symbols.
>> Am I using the right reference? How can I
>> recognize (or get a list of) valid uppercase and
>> lowercase unicode letters, as well as symbols
>> and punctuation?
>>
>> Thanks for your help,
>> Maurício
>>
>> _______________________________________________
>> Haskell-Cafe mailing list
>> Haskell-Cafe at haskell.org
>> http://www.haskell.org/mailman/listinfo/haskell-cafe



More information about the Haskell-Cafe mailing list