Re: Better casing functions (German ß, etc.)

Mario Blažević blamario at ciktel.net
Wed Jul 11 12:33:31 UTC 2018


On 2018-07-11 02:59 AM, 박신환 wrote:
>
> Current Haskell has 'simple' `Char`-to-`Char` casing functions (as 
> specified by Unicode), namely `toUpper`, `toLower` and `toTitle`.
>
> So to convert cases of a `String`, Haskell intends `fmap toUpper`, 
> etc. But this has some bugs.
>

I've never tested the cases you list, but I believe the text-icu library 
covers them. See 
http://hackage.haskell.org/package/text-icu-0.7.0.1/docs/Data-Text-ICU.html#g:4

> Case 1. German ß (Eszett)
>
> 'ß' (U+00DF), Latin Small Letter Sharp S, is a lowercase letter 
> itself, but Unicode doesn't specify its 'simple' uppercase counterpart.
>
> It's because its uppercase counterpart is not a single character, but 
> two characters, "SS".
>
> Case 2. Turkish İ and ı
>
> Rather than the common 'I' and 'i' case pair, Turkish language has the 
> 'İ' (U+0130) and 'i' pair and the 'I' and 'ı'(U+0131) pair. Those 
> are, dotted I pair and dotless I pair, respectively.
>
> Case 3. Greek Σ (Sigma)
>
> Greek 'Σ' (U+03A3) must be lowercase mapped to 'ς' (U+03C2) if 
> followed by a whitespace, rather than normal 'σ' (U+03C3).
>
> Case 4. Greek iota subscript (Ypogegrammeni)
>
> Greek 'Capital' letters with iota subscripts (for example, 'ᾈ' 
> (U+1F88)), though they are the 'simple' uppercase counterpart of their 
> lowercase counterpart, they themselves are actually treated as 
> titlecase characters. For example, the actual uppercase counterpart of 
> 'ᾀ' (U+1F80) is "ἈΙ" (U+1F08 U+0399). That is, an actual capital iota 
> instead of the iota subscript.
>
> Case 5. Precomposed letters without upper/lowercase counterpart
>
> For example, ΐ (U+03B0) doesn't have precomposed uppercase 
> counterpart. It must be effectively mapped to "Ϊ́" (U+03AA U+0301).
>
>
> In Summary, we need more elaborated casing functions which are 
> `String`-to-`String`.
>
>
> Bibliography:
>
> /The Unicode Standard Version 11.0 – Core Specification/, Section 5.18.
>
>
>
> _______________________________________________
> Libraries mailing list
> Libraries at haskell.org
> http://mail.haskell.org/cgi-bin/mailman/listinfo/libraries




More information about the Libraries mailing list