[Haskell-i18n] Unicode in source

Ashley Yakeley ashley@semantic.org
Thu, 22 Aug 2002 16:41:15 -0700


At 2002-08-22 08:49, Sven Moritz Hallberg wrote:

>Ashley, do your property tools include something that can handle
>composition?

I have decomposition functions:

  decomposeCanonical :: String -> String;
  decomposeCompatibility :: String -> String;

See 
<http://cvs.sourceforge.net/cgi-bin/viewcvs.cgi/*checkout*/hbase/Source/HBa
se/Text/Unicode.hs?rev=HEAD&content-type=text/plain>.

Both fully decompose text according to the two kinds of decomposition: 
"canonical" and "compatibility". These don't do "canonical ordering" of 
things like multiple accent modifiers, so they don't count as 
normalisation forms D and KD. I might be able to write the code to do 
this, however.

I don't have composition functions. And it won't help you for layout, 
since not all characters are composable as single codepoints.

See UAX #15 for normalisation.
<http://www.unicode.org/unicode/reports/tr15/>

-- 
Ashley Yakeley, Seattle WA