ANN: unicode-properties, unicode-names

Ross Paterson ross at
Tue Sep 2 04:26:30 EDT 2008

On Mon, Sep 01, 2008 at 09:54:38PM -0700, Ashley Yakeley wrote:
> These two packages are representations in Haskell of various data in the  
> Unicode 3.2.0 Character Database. Unicode 3.2.0 was the latest version  
> of the Unicode standard at the time I wrote most of the code; later I  
> may move the packages to the latest version (currently 5.1.0).
> The unicode-properties package contains functions to determine general  
> category, case, and a wide range of other properties, as well as to do  
> decomposition and case-folding.
> The unicode-names package contains just one function, getCharacterName,  
> for getting the name of a character. It's separated out because it's a  
> sufficiently large proportion of the total data.

On a minor point, it would probably be better to avoid prefixing names
of constants (e.g. DCVertical).  Also, the prefix "get" is usually
reserved for functions that have a monadic effect, so names like

	decomposition :: Char -> Decomposition

would be more usual than getDecomposition.

Note that Data.Char already has functions generalCategory, toUpper,
toLower and toTitle, which should work on the full range.  It should
probably have majorClass as well.

More information about the Libraries mailing list