converting capital letters into small letters

Andrew J Bromage ajb@spamcop.net
Fri, 26 Jul 2002 12:07:12 +1000


G'day all.

On Fri, Jul 26, 2002 at 01:27:48AM +0000, Karen Y wrote:

> 1. How would I convert capital letters into small letters?
> 2. How would I remove vowels from a string?

As you've probably found out, these are very hard problems.

Haskell gets it a little wrong here, since the result of some of the
UnicodePrims functions (see chapter 9, Haskell 98 library report)
should really be locale-dependent and therefore _impure_ if you allow
changes of locale.  Of course, Haskell currently only supports time and
data locale information, so this wouldn't help you anyway.

Glossing over that concern, current implementations don't support the
relevant UnicodePrims fully, so to do it properly you'll probably need
to parse the case folding files yourself.  See:

	http://www.unicode.org/unicode/reports/tr21/

Vowels are even harder because I don't think the Unicode standard even
defines what a "vowel" is.  Removing vowel _marks_ should be
straightforward once you expand combining characters, but that doesn't
help with the general case.  Frankly, I don't like your chances.

Can anyone else think up a good solution?

Cheers,
Andrew Bromage