Improving Data.Char.isSpace performance
wren ng thornton
wren at freegeek.org
Thu Nov 8 19:18:57 CET 2012
On 10/31/12 11:49 PM, Patrick Palka wrote:
> On Wed, Oct 31, 2012 at 10:39 PM, wren ng thornton <wren at freegeek.org>wrote:
>> The one thing I worry about using \x1680 as the threshold is that I'm
>> not sure whether every character below \x1680 has been allocated or whether
>> some are still free. If any of them are free, then this will become
>> incorrect in subsequent versions of Unicode so it's a maintenance timebomb.
>> (Whereas if they're all specified then it should be fine.) Can someone
>> verify that using \x1680 is sound in this manner?
> According to GHCi:
> Prelude Data.Char> length $ filter ((== NotAssigned) . generalCategory)
Guess I never looked closely at what Unicode queries Data.Char offers...
Looks like the first unassigned character is '\888'
More information about the Libraries