[Haskell-cafe] Clarification on uniWhite lexical definition

Mario blamario at rogers.com
Tue Oct 20 11:59:03 UTC 2020

On 2020-10-20 6:43 a.m., Immanuel Litzroth wrote:
> The haskell report says:
> uniWhite → any Unicode character defined as whitespace
> it's not clear to me whether this means that the unicode character should
> have "Zs" as it's general category
> ;; Zs Space_Separator a space character (of various non-zero widths)
> or whether it should be defined as whitespace as in
> https://www.unicode.org/Public/UCD/latest/ucd/PropList.txt

     Recall that this production dates from 1998, which was the early 
days of Unicode. You should be looking approximately at the Unicode 
2.1.8 standard, not the latest one. And once you look there, you'll find 
it was much simpler:

> Property dump for: 0x10000004 (White space)
> 0009..000D  (5 chars)
> 0020
> 00A0
> 2000..200B  (12 chars)
> 2028..2029  (2 chars)
> 3000

         So there was no ambiguity at the time. Now if you're trying to 
extrapolate the intent to the present standard... well I have no more 
authority than you in the matter, but I'd go with the more inclusive 

More information about the Haskell-Cafe mailing list