[Haskell-cafe] Unicode symbols in operators
Mikhail Glushenkov
the.dead.shall.rise at gmail.com
Thu Oct 16 06:22:16 UTC 2014
Hi,
On 15 October 2014 21:23, Niklas Hambüchen <mail at nh2.me> wrote:
> (I'm trying to improve Sublime Text's Haskell lexer.)
>
> https://www.haskell.org/onlinereport/haskell2010/haskellch10.html says
> uniSymbol → any Unicode symbol or punctuation
>
> What is meant here, is "Unicode symbol" literally \p{Symbol} in regex,
> or more?
>
> So uniSymbol = \p{Symbol} | \p{Punctuation}
Looking at the source of GHC's lexer [1], the relevant part seems to be:
case generalCategory c of
[...]
ConnectorPunctuation -> symbol
DashPunctuation -> symbol
[...]
OtherPunctuation -> symbol
MathSymbol -> symbol
CurrencySymbol -> symbol
ModifierSymbol -> symbol
OtherSymbol -> symbol
[...]
[1] https://github.com/ghc/ghc/blob/master/compiler/parser/Lexer.x
More information about the Haskell-Cafe
mailing list