Lexing character literals in H98
Simon Peyton-Jones
simonpj@microsoft.com
Fri, 26 Apr 2002 09:54:10 -0700
| Yet another H98 question, this time regarding the module=20
| Char: Is it a deliberate design decision that readLitChar=20
| handles decimal, octal, and hex escapes, but lexLitChar=20
| handles only decimal ones? It looks more like an oversight to me...
Me too. It's a messy part of the language that both readLitChar
and lexLitChar exist, with pretty much duplicate code. Still, I'd
rather
not rewrite it all (easy to introduce more bugs). But it is unfortunate
that the two actually parse different escape sequences. Would anyone
like to volunteer to write missing code for lexLitChar?
Below is the stuff from the Report that describes the spec
| Another question is if the uppercase variants '\O...' and=20
| '\X...' should be forbidden explicitly. Current=20
| implementations seem to differ in this aspect.
Implementation shouldn't accept \O \X etc. I don't think it's
appropriate
to forbid it... should we forbid \A (read number base 35) too?
Simon
The function showLitChar converts a character to a string using only
printable characters, using Haskell source-language escape conventions.
The function lexLitChar does the reverse, returning the sequence of
characters that encode the character. The function readLitChar does the
same, but in addition converts the to the character that it encodes. For
example:=20
showLitChar '\n' s =3D "\\n" ++ s
lexLitChar "\\nHello" =3D [("\\n", "Hello")]
readLitChar "\\nHello" =3D [("\n", "Hello")]