lexer puzzle

Thomas Hallgren hallgren at cse.ogi.edu
Wed Sep 24 17:42:34 EDT 2003


Marcin 'Qrczak' Kowalczyk wrote:

>>A... should be split into "A.." and "."
>>    
>>
>
>I found a compromise: let's make it a lexing error! :-)
>  
>
At least that agrees with what some Haskell compilers implement. No 
current Haskell compiler/interpreter agrees with what the report seems 
to say, that is that "A..." should be lexed as the two tokens "A.." and 
".", and similarly, "A.where" should be lexed as "A.wher" followed by "e".

It seems that the source of the problem is the use of the (nonstandard) 
difference operator r1<r2> in the specification of the lexical syntax in 
the Haskell report [1]. It was presumably fairly innocent and easy to 
understand originally, but I guess that nobody understood its 
consequences when qualified names were introduced.

Anyway, this should teach us not to use homemade pseudo formal notation 
when defining a new language, but to stick to well-established, 
well-understood, tool-supported formalisms...

For the Programatica Haskell front-end, I have now switched to a regular 
expression compiler that has direct support for the difference operator, 
so hopefully, our implementation agrees with what the report specifies. 
(This is not necessarily a good thing, though, since it makes our 
front-end different from all other Haskell implementations :-)

For what it is worth, I tested "A...", "A.where" and "A.--" in the main 
Haskell implementations and in the Programatica Haskell front-end. The 
input was two modules A and B:

1 module A where
2
3 wher = id
4 e = id

1 module B where
2 import A
3
4 x = (A.where)
5 y = x

Here is the result:

GHC: B.hs:4: parse error on input `where'
HBC: "B.hs", line 4, Bad qualified name on input:<eof>
Hugs: ERROR "B.hs":4 - Undefined qualified variable "A.where"
NHC98: Identifier A.where used at 4:6 is not defined.
PFE: ok

If line 4 in module B is replaced with x = (A...):

GHC: B.hs:4: Variable not in scope: `A...'
HBC: "B.hs", line 4, Bad qualified name on input:<eof>
Hugs: ERROR "B.hs":4 - Undefined qualified variable "A..."
NHC98: Identifier A... used at 4:6 is not defined.
PFE: B.hs:4,13, before ): syntax error
(A.. is lexed as A.. .)

If line 4 in module B is replaced with x = (A.--)

GHC: B.hs:4: Variable not in scope: `A.--'
HBC: "B.hs", line 5, syntax error on input:=
(treats -- as the start of a comment)
Hugs: ERROR "B.hs":4 - Undefined qualified variable "A.--"
NHC98: Identifier A.-- used at 4:6 is not defined.
PFE: B.hs:5,1, before : syntax error
(A.-- is lexed as A.- -)

I used the following versions

GHC 6.0.1
HBC 0.9999.5b
Hugs 98 November 2002
NHC98 1.16
PFE 030912

-- 
Thomas H

[1] http://www.haskell.org/onlinereport/syntax-iso.html 





More information about the Haskell mailing list