H98 Report: expression syntax glitch

Simon Peyton-Jones simonpj@microsoft.com
Tue, 26 Feb 2002 07:30:44 -0800

| Consider the following Haskell 98 expressions:
| 	(let x =3D 10 in x `div`)
| 	(let x =3D 10 in x `div` 3)
| To parse the first, a bottom-up parser should reduce the=20
| let-expression before the operator, while to parse the second=20
| it should shift. But it needs 4 tokens of lookahead to decide=20
| which to do.  That seems unreasonable, and is well beyond the=20
| LALR(1) parsers used by Hugs and GHC. Replacing `div` by +=20
| needs 2 tokens of lookahead, which is still too much. I think=20
| the first should be made illegal, but can't think of a clean=20
| rule. (There are similar expressions using lambda and if.)=20

Thanks to Ross for identifying this glitch, and for others who followed
up. The real problem seems to be the notion of "as far to the right as=20
possible" because that implies "as far to the right as can be done=20
without a parse error", and that in turn depends on associativity etc.

Simon and I came up with the following alternative formulation for
the meta-rule that disambiguates let, lambda, and if:

Replace "The ambiguity is resolved by the meta rule that each of these
constructs extends as far to the right as possible" by

	"The ambiguity is resolved by the meta rule that each=20
	of these constructs extends to the nearest occurrence of
	the following punctuation symbols that does not form part of
	a nested expression:

		)  ]  }  |  ;  ,  ..  where  of  then  else

	For example:

		(\x -> x + (x*x) )	Lambda extends to the second ')'
					because the first ')' forms part
					a nested expression

This formulation means that

 	(let x =3D 10 in x `div`)

would be a syntax error, because the `div` cannot terminate the
reach of the 'let'.   Similarly,

	let x =3D 10 in x =3D=3D x =3D=3D True

is a syntax error because the let cannot be terminated by the second

Can anyone think of a way to improve on this?=20