[Haskell-cafe] Re: Editors for Haskell
Brian Hulley
brianh at metamilk.com
Fri Jun 2 09:57:10 EDT 2006
Simon Marlow wrote:
> Malcolm Wallace wrote:
>> "Brian Hulley" <brianh at metamilk.com> wrote:
>>
>>
>>> Thanks for pointing this out. Although there is still a problem with
>>> the fact that var, qvar, qcon etc is in the context free syntax
>>> instead of the lexical syntax so you could write:
>>>
>>> 2 ` plus ` 4
>>> ( Prelude.+
>>> {- a comment -} ) 5 6
>>
>>
>> You appear to be right. However, I don't think I have ever seen a
>> piece of code that actually used the first form. People seem to
>> naturally place the backticks right next to the variable name. Should we
>> consider the fact that whitespace and comments are
>> permitted between backticks to be a bug in the Report? It certainly
>> feels like it should be a lexical issue.
>
> I tend in the other direction: I'd rather see as much as possible
> pushed into the context-free syntax. The only reason that qualified
> identifiers are in the lexical syntax currently is because of the
> clash with the '.' operator.
>
> I'm not sure I can concisely explain why I think it is better to use
> the context-free syntax than the lexical syntax, but I'll try. I
> believe the lexical syntax should adhere, as far as possible, to the
> following rule:
> juxtaposition of lexemes of different classes should not affect
> the lexical interpretation.
>
> in other words, whitespace between different lexemes is irrelevant.
A question here is: what is a lexeme?
For example there are floating point numbers, which are written without
spaces, but which could be considered to consist of primitive whole-number
lexemes interspersed with . e -
34.678e-98
I don't see what the difference is between them and
Prelude.+
especially since we *really* need the dot for other purposes in the CFG such
as composition and (hopefully at some point) field selection.
Since Prelude.+ is by the above argument a single lexeme, it seems
consistent to say that
`Mod.Id`
(Mod.+)
are also single lexemes. The brackets in (Mod.+) have a lexical purpose, to
turn a symbol into an id, which is very different imho from the use of
brackets to parenthesise expressions or form sections.
For example, should a parser consider ( + ) to be an incomplete
parenthesised expression with 2 gaps or an id formed from the symbol + ? At
the moment of course it would be an id but this causes problems when you're
trying to parse Haskell and highlight incomplete expressions, because you'd
expect that if the user indended to just make an id there wouldn't be any
reason to leave spaces between the symbol and the brackets.
In many ways it would be a lot easier if the (lexical) grammar was changed
so that the "turning a symbol into an id" would just be indicated by
parentheses round the (unqualified part of the) symbol alone not the whole
thing thus:
Prelude.(+)
so that the first lexical rule would be
1) Parentheses around an unqualifed symbol turns it into an id
Then the ` could be used to turn a (possibly qualified) id into a symbol:
`Prelude.plus
`Prelude.(+)
and there would be no need for a closing `, so the second rule would be:
2) A grave before an id turns it into a symbol (that can't subsequently
be turned back into an id!)
There are at least five motivations for suggesting the above changes:
1) It allows operator expressions to be parsed by LL1 recursive descent
:-)
2) The low level details of whether or not a symbol or id is used is
kept to the lexical level
3) You can use a qualified function and an operator without knowing in
advance whether it has been declared as a symbol or an id in the module. For
example, you could type
x `Mod.
and expect to get a pop-up list of functions in Mod, such as (+) add etc,
whereas with the current rules, you'd have to go back and add graves around
the qualified function if the function was declared as an id and remove the
grave if it was already declared as an operator.
4) Only one grave is needed :-)
5) An editor can give more feedback, by distinguishing between
incomplete expressions and the turning of symbols into ids
Regards, Brian.
--
Logic empowers us and Love gives us purpose.
Yet still phantoms restless for eras long past,
congealed in the present in unthought forms,
strive mightily unseen to destroy us.
http://www.metamilk.com
More information about the Haskell-Cafe
mailing list