How to use Lexer.lexer to produce closing braces as well?

Kwanghoon Choi lazyswamp at gmail.com
Tue Oct 26 22:45:41 UTC 2021


Thank you Edward for your reply.

It took some time for me to understand a formal description (translation
function L) of
the Haskell layout rule in Section 10.3 of Haskell 2010 language report. :)

I myself tried to write an example with the Haskell layout rule for me to
understand it.
I am sharing this for those who are curious about the Haskell layout rules.

 - http://lazyswamp.blogspot.com/2021/10/on-haskell-layout-rule.html

Kwanghoon

PS.
To be honest, however, I still don't know how Haskell parser (Parser.y) and
lexer (Lexer.x) interact with each other
to implement the following rule on the insertion of a closing brace on a
parse error.

   [Sec. 10.3]
   -    L (t : ts) (m : ms) = } : (L (t : ts) ms)  if m /= 0 and
parse-error(t)

It seems that Happy's error recover is relevant to the interaction via a
production rule:

    [ Parser.y ]
   - close :: { () }
        : vccurly             { () }    -- context popped in lexer.
        | error                 {% popContext }

Is it enough for a Haskell lexer just to pop the top context? It is not
clear to me how vccurly is inserted just
by having such a production rule and an error recovery.



On Wed, 18 Aug 2021 at 17:05, Edward Kmett <ekmett at gmail.com> wrote:

> Unfortunately, the current parsing rules for Haskell aren't fully
> phase-separable like this.
>
> If you look at the rules for Layout token insertion in the Haskell report
> the 9th rule requires that in the event the parser encounters a parse
> error it should insert a virtual close brace and continue on!
>
> Otherwise you couldn't parse things like *let **{** foo = bar **}** in
> baz *where the {}'s are virtual without reframing *let* and *in* as a
> different kind of paired opening and closing brace or using other hacks in
> the grammar. It is quite difficult to hack around all the ways parses can
> go wrong.
>
> The main downside this has from a language standpoint is you simply can't
> properly lex Haskell without more or less fully parsing Haskell.
>
> -Edward
>
> On Wed, Aug 18, 2021 at 7:22 AM Kwanghoon Choi <lazyswamp at gmail.com>
> wrote:
>
>>
>> Hi,
>>
>> I have recently been playing with GHC's Lexer.lexer in the ghc-parser-lib
>> package.
>>
>> Given
>>
>>    module HelloWorld where
>>
>>    main = putStrLn "Hello World!\n"
>>
>> it produces
>>
>>    stack exec -- lexer-exe ./examples/HelloWorld.hs
>>    Lexing&Parsing: ./examples/HelloWorld.hs
>>    module at (1, 1): module
>>    CONID at (1, 8): CONID
>>    where at (1, 19): where
>>    vocurly at (3, 1): vocurly    <==== { is inserted automatically!!
>>    VARID at (3, 1): VARID
>>    = at (3, 6): =
>>    VARID at (3, 8): VARID
>>    STRING at (3, 17): STRING
>>    ; at (4, 1): ;
>>
>> By the example above, the lexer automatically inserts an opening brace
>> (i.e. vocurly) right after 'where'. But it does not insert a matching
>> closing brace (i.e., vccurly), which would lead to a failure in parsing a
>> list of tokens produced by the lexer.
>>
>> My question is how to use the GHC lexer to produce closing braces as
>> well.
>>
>> All my code is available
>>  - https://github.com/kwanghoon/hslexer
>>
>> To save your time, the relevant part of the code is as follows:
>>
>> In app/HaskellLexer.hs,
>>
>>     singleHaskellToken :: P (Located Token)
>>     singleHaskellToken =
>>       Lexer.lexer False
>>         (\locatedToken -> P (\pstate -> POk pstate locatedToken))
>>
>>     tokInfos :: [Terminal Token] -> P (Line, Column, [Terminal Token])
>>     tokInfos s = do
>>       locatedToken <- singleHaskellToken
>>       case locatedToken of
>>         L srcspan ITeof ->
>>           let (start_line, start_col, end_line, end_col) =
>> srcSpanToLineCol srcspan in
>>           return (end_line, end_col, s)
>>
>>         L srcspan tok ->
>>           let (start_line, start_col, end_line, end_col) =
>> srcSpanToLineCol srcspan in
>>           tokInfos (Terminal (fromToken tok) start_line start_col (Just
>> tok) : s)
>>
>> Thanks in advance
>>
>> Best regards,
>>
>> Kwanghoon
>>
>>
>>
>>
>> _______________________________________________
>> ghc-devs mailing list
>> ghc-devs at haskell.org
>> http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.haskell.org/pipermail/ghc-devs/attachments/20211026/d132529c/attachment.html>


More information about the ghc-devs mailing list