Parsing funny arrows

Csongor Kiss kiss.csongor.kiss at gmail.com
Sat Aug 29 08:13:18 UTC 2020


Thanks once again for your detailed analysis, it's really helpful.
Especially the forall example - I did try constructing some string that would parse, but I was shooting in the dark.

Best,
Csongor

> On 29 Aug 2020, at 09:05, Vladislav Zavialov <vladislav at serokell.io> wrote:
> 
> The lexer produces only as many tokens as the parser requires. In the ‘lexerDbg’ dump that you included in the message, there were these lines:
> 
>  token: ITvarid "m"
>  token: ITvarid "b"
>  token: ITsemi
> 
> So I knew that the parser consumed the entire string, up to the virtual semicolon. I also recognized the parse error as the one produced by ‘happy’ rather than by a later validation pass. So even though the parser consumed the entire string, it failed. Therefore, it didn’t expect this string to end so abruptly, it expected it to continue.
> 
> But what did it expect to find? To figure it out, we need to know which grammar production is involved in the failure. The only grammar production that could’ve consumed the ‘->’ PREFIX_AT sequence successfully and proceed to process the rest of the string is this one:
> 
>   btype '->' PREFIX_AT btype ctype
> 
> By inspecting the definitions of ‘btype’ and ‘ctype’, one can see that neither of those accept the empty string, and both of those accept type-level function application. Thus it’s possible that ‘btype’ consumed “m b” as an application, and ‘ctype’ failed because it didn’t accept the remaining empty string:
> 
>  btype = “m b”
>  ctype = parse error (nothing to consume)
> 
> But what you wanted instead was:
> 
>  btype = “m”
>  ctype = “b”
> 
> The solution is to use ‘atype’ instead of ‘btype’, as ‘atype’ does not accept type-level application.
> 
> By the way, there’s also the input string on which the original grammar would’ve succeeded (or at least I think so):
> 
>  test :: a -> @m forall b. b
>  test = undefined
> 
> That’s because ‘btype’ wouldn’t have consumed the ‘forall’, it would’ve stopped at this point. And then ‘ctype’ could’ve consumed “forall b. b”.
> 
> I don’t think there’s a parser equivalent of -ddump-tc-trace. You’ll need to figure this stuff out by reading the grammar and keeping in mind that ‘happy’ generates a shift-reduce parser that does not backtrack. The ‘lexerDbg’ output is useful to see how far the parser got. And there’s also this command in case you want to go low level and inspect the state machine generated by ‘happy’:
> 
>    happy -agc --strict compiler/GHC/Parser.y -idetailed-info
> 
> Hope this helps,
> - Vlad
> 
>> On 29 Aug 2020, at 10:16, Csongor Kiss <kiss.csongor.kiss at gmail.com> wrote:
>> 
>> Thanks a lot Vlad and Shayne, that indeed did the trick!
>> 
>> Out of curiosity, how could I have figured out that this was the culprit? The parse
>> error I got was a bit puzzling, and I couldn't find any flags that would give more information
>> (I think I was looking for the parser equivalent of -ddump-tc-trace).
>> 
>> Best,
>> Csongor
> 



More information about the ghc-devs mailing list