[Haskell-cafe] Parsing Haskell with Layout

Benjamin Redelings benjamin.redelings at gmail.com
Tue Aug 28 19:53:10 UTC 2018


Section 10.3 of the Haskell 2010 report describes the layout rule for 
Haskell in terms of rules for inserting curly braces and semicolons into 
the token stream.

I'm not sure of the theory behind implementing Note 5 in section 10.3.  
This says roughly that we insert a "}" token after token t[n] if the 
tokens t[1]...t[n+1] have no valid parses, and the sequence of tokens 
t[1] ... t[n] followed by "}" DO have a valid parse.

I think this allows you to write

f z = let x=2;y=3 in x+y+z

However, except for Note 5, it seems like you could do a simple 
preprocessing step on the token stream to insert "{", ";", and "}" 
before parsing.  Note 5 seems more complicated.  (I feel like I read a 
note somewhere that said "read this and weep" and then failed to 
implement Note 5.  But now I can't remember where I read this.)

In simple explanations like Write you a haskell 
(http://dev.stephendiehl.com/fun/008_extended_parser.html) it seems that 
the approach is simply to ignore Note 5.

Can anyone fill me in on

(a) what concepts are assumed by the report that would make it "obvious" 
how Note 5 is supposed to be implemented?

(b) a simple (hopefully) implementation strategy for implementing it? 
I'm looking for a language-independent approach for parsing.

I did see some mention of error-correcting parsers in the Parsec paper, 
which might allow adding the "}" at a parse error.  But I see there is a 
paper called " An error correcting parser for context free grammars that 
takes less than cubic time".  The "less than cubic time" seems rather 
worrisome, so I don't know if this is the right approach.

Also, when using parsec, it seems that you might get an idea how many 
tokens are a valid prefix of the grammer when the token stream fails to 
parse.  Hence, you could try inserting a "}" into the token stream at 
the point of a parse error.  However, a naive approach would then 
require re-parsing the whole stream from the beginning until each parse 
error, which seems very expensive.

Thanks for any help!


More information about the Haskell-Cafe mailing list