Parser.y rewrite with parser combinators

Vladislav Zavialov vlad.z.4096 at gmail.com
Tue Oct 9 07:18:21 UTC 2018


> For example, if we see `do K x y z ...`, we don't know whether we're parsing an expression or a pattern before we can see what's in the ..., which is arbitrarily later than the ambiguity starts. Of course, while we can write a backtracking parser with combinators, doing so doesn't seem like a particularly swell idea.

Backtracking is exactly what I wanted to do here. Perhaps it is lack
of theoretical background on my behalf showing, but I do not see
downsides to it. It supposedly robs us of linear time guarantee, but
consider this.

With 'happy' and post-processing we

1. Parse into an expression (linear in the amount of tokens)
2. If it turns out we needed a pattern, rejig (linear in the size of expression)

With parser combinators

1. Parse into an expression (linear in the amount of tokens)
2. If it turns out we needed a pattern, backtrack and parse into a
pattern (linear in the amount of tokens)

Doesn't post-processing that we do today mean that we don't actually
take advantage of the linearity guarantee?

On Tue, Oct 9, 2018 at 3:31 AM Richard Eisenberg <rae at cs.brynmawr.edu> wrote:
>
> I, too, have wondered about this.
>
> A pair of students this summer were working on merging the type-level and term-level parsers, in preparation for, e.g., visible dependent quantification in terms (not to mention dependent types). If successful, this would have been an entirely internal refactor. In any case, it seemed impossible to do in an LALR parser, so the students instead parsed into a new datatype Term, which then got converted either to an HsExpr, an HsPat, or an HsType. The students never finished. But the experience suggests that moving away from LALR might be a good move.
>
> All that said, I'm not sure how going to parser combinators stops us from needing an intermediate datatype to parse expressions/patterns into before we can tell whether they are expressions or patterns. For example, if we see `do K x y z ...`, we don't know whether we're parsing an expression or a pattern before we can see what's in the ..., which is arbitrarily later than the ambiguity starts. Of course, while we can write a backtracking parser with combinators, doing so doesn't seem like a particularly swell idea. This isn't an argument against using parser combinators, but fixing the pattern/expression ambiguity was a "pro" listed for them -- except I don't think this is correct.
>
> Come to think of it, the problem with parsing expressions vs. types would persist just as much in the combinator style as it does in the LALR style, so perhaps I've talked myself into a corner. Nevertheless, it seems awkward to do half the parsing in one language (happy) and half in another.
>
> Richard
>
> > On Oct 8, 2018, at 6:38 PM, Vladislav Zavialov <vlad.z.4096 at gmail.com> wrote:
> >
> > That is a very good point, thank you! I have not thought about
> > incremental parsing. That's something I need to research before I
> > start the rewrite.
> > On Tue, Oct 9, 2018 at 1:06 AM Alan & Kim Zimmerman <alan.zimm at gmail.com> wrote:
> >>
> >> I am not against this proposal,  but want to raise a possible future concern.
> >>
> >> As part of improving the haskell tooling environment I am keen on making GHC incremental, and have started a proof of concept based in the same techniques as used in the tree-sitter library.
> >>
> >> This is achieved by modifying happy, and requires minimal changes to the existing Parser.y.
> >>
> >> It would be unfortunate if this possibility was prevented by this rewrite.
> >>
> >> Alan
> > _______________________________________________
> > ghc-devs mailing list
> > ghc-devs at haskell.org
> > http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs
>


More information about the ghc-devs mailing list