[Haskell-beginners] Consuming Rule Based Parsing
ktvoelker at gmail.com
Fri Nov 16 10:55:12 CET 2012
On Fri, Nov 16, 2012 at 12:55 AM, Christopher Howard <
christopher.howard at frigidcode.com> wrote:
> That is, I'm not parsing a single type of document with a set structure.
I think you are. See below.
> Rather, the idea is that I have a list of symbols (characters, numbers,
> whatever) and pattern match the front part of the list according to some
> (grammer) rule (which itself could be a conjunction or disjunction of
> rules) in a set of rules. If the match succeeds, then I consume that
> part of the list, and then analyze the remaining list, starting again
> with the first rule in my rule set. If the match fails, then I try the
> next rule in my rule set.
If the things which can appear on the stream have grammars A, B, and C,
then you can describe the grammar of the stream as a whole by:
S -> (A | B | C)*
So you are still just parsing an S document.
> Backtracking would also be cool (to see what
> other results I can get).
Unregulated backtracking tends to produce slow parsers. And for a
reasonably thought-out grammar, backtracking is not usually necessary. But
it can be done.
> Is there a particular Haskell abstraction or system suitable for what I
> have described?
There is an approach called "parser combinators". The basic idea is this:
type Parser a = String -> [(a, String)]
In other words, a parser of values of type "a" is a function which takes a
string and returns all possible parses of some prefix of that string as an
"a"-value, each paired with whatever input was left after parsing.
You can then start writing functions to combine parsers. Thus, you end up
with "parser combinators".
There is a library called Parsec which is the industrial-strength
incarnation of the parser combinator concept. But I would recommend that
you start by getting familiar with a home-grown parser combinator library
of your own.
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the Beginners