User-defined operators and compound expressions using Happy
Doaitse Swierstra
doaitse at cs.uu.nl
Mon Nov 22 13:26:40 EST 2004
On 2004 nov 22, at 17:48, Frank-Andre Riess wrote:
> Hi there folks,
>
> once again, I've got a question related to Happy (I've got version
> 1.13 at
> the moment).
> Maybe, it's even more a question on formal languages, but well...
> How can I write a grammar that can cope with user-defined operators (of
> different precedences/associativities) and compound expression like
> function calls, if-then-else- and case-statements and the like. I
> tried to
> write it down straight forwardly, but failed terribly (alas, I didn't
> keep
> it, so I can't show you - if someone of you is versed in this issue, I
> can
> try to explain the language's constructs).
One way of doing this using combinator based parsing (where you can
generate parsers dynamically) is to read the fixity declarations, and
to use the result of this to build the precedence parser. This idea has
been sketched in:
S. D. Swierstra and P. R. Azero Alcocer. Fast, Error Correcting Parser
Combinators: a Short Tutorial. In J. Pavelka, G. Tel, and M. Bartosek,
editors, SOFSEM'99 Theory and Practice of Informatics, 26th Seminar on
Current Trends in Theory and Practice of Informatics, volume 1725 of
LNCS, pages 111--129, November 1999.
If you do not have access to this I will be happy to send it to you,
Doaitse Swierstra
== some text (created from the pdf) from this paper ================
As an example of what can be done we will now show how to construct
parsers dynamically by writing a parser for an expression language with
infix
operators. An example input is:
(L+R*)a+b*(c+d)
and the code we want to generate is:
abcd+*+
which is the reversed Polish notation of the input expressions.
The text (L+R*) indicates that + is left (L) associative and has lower
priority
than *, which is right (R) associative. In this way an unlimited number
of
operators may be specified, with relative priorities depending on their
position
in this list.
We start by defining a function that parses a single character
identifier and
returns as its result that identifier in the form of a string:
pVar = (\c -> [c]) <$> pAnyOf ['a'..'z'] .
The next step is to define a function that, given the name of an
operator,
recognizes that operator and as a result returns a function that will
concatenate
the two arguments of that operator and postfix it with the name of the
operator,
thus building the reversed Polish notation:
pOp name = (\ left right -> left++right++[name]) <$ pSym name
Note that, by using the operator <$ we indicate that we are not
interested
in the recognized operator; we already know what this is since it was
passed as
a parameter.
Next we de ne the function compile. For this we introduce a new
combinator
<@>, that takes as its left hand side operand a parser constructor f
and as its
right hand side operand a parser g. The results v of parsing a pre x of
the input
with g, are used in calling f; these calls, in their turn, result in
new parsers which
are applied to the rest of the input:
(f <@> g) input = [ f v rest | (v, rest) <- g input ]
Since our input consists of two parts, the priority declarations and
the expression
itself , we postulate that the function compile reads:
compile = pRoot <@> pPrios
First we focus on the function pRoot, that should take as argument the
result
of recognizing the priorities. Here we will assume that this result is
a function
that, given how to parse an operand, parses an expression constructed
out of
operands and the de ned operators:
pRoot prios = let pExpr = prios (pVar <|> pParens pExpr) in pExpr
There is a difference between an operator that occurs in the
declaration part
of the input and one in the expression part: the former may be any
operator,
whereas the latter can only be an operator that has been declared
before. For
the priority declaration part we thus introduce a new parser that
recognizes any
operator, and returns a parser that compiles the just recognized
operator using
the function pOp defined before:
pAnyOp = pOp <$> pAnyOf "+*/-^" just some possible operators
Now suppose we have recognized a left and a right associative operator
resulting
in operator compilers pLeft and pRight. Out of these we can construct
a function that, given the operand parser, parses infix expressions
containing
pLeft and pRight occurrences:
pLR factor = (pChainl pLeft . pChainr pRight) factor.
Generalizing this pattern to an unlimited number of operators we now
deduce
the definition:
pPrios = pParens $
pFoldr ((.), id) (( pChainl <$ pSym 'L'
<|> pChainr <$ pSym 'R'
)
<*> pAnyOp
)
Let us now compare once more this approach with the situation where we
would have used a special parser generator. In the combinator approach
we can
freely introduce all kinds of abbreviations by defining new combinators
in terms
of existing ones; furthermore we may de ne higher order combinators
that take
arguments and return values that may be parsers. This is a property we
get for
free here, and is absent in most tools, where the syntax of the input
is xed and
at most some form of macro processing is available as an abstraction
mechanism.
Another important consequence from embedding our parser construction in
an existing language is that type checking and error reporting can
directly be
done at the program level, and not at the level of some generated
program.
=======================================================
Fast, Error Correcting Parser Combinators: A Short Tutorial 5
module ExtendedCombinators where
import BasicCombinators
infixl 4 <$>, <$, <*, *>, <**>, <??>
infixl 2 `opt`
5
pAnyOf :: Eq s => [s] -> Parser s s
opt :: Eq s => Parser s a -> a -> Parser s a
(<$>) :: Eq s => (b -> a) -> Parser s b -> Parser s a
(<$ ) :: Eq s => a -> Parser s b -> Parser s a
10 (<* ) :: Eq s => Parser s a -> Parser s b -> Parser s a
( *>) :: Eq s => Parser s a -> Parser s b -> Parser s b
(<**>) :: Eq s => Parser s b -> Parser s (b->a) -> Parser s a
(<??>) :: Eq s => Parser s b -> Parser s (b->b) -> Parser s b
15 pAnyOf = foldr (<|>) pFail . map pSym
p `opt` v = p <|> pSucceed v
f <$> p = pSucceed f <*> p
f <$ p = const f <$> p
p <* q = (\ x _ -> x) <$> p <*> q
20 p *> q = (\ _ x -> x) <$> p <*> q
p <**> q = (\ x f -> f x) <$> p <*> q
p <??> q = p <**> (q `opt` id)
pFoldr alg@(op,e) p
25 = pfm where pfm = (op <$> p <*> pfm) `opt` e
pFoldrSep alg@(op,e) sep p
= (op <$> p <*> pFoldr alg (sep *> p)) `opt` e
pFoldrPrefixed alg@(op,e) c p = pFoldr alg (c *> p)
30 pList p = pFoldr ((:), []) p
pListSep s p = pFoldrSep ((:), []) s p
pListPrefixed c p = pFoldrPrefixed ((:), []) c p
pSome p = (:) <$> p <*> pList p
35 pChainr op x = r where r = x <**> (flip <$> op <*> r `opt` id)
pChainl op x = f <$> x <*> pList (flip <$> op <*> x)
where
f x [] = x
f x (func:rest) = f (func x) rest
40
pPacked l r x = l *> x <* r
some ad hoc extensions
pOParen = pSym '('
45 pCParen = pSym ')'
pParens = pPacked pOParen pCParen
Listing 3: ExtendedCombinators
>
> Thank you so much,
> Frank-Andre Riess
>
> _______________________________________________
> Glasgow-haskell-users mailing list
> Glasgow-haskell-users at haskell.org
> http://www.haskell.org/mailman/listinfo/glasgow-haskell-users
More information about the Glasgow-haskell-users
mailing list