User-defined operators and compound expressions using Happy

Mon Nov 22 13:26:40 EST 2004

On 2004 nov 22, at 17:48, Frank-Andre Riess wrote:

> Hi there folks,
>
> once again, I've got a question related to Happy (I've got version 
> 1.13 at
> the moment).
> Maybe, it's even more a question on formal languages, but well...
> How can I write a grammar that can cope with user-defined operators (of
> different precedences/associativities) and compound expression like
> function calls, if-then-else- and case-statements and the like. I 
> tried to
> write it down straight forwardly, but failed terribly (alas, I didn't 
> keep
> it, so I can't show you - if someone of you is versed in this issue, I 
> can
> try to explain the language's constructs).

One way of doing this using combinator based parsing (where you can 
generate parsers dynamically) is to read the fixity declarations, and 
to use the result of this to build the precedence parser. This idea has 
been sketched in:

  S. D. Swierstra and P. R. Azero Alcocer. Fast, Error Correcting Parser 
Combinators: a Short Tutorial. In J. Pavelka, G. Tel, and M. Bartosek, 
editors, SOFSEM'99 Theory and Practice of Informatics, 26th Seminar on 
Current Trends in Theory and Practice of Informatics, volume 1725 of 
LNCS, pages 111--129, November 1999.

If you do not have access to this I will be happy to send it to you,

  Doaitse Swierstra

== some text (created from the pdf) from this paper ================

As an example of what can be done we will now show how to construct
parsers dynamically by writing a parser for an expression language with 
infix
operators. An example input is:
(L+R*)a+b*(c+d)
and the code we want to generate is:
abcd+*+
which is the reversed Polish notation of the input expressions.

The text (L+R*) indicates that + is left (L) associative and has lower 
priority
than *, which is right (R) associative. In this way an unlimited number 
of
operators may be specified, with relative priorities depending on their 
position
in this list.

We start by defining a function that parses a single character 
identifier and
returns as its result that identifier in the form of a string:
pVar = (\c -> [c]) <$> pAnyOf ['a'..'z'] .
The next step is to define a function that, given the name of an 
operator,
recognizes that operator and as a result returns a function that will 
concatenate
the two arguments of that operator and postfix it with the name of the 
operator,
thus building the reversed Polish notation:
pOp name = (\ left right -> left++right++[name]) <$ pSym name
Note that, by using the operator <$ we indicate that we are not 
interested
in the recognized operator; we already know what this is since it was 
passed as
a parameter.
Next we de ne the function compile. For this we introduce a new 
combinator
<@>, that takes as its left hand side operand a parser constructor f 
and as its
right hand side operand a parser g. The results v of parsing a pre x of 
the input
with g, are used in calling f; these calls, in their turn, result in 
new parsers which
are applied to the rest of the input:

(f <@> g) input = [ f v rest | (v, rest) <- g input ]

Since our input consists of two parts, the priority declarations and 
the expression
itself , we postulate that the function compile reads:
compile = pRoot <@> pPrios
First we focus on the function pRoot, that should take as argument the 
result
of recognizing the priorities. Here we will assume that this result is 
a function
that, given how to parse an operand, parses an expression constructed 
out of
operands and the de ned operators:
pRoot prios = let pExpr = prios (pVar <|> pParens pExpr) in pExpr
There is a difference between an operator that occurs in the 
declaration part
of the input and one in the expression part: the former may be any 
operator,
whereas the latter can only be an operator that has been declared 
before. For
the priority declaration part we thus introduce a new parser that 
recognizes any
operator, and returns a parser that compiles the just recognized 
operator using
the function pOp defined before:

pAnyOp = pOp <$> pAnyOf "+*/-^"   just some possible operators

Now suppose we have recognized a left and a right associative operator 
resulting
in operator compilers pLeft and pRight. Out of these we can construct
a function that, given the operand parser, parses infix expressions 
containing
pLeft and pRight occurrences:

pLR factor = (pChainl pLeft . pChainr pRight) factor.

Generalizing this pattern to an unlimited number of operators we now 
deduce
the definition:

pPrios = pParens $
pFoldr ((.), id) ((     pChainl <$ pSym 'L'
                    <|>  pChainr <$ pSym 'R'
                    )
                   <*> pAnyOp
                   )

Let us now compare once more this approach with the situation where we
would have used a special parser generator. In the combinator approach 
we can
freely introduce all kinds of abbreviations by defining new combinators 
in terms
of existing ones; furthermore we may de ne higher order combinators 
that take
arguments and return values that may be parsers. This is a property we 
get for
free here, and is absent in most tools, where the syntax of the input 
is  xed and
at most some form of macro processing is available as an abstraction 
mechanism.
Another important consequence from embedding our parser construction in
an existing language is that type checking and error reporting can 
directly be
done at the program level, and not at the level of some generated 
program.
=======================================================
Fast, Error Correcting Parser Combinators: A Short Tutorial 5
module ExtendedCombinators where
import BasicCombinators
infixl 4 <$>, <$, <*, *>, <**>, <??>
infixl 2 `opt`
5
pAnyOf :: Eq s => [s] -> Parser s s
opt :: Eq s => Parser s a -> a -> Parser s a
(<$>) :: Eq s => (b -> a) -> Parser s b -> Parser s a
(<$ ) :: Eq s => a -> Parser s b -> Parser s a
10 (<* ) :: Eq s => Parser s a -> Parser s b -> Parser s a
( *>) :: Eq s => Parser s a -> Parser s b -> Parser s b
(<**>) :: Eq s => Parser s b -> Parser s (b->a) -> Parser s a
(<??>) :: Eq s => Parser s b -> Parser s (b->b) -> Parser s b
15 pAnyOf = foldr (<|>) pFail . map pSym
p `opt` v = p <|> pSucceed v
f <$> p = pSucceed f <*> p
f <$ p = const f <$> p
p <* q = (\ x _ -> x) <$> p <*> q
20 p *> q = (\ _ x -> x) <$> p <*> q
p <**> q = (\ x f -> f x) <$> p <*> q
p <??> q = p <**> (q `opt` id)
pFoldr alg@(op,e) p
25 = pfm where pfm = (op <$> p <*> pfm) `opt` e
pFoldrSep alg@(op,e) sep p
= (op <$> p <*> pFoldr alg (sep *> p)) `opt` e
pFoldrPrefixed alg@(op,e) c p = pFoldr alg (c *> p)
30 pList p = pFoldr ((:), []) p
pListSep s p = pFoldrSep ((:), []) s p
pListPrefixed c p = pFoldrPrefixed ((:), []) c p
pSome p = (:) <$> p <*> pList p
35 pChainr op x = r where r = x <**> (flip <$> op <*> r `opt` id)
pChainl op x = f <$> x <*> pList (flip <$> op <*> x)
where
f x [] = x
f x (func:rest) = f (func x) rest
40
pPacked l r x = l *> x <* r
   some ad hoc extensions
pOParen = pSym '('
45 pCParen = pSym ')'
pParens = pPacked pOParen pCParen
Listing 3: ExtendedCombinators

>
> Thank you so much,
> Frank-Andre Riess
>
> _______________________________________________
> Glasgow-haskell-users mailing list
> Glasgow-haskell-users at haskell.org
> http://www.haskell.org/mailman/listinfo/glasgow-haskell-users