Parsing library

James Ealing jamesealing2000 at yahoo.co.uk
Thu Jan 1 23:19:26 EST 2004


I'm trying to make use of the combinatorial parsing
library to process strings. However, I can't figure
out the correct syntax for the (|||) (^^^) (>>>) (<^^)
and (^^>) functions. Can anyone see how to do it? If
so it'd be really useful if you could put down a
couple of examples of how each is used.
 
Thanks
 
Jim
 
The library:
 
Generic parsing functions
>module Parse (
>  Parse, succeed, token, spot, 
>  (|||), (^^^), (<^^), (^^>), (>>>), 
>  many, listOf, topLevel, 
>  white, ws, parseVar, word, parseNum)
>where

---------------------------------------------------
-- Combinatory parsing library using Maybe types --
--     suitable for non-ambiguous grammars.      --
--         sja4 at mcs.le.ac.uk (21/10/96)          --
---------------------------------------------------

>infixr 5 `into`, ^^^, <^^, ^^>
>infixl 4 >>>
>infixr 3 |||
Maybe defined in prelude:
data Maybe a = Just a | Nothing
---------------------
-- Type of parsers --
---------------------
>type Parse a = String -> Maybe (a, String)
-------------------
-- Basic parsers --
-------------------
Succeed with the value given.
>succeed :: a -> Parse a
>succeed val inp = Just (val, inp)
Recognize a specified token at the head of the input.
>token :: Char -> Parse Char
>token t (u:x) = if t == u then Just (t,x) else
Nothing
>token t [] = Nothing
Recognize a token with a certain property.
>spot :: (Char -> Bool) -> Parse Char
>spot p (t:x) = if p t then Just (t,x) else Nothing
>spot p [] = Nothing

-------------------------
-- Parsing combinators --
-------------------------
A choice between two parsers. The function p1 ||| p2
returns the result of p1
whenever it succeeds and the result of p2 otherwise.

>(|||) :: Parse a -> Parse a -> Parse a
>(p1 ||| p2) inp = case p1 inp of
>                    Nothing -> p2 inp
>                    Just (v,x) -> Just (v,x)

Sequencing of parsers. The function p1 ^^^ p2 returns
the result, if any, of
applying p1 to the input and then p2 to the remainder.

>(^^^) :: Parse b -> Parse c -> Parse (b,c)
>(p1 ^^^ p2) inp = 
>  case p1 inp of
>    Nothing -> Nothing
>    Just (v,x) -> case p2 x of
>                    Nothing -> Nothing
>                    Just (u,y) -> Just ((v,u),y)

Semantic action. The results from a parser p are
transformed by applying a
function f.

>(>>>) :: Parse b -> (b -> c) -> Parse c
>(p >>> f) inp = case p inp of
>               Nothing -> Nothing
>               Just (v,x) -> Just (f v, x)

Sequencing of parsers, choosing one component or the
other
>(<^^) :: Parse b -> Parse c -> Parse b
>p <^^ q = (p ^^^ q) >>> fst
>(^^>) :: Parse b -> Parse c -> Parse c
>p ^^> q = (p ^^^ q) >>> snd




Repetition. The parser p is used as many times as
possible and the results are
returned as a list.
>many :: Parse b -> Parse [b]
>many p = ((p ^^^ many p) >>> cons) ||| (succeed [])
>cons (x,xs) = x:xs
ListOf p c applies parser p as many times as possible,
with the instances
separated by instances of c, and returns the result as
a list.
>listOf :: Parse b -> Char -> Parse [b]
>listOf p sep = p ^^^ many (token sep ^^> p) >>> cons
The top level parser is a function which maps a list
of tokens to a value. A
value p :: Parse a b can be converted to such a
function by applying topLevel:
>topLevel :: Parse b -> String -> b
>topLevel p inp
>  = case p inp of
>      Just (result,[]) -> result
>      Just (result,rest) -> 
>        error ("parse unsuccessful; input
unconsumed:"++show rest)
>      other -> error "parse unsuccessful"
Note there is an error if the input is not fully
consumed.
It is sometimes useful to test whether a given parser
will accept the input
without actually returning a result.
>acceptedBy :: Parse b -> String -> Bool
>acceptedBy parser inp
>   = case parser inp of
>       Just (result,[]) -> True
>       other -> False
A more sophisticated form of sequencing. The into
combinator allows the second
parser to be chosen according the result of the first.
 
>into :: Parse b -> (b -> Parse c) -> Parse c
>into p f inp = case p inp of
>               Nothing -> Nothing
>               Just (v,x) -> f v x
==========================================================================
Absorb white space
>white = many (token ' ' ||| token '\t')
>ws p = white ^^> p <^^ white
==========================================================================
Consume and return variable: must start with small
letter, and continue with
alphanumberic characters
>parseVar :: Parse String
>parseVar = spot isLower ^^^ many (spot isAlphaNum)
>>> cons
Consume and return particular word
>word :: String -> Parse String
>word [c] = token c >>> (\ c -> [c])
>word (c:cs) = (token c ^^^ word cs) >>> cons
Consume and return number
>parseNum :: Parse Int
>parseNum = spot isDigit ^^^ many (spot isDigit) >>>
mknum
> where mknum (d, ds) = foldl f (digitToInt d) ds
>       f n d = 10*n + digitToInt d

________________________________________________________________________
Yahoo! Messenger - Communicate instantly..."Ping" 
your friends today! Download Messenger Now 
http://uk.messenger.yahoo.com/download/index.html


More information about the Haskell-Cafe mailing list