[Haskell-cafe] Overlaping Parsec rules
Hauschild, Klaus (EXT)
klaus.hauschild.ext at siemens.com
Mon Mar 7 13:48:31 CET 2011
Thanks Christian,
I adapted the "keyword" parser and now "n" <-> "negi" does not occur.
But there are still other problems to solve. If I activate the parseFunction the parser will answer to fact.gml unexpected end of input, expecting space or "}".
Any ideas?
-----Ursprüngliche Nachricht-----
Von: Christian Maeder [mailto:Christian.Maeder at dfki.de]
Gesendet: Montag, 7. März 2011 12:23
An: Hauschild, Klaus (EXT)
Cc: haskell-cafe at haskell.org
Betreff: Re: Overlaping Parsec rules
You should parse keywords using:
keyword s = try (string s) >> notFollowedBy (letter <|> digit)
C.
Am 07.03.2011 11:34, schrieb Hauschild, Klaus (EXT):
> Hi,
>
> to solve this ICFP task _http://www.cs.cornell.edu/icfp/task.htm_ I'm
> currnetly working on the parser. With the hint from Thu (reading Phillip
> Wadlers monadic parser paper) and consulting
> _http://en.wikibooks.org/wiki/Write_Yourself_a_Scheme_in_48_Hours/Parsing_
> I produce a first working version of the parser.
> After this great moment I completed the token definition and near all
> parsing rules. For the complete code have a look at
> _http://code.google.com/p/hgmltracer/source/browse/#svn%2Ftrunk%2FhGmlTracer_
>
> data GmlToken =
>
> -- structures
>
> FunctionToken TokenSequence |
>
> ArrayToken TokenSequence |
>
> -- control operators
>
> ApplyToken |
>
> IfToken |
>
> -- number operators
>
> AddiToken |
>
> AddfToken |
>
> ACosToken |
>
> ASinToken |
>
> ClampfToken |
>
> CosToken |
>
> DiviToken |
>
> DivfToken |
>
> EqiToken |
>
> EqfToken |
>
> FloorToken |
>
> FracToken |
>
> LessiToken |
>
> LessfToken |
>
> ModiToken |
>
> MuliToken |
>
> MulfToken |
>
> NegiToken |
>
> NegfToken |
>
> ToRealToken |
>
> SinToken |
>
> SqrtToken |
>
> SubiToken |
>
> SubfToken |
>
> -- points
>
> GetXToken |
>
> GetYToken |
>
> GetZToken |
>
> PointToken |
>
> -- arrays
>
> GetToken |
>
> LengthToken |
>
> -- environment
>
> IdentifierToken String |
>
> BinderToken String |
>
> -- types
>
> BoolToken Bool |
>
> IntToken Int |
>
> RealToken Double |
>
> StringToken String deriving Show
>
> And
>
> parseGml :: String -> [GmlToken]
>
> parseGml input = case parse parseList "gml" input of
>
> Left err -> error ("Parse error: " ++ (show err))
>
> Right gml -> gml
>
>
>
> parseList = sepBy parseGml' spaces
>
>
>
> parseGml' =
>
> -- structures
>
> -- parseFunction
>
> -- <|> parseArray
>
> -- control operators
>
> parseControlOperators
>
> -- number operators
>
> <|> parseNumberOperators
>
> -- points
>
> <|> parsePointOperators
>
> -- arrays
>
> <|> parseArrayOperators
>
> -- types
>
> <|> parseBool
>
> <|> parseInteger
>
> <|> parseString
>
> -- environment
>
> <|> parseIdentifier
>
> <|> parseBinder
>
>
>
> parseArray = parseSequence '[' ']'
>
> parseFunction = parseSequence '{' '}'
>
> parseSequence start end = do char start
>
> array <- parseList
>
> char end
>
> return $ ArrayToken array
>
>
>
> parseControlOperators = parseApply <|> parseIf
>
> parseApply = do string "apply"
>
> return $ ApplyToken
>
> parseIf = do string "if"
>
> return $ IfToken
>
>
>
> parseNumberOperators = do string "addi"
>
> return $ AddiToken
>
> <|> do string "addf"
>
> return $ AddfToken
>
> <|> do string "acos"
>
> return $ ACosToken
>
> <|> do string "asind"
>
> return $ ASinToken
>
> <|> do string "clampf"
>
> return $ ClampfToken
>
> <|> do string "cos"
>
> return $ CosToken
>
> <|> do string "divi"
>
> return $ DiviToken
>
> <|> do string "divf"
>
> return $ DivfToken
>
> <|> do string "eqi"
>
> return $ EqiToken
>
> <|> do string "eqf"
>
> return $ EqfToken
>
> <|> do string "floor"
>
> return $ FloorToken
>
> <|> do string "frac"
>
> return $ FracToken
>
> <|> do string "lessi"
>
> return $ LessiToken
>
> <|> do string "lessf"
>
> return $ LessfToken
>
> <|> do string "modi"
>
> return $ ModiToken
>
> <|> do string "muli"
>
> return $ MuliToken
>
> <|> do string "mulf"
>
> return $ MulfToken
>
> <|> do string "negi"
>
> return $ NegiToken
>
> <|> do string "negf"
>
> return $ NegfToken
>
> <|> do string "real"
>
> return $ ToRealToken
>
> <|> do string "sin"
>
> return $ SinToken
>
> <|> do string "Sqrt"
>
> return $ SqrtToken
>
> <|> do string "subi"
>
> return $ SubiToken
>
> <|> do string "subf"
>
> return $ SubfToken
>
>
>
> parsePointOperators = do string "getx"
>
> return $ GetXToken
>
> <|> do string "gety"
>
> return $ GetYToken
>
> <|> do string "getz"
>
> return $ GetZToken
>
> <|> do string "point"
>
> return $ PointToken
>
>
>
> parseArrayOperators = do string "get"
>
> return $ GetToken
>
> <|> do string "length"
>
> return $ LengthToken
>
>
>
> parseInteger = liftM (IntToken . read) $ many1 digit
>
> parseBool = do string "true"
>
> return $ BoolToken True
>
> <|> do string "false"
>
> return $ BoolToken False
>
> parseString = do char '"'
>
> string <- many (noneOf "\"")
>
> char '"'
>
> return $ StringToken string
>
>
>
> parseIdentifier = do identifier <- many (noneOf " ")
>
> return $ IdentifierToken identifier
>
> parseBinder = do char '/'
>
> binder <- many (noneOf " ")
>
> return $ BinderToken binder
>
>
>
> parseComment = do char '%'
>
> many (noneOf "")
>
> newline
>
> return $ ()
>
>
>
> spaces = skipMany1 space
>
>
> After gluing all this together in my mind all worked well. But it
> doesn't. The test file for parsing looks like:
>
> { /self /n
> n 2 lessi
> { 1 }
> { n 1 subi self self apply n muli }
> if
> } /fact
>
> 12 fact fact apply
>
>
> * I think there is a problem with overlaping rules. There is a
> parser rule consuming "negi" and resulting in the NegiToken. A
> single "n" is a valid identifier. For the example file my parser
> says: unexpected " ", expecting "negi"
> * I think the same problem is present for "parseInteger" and
> "parseReal" (currently no in code but looks like "parseReal = do a
> <- many1 digit \n char '.' \n b <- many1 digit \n return $
> RealToken (read (a ++ "." ++ b)"
> * Something with "parseFunction" is going really wrong.
> * All parsig rules are designed with the condition that each
> "construct" is separated by whitespaces. For this before parsing
> the input will be preprocessed: insert spaces, removing
> whitespaces and so on. Now in the parsed result appears
> (IdentifierToken ""). I think my version is not the best way to
> parse a identifer:
>
> parseIdentifier = *do* identifier <- many (noneOf " ")
> return $ *IdentifierToken* identifier
>
> Please help me.
>
> Klaus
>
>
>
>
> _______________________________________________
> Haskell-Cafe mailing list
> Haskell-Cafe at haskell.org
> http://www.haskell.org/mailman/listinfo/haskell-cafe
More information about the Haskell-Cafe
mailing list