[Haskell-cafe] Overlaping Parsec rules
Christian Maeder
Christian.Maeder at dfki.de
Mon Mar 7 14:17:40 CET 2011
Am 07.03.2011 13:48, schrieb Hauschild, Klaus (EXT):
> Thanks Christian,
>
> I adapted the "keyword" parser and now "n" <-> "negi" does not occur.
>
> But there are still other problems to solve. If I activate the parseFunction the parser will answer to fact.gml unexpected end of input, expecting space or "}".
Your redefinition of spaces (= skipMany1 space) is problematic and:
parseList = sepBy parseGml' spaces
Allow for the original (optional) spaces after parseGml':
parseGml'' = liftM2 const parseGml' spaces
parseList = many parseGml''
C.
P.S. why do you call? many (noneOf "")
manyTill anyChar newline
or just: many (noneOf "\n")
(a trailing newline will be skipped by spaces)
>
> Any ideas?
>
>
> -----Ursprüngliche Nachricht-----
> Von: Christian Maeder [mailto:Christian.Maeder at dfki.de]
> Gesendet: Montag, 7. März 2011 12:23
> An: Hauschild, Klaus (EXT)
> Cc: haskell-cafe at haskell.org
> Betreff: Re: Overlaping Parsec rules
>
> You should parse keywords using:
>
> keyword s = try (string s) >> notFollowedBy (letter <|> digit)
>
> C.
>
> Am 07.03.2011 11:34, schrieb Hauschild, Klaus (EXT):
>> Hi,
>>
>> to solve this ICFP task _http://www.cs.cornell.edu/icfp/task.htm_ I'm
>> currnetly working on the parser. With the hint from Thu (reading Phillip
>> Wadlers monadic parser paper) and consulting
>> _http://en.wikibooks.org/wiki/Write_Yourself_a_Scheme_in_48_Hours/Parsing_
>> I produce a first working version of the parser.
>> After this great moment I completed the token definition and near all
>> parsing rules. For the complete code have a look at
>> _http://code.google.com/p/hgmltracer/source/browse/#svn%2Ftrunk%2FhGmlTracer_
>>
>> data GmlToken =
>>
>> -- structures
>>
>> FunctionToken TokenSequence |
>>
>> ArrayToken TokenSequence |
>>
>> -- control operators
>>
>> ApplyToken |
>>
>> IfToken |
>>
>> -- number operators
>>
>> AddiToken |
>>
>> AddfToken |
>>
>> ACosToken |
>>
>> ASinToken |
>>
>> ClampfToken |
>>
>> CosToken |
>>
>> DiviToken |
>>
>> DivfToken |
>>
>> EqiToken |
>>
>> EqfToken |
>>
>> FloorToken |
>>
>> FracToken |
>>
>> LessiToken |
>>
>> LessfToken |
>>
>> ModiToken |
>>
>> MuliToken |
>>
>> MulfToken |
>>
>> NegiToken |
>>
>> NegfToken |
>>
>> ToRealToken |
>>
>> SinToken |
>>
>> SqrtToken |
>>
>> SubiToken |
>>
>> SubfToken |
>>
>> -- points
>>
>> GetXToken |
>>
>> GetYToken |
>>
>> GetZToken |
>>
>> PointToken |
>>
>> -- arrays
>>
>> GetToken |
>>
>> LengthToken |
>>
>> -- environment
>>
>> IdentifierToken String |
>>
>> BinderToken String |
>>
>> -- types
>>
>> BoolToken Bool |
>>
>> IntToken Int |
>>
>> RealToken Double |
>>
>> StringToken String deriving Show
>>
>> And
>>
>> parseGml :: String -> [GmlToken]
>>
>> parseGml input = case parse parseList "gml" input of
>>
>> Left err -> error ("Parse error: " ++ (show err))
>>
>> Right gml -> gml
>>
>>
>>
>> parseList = sepBy parseGml' spaces
>>
>>
>>
>> parseGml' =
>>
>> -- structures
>>
>> -- parseFunction
>>
>> -- <|> parseArray
>>
>> -- control operators
>>
>> parseControlOperators
>>
>> -- number operators
>>
>> <|> parseNumberOperators
>>
>> -- points
>>
>> <|> parsePointOperators
>>
>> -- arrays
>>
>> <|> parseArrayOperators
>>
>> -- types
>>
>> <|> parseBool
>>
>> <|> parseInteger
>>
>> <|> parseString
>>
>> -- environment
>>
>> <|> parseIdentifier
>>
>> <|> parseBinder
>>
>>
>>
>> parseArray = parseSequence '[' ']'
>>
>> parseFunction = parseSequence '{' '}'
>>
>> parseSequence start end = do char start
>>
>> array <- parseList
>>
>> char end
>>
>> return $ ArrayToken array
>>
>>
>>
>> parseControlOperators = parseApply <|> parseIf
>>
>> parseApply = do string "apply"
>>
>> return $ ApplyToken
>>
>> parseIf = do string "if"
>>
>> return $ IfToken
>>
>>
>>
>> parseNumberOperators = do string "addi"
>>
>> return $ AddiToken
>>
>> <|> do string "addf"
>>
>> return $ AddfToken
>>
>> <|> do string "acos"
>>
>> return $ ACosToken
>>
>> <|> do string "asind"
>>
>> return $ ASinToken
>>
>> <|> do string "clampf"
>>
>> return $ ClampfToken
>>
>> <|> do string "cos"
>>
>> return $ CosToken
>>
>> <|> do string "divi"
>>
>> return $ DiviToken
>>
>> <|> do string "divf"
>>
>> return $ DivfToken
>>
>> <|> do string "eqi"
>>
>> return $ EqiToken
>>
>> <|> do string "eqf"
>>
>> return $ EqfToken
>>
>> <|> do string "floor"
>>
>> return $ FloorToken
>>
>> <|> do string "frac"
>>
>> return $ FracToken
>>
>> <|> do string "lessi"
>>
>> return $ LessiToken
>>
>> <|> do string "lessf"
>>
>> return $ LessfToken
>>
>> <|> do string "modi"
>>
>> return $ ModiToken
>>
>> <|> do string "muli"
>>
>> return $ MuliToken
>>
>> <|> do string "mulf"
>>
>> return $ MulfToken
>>
>> <|> do string "negi"
>>
>> return $ NegiToken
>>
>> <|> do string "negf"
>>
>> return $ NegfToken
>>
>> <|> do string "real"
>>
>> return $ ToRealToken
>>
>> <|> do string "sin"
>>
>> return $ SinToken
>>
>> <|> do string "Sqrt"
>>
>> return $ SqrtToken
>>
>> <|> do string "subi"
>>
>> return $ SubiToken
>>
>> <|> do string "subf"
>>
>> return $ SubfToken
>>
>>
>>
>> parsePointOperators = do string "getx"
>>
>> return $ GetXToken
>>
>> <|> do string "gety"
>>
>> return $ GetYToken
>>
>> <|> do string "getz"
>>
>> return $ GetZToken
>>
>> <|> do string "point"
>>
>> return $ PointToken
>>
>>
>>
>> parseArrayOperators = do string "get"
>>
>> return $ GetToken
>>
>> <|> do string "length"
>>
>> return $ LengthToken
>>
>>
>>
>> parseInteger = liftM (IntToken . read) $ many1 digit
>>
>> parseBool = do string "true"
>>
>> return $ BoolToken True
>>
>> <|> do string "false"
>>
>> return $ BoolToken False
>>
>> parseString = do char '"'
>>
>> string <- many (noneOf "\"")
>>
>> char '"'
>>
>> return $ StringToken string
>>
>>
>>
>> parseIdentifier = do identifier <- many (noneOf " ")
>>
>> return $ IdentifierToken identifier
>>
>> parseBinder = do char '/'
>>
>> binder <- many (noneOf " ")
>>
>> return $ BinderToken binder
>>
>>
>>
>> parseComment = do char '%'
>>
>> many (noneOf "")
>>
>> newline
>>
>> return $ ()
>>
>>
>>
>> spaces = skipMany1 space
>>
>>
>> After gluing all this together in my mind all worked well. But it
>> doesn't. The test file for parsing looks like:
>>
>> { /self /n
>> n 2 lessi
>> { 1 }
>> { n 1 subi self self apply n muli }
>> if
>> } /fact
>>
>> 12 fact fact apply
>>
>>
>> * I think there is a problem with overlaping rules. There is a
>> parser rule consuming "negi" and resulting in the NegiToken. A
>> single "n" is a valid identifier. For the example file my parser
>> says: unexpected " ", expecting "negi"
>> * I think the same problem is present for "parseInteger" and
>> "parseReal" (currently no in code but looks like "parseReal = do a
>> <- many1 digit \n char '.' \n b <- many1 digit \n return $
>> RealToken (read (a ++ "." ++ b)"
>> * Something with "parseFunction" is going really wrong.
>> * All parsig rules are designed with the condition that each
>> "construct" is separated by whitespaces. For this before parsing
>> the input will be preprocessed: insert spaces, removing
>> whitespaces and so on. Now in the parsed result appears
>> (IdentifierToken ""). I think my version is not the best way to
>> parse a identifer:
>>
>> parseIdentifier = *do* identifier <- many (noneOf " ")
>> return $ *IdentifierToken* identifier
>>
>> Please help me.
>>
>> Klaus
>>
>>
>>
>>
>> _______________________________________________
>> Haskell-Cafe mailing list
>> Haskell-Cafe at haskell.org
>> http://www.haskell.org/mailman/listinfo/haskell-cafe
More information about the Haskell-Cafe
mailing list