[Haskell-cafe] Keeping a symbol table with Parsec
Albert Y. C. Lai
trebla at vex.net
Mon Apr 2 19:20:06 EDT 2007
Joel Reymont wrote:
> Meaning that if a keyword Output is followed by ":" and an identifier
> and then "(NumericSimple)" then add identifier to the symbol table as a
> Number and box it in a constructor.
>
> Then in my lexer I do a lookup to check if I have seen this identifier
> and if I have seen one of type TypeNumOut I return the token NUM instead
> of ID. This ensures that I can have rules with the token NUM as opposed
> to ID everywhere.
I use a set of strings for the symbol table (I don't record the types of
the identifiers, but you can add it back). I don't allow for whitespace,
but you can add it back. The parser returns a string rather than a
constructor with a string, but you can add it back.
It is necessary to fuse the lexer and the parser together, so that they
share state; but we can fuse them in a way that still leaves
recognizable boundary, e.g., in the below, string "blah", ident, num,
name, and numeric_simple are lexers (thus when you add back whitespace
you know who are the suspects), and p0 is a parser that calls the lexers
and do extra.
The name lexer returns a sum type, so you can use its two cases to
signify whether a name is in the table or not; then ident and num can
fail on the wrong cases. (Alternatively, you can eliminate the sum type
by copying the name code into the ident code and the num code.)
import Text.ParserCombinators.Parsec
import Monad(mzero)
import Data.Set as Set
main = do { input <- getLine
; print (runParser p0 Set.empty "stdin" input)
}
p0 = do { string "Output"
; string ":"
; i <- ident
; string "("
; numeric_simple
; string ")"
; updateState (Set.insert i)
; return i
}
numeric_simple = many digit
ident = do { n <- name
; case n of { ID i -> return i
; _ -> mzero
}
}
name = do { c0 <- letter
; cs <- many alphaNum
; let n = c0 : cs
; table <- getState
; return (if n `Set.member` table then NUM n else ID n)
}
data Name = NUM String | ID String
num = do { n <- name
; case n of { NUM i -> return i
; _ -> mzero
}
}
More information about the Haskell-Cafe
mailing list