[Haskell-cafe] Keeping a symbol table with Parsec

Albert Y. C. Lai trebla at vex.net
Mon Apr 2 19:20:06 EDT 2007

Joel Reymont wrote:
> Meaning that if a keyword Output is followed by ":" and an identifier 
> and then "(NumericSimple)" then add identifier to the symbol table as a 
> Number and box it in a constructor.
> Then in my lexer I do a lookup to check if I have seen this identifier 
> and if I have seen one of type TypeNumOut I return the token NUM instead 
> of ID. This ensures that I can have rules with the token NUM as opposed 
> to ID everywhere.

I use a set of strings for the symbol table (I don't record the types of 
the identifiers, but you can add it back). I don't allow for whitespace, 
but you can add it back. The parser returns a string rather than a 
constructor with a string, but you can add it back.

It is necessary to fuse the lexer and the parser together, so that they 
share state; but we can fuse them in a way that still leaves 
recognizable boundary, e.g., in the below, string "blah", ident, num, 
name, and numeric_simple are lexers (thus when you add back whitespace 
you know who are the suspects), and p0 is a parser that calls the lexers 
and do extra.

The name lexer returns a sum type, so you can use its two cases to 
signify whether a name is in the table or not; then ident and num can 
fail on the wrong cases. (Alternatively, you can eliminate the sum type 
by copying the name code into the ident code and the num code.)

import Text.ParserCombinators.Parsec
import Monad(mzero)
import Data.Set as Set

main = do { input <- getLine
           ; print (runParser p0 Set.empty "stdin" input)

p0 = do { string "Output"
         ; string ":"
         ; i <- ident
         ; string "("
         ; numeric_simple
         ; string ")"
         ; updateState (Set.insert i)
         ; return i

numeric_simple = many digit

ident = do { n <- name
            ; case n of { ID i -> return i
                        ; _ -> mzero

name = do { c0 <- letter
           ; cs <- many alphaNum
           ; let n = c0 : cs
           ; table <- getState
           ; return (if n `Set.member` table then NUM n else ID n)

data Name = NUM String | ID String

num = do { n <- name
          ; case n of { NUM i -> return i
                      ; _ -> mzero

More information about the Haskell-Cafe mailing list