[Haskell-beginners] Parsing Custom TeX
Jeffrey Drake
jeffd at techsociety.ca
Sat Nov 8 18:22:27 EST 2008
*This message was sent in reply to Jason Dusek but the reply went to him, not the list.
I plan to have Command String, CommandParams [String],
CommandParamsWithArgs [String] [String] or something to that effect. If
each of these are combinators, then I imagine you can use <|> with them.
But realistically, I have no idea how to get this to take a string input
(or something like getContents) and have [UnTeX] come out the other end.
The problem that I have also is that while Command* can come anywhere in
the text, everything goes back to Text unless it is a Paragraph. The
spaces after a Command* up till the next letter have to be ignored, and
superfluous spaces within Text itself also should be. I also can't have
two Paragraphs right beside each other, because that makes little sense.
So from what I can guess - I need a lexer, I think it was called a
lexeme lexer.
The syntax for Command* is like this:
\command
\command[arg]
\command[arg][arg][...]
\command[...]{body}
\command[...]{body1}{body2}{...}
The reason why this is necessary is because you could have something
like \frac{a}{b}. I am trying to be more consistent with my use of this
than LaTeX/TeX is.
I might need to implement something like a table generator eventually,
but this would be hopefully for the backend. Because I would like to
translate this stuff into HTML and other outputs eventually.
Thank you for your help,
Jeffrey.
On Sat, 2008-11-08 at 01:00 -0800, Jason Dusek wrote:
> Each one of the little combinators seems to work as
> advertized. Are you having trouble fitting them together?
>
> --
> _jsn
>
>
> module UnTeX where
>
> import Text.ParserCombinators.Parsec
> import Text.ParserCombinators.Parsec.Prim
> import Text.ParserCombinators.Parsec.Language
> import qualified Text.ParserCombinators.Parsec.Token as T
>
> data UnTeX
> = Command String [String] String
> | Text String
> | Paragraph
> deriving Show
>
>
> -- I don't remember TeX very well, so I'm not sure this is right.
> command :: Parser UnTeX
> command = do
> char '\\'
> cmd <- ident
> p <- orNot params []
> b <- orNot body ""
> return $ Command cmd p b
> where
> params = many1 $ between (char '[') (char ']') ident
> body = do
> char '{'
> text <- many1 $ noneOf "}"
> char '}'
> return text
> ident = many1 $ letter <|> digit
> orNot p n = choice [try p , return n]
>
>
> paragraph :: Parser UnTeX
> paragraph = do
> newline
> newline
> return $ Paragraph
>
> text :: Parser UnTeX
> text = do
> txt <- many1 (alphaNum <|> space)
> return $ Text txt
More information about the Beginners
mailing list