[Haskell-cafe] Network parsing and parsec
andrew at pimlott.net
Thu Sep 15 21:11:58 EDT 2005
On Thu, Sep 15, 2005 at 11:09:25AM -0500, John Goerzen wrote:
> The recent thread on binary parsing got me to thinking about more
> general network protocol parsing with parsec. A lot of network
> protocols these days are text-oriented, so seem a good fit for parsec.
> However, the difficulty I come up time and again is: parsec normally
> expects to parse as much as possible at once.
> With networking, you must be careful not to attempt to read more data
> than the server hands back, or else you'll block.
> I've had some success with hGetContents on a socket and feeding it into
> extremely carefully-crafted parsers, but that is error-prone and ugly.
I don't see why this would be more error-prone than any other approach.
As for ugly, it might be somewhat more pleasant if Parsec could take
input from a monadic action, but hGetContents works, and if you want
more control (eg, reading from a socket fd directly), you can use
I wrote a parser for s-expressions that must not read beyond the final
')', and while I agree it is tricky, it's all necessary trickiness.
Note I use lexeme parsers as in the Parsec documentation, and use an "L"
suffix in their names.
-- do not eat trailing whitespace, because we want to process a request from
-- a lazy stream (eg socket) as soon as we see the closing paren.
sexpr :: Parser a -> Parser (Sexpr a)
sexpr p = liftM Atom p
<|> cons p
cons :: Parser a -> Parser (Sexpr a)
cons p = parens tailL where
tailL = do dotL
<|> liftM2 Cons (sexprL p) tailL
<|> return Nil
sexprL :: Parser a -> Parser (Sexpr a)
sexprL p = lexeme (sexpr p)
consL :: Parser a -> Parser (Sexpr a)
consL p = lexeme (cons p)
top p = between whiteSpace eof p
lexeme p = do r <- p
whiteSpace = many space
dotL = lexeme (string ".")
-- NB: eats whitespace after opening paren, but not closing
parens p = between (lexeme (string "(")) (string ")") p
More information about the Haskell-Cafe