[Haskell-cafe] Network parsing and parsec

Andrew Pimlott andrew at pimlott.net
Thu Sep 15 21:11:58 EDT 2005


On Thu, Sep 15, 2005 at 11:09:25AM -0500, John Goerzen wrote:
> The recent thread on binary parsing got me to thinking about more
> general network protocol parsing with parsec.  A lot of network
> protocols these days are text-oriented, so seem a good fit for parsec.
> 
> However, the difficulty I come up time and again is: parsec normally
> expects to parse as much as possible at once.
> 
> With networking, you must be careful not to attempt to read more data
> than the server hands back, or else you'll block.
> 
> I've had some success with hGetContents on a socket and feeding it into
> extremely carefully-crafted parsers, but that is error-prone and ugly.

I don't see why this would be more error-prone than any other approach.
As for ugly, it might be somewhat more pleasant if Parsec could take
input from a monadic action, but hGetContents works, and if you want
more control (eg, reading from a socket fd directly), you can use
unsafeInterleaveIO yourself.

I wrote a parser for s-expressions that must not read beyond the final
')', and while I agree it is tricky, it's all necessary trickiness.
Note I use lexeme parsers as in the Parsec documentation, and use an "L"
suffix in their names.

    -- do not eat trailing whitespace, because we want to process a request from
    -- a lazy stream (eg socket) as soon as we see the closing paren.
    sexpr :: Parser a -> Parser (Sexpr a)
    sexpr p = liftM Atom p
          <|> cons p
    cons :: Parser a -> Parser (Sexpr a)
    cons p  = parens tailL where
      tailL = do  dotL
                  sexprL p 
          <|> liftM2 Cons (sexprL p) tailL
          <|> return Nil
    sexprL :: Parser a -> Parser (Sexpr a)
    sexprL p  = lexeme (sexpr p)
    consL :: Parser a -> Parser (Sexpr a)
    consL p   = lexeme (cons p)

    top p       = between whiteSpace eof p
    lexeme p    = do  r <- p
                      whiteSpace
                      return r
    whiteSpace  = many space
    dotL        = lexeme (string ".")
    -- NB: eats whitespace after opening paren, but not closing
    parens p    = between (lexeme (string "(")) (string ")") p

Andrew


More information about the Haskell-Cafe mailing list