[Haskell-cafe] Brainstorming on how to parse IMAP
jgoerzen at complete.org
Tue Aug 5 11:22:26 EDT 2008
Thanks to those that responded on this -- certainly some libraries to
check out here.
One problem with that is that if I use specific parsing library foo,
then only others that are familiar with specific parsing library foo can
hack on it.
In general though, I think this speaks to more generic problems:
1) A lot of network protocols require reading data of arbitrary length
until a certain delimiter is found. Often that delimiter is \n.
Haskell is really weak at this. We can turn a Socket into a Handle and
use hGetLine, but this has a security weakness: it has no upper bounds
on the amount of data read, and this is vulnerable to resource
exhaustion DOS from the remote end. There is, as far as I can tell, no
general-purpose "buffer until I see foo" framework in Haskell. (Note
that just reading character-by-character is too slow as well).
1a) Even more generally, a "read one packet of data, however much
becomes available" notion is pretty weak. For a lazy ByteString, my
only two choices are to block until n bytes are available (where n is
specified in advance), or to not block at all. There is no "block until
some data, however much, becomes available, and return that chunk up to
a maxmimum size x." Well, Network.Socket.recv may do this, but it
returns a String. Is there even a way to do this with a ByteString?
2) A lot of RFC protocols -- and IMAP in particular -- can involve
complex responses from the server with hierarchical data, and the parse
of, say, line 1 and of each successive line can indicate whether or not
to read more data from the server. Parsing of these lines is a stateful
3) The linkage between Parsec and IO is weak. I cannot write an
"IMAPResponse" parser. I would have a write a set of parsers to parse
individual components of the IMAP response as part of the IO monad code
that reads the IMAP response, since the result of one dictates how much
network data I attempt to read.
More information about the Haskell-Cafe