[Haskell-cafe] Brainstorming on how to parse IMAP

Donn Cave donn at avvanta.com
Sun Aug 3 04:52:08 EDT 2008

On Sat, 02 Aug 2008 21:04:28 -0500
John Goerzen <jgoerzen at complete.org> wrote:
> The braces mean that the given number of octets follows after the CRLF
> at the end of the given line.  We could even see:
> A283 SEARCH {4} {21}
> TEXTstring not in mailbox

I don't think it's quite that bad.  The literal count must immediately
precede the value -- {4}\r\nTEXT -- the way I read it.

I think most servers use this mechanism for mail message data only, but
seems to me there's one out there that may occasionally slip a literal
into LIST results.

> 1) Ideally I could parse stuff lazily.  I have tried this with FTP and
> it is more complex than it seems at first, due to making sure you never,
> never, never consume too much data.  But being able to parse lazily
> would make it so incredibly easy to issue a command saying "download all
> new mail", and things get written to disk as they come in, with no
> buffer at all.

I'm not sure what that means, but to start at the beginning, ideally the
IMAP parser would be pure, right?

> 2) Avoiding Strings wherever possible.

It certainly makes sense to me that message data would be bytestring,
not only the "body" of the message but header fields as well.  Flags,
for me, would be string, unless you want to parse the standard flags
into a sum type.  No big deal, maybe bytestrings are more convenient
than I realize.

> 3) Avoiding complex buffering schemes where I have to manually buffer
> data packets.

This sounds to me like the application's problem, not the parser's?

I actually wrote the beginnings of an IMAP parser, for my own entertainment.
Substantially incomplete, in terms of support for various IMAP responses,
but it works against a couple of IMAP servers, and it supports GSSAPI
authentication and SSL.

Well, of course the IMAP parsing code itself has no idea about GSSAPI
or SSL, that being the application's job, but I think it's worth looking
at how for example GSSAPI authentication works with IMAP, while designing
the parser.

In the most general view, it may return either (response, remainder) or
(insufficient-data), and in the latter case the application gets more
data from the server and tries again.

Donn Cave <donn at avvanta.com>

More information about the Haskell-Cafe mailing list