[Haskell-cafe] Parsec to parse tree structures?

John Lato jwlato at gmail.com
Mon Mar 15 09:57:03 EDT 2010


Hi Dave,

> From: david fries <djf at gmx.ch>
>
> Hello Café
>
> Some time ago I wrote a parser for a project of one our customers. The
> format was proprietary and binary. The data was structured as a tree
> with tables pointing to sub tables farther in the file. (Well actually
> there was one or two cases where branches joined together, so I guess it
> was a directed graph.) Also it had no defined endianess, some tables
> were bigendian others were little endian and some were in their very own
> in-house-endianess.
> All in all, the typical binary data structure that has been in use and
> continuously extended for the last 15 years. You know the kind.
>
> My parser was written in C# and wasn't particularly elegant, but it
> worked reliably. I was wondering how you would parse tree-like
> structures with Parsec (or other functional parsers)? Up to know, all
> examples I've seen were of sequential data. To parse trees, you'd
> essentially need to be able to follow pointers, parse whatever is there
> and then jump back. I guess I'd have to mess around with the internal
> state of the parser to do that, which is something I'd rather avoid.

Have you seen Edward Kmett's Parsing Trifecta talk/slides, available
at http://comonad.com/reader/2009/iteratees-parsec-and-monoid/ ?  I
think a similar approach would allow you to parse this data structure.
 The trick is to create a data source that can read from arbitrary
locations.

Also see http://okmij.org/ftp/Haskell/Iteratee/Tiff.hs for an
iteratee-based TIFF reader.  TIFF files have a similar structure, so
this code could apply pretty directly.  The parsing is quite
low-level, but you could use iteratee-parsec [1] or
attoparsec-iteratee [2] to use parser combinators with this technique.

[1] http://hackage.haskell.org/package/iteratee-parsec
[2] http://hackage.haskell.org/package/attoparsec-iteratee


More information about the Haskell-Cafe mailing list