[Haskell-cafe] Parsec to parse tree structures?
Stephen Tetley
stephen.tetley at gmail.com
Sun Mar 14 15:09:49 EDT 2010
Hi David
Ah ha - this form of binary file layout is quite common (e.g. PECOFF
object files and OpenType / TrueType fonts).
Parsec and other parsing libraries are perhaps not ideal for the task,
as they consume input as they parse. I have my own alternative to
Parsec - Kangaroo [1] - for parsing binary files. It moves a cursor
around inside the file (strictly speaking an array in memory from
reading the file), so you can parse within a sub-region of the file
and jump back out again.
Although its on Hackage, I wouldn't really recommend its use - its now
fairly well documented but the API is not stable and I only work on it
sporadically. Because I didn't want any dependencies, the package is
quite a bit larger than it need be - if someone were interested in
technique they might be better off using it as a start point. The most
important bits are the 'intraparse' function and the monadic machinery
inside the Kangaroo.ParseMonad module.
Even when a binary format has a published standard, unfortunately the
standard might not be detailed enough to actually produce a parser.
This is the case for True Type and PECOFF which I wrote Kangaroo for,
and as I don't have much enthusiasm for deriving a parser from another
open-source implementation, its rather stalling any continued
development of Kangaroo.
[1] http://hackage.haskell.org/package/kangaroo
Best wishes
Stephen
More information about the Haskell-Cafe
mailing list