[Haskell-cafe] Parsec to parse tree structures?
stephen.tetley at gmail.com
Thu Mar 18 06:18:50 EDT 2010
Where I've used it, random access does seem conceptual more
satisfactory than trying to avoid it.
Well designed binary formats are deterministic - so wherever you are
inside the file you should know what you are parsing. One example of
this determinism is that parsing "local" alternatives are generally
encoded on a single tag, whereas in a parser for a programming
language parsing alternatives might require lookahead and possibly
other disambiguation. Another example is that formats will often have
a "index table" table at the start of the file giving start position
and length for all the sub tables - this saves you from having to
start each table with a tag and length.
For complex formats e.g. PECOFF or TrueType, you might only want to
parse certain tables [*]. After parsing the index table you could
build a list of parsers to run sequentially on the body of the file
(including parsers that just drop unwanted tables), but this seems too
much work (and too much to go wrong), when I can use a function almost
as simple as parseAt for the tables I'm interested in.
parseAt :: Start -> End -> Parser a -> Parser a
In practice, I made parseAt slightly more complicated so it could
encode where the cursor is moved to at the end of the parse.
[*] Certain tables in TrueType / OpenType are propriety - it might be
unwise for an open source parser to even include parsing operations
for those tables.
More information about the Haskell-Cafe