[Haskell-cafe] ansi2html - one program, several issues
dan.doel at gmail.com
Sun Jul 20 02:55:10 EDT 2008
On Sunday 20 July 2008, John Meacham wrote:
> I do not believe that is the case, since the return type of runParser
> "Either ParseError a" means that before you can extract the result of
> the parse from the 'Right' branch, it must evaluate whether the result
> is 'Left' or 'Right' meaning it needs to parse the whole input in order
> to determine whether the parse was succesful.
> This was the reason I made frisby's main parsing routine just be
> > runPeg :: P a -> String -> a
> so you have to do something explicit like
> > runPegMaybe :: P a -> String -> Maybe a
> > runPegMaybe p s = runPeg (fmap Just p // return Nothing) s
> to force strictness in the parsing.
> Though, perhaps parsec is doing something more clever. I do know it uses
> the one token lookahead trick to determine which branch to take on
> alternation, but I don't think that solves the issue with parsing the
> entire file..
It doesn't stop it from parsing the entire file strictly. However, what it
does do is prevent the parser from backtracking out of arbitrary amounts of
lookahead. So, unless you use try (which allows for lookahead), when any
token is consumed by the parser, it can be garbage collected (assuming the
parser is the only thing pointing to the token stream). So, it consumes its
input strictly, but with limited overhead (ideally using try only for some
small bounded lookahead where it's needed).
By contrast, a naive parser combinator of the form:
p = foo <|> bar -- or p = try foo <|> bar in parsec
Might read the entire file into memory parsing foo, without any of it being
garbage collected until completion, in case foo fails and a backtrack to bar
Of course, this all assumes that the input to the parser can both be lazily
generated, and discarded in pieces (so, not the case if reading an entire
file into a strict byte string, for instance).
More information about the Haskell-Cafe