[Haskell-cafe] ansi2html - one program, several issues

Dan Doel dan.doel at gmail.com
Sun Jul 20 02:55:10 EDT 2008


On Sunday 20 July 2008, John Meacham wrote:
> I do not believe that is the case, since the return type of runParser
> "Either ParseError a" means that before you can extract the result of
> the parse from the 'Right' branch, it must evaluate whether the result
> is 'Left' or 'Right' meaning it needs to parse the whole input in order
> to determine whether the parse was succesful.
>
> This was the reason I made frisby's main parsing routine just be
> (roughly)
>
> > runPeg :: P a -> String -> a
>
> so you have to do something explicit like
>
> > runPegMaybe :: P a -> String -> Maybe a
> > runPegMaybe p s = runPeg (fmap Just p // return Nothing) s
>
> to force strictness in the parsing.
>
> Though, perhaps parsec is doing something more clever. I do know it uses
> the one token lookahead trick to determine which branch to take on
> alternation, but I don't think that solves the issue with parsing the
> entire file..

It doesn't stop it from parsing the entire file strictly. However, what it 
does do is prevent the parser from backtracking out of arbitrary amounts of 
lookahead. So, unless you use try (which allows for lookahead), when any 
token is consumed by the parser, it can be garbage collected (assuming the 
parser is the only thing pointing to the token stream). So, it consumes its 
input strictly, but with limited overhead (ideally using try only for some 
small bounded lookahead where it's needed).

By contrast, a naive parser combinator of the form:

    p = foo <|> bar -- or p = try foo <|> bar in parsec

Might read the entire file into memory parsing foo, without any of it being 
garbage collected until completion, in case foo fails and a backtrack to bar 
is required.

Of course, this all assumes that the input to the parser can both be lazily 
generated, and discarded in pieces (so, not the case if reading an entire 
file into a strict byte string, for instance).

-- Dan


More information about the Haskell-Cafe mailing list