[Haskell-cafe] ansi2html - one program, several issues

John Meacham john at repetae.net
Thu Jul 24 18:14:46 EDT 2008


On Sun, Jul 20, 2008 at 09:55:15AM -0400, Isaac Dupree wrote:
>> It doesn't stop it from parsing the entire file strictly. However, what 
>> it does do is prevent the parser from backtracking out of arbitrary 
>> amounts of lookahead. So, unless you use try (which allows for 
>> lookahead), when any token is consumed by the parser, it can be garbage 
>> collected (assuming the parser is the only thing pointing to the token 
>> stream). So, it consumes its input strictly, but with limited overhead 
>> (ideally using try only for some small bounded lookahead where it's 
>> needed).
>
> So with Parsec, you can keep the *input* from filling up memory, but if  
> you do, the *result* will still take up space (e.g. Right (value)).  For  
> a simple transformation where the output is a similar string to the  
> input, it will be just as large, so not much space is actually saved  
> (maybe a factor of 2 -- just keeping the output, not also the input), it  
> seems.

Yeah, this is my understanding. frisby combats this via 'irrefutable'
parser combinators. An irrefutable combinator is one that always
succeeds, a prime example is the 'many' combinator. Since 'many'
consumes only as many of its arguments as it can and is perfectly fine
consuming nothing, it inherently always succeeds so the parser can
immediately begin returning results (before consuming all of the input).
Ironically, this means frisby often uses less space than other parsers,
despite being based on PEGs which generally are known for taking a lot
of space.

It is not too hard to ensure your optimizer is irrefutable, for
instance, the parser for a simple language might be

> many statement <> eof

however, the 'eof' makes the parser non-irrefutabel. however it is easy
to gain back by doing

> many statement <> (eof // pure (error "unexpected data"))

frisbys static analysis realizes that (irrefutable // ... ) and ( ... //
irrefutable) are irrefutable. 

        John


-- 
John Meacham - ⑆repetae.net⑆john⑈


More information about the Haskell-Cafe mailing list