[Haskell-cafe] parsec manyTill stack overflow
Badea Daniel
badeadaniel at yahoo.com
Fri Jul 4 18:15:34 EDT 2008
The file I'm trying to parse contains mixed sections like:
...
<start_section=
... script including arithmetic expressions ...
/end_section>
...
so I defined two parsers: one for the 'outer' language and
the other one for the 'inner' language. I used (manyTill
inner_parser end_section_parser) but I got a stack overflow
because there's just too much text between section begin and
end.
With getInput I can switch from the outer parser to the inner
parser but this one tries to parse until eof and when it hits the
'/end_section>' it fails.
--- On Fri, 7/4/08, Derek Elkins <derek.a.elkins at gmail.com> wrote:
> From: Derek Elkins <derek.a.elkins at gmail.com>
> Subject: Re: [Haskell-cafe] parsec manyTill stack overflow
> To: "Badea Daniel" <badeadaniel at yahoo.com>
> Cc: haskell-cafe at haskell.org
> Date: Friday, July 4, 2008, 2:22 PM
> On Fri, 2008-07-04 at 13:31 -0700, Badea Daniel wrote:
> > I'm trying to parse a large file looking for
> instructions on each line and for a section end marker but
> Parsec's manyTill function causes stack overflow, as
> you can see in the following example (I'm using ghci
> 6.8.3):
> >
> > > parse (many anyChar) ""
> ['a'|x<-[1..1024*64]]
> >
> > It almost immediately starts printing
> "aaaaaaaaaaa...." and runs to completion.
> >
> > > parse (manyTill anyChar eof) ""
> ['a'|x<-[1..1024*1024]]
> > *** Exception: stack overflow
> >
> > I guess this happens because manyTill recursively
> accumulates output
> > from the first parser and returns only when it hits
> the 'end' parser.
> > Is it possible to write a version of
> 'manyTill' that works like 'many'
> > returning output from 'anyChar' as soon as it
> advances through the
> > list of tokens?
>
> No, manyTill doesn't know whether it is going to return
> anything at all
> until its second argument succeeds. I can make manyTill
> not stack
> overflow, but it will never immediately start returning
> results. For
> the particular case above you can use getInput and setInput
> to get a
> result that does what you want.
>
> parseRest = do
> rest <- getInput
> setInput []
> return rest
>
> That should probably update the position as well though
> it's not so
> crucial in the likely use-cases of such a function.
More information about the Haskell-Cafe
mailing list