Are Happy-generated parsers lazy?
Hal Daume III
hdaume@ISI.EDU
Wed, 8 May 2002 09:54:17 -0700 (PDT)
I don't know if this applies in your situation with XML, but I ran into
this wall a few months back. In my case, I'm reading a whole lot of
objects of the same type from a file. That is, my top level rule in my
parser looks something like:
TreeList : Tree TreeList { $1:$2}
| {- empty -} { [] }
Unfortunately, ofen the files I'm parsing have thousands of large trees
and take up around a gigabyte of disk space. What I did to get around
this laziness thing was: I can identify at the lexing stage the tree
breaks (in my case there's a blank line followed by an id followed by a
tree without any blank lines). Then, I changed my lexer from String ->
[Token] to String -> [[Token]] where each element in the returned list is
the tokens for just one tree; I can then parse each lazily.
I don't know if such a think is applicable to your problem, but it worked
for me...
- Hal
--
Hal Daume III
"Computer science is no more about computers | hdaume@isi.edu
than astronomy is about telescopes." -Dijkstra | www.isi.edu/~hdaume
On Wed, 8 May 2002, Simon Marlow wrote:
> > I am running into trouble with the HaXml parser because it is
> > not lazy.
> > Hence I am considering to abondon the use of XML as data-exchange
> > format. Anyway, XML documents are too verbose.
> >
> > Now I wonder whether Happy-generated parsers are lazy?
>
> If you mean will it parse the input lazilly and return a lazy parse
> tree, then the answer is no. Happy uses LALR(1) parsing, which means
> that it needs to see the end of a syntactic object before reducing it.
>
> Cheers,
> Simon
>
> _______________________________________________
> Haskell mailing list
> Haskell@haskell.org
> http://www.haskell.org/mailman/listinfo/haskell
>