Are Happy-generated parsers lazy?

Hal Daume III hdaume@ISI.EDU
Wed, 8 May 2002 09:54:17 -0700 (PDT)


I don't know if this applies in your situation with XML, but I ran into
this wall a few months back.  In my case, I'm reading a whole lot of
objects of the same type from a file.  That is, my top level rule in my
parser looks something like:

TreeList : Tree TreeList { $1:$2}
         | {- empty -}   { [] }

Unfortunately, ofen the files I'm parsing have thousands of large trees
and take up around a gigabyte of disk space.  What I did to get around
this laziness thing was: I can identify at the lexing stage the tree
breaks (in my case there's a blank line followed by an id followed by a
tree without any blank lines).  Then, I changed my lexer from String ->
[Token] to String -> [[Token]] where each element in the returned list is
the tokens for just one tree; I can then parse each lazily.

I don't know if such a think is applicable to your problem, but it worked
for me...

 - Hal

--
Hal Daume III

 "Computer science is no more about computers    | hdaume@isi.edu
  than astronomy is about telescopes." -Dijkstra | www.isi.edu/~hdaume

On Wed, 8 May 2002, Simon Marlow wrote:

> > I am running into trouble with the HaXml parser because it is 
> > not lazy.
> > Hence I am considering to abondon the use of XML as data-exchange
> > format. Anyway, XML documents are too verbose.
> > 
> > Now I wonder whether Happy-generated parsers are lazy?
> 
> If you mean will it parse the input lazilly and return a lazy parse
> tree, then the answer is no.  Happy uses LALR(1) parsing, which means
> that it needs to see the end of a syntactic object before reducing it.
> 
> Cheers,
> 	Simon
> 
> _______________________________________________
> Haskell mailing list
> Haskell@haskell.org
> http://www.haskell.org/mailman/listinfo/haskell
>