[Haskell-cafe] Re: Lazy HTML parsing with HXT, HaXML/polyparse, what else?

neez at freemail.hu neez at freemail.hu
Sat May 12 10:07:32 EDT 2007


Hi,

> Hi,
>
> > What results should a lazy parser return before emitting ⊥? At the time
> > you read the <html>-tag, you cannot know whether a syntax error far down
> > in the file makes it invalid. Thus, you may not return the top-most
> > <html>-tag until you see the closing </html>.
>
> But to return the top most <html> you don't have to parse the data until
> the </html> tag, it is really enough to see it, of course you need to read
> the whole file for that,  but the parsing can be lazy.
>
I've just found somebody wrote an article on a similar idea.
http://citeseer.ist.psu.edu/199634.html

Real lazy evaluation fans should code html in a Breadth-First way, with 
forward pointers. :)

			Zoli


More information about the Haskell-Cafe mailing list