[Haskell-cafe] Re: Lazy HTML parsing with HXT, HaXML/polyparse,
what else?
neez at freemail.hu
neez at freemail.hu
Sat May 12 10:07:32 EDT 2007
Hi,
> Hi,
>
> > What results should a lazy parser return before emitting ⊥? At the time
> > you read the <html>-tag, you cannot know whether a syntax error far down
> > in the file makes it invalid. Thus, you may not return the top-most
> > <html>-tag until you see the closing </html>.
>
> But to return the top most <html> you don't have to parse the data until
> the </html> tag, it is really enough to see it, of course you need to read
> the whole file for that, but the parsing can be lazy.
>
I've just found somebody wrote an article on a similar idea.
http://citeseer.ist.psu.edu/199634.html
Real lazy evaluation fans should code html in a Breadth-First way, with
forward pointers. :)
Zoli
More information about the Haskell-Cafe
mailing list