[Haskell-cafe] HTML library with DOM?

Neil Mitchell ndmitchell at gmail.com
Thu Oct 7 17:34:06 EDT 2010


Yes, I don't think I've officially announced a version of TagSoup that
has had HTML 5 parsing, but it now does as standard for the last few
releases. The HTML 5 spec is still changing, so it's entirely possible
something is incorrect in a corner case, but please let me know and
I'll fix it.

Thanks, Neil

2010/10/7 Gregory Collins <greg at gregorycollins.net>:
> Michael Snoyman <michael at snoyman.com> writes:
>
>> As far as I know, Neil Mitchel's tagsoup[1] parses according to the
>> HTML 5 parsing rules, but it just generates a list of Tags[2], so
>> you'd have to build the DOM tree up from there. I personally have had
>> great experience with tagsoup. It's even the core of HTML-scraping
>> technology powering searchonce[3].
>
> Yep, someone else wrote me privately to say this (that tagsoup respects
> the html5 lexing rules). So I'll be using this as the basis of an html5
> DOM parser. Stay tuned!
>
> G
> --
> Gregory Collins <greg at gregorycollins.net>
> _______________________________________________
> Haskell-Cafe mailing list
> Haskell-Cafe at haskell.org
> http://www.haskell.org/mailman/listinfo/haskell-cafe
>


More information about the Haskell-Cafe mailing list