[Haskell-cafe] Is XHT a good tool for parsing web pages?
malcolm.wallace at cs.york.ac.uk
Tue Apr 27 16:58:16 EDT 2010
> Is XHT a good tool for parsing web pages?
> I read that it fails if the XML isn't strict and I know a lot of web
> pages don't use strict XHTML.
Do you mean HXT rather than XHT?
I know that the HaXml library has a separate error-correcting HTML
parser that works around most of the common non-well-formedness bugs
I believe HXT has a similar parser:
Indeed, some of the similarities suggest this parser was originally
lifted directly out of HaXml (as permitted by HaXml's licence),
although the two modules have now diverged significantly.
More information about the Haskell-Cafe