[Haskell-cafe] Parsing XML
Michal J Gajda
mgajda at mimuw.edu.pl
Wed Aug 4 13:08:08 UTC 2021
Dear John,
I recommend Xeno: https://gitlab.com/migamake/xeno
It was released on Hackage some time ago by Chris Done, then
maintained by Marco Zocca.
It was benchmarked against fastest parsers in both Haskell and other languages:
* https://arxiv.org/abs/2011.03536
* http://neilmitchell.blogspot.com/2016/12/new-xml-parser-hexml.html
It allows you to use SAX and DOM-style processing. It has an awesome
memory efficiency that is surpassed only by PugiXML parser (which
builds DOM in place, but occasionally crashes).
It does not support some XML features (namespaces, entity
normalization etc.), but these can be added as postprocessing after
the DOM is built.
It is maintained. And there is a yet unreleased version that can be
used to parse HTML documents and documents that are not well formed.
Disclosure: I am current maintainer.
--
Cheers
MichaĆ J. Gajda
More information about the Haskell-Cafe
mailing list