[Haskell-cafe] Parsing XML

Michal J Gajda mgajda at mimuw.edu.pl
Wed Aug 4 13:08:08 UTC 2021


Dear John,

I recommend Xeno: https://gitlab.com/migamake/xeno
It was released on Hackage some time ago by Chris Done, then
maintained by Marco Zocca.
It was benchmarked against fastest parsers in both Haskell and other languages:

* https://arxiv.org/abs/2011.03536
* http://neilmitchell.blogspot.com/2016/12/new-xml-parser-hexml.html

It allows you to use SAX and DOM-style processing. It has an awesome
memory efficiency that is surpassed only by PugiXML parser (which
builds DOM in place, but occasionally crashes).

It does not support some XML features (namespaces, entity
normalization etc.), but these can be  added as postprocessing after
the DOM is built.

It is maintained. And there is a yet unreleased version that can be
used to parse HTML documents and documents that are not well formed.

Disclosure: I am current maintainer.
-- 
  Cheers
    MichaƂ J. Gajda


More information about the Haskell-Cafe mailing list