[Haskell-cafe] Space leak in hexpat-0.20.3/List-0.5.1

oleg at okmij.org oleg at okmij.org
Wed May 1 07:57:12 CEST 2013


Wren Thornton wrote:
> So I'm processing a large XML file which is a database of about 170k
> entries, each of which is a reasonable enough size on its own, and I only
> need streaming access to the database (basically printing out summary data
> for each entry). Excellent, sounds like a job for SAX.

Indeed a good job for a SAX-like parser. XMLIter is exactly such
parser, and it generates event stream quite like that of Expat. Also
you application is somewhat similar to the following
        http://okmij.org/ftp/Haskell/Iteratee/XMLookup.hs

So, it superficially seems XMLIter should be up for the task. Can you
explain which elements your are counting? BTW, xml_enum already checks
for the well-formedness of XML (including the start-end tag
balance, and many more criteria). One can assume that the XMLStream
corresponds to the well-formed document and only count the desired
start tags (or end tags, for that matter).






More information about the Haskell-Cafe mailing list