[Haskell-cafe] Space leak in hexpat-0.20.3/List-0.5.1
oleg at okmij.org
oleg at okmij.org
Wed May 1 07:57:12 CEST 2013
Wren Thornton wrote:
> So I'm processing a large XML file which is a database of about 170k
> entries, each of which is a reasonable enough size on its own, and I only
> need streaming access to the database (basically printing out summary data
> for each entry). Excellent, sounds like a job for SAX.
Indeed a good job for a SAX-like parser. XMLIter is exactly such
parser, and it generates event stream quite like that of Expat. Also
you application is somewhat similar to the following
http://okmij.org/ftp/Haskell/Iteratee/XMLookup.hs
So, it superficially seems XMLIter should be up for the task. Can you
explain which elements your are counting? BTW, xml_enum already checks
for the well-formedness of XML (including the start-end tag
balance, and many more criteria). One can assume that the XMLStream
corresponds to the well-formed document and only count the desired
start tags (or end tags, for that matter).
More information about the Haskell-Cafe
mailing list