HaXml, memory usage and segmentation fault
Joe English
jenglish@flightlab.com
Wed, 31 Oct 2001 17:37:50 -0800
An update on Dmitry's problems with HaXml memory usage:
+ Compiling HaXml and the driver program with ghc -O helps a *lot*.
+ Using the version of HaXml that comes preinstalled with
GHC (-package text) helps even more. There is a slight difference
in the 'Pretty' module (which is used to print the output) between
the two versions.
+ I wrote an adapter that converts my parser's XML representation
into HaXml's, so you can use it as a drop-in replacement.
This helps some, but not enough. The heap profile using
HaXml 1.02 has two large humps: the first from parsing the
input, and the second from pretty-printing the output.
(With the GHC version of HaXml the second hump is about half
as tall as with the "official" HaXml version).
With the new parser, only the smaller hump remains.
+ Figuring that using a pretty-printer is overkill, I replaced
it with a quick hack that converts the HaXml representation
_back_ into my representation and feeds it to a serializer
that I had previously written. This improves things some more:
the identity transformation 'processXmlWith keep' now has a
flat heap profile.
+ Unfortunately, Dmitry's original program still has a space leak.
I suspect that the HaXml combinators (or, more likely,
the HaXml internal representation) are not as space-efficient
as I had originally thought, since when I rewrote Dmitry's test
case to use the new parser's internal representation directly
I again got a flat heap profile -- there doesn't
seem to be anything wrong with the structure of the
original program.
The code will be ready to release Real Soon Now;
I'll keep you posted.
--Joe English
jenglish@flightlab.com