HaXml, memory usage and segmentation fault

Joe English jenglish@flightlab.com
Fri, 26 Oct 2001 13:15:10 -0700

Dmitry Astapov wrote:

> I have Hugs version February 2001, HaXml version 1.02 and this program:
>  [...]
> This program can process following file:
> <?xml version='1.0'?>
> <invoice>
>     [... one <customer> containing two <contract>s ... ]
> </invoice>
> Now increase amount of <customer>s to 10, and amount of <contract>s within
> each customer to 999. After that, "runhugs -h6000000 translate.hs
> invoice.xml invoice.html" dumps core :(
> What's the reason: bug in hugs, bug in HaXml, or my own bad programming
> techniques?

More an inappropriate use of Hugs -- 10 <customer>s with 999
<contract>s each is a moderately large input file, and
the Hugs interpreter just isn't designed to work with large inputs.
Try compiling the program instead.

The other issue is that HaXml's XML parser is insufficiently lazy
(although the rest of HaXml has very nice strictness properties).
For instance, there's no reason why your program
shouldn't run in near-constant space, but due to the way the
parser is structured it won't begin producing any output
until the entire input document has been read.

Try the identity transform 'main = processXmlWith keep'
on your sample document and see if that runs out of heap too.
If so, there's not much you can do short of replacing the
HaXml parser.

--Joe English