HaXml, memory usage and segmentation fault

Dmitry Astapov adept@umc.com.ua
27 Oct 2001 01:44:50 +0300

>> What's the reason: bug in hugs, bug in HaXml, or my own bad programming
>> techniques?

 JE> More an inappropriate use of Hugs -- 10 <customer>s with 999
 JE> <contract>s each is a moderately large input file,
Almost 6 megs

 JE> and the Hugs interpreter just isn't designed to work with large
 JE> inputs.  Try compiling the program instead.
well, ghc-5.02 seems to dislike something inside XmlLib.hs - it could not
find interface defs file for modules IOExts .. I plan to look more deeply
into it though.

 JE> The other issue is that HaXml's XML parser is insufficiently lazy
 JE> (although the rest of HaXml has very nice strictness properties).  For
 JE> instance, there's no reason why your program shouldn't run in
 JE> near-constant space, but due to the way the parser is structured it
 JE> won't begin producing any output until the entire input document has
 JE> been read.
I suspected it, and your comment encouraged me to look more deeply in the
code, and yes - it seems that examples like mine simply do not fit in :(

 JE> Try the identity transform 'main = processXmlWith keep' on your sample
 JE> document and see if that runs out of heap too.  If so, there's not
 JE> much you can do short of replacing the HaXml parser.

I got:

runhugs98 +sgt -h5000000 translate_invoices.hs invoice.xml invoice_small.html
runhugs: Error occurred
(47153895 reductions, 79953374 cells, 23 garbage collections)
{{Gc:3812956}}ERROR - Control stack overflow

I tried to put several "observe" statements in the code, but they seem to
be ignored in the case of "Control stack overflow".

Dmitry Astapov //ADEpt                               E-mail: adept@umc.com.ua
GPG KeyID/fprint: F5D7639D/CA36 E6C4 815D 434D 0498  2B08 7867 4860 F5D7 639D