[Haskell-cafe] Re: Re: hxt memory useage

Matthew Pocock matthew.pocock at ncl.ac.uk
Mon Jan 28 18:18:00 EST 2008


On Monday 28 January 2008, Rene de Visser wrote:
> It would be nice if HXT was incremental even when you are processing the
> whole tree.
>
> If I remember correctly, the data type of the tree in HXT is something like
>
> data Tree = Tree NodeData [Tree]
>
> which means that already processed parts of the tree can't be garbage
> collected because the parent node is holding onto them.
>
> If instead it was
>
> data Tree = Tree NodeData (IORef [Tree])
>
> Would could remove each subtree as it was processed (well just before would
> probably be necessary, and we would need to rely on blackholing to remove
> the reference on the stack). This would perhaps allow already processed
> subtree to be garbage collected. Together with the lazy evaluation this
> could lead to quite good memory usage.
>
> Rene.

Not so sure about this. For streaming processing, it would be nicer to have 
something like StAX with a stack of already entered elements kept about as 
book-keeping |(the tags + attribute sets to root). Let's face it, if you sign 
up to a document model, you are signing up to a document and shouldn't be 
supprised when it sits in memory.

I think the 'right' solution at least in part goes with the problem to be 
solved. I'd be upset if we moved to something more complex where my code 
breaks because something accidentaly garbage collected data that I need to 
back-track to.

Matthew


More information about the Haskell-Cafe mailing list