[Haskell-cafe] Re: Re: hxt memory useage
Matthew Pocock
matthew.pocock at ncl.ac.uk
Mon Jan 28 18:18:00 EST 2008
On Monday 28 January 2008, Rene de Visser wrote:
> It would be nice if HXT was incremental even when you are processing the
> whole tree.
>
> If I remember correctly, the data type of the tree in HXT is something like
>
> data Tree = Tree NodeData [Tree]
>
> which means that already processed parts of the tree can't be garbage
> collected because the parent node is holding onto them.
>
> If instead it was
>
> data Tree = Tree NodeData (IORef [Tree])
>
> Would could remove each subtree as it was processed (well just before would
> probably be necessary, and we would need to rely on blackholing to remove
> the reference on the stack). This would perhaps allow already processed
> subtree to be garbage collected. Together with the lazy evaluation this
> could lead to quite good memory usage.
>
> Rene.
Not so sure about this. For streaming processing, it would be nicer to have
something like StAX with a stack of already entered elements kept about as
book-keeping |(the tags + attribute sets to root). Let's face it, if you sign
up to a document model, you are signing up to a document and shouldn't be
supprised when it sits in memory.
I think the 'right' solution at least in part goes with the problem to be
solved. I'd be upset if we moved to something more complex where my code
breaks because something accidentaly garbage collected data that I need to
back-track to.
Matthew
More information about the Haskell-Cafe
mailing list