[Haskell-cafe] Re: Lazy XML handling

MR K P SCHUPKE k.schupke at imperial.ac.uk
Mon May 17 12:37:31 EDT 2004


>I'd say it's still less easy (through clearly possible) for an application
>to adopt alternative traversal patterns.

Yes, thats true, but thats the order in which the tokenizer outputs
tags... This is the order tags must be processed if you wish to 
stream - otherwise you need a heap of tags to keep the ones you are
not ready to process from the tokenizer. In effect we can transform
an algorithm that requires a different traversal into one that uses
this traversal, at a possible slight increase in complexity. The gain
is sequential stream based processing.

Can you give me an example where processing HAS to be in a different order?

In effect HaXml's representation actually simplifies to the above
anyway... if you think about...

[ A [B,C,D [E,F], G] H [I,J] ] => [(0,A),(1,B),(1,C),(1,D),(2,E),(2,F),(1,G),(0,H),(1,I),(1,J)]

The nested lists does not allow any extra flexibility in traversal, at the cost of
greater complexity. The one case it is simpler is if you want to look up a single
node by its 'tree address' - This is more complex with the list-rep, as you
have to count nodes at a given level, skipping irrelevent nodes...

The list rep is very good at data extraction, if you want to extract a sub-tree
of a given type you just ignore all tags until you find one of the correct
type, then output all tags until the depth returns to the level of the initial
tag. 

Regards,
Keean.




More information about the Haskell-Cafe mailing list