[Haskell-cafe] Capturing the parent element as I parse XML using parsec

Richard O'Keefe ok at cs.otago.ac.nz
Mon Jul 30 00:44:11 CEST 2012


On 29/07/2012, at 6:21 PM, C K Kashyap wrote:
> I am struggling with an idea though - How can I capture the parent element of each element as I parse? Is it possible or would I have to do a second pass to do the fixup?

Why do you *want* the parent element of each element?
One of the insanely horrible aspects of the Document Object Model is that every
element is nailed in place by pointers everywhere, with the result that you
cannot share elements, and even moving an element was painful.
I still do a fair bit of SGML/XML process in C using a "Document Value Model"
library that uses hash consing, and it's so much easier it isn't funny.

While you are traversing a document tree it is useful to keep track of the
path from the root.  Given

    data XML
       = Element String [(String,String)] [XML]
       | Text String

you do something like

    traverse :: ([XML] -> [a] -> a) -> ([XML] -> String -> a) -> XML -> a
    traverse f g xml = loop [] xml
      where loop ancs (Text s)           = g ancs  s
            loop ancs e@(Element _ _ ks) = f ancs' (map (loop ancs') ks)
                                           where ancs' = e:ancs

(This is yet another area where Haskell's non-strictness pays off.)
If you do that, then you have the parent information available without
it being stored in the tree.







More information about the Haskell-Cafe mailing list