[Haskell-cafe] Incremental XML parsing with namespaces?
Malcolm Wallace
malcolm.wallace at cs.york.ac.uk
Mon Jun 8 16:44:35 EDT 2009
On 8 Jun 2009, at 19:39, John Millikin wrote:
> + HaXml and hexpat seem to disregard namespaces entirely -- that is,
> the root element is parsed to "doc" instead of
> ("org:myproject:mainns", "doc"), and the second child is "x:ref"
> instead of ("org:myproject:otherns", "ref").
Yes, HaXml makes no special effort to deal with namespaces. However,
that does not mean that dealing with namespaces is "impossible" - it
just requires a small amount of post-processing, that is all.
For instance, it would not be difficult to start from the SAX-like
parser
http://hackage.haskell.org/packages/archive/HaXml/1.19.7/doc/html/Text-XML-HaXml-SAX.html
taking e.g. a constructor value
SaxElementOpen Name [Attribute]
and converting it to your corresponding constructor value
EventElementBegin Namespace LocalName [Attribute]
Just filter the [Attribute] of the first type for the attribute name
"xmlns", and pull that attribute value out to become your new
Namespace value.
Obviously there is a bit more to it than that, since namespace
*defining* attributes, like your example xmlns:x="...", have an
lexical scope. You will need some kind of state to track the scope,
possibly in the parser itself, or again possibly in a post-processing
step over the list of output XMLEvents.
Regards,
Malcolm
More information about the Haskell-Cafe
mailing list