[Haskell-cafe] Incremental XML parsing with namespaces?
John Millikin
jmillikin at gmail.com
Mon Jun 8 18:04:35 EDT 2009
On Mon, Jun 8, 2009 at 1:44 PM, Malcolm
Wallace<malcolm.wallace at cs.york.ac.uk> wrote:
> Yes, HaXml makes no special effort to deal with namespaces. However, that
> does not mean that dealing with namespaces is "impossible" - it just
> requires a small amount of post-processing, that is all.
>
> For instance, it would not be difficult to start from the SAX-like parser
>
> http://hackage.haskell.org/packages/archive/HaXml/1.19.7/doc/html/Text-XML-HaXml-SAX.html
>
> taking e.g. a constructor value
> SaxElementOpen Name [Attribute]
>
> and converting it to your corresponding constructor value
> EventElementBegin Namespace LocalName [Attribute]
>
> Just filter the [Attribute] of the first type for the attribute name
> "xmlns", and pull that attribute value out to become your new Namespace
> value.
>
> Obviously there is a bit more to it than that, since namespace *defining*
> attributes, like your example xmlns:x="...", have an lexical scope. You
> will need some kind of state to track the scope, possibly in the parser
> itself, or again possibly in a post-processing step over the list of output
> XMLEvents.
>
The interface you linked to doesn't seem to have a way to "resume"
parsing. That is, I can't feed it chunks of text and have it generate
a (ParserState, [Event]) tuple for each chunk. Perhaps this is
possible in Haskell without explicit state management? I've tried to
write a test application to listen on a socket and print events as the
arrive, but with no luck.
Manually re-parsing the events isn't attractive, because it would
require writing at least part of the parser manually. I had hoped to
re-use an existing XML parser, rather than writing a new one.
More information about the Haskell-Cafe
mailing list