[Haskell-cafe] Re: HXT Namespaces and XPath

Uwe Schmidt si at fh-wedel.de
Tue Apr 6 10:18:46 EDT 2010


Hi Mads,

> In HXT, namespace prefixes bound by an XML document are valid in the
> context of an XPath. How do avoid that?
> 
> An example program will clarify:
> 
> simpleXml :: String
> simpleXml = "<soap:Body xmlns:soap=\"http://www.w3.org/2003/05/soap-envelope\"/>"
> 
> nsEnv :: [(String, String)]
> nsEnv = [ ("s"    , "http://www.w3.org/2003/05/soap-envelope") ]
> 
> evalXPath :: String -> String -> [XmlTree]
> evalXPath xpath xml =
>   runLA ( xread
>           >>> propagateNamespaces
>           >>> getXPathTreesWithNsEnv nsEnv xpath
>         ) xml
> 
> Here:
> 
> evalXPath "//s:Body" simpleXml     ==
> evalXPath "//soap:Body" simpleXml
> 
> Even though I only mentions the prefix "s" (and not "soap") in the
> function nsEnv.

When working with namespaces in XML, the prefixes are not longer significant.
After namespace propagation every name in the XML document is identified
by a qualified name. This is a pair consisting of the namespace URI and the local part.
The prefixes become irrelevant.
A namespace aware XPath expression needs a namespace environment,
as given in the example, to construct these qualified names for the names
in the XPath expression.
So the results of both evalXPath calls in your example must be the same. 

> I do not want the XPath to see prefixes declared in the xml-document, as
> it means that two semantically similar XML documents can get different
> results when applied to the same XPath.

If you intend, that the prefixes are significant, then you should not work
with namespace propagation. If namespaces are not propagated,
all names occurring in the document are taken as they are. That means
"s:Body" is different from "soap:Body".

For solving your tasks there could be 2 ways:

1. don't use namespaces, no propagateNamespaces, no getXPathTreesWithNsEnv,
 use the prefixes instead. Disadvantage: the prefixes become significant. If
 prefixes change, the semantics changes.

2. Use propagateNamespaces and getXPathTreesWithNsEnv, but select
 all nodes via the namespace URIs, not via the prefixes. If you want to
 output the manipulated XML parts and use standard prefixes for
 the namespaces, then use the namespace manipulation functions
 in http://hackage.haskell.org/packages/archive/hxt/latest/doc/html/Text-XML-HXT-Arrow-Namespace.html

Cheers,

  Uwe



More information about the Haskell-Cafe mailing list