Q. about XML support

oleg@pobox.com oleg@pobox.com
Fri, 21 Feb 2003 20:37:59 -0800 (PST)


Joe English wrote:

>     case nodeName node of
>         "html:p" -> ...
>         "html:h1" -> ...
>         "html:pre" -> ...

> The approach I'm thinking of is to let the application programmer
> define an "internal" namespace environment, then rewrite
> element and attribute names in the parsed document to
> use the locally-defined prefixes.

SXML [1] treats the namespaces precisely this way. All element and
attribute names are fully resolved extended names, that is, with the
namespace URI if any, e.g.,
	http://www.w3.org/1999/xhtml:p

All the names are symbols (a.k.a. interned strings), which means that
multiple occurrences of the same seemingly long name share all of the
name's characters. However, it is still awkward to read such
names. Therefore, SXML provides for so-called user ns-shortcuts. They
look like XML Name prefixes, with an important distinction:
ns-shortcuts are in a one-to-one correspondence with the namespace
URIs. XML Namespace prefixes do not have this property. Furthermore,
ns-shortcuts are determined by the author of the processing
application whereas XML Namespace prefixes are chosen by the author of
a document.

SXML specification discusses the issue of XML Namespaces in XML AST at
great length -- and points out to more discussion, about dealing with
namespaces in XPath and XSLT.

Incidentally, SXML is a language-neutral specification. One can easily
use Expat to write SXML. There are parsers that convert XML and a
(possibly ill-formed) HTML to SXML.

An example code [2] demonstrates parsing and unparsing of a
namespace-rich document: a DAML RDF file. DAML (DARPA Agent Markup
Language) ontologies typically use a great number of namespaces. The
example code demonstrates that
	parse . unparse . parse === parse

[1] http://pobox.com/~oleg/ftp/Scheme/SXML.html

[2] http://cvs.sf.net/cgi-bin/viewcvs.cgi/ssax/SSAX/examples/daml-parse-unparse.scm
(written in Scheme, sorry)