HaXml revisions (continued)
gk at ninebynine.org
Thu Jun 17 13:48:32 EDT 2004
Further to my previous message , I've created a new snapshot  of my
developments to the HaXml package. Browseable source code is at . It
relies on a version of Network functions that are at . The main feature
of this release is the addition of a "filter" to perform general entity
substitution in the parsed XML, and some fixes to the parameter entity
The additions have necessitated some further reorganization of the handling
of entity definitions, in order that some of the more subtle examples noted
in the XML specification work as documented.
The code still needs some tidying up, but it works on all the test cases
I've assembled to date.
Next steps are:
- create a test suite for XML validation, based on the W3C conformance test
suite, and make sure the validation functions (still) work.
- tidy up the code, in particular with a view to pruning out deadwood.
- create a filter to perform namespace processing (that which got me
started on all this in the first place).
There's one change I've made which I'm not entirely happy with: in order
to be able to collect diagnostic information from XML filter (CFilter)
processing, I've added an option CErr to the XML content model. A cleaner
solution would, I think, be to extend the return type of a CFilter value,
but this would be a significant change to the package interface in an area
that I assume is particularly used by applications.
I've created separate code paths for internal (non-IO) and external (IO
using) entity processing, but currently external entities are read using
unsafePerformIO. I've thought a little about trying to have the "external"
interfaces return an IO value, and avoid using unsafePerformIO, but I can't
currently see how to do that without sacrificing a very high degree of code
sharing that is currently achieved.
As I write this, I think I've just realized how to do this. If all the
relevant shared code runs in some unspecified monad (i.e. is polymorphic in
a monadic return type), then the non-IO code can use an identity monad and
simply pick out the resulting value as a pure value, but code which depends
on IO will be forced to return an IO value (or use unsafe...). Does this
More information about the Libraries