gk at ninebynine.org
Tue May 18 12:19:05 EDT 2004
For your information...
I'm starting a push to bring a version of the HaXml library code to the
point of meeting my requirements. First step is to sort out the error
handling and diagnostic reporting. (I've made the changes to the error
handling, and the parser still works but the effect is that error
diagnostics are not very good: everything reverts back to a failure of the
The changes are looking a bit deeper than I first expected, so my current
(a) beef up the test case library. I plan to track down a set of test data
that you mentioned previously. (I can't get to the CVS to find these right
My test cases are based on (1) success/failure to parse a given
collection of data, and (2) pretty-print and text comparison of the result
with a stored (and previously checked) version. At some stage, the test
code may be improved to be less sensitive to formatting detail, but it's
serving my purposes well enough for the present. It's pretty easy to add
new test cases.
(b) refactor the parser combinator code so that the parser combinators
access the token information (e.g. position and other diagnostic
information) via a new type class. This is to keep the combinators as
independent as possible of the lexer details.
(c) enhance the error handling structure that I've already started on so
that I can distinguish between parse failure (failure of an alternative)
and definite failure (e.g. lexing error or tag mismatch), and also to make
sure the right diagnostic information is propagated back to the calling
Apart from the adding extended diagnostic interface I've already created
(modulo some name change to make it more obvious what it does) I plan to
keep existing external interface unchanged.
(BTW, I've noticed some situations where I think the current code is not
fully XML compliant with respect to recognizing alternative case variants
such as XML, Xml, xMl, xml, etc. I'll try and add comments to them as I
www.haskell.org seems to be down, so I can't locate the CVS to find your
But I've grabbed the W3C XML test suite, and added some 300 test cases
based on James Clark's XMLTest collection for Valid and not-wellformed
stand-alone documents. So far, I'm just performing parse/fail tests, not
checking the output. I'm getting about 6 failures for valid documents, and
lots of failures to reject not-well-formed documents. It also appears that
formfeeds in the input document cause the parser all sorts of grief. The
valid-document failures are all to do with odd characters (did you notice
that ':' is a valid attribute name? I didn't!)
Anyway, I reckon the c. 200 remaining tests probable make a fairly good
regression test suite for my refactoring work to proceed.
Over time, I should be able to hook in the entire applicable W3 test suite.
At 13:31 23/03/04 +0000, you wrote:
> > HuttonMeijerWallace.hs modified to include an option to return a
> > diagnostic message or parser result, via an (Either String a) value.
> > The original interface is (mostly) preserved, and new functions added
> > to support the extended return values (e.g. papplydiag).
>If the original interface is largely preserved, I'll happily include
>your modifications once I've had a look through them. By the way,
>what is the mnemonic value of 'diag' in your additions? It suggests
>"diagonalised" to me, but I don't quite see how that fits.
>P.S. I've seen your message to the libraries list stating that you
>are going to continue your project using HXToolkit rather than HaXml,
>so thanks for the effort you have already contributed, and all the
>best with HXT. Perhaps one day I will get the chance to sit down
>and compare the two systems myself and possibly develop a merger of
>their good points.
More information about the Libraries