Graham Klyne gk at ninebynine.org
Tue May 18 12:19:05 EDT 2004


For your information...

I'm starting a push to bring a version of the HaXml library code to the 
point of meeting my requirements.  First step is to sort out the error 
handling and diagnostic reporting.  (I've made the changes to the error 
handling, and the parser still works but the effect is that error 
diagnostics are not very good:  everything reverts back to a failure of the 
top-level production.)

The changes are looking a bit deeper than I first expected, so my current 
plan is:

(a) beef up the test case library.  I plan to track down a set of test data 
that you mentioned previously.  (I can't get to the CVS to find these right 

   My test cases are based on (1) success/failure to parse a given 
collection of data, and (2) pretty-print and text comparison of the result 
with a stored (and previously checked) version.  At some stage, the test 
code may be improved to be less sensitive to formatting detail, but it's 
serving my purposes well enough for the present.  It's pretty easy to add 
new test cases.

(b) refactor the parser combinator code so that the parser combinators 
access the token information (e.g. position and other diagnostic 
information) via a new type class.  This is to keep the combinators as 
independent as possible of the lexer details.

(c) enhance the error handling structure that I've already started on so 
that I can distinguish between parse failure (failure of an alternative) 
and definite failure (e.g. lexing error or tag mismatch), and also to make 
sure the right diagnostic information is propagated back to the calling 

Apart from the adding extended diagnostic interface I've already created 
(modulo some name change to make it more obvious what it does) I plan to 
keep existing external interface unchanged.

(BTW, I've noticed some situations where I think the current code is not 
fully XML compliant with respect to recognizing alternative case variants 
such as XML, Xml, xMl, xml, etc.  I'll try and add comments to them as I 
see them.)



www.haskell.org seems to be down, so I can't locate the CVS to find your 
bug documents.

But I've grabbed the W3C XML test suite, and added some 300 test cases 
based on James Clark's XMLTest collection for Valid and not-wellformed 
stand-alone documents. So far, I'm just performing parse/fail tests, not 
checking the output.  I'm getting about 6 failures for valid documents, and 
lots of failures to reject not-well-formed documents.  It also appears that 
formfeeds in the input document cause the parser all sorts of grief.   The 
valid-document failures are all to do with odd characters (did you notice 
that ':' is a valid attribute name?  I didn't!)

Anyway, I reckon the c. 200 remaining tests probable make a fairly good 
regression test suite for my refactoring work to proceed.

Over time, I should be able to hook in the entire applicable W3 test suite.


At 13:31 23/03/04 +0000, you wrote:
> > HuttonMeijerWallace.hs modified to include an option to return a
> > diagnostic message or parser result, via an (Either String a) value.
> > The original interface is (mostly) preserved, and new functions added
> > to support the extended return values (e.g.  papplydiag).
>If the original interface is largely preserved, I'll happily include
>your modifications once I've had a look through them.  By the way,
>what is the mnemonic value of 'diag' in your additions?  It suggests
>"diagonalised" to me, but I don't quite see how that fits.
>     Malcolm
>P.S.  I've seen your message to the libraries list stating that you
>are going to continue your project using HXToolkit rather than HaXml,
>so thanks for the effort you have already contributed, and all the
>best with HXT.  Perhaps one day I will get the chance to sit down
>and compare the two systems myself and possibly develop a merger of
>their good points.

Graham Klyne
For email:

More information about the Libraries mailing list