[Haskell] understanding HaXml and escaping

Graham Klyne GK at ninebynine.org
Thu Oct 28 10:00:28 EDT 2004


Hmmm... it's not strictly an entry to the mainstream HaXml, but I have unit 
test code for my modified version of HaXml, which might be of some use:

   http://www.ninebynine.org/Software/HaskellUtils/HaXml-1.12/test/
   http://www.ninebynine.org/Software/HaskellUtils/HaXml-1.12/test/TestXml.hs

The test cases include some round-tripping, e.g. using doXmlParseFormat.

#g
--

At 08:48 28/10/04 -0400, S. Alexander Jacobson wrote:
>Is there a good entry point into HaXml?
>I've now spent some time trying to understand it
>and feel like I've gotten nowhere.
>
>The Haddock documentation enumerates what each
>function does, but I still don't know how to
>produce a valid XML document?
>
>For example, this is obviously the wrong way to
>go:
>
>   simp2 = document $ Document (Prolog Nothing [] Nothing []) [] $
>                 Elem "root" [("attr",AttValue [Left "v\"al"])]
>                 [CString False "<<<<<>>&&&"]
>
>Because, it produces the obviously wrong:
>
>   <root attr="v"al"><<<<<>>&&&</root>
>
>I assume/hope that the combinators properly
>encode/escape attribute values and CDATA, but
>can't figure out how to generate even the
>simple XML above.
>
>And once I've done so, is there a way to put PIs
>in via the combinators or do I have to import
>Types and risk have unescaped stuff in my
>document?
>
>-Alex-
>
>
>
>
>
>On Thu, 28 Oct 2004, Malcolm Wallace wrote:
>
> > "S. Alexander Jacobson" <alex at alexjacobson.com> writes:
> >
> > > I modified the Prolog type to be
> > >    data Prolog = Prolog (Maybe XMLDecl) [Misc] (Maybe DocTypeDecl) [Misc]
> > > and then modified the Prolog parser
> >
> > Thanks for spotting this bug and providing a fix.  I also note that
> > the XML spec allows "misc*" to follow the document top-level element:
> >
> >     document     ::=          prolog element Misc*
> >
> > and this too is incorrect in HaXml.  There may well be other
> > occurrences of the same omission.
> >
> > > Given that this fix was so very easy and given
> > > that the parser was already spec consistent, I now
> > > have to assume that there was good reason for the
> > > Prolog to be spec inconsistent, but I don't know
> > > what it is...
> >
> > I originally assumed that Misc's were unimportant and could be
> > discarded, like comments are discarded by a compiler.  I failed to
> > notice that PI's should be passed through to the application.
> >
> > > Implementation question: Why is there so much
> > > replicated code in HaXML/Html (parse.hs and
> > > pretty.hs)
> >
> > The HTML parser does some correction of mal-formed input, which
> > is not otherwise permitted by the XML spec.  Likewise, the HTML
> > pretty-printer makes some wild and unjustified assumptions about the
> > way that humans like to format their documents, whereas the XML pp
> > is more strictly-conforming.  Once XHTML becomes common, the HTML
> > parser/pp will be obsolete.
> >
> > Regards,
> >     Malcolm
> >
>
>______________________________________________________________
>S. Alexander Jacobson tel:917-770-6565 http://alexjacobson.com
>_______________________________________________
>Haskell mailing list
>Haskell at haskell.org
>http://www.haskell.org/mailman/listinfo/haskell

------------
Graham Klyne
For email:
http://www.ninebynine.org/#Contact



More information about the Haskell mailing list