[Haskell] understanding HaXml and escaping
Graham Klyne
GK at ninebynine.org
Thu Oct 28 10:00:28 EDT 2004
Hmmm... it's not strictly an entry to the mainstream HaXml, but I have unit
test code for my modified version of HaXml, which might be of some use:
http://www.ninebynine.org/Software/HaskellUtils/HaXml-1.12/test/
http://www.ninebynine.org/Software/HaskellUtils/HaXml-1.12/test/TestXml.hs
The test cases include some round-tripping, e.g. using doXmlParseFormat.
#g
--
At 08:48 28/10/04 -0400, S. Alexander Jacobson wrote:
>Is there a good entry point into HaXml?
>I've now spent some time trying to understand it
>and feel like I've gotten nowhere.
>
>The Haddock documentation enumerates what each
>function does, but I still don't know how to
>produce a valid XML document?
>
>For example, this is obviously the wrong way to
>go:
>
> simp2 = document $ Document (Prolog Nothing [] Nothing []) [] $
> Elem "root" [("attr",AttValue [Left "v\"al"])]
> [CString False "<<<<<>>&&&"]
>
>Because, it produces the obviously wrong:
>
> <root attr="v"al"><<<<<>>&&&</root>
>
>I assume/hope that the combinators properly
>encode/escape attribute values and CDATA, but
>can't figure out how to generate even the
>simple XML above.
>
>And once I've done so, is there a way to put PIs
>in via the combinators or do I have to import
>Types and risk have unescaped stuff in my
>document?
>
>-Alex-
>
>
>
>
>
>On Thu, 28 Oct 2004, Malcolm Wallace wrote:
>
> > "S. Alexander Jacobson" <alex at alexjacobson.com> writes:
> >
> > > I modified the Prolog type to be
> > > data Prolog = Prolog (Maybe XMLDecl) [Misc] (Maybe DocTypeDecl) [Misc]
> > > and then modified the Prolog parser
> >
> > Thanks for spotting this bug and providing a fix. I also note that
> > the XML spec allows "misc*" to follow the document top-level element:
> >
> > document ::= prolog element Misc*
> >
> > and this too is incorrect in HaXml. There may well be other
> > occurrences of the same omission.
> >
> > > Given that this fix was so very easy and given
> > > that the parser was already spec consistent, I now
> > > have to assume that there was good reason for the
> > > Prolog to be spec inconsistent, but I don't know
> > > what it is...
> >
> > I originally assumed that Misc's were unimportant and could be
> > discarded, like comments are discarded by a compiler. I failed to
> > notice that PI's should be passed through to the application.
> >
> > > Implementation question: Why is there so much
> > > replicated code in HaXML/Html (parse.hs and
> > > pretty.hs)
> >
> > The HTML parser does some correction of mal-formed input, which
> > is not otherwise permitted by the XML spec. Likewise, the HTML
> > pretty-printer makes some wild and unjustified assumptions about the
> > way that humans like to format their documents, whereas the XML pp
> > is more strictly-conforming. Once XHTML becomes common, the HTML
> > parser/pp will be obsolete.
> >
> > Regards,
> > Malcolm
> >
>
>______________________________________________________________
>S. Alexander Jacobson tel:917-770-6565 http://alexjacobson.com
>_______________________________________________
>Haskell mailing list
>Haskell at haskell.org
>http://www.haskell.org/mailman/listinfo/haskell
------------
Graham Klyne
For email:
http://www.ninebynine.org/#Contact
More information about the Haskell
mailing list