[Haskell-cafe] Elementary HaXml question
Koen.Roelandt at mineco.fgov.be
Koen.Roelandt at mineco.fgov.be
Thu Feb 9 10:13:28 EST 2006
> I'm new to Haskell and HaXml and I'm playing around with the latter to
> clean some (well-formed) 'legacy' html. This works fine except for the
> following cases. Some of the elements to be cleaned are:
>
> <font size="4"><i>Hello World</i></font>
> <i><font size="4">Hello World</font></i>
>
> This should become:
>
> <h1 class="subtitle">Hello World</h1>
>
> From what I could gather from the documentation, it should be something
> like:
>
> foldXml (txt ?> keep :>
> (attrval("size",AttValue[Left "4"]) `o` tag "font")
> /> tag "i" ?> replaceTag "h1" :> children)
Is the bracketing correct? I can't remember the precedence of the
operators offhand, but perhaps it should be
foldXml (txt ?> keep :>
(((attrval("size",AttValue[Left "4"]) `o` tag "font")
/> tag "i") ?> replaceTag "h1" :> children))
Yes, the braketing is correct since the following code:
foldXml (txt ?> keep :>
fontSize4 /> tag "em" ?> mkSubtitle :>
children)
fontSize4 = (attrval("size",AttValue[Left "4"]) `o` tag "font")
mkSubtitle = mkElemAttr "h1" [("class", ("subtitle"!))]
[children]
now transforms
<font size="4"><em>Hello World</em></font>
into
<h1 class="subtitle"><em><Hello World</em></font>
Which I'm satisfied with (hurray!). The <em> appears because I 'keep' it
higher in the original switch, the example above is just an extract for
brevity's sake.
I still have a problem with the other example:
<em><font size="4">Hello World</font></em>
I _think_ the line in the .hs file should be:
tag "em" /> fontSize4 ?> mkSubtitle :>
Which doesn't work, and I don't know why. If the first example works,
shouldn't the second, too? I also tried
tag "em" /> font ?> mkSubtitle :>
Which doesn't work either. I transformed it into
tag "font" `o` children `o` tag "em" ?> mkSubtitle :>
without avail. Again, any ideas are welcome. (btw: I like HaXml, Malcolm,
nice work!)
Koen.
More information about the Haskell-Cafe
mailing list