[Haskell-cafe] HXT: Replace an element with its text

Ivan Perez ivanperezdominguez at gmail.com
Tue Jun 26 11:15:12 CEST 2012


Hi,
 You code fails because a link is not a node of kind Text, I think.
What you want is to get the text from a child node of an anchor node.
I think the following should work:

is_link :: (ArrowXml a) => a XmlTree XmlTree
is_link = hasName "a"

process_link :: (ArrowXml a) => a XmlTree XmlTree
process_link = getChildren >>> getText >>> mkText

replace_links_with_their_text :: (ArrowXml a) => a XmlTree XmlTree
replace_links_with_their_text =
  processTopDown $ process_link `when` is_link

Cheers,
Ivan.

On 26 June 2012 06:58, Michael Orlitzky <michael at orlitzky.com> wrote:
> I would like to replace,
>
>   <body><a href="#">foo</a></body>
>
> with,
>
>   <body>foo</body>
>
> using HXT. So far, the closest I've come is to parse the HTML and apply
> the following stuff:
>
>   is_link :: (ArrowXml a) => a XmlTree XmlTree
>   is_link =
>     hasName "a"
>
>   replace_links_with_their_text :: (ArrowXml a) => a XmlTree XmlTree
>   replace_links_with_their_text =
>     processTopDown $ (getText >>> mkText) `when` is_link
>
> Unfortunately, this just removes the "a" element and its text entirely.
> The other-closest solution is,
>
>   replace_links_with_their_text :: (ArrowXml a) => a XmlTree XmlTree
>   replace_links_with_their_text =
>     processTopDown $ (txt "foo") `when` is_link
>
> Of course, I don't want to hard-code the value "foo", and I can't figure
> out a way to feed the element's text back into 'txt'.
>
> Anyone tried this before?
>
> _______________________________________________
> Haskell-Cafe mailing list
> Haskell-Cafe at haskell.org
> http://www.haskell.org/mailman/listinfo/haskell-cafe



More information about the Haskell-Cafe mailing list