[Haskell-beginners] Re: remove XML tags using Text.Regex.Posix
Christian Maeder
Christian.Maeder at dfki.de
Wed Sep 30 08:48:58 EDT 2009
I think regexs are a pain und would suggest the xml-light package for
your purpose, which is the smallest xml library. (Or use take, drop,
isPrefixOf and isSuffixOf to chop of your tags manually.)
http://hackage.haskell.org/package/xml
Cheers Christian
Prelude Text.XML.Light> concatMap strContent . onlyElems $ parseXML
"<tag>123</tag>"
"123"
Robert Ziemba wrote:
> I have been working with the regular expression package
> (Text.Regex.Posix). My hope was to find a simple way to remove a pair
> of XML tags from a short string.
>
> I have something like this "<tag>Data</tag>" and would like to extract
> 'Data'. There is only one tag pair, no nesting, and I know exactly what
> the tag is.
>
> My first attempt was this:
>
> "<tag>123</tag>" =~ "[^<tag>].+[^</tag>]"::String
>
> result: "123"
>
> Upon further experimenting I realized that it only works with more than
> 2 digits in 'Data'. I occured to me that my thinking on how this
> regular expression works was not correct - but I don't understand why it
> works at all for 3 or more digits.
>
> Can anyone help me understand this result and perhaps suggest another
> strategy? Thank you.
>
>
> ------------------------------------------------------------------------
>
> _______________________________________________
> Beginners mailing list
> Beginners at haskell.org
> http://www.haskell.org/mailman/listinfo/beginners
More information about the Beginners
mailing list