[Haskell-beginners] remove XML tags using Text.Regex.Posix

Magnus Therning magnus at therning.org
Wed Sep 30 03:59:35 EDT 2009


On Wed, Sep 30, 2009 at 6:58 AM, Magnus Therning <magnus at therning.org> wrote:
[..]
> Personally I would have used tagsoup for this sort of thing.  Keep in mind the
> eternal words
>
>  Some people, when confronted with a problem, think 'I know, I'll use
>  regular expressions.' Now they have two problems.
>       -- Jamie Zawinski
>
> As you so nicely demonstrated yourself ;-)

Here's a quick and dirty solution using tagsoup:

% cat file.xml
<tag>123</tag>
<tag>456</tag>
<tag>789</tag>

Text.HTML.Download Text.HTML.TagSoup> tags <- openItem "file.xml"
Text.HTML.Download Text.HTML.TagSoup> map (fromTagText . head . tail)
$ partitions (TagOpen "tag" [] ~==) (parseTags tags)
["123","456","789"]

/M

-- 
Magnus Therning                        (OpenPGP: 0xAB4DFBA4)
magnus@therning.org          Jabber: magnus@therning.org
http://therning.org/magnus         identi.ca|twitter: magthe


More information about the Beginners mailing list