[Haskell-beginners] remove XML tags using Text.Regex.Posix
Lyndon Maydwell
maydwell at gmail.com
Wed Sep 30 02:27:08 EDT 2009
HXT should be able to do what you're after quite easily from what I've seen.
On Wed, Sep 30, 2009 at 1:58 PM, Magnus Therning <magnus at therning.org> wrote:
> On Tue, Sep 29, 2009 at 12:25:07PM -0700, Robert Ziemba wrote:
>> I have been working with the regular expression package (Text.Regex.Posix).
>> My hope was to find a simple way to remove a pair of XML tags from a short
>> string.
>>
>> I have something like this "<tag>Data</tag>" and would like to extract
>> 'Data'. There is only one tag pair, no nesting, and I know exactly what the
>> tag is.
>>
>> My first attempt was this:
>>
>> "<tag>123</tag>" =~ "[^<tag>].+[^</tag>]"::String
>>
>> result: "123"
>>
>> Upon further experimenting I realized that it only works with more than 2
>> digits in 'Data'. I occured to me that my thinking on how this regular
>> expression works was not correct - but I don't understand why it works at
>> all for 3 or more digits.
>>
>> Can anyone help me understand this result and perhaps suggest another
>> strategy? Thank you.
>
> Personally I would have used tagsoup for this sort of thing. Keep in mind the
> eternal words
>
> Some people, when confronted with a problem, think 'I know, I'll use
> regular expressions.' Now they have two problems.
> -- Jamie Zawinski
>
> As you so nicely demonstrated yourself ;-)
>
> /M
>
> --
> Magnus Therning (OpenPGP: 0xAB4DFBA4)
> magnus@therning.org Jabber: magnus@therning.org
> http://therning.org/magnus identi.ca|twitter: magthe
>
> _______________________________________________
> Beginners mailing list
> Beginners at haskell.org
> http://www.haskell.org/mailman/listinfo/beginners
>
>
More information about the Beginners
mailing list