[Haskell-beginners] subRegex https? with anchor href tags
Daniel Fischer
daniel.is.fischer at googlemail.com
Sat Nov 12 15:50:38 CET 2011
On Saturday 12 November 2011, 14:23:30, Shakthi Kannan wrote:
> Hi,
>
> --- On Sat, Nov 12, 2011 at 5:44 PM, Daniel Fischer
>
> <daniel.is.fischer at googlemail.com> wrote:
> | Maybe the backreferences numbering starts at 0?
Not backreferences, but who cares?
> | Worth a try.
>
> \--
>
> \0 represents the entire string match:
The entire *match*, that is, the part of the input matched by the regexp.
The other entries correspond to parts matched by certain subregexen in the
match.
>
> http://cvs.haskell.org/Hugs/pages/libraries/base/Text-Regex.html
May I suggest using the docs at hackage, hugs hasn't had a release since
2006, I don't think the docs are up to date. Unless you're actually using
hugs, in which case I suggest switching to ghc.
http://hackage.haskell.org/package/regex-compat
>
> Prelude Text.Regex> subRegex (mkRegex "e") "hello" "\\0"
> "hello"
Heh, I didn't see it immediately either ;)
Prelude Text.Regex> subRegex (mkRegex "e") "hello" ">\\0<"
"h>e<llo"
Of course, if you replace a substring with itself, it doesn't change
anything.
Prelude Text.Regex> subRegex (mkRegex "https?[^\\s\n\r]+") "The best is
http://haskell.org" "<a href=\"\\0\">there</a>"
"The best is <a href=\"http://ha\">there</a>skell.org"
Not what you want, character classes don't work that way,
Prelude Text.Regex> subRegex (mkRegex "https?[^[:space:]]+") "The best is
http://haskell.org\n" "<a href=\"\\0\">there</a>"
"The best is <a href=\"http://haskell.org\">there</a>\n"
but that.
However,
Prelude Text.Regex> subRegex (mkRegex "https?[^[:space:]]+") "The best is
http://haskell.org." "<a href=\"\\0\">there</a>"
"The best is <a href=\"http://haskell.org.\">there</a>"
please make an effort to not include final punctuation in the href, it's
rather annoying how many 404s I get from that.
More information about the Beginners
mailing list