[Haskell-cafe] Re: Has anybody replicated =~ s/../../ or even something more basic for doing replacements with pcre haskell regexen?

Don Stewart dons at galois.com
Sat Mar 14 19:12:31 EDT 2009


Also, consider stealing the regex susbt code from:

    http://shootout.alioth.debian.org/u64q/benchmark.php?test=regexdna&lang=ghc&id=4

tphyahoo:
> So, I tweaked Text.Regex to have the behavior I need.
> 
> http://patch-tag.com/repo/haskell-learning/browse/regexStuff/pcreReplace.hs
> 
> FWIW, the problem I was trying to solve was deleting single newlines
> but not strings of newlines in a document. Dead simple for pcre-regex
> with lookaround. But, I think, impossible with posix regex.
> 
> -- replace single newlines, but not strings of newlines (requires pcre
> look-around (lookaround, lookahead, lookbehind, for googlebot))
> 
> http://perldoc.perl.org/perlre.html
> 
> testPcre = ( subRegex (mkRegex "(?<!\n)\n(?!\n)") "asdf\n \n\n\nadsf"
> "" ) == "asdf \n\n\nadsf"
> 
> Can I lobby for this to make its way into the Regex distribution?
> Really, I would argue that every regex flavor should have all the
> functions that Text.Regex get, not just posix. (subRegex is just the
> most important, to my mind)
> 
> Otherwise I'll make my own RegexHelpers hackage package or something.
> 
> Hard for me to see how to do this in an elegant way since the pcre
> packages are so polymorphic-manic. I'm sure there is a way though.
> 
> Or if you point me to the darcs head of regex I'll patch that directly.
> 
> 2009/3/14 Thomas Hartman <tphyahoo at gmail.com>:
> > Right, I'm just saying that a "subRegex" that worked on pcre regex
> > matches would be great for people used to perl regexen and unused to
> > posix -- even it only allowed a string replacement, and didn't have
> > all the bells and whistles of =~ s../../../ in perl.
> >
> > 2009/3/12 ChrisK <haskell at list.mightyreason.com>
> >> Thomas Hartman wrote:
> >>>
> >>> Is there something like subRegex... something like =~ s/.../.../ in
> >>> perl... for haskell pcre Regexen?
> >>>
> >>> I mean, subRegex from Text.Regex of course:
> >>> http://hackage.haskell.org/cgi-bin/hackage-scripts/package/regex-compat
> >>>
> >>> Thanks for any advice,
> >>>
> >>> thomas.
> >>
> >> Short answer: No.
> >>
> >> This is a FAQ.  The usual answer to your follow up "Why not?" is that the
> >> design space is rather huge.  Rather than justify this statement, I will
> >> point at the complicated module:
> >>
> >> http://hackage.haskell.org/packages/archive/split/0.1.1/doc/html/Data-List-Split.html
> >>
> >> The above module is "a wide range of strategies for splitting lists", which
> >> is a much simpler problem than your subRegex request, and only works on
> >> lists.  A subRegex library should also work on bytestrings (and Seq).
> >>
> >> At the cost of writing your own routine you get exactly what you want in a
> >> screen or less of code, see
> >> http://hackage.haskell.org/packages/archive/regex-compat/0.92/doc/html/src/Text-Regex.html#subRegex
> >> for "subRegex" which is 30 lines of code.
> >>
> >> Cheers,
> >>  Chris
> >>
> >
> 


More information about the Haskell-Cafe mailing list