[Haskell-cafe] Re: Has anybody replicated =~ s/../../ or even something more basic for doing replacements with pcre haskell regexen?

Thomas Hartman tphyahoo at gmail.com
Sat Mar 14 19:01:44 EDT 2009


So, I tweaked Text.Regex to have the behavior I need.

http://patch-tag.com/repo/haskell-learning/browse/regexStuff/pcreReplace.hs

FWIW, the problem I was trying to solve was deleting single newlines
but not strings of newlines in a document. Dead simple for pcre-regex
with lookaround. But, I think, impossible with posix regex.

-- replace single newlines, but not strings of newlines (requires pcre
look-around (lookaround, lookahead, lookbehind, for googlebot))

http://perldoc.perl.org/perlre.html

testPcre = ( subRegex (mkRegex "(?<!\n)\n(?!\n)") "asdf\n \n\n\nadsf"
"" ) == "asdf \n\n\nadsf"

Can I lobby for this to make its way into the Regex distribution?
Really, I would argue that every regex flavor should have all the
functions that Text.Regex get, not just posix. (subRegex is just the
most important, to my mind)

Otherwise I'll make my own RegexHelpers hackage package or something.

Hard for me to see how to do this in an elegant way since the pcre
packages are so polymorphic-manic. I'm sure there is a way though.

Or if you point me to the darcs head of regex I'll patch that directly.

2009/3/14 Thomas Hartman <tphyahoo at gmail.com>:
> Right, I'm just saying that a "subRegex" that worked on pcre regex
> matches would be great for people used to perl regexen and unused to
> posix -- even it only allowed a string replacement, and didn't have
> all the bells and whistles of =~ s../../../ in perl.
>
> 2009/3/12 ChrisK <haskell at list.mightyreason.com>
>> Thomas Hartman wrote:
>>>
>>> Is there something like subRegex... something like =~ s/.../.../ in
>>> perl... for haskell pcre Regexen?
>>>
>>> I mean, subRegex from Text.Regex of course:
>>> http://hackage.haskell.org/cgi-bin/hackage-scripts/package/regex-compat
>>>
>>> Thanks for any advice,
>>>
>>> thomas.
>>
>> Short answer: No.
>>
>> This is a FAQ.  The usual answer to your follow up "Why not?" is that the
>> design space is rather huge.  Rather than justify this statement, I will
>> point at the complicated module:
>>
>> http://hackage.haskell.org/packages/archive/split/0.1.1/doc/html/Data-List-Split.html
>>
>> The above module is "a wide range of strategies for splitting lists", which
>> is a much simpler problem than your subRegex request, and only works on
>> lists.  A subRegex library should also work on bytestrings (and Seq).
>>
>> At the cost of writing your own routine you get exactly what you want in a
>> screen or less of code, see
>> http://hackage.haskell.org/packages/archive/regex-compat/0.92/doc/html/src/Text-Regex.html#subRegex
>> for "subRegex" which is 30 lines of code.
>>
>> Cheers,
>>  Chris
>>
>


More information about the Haskell-Cafe mailing list