Matching word boundaries in Text.Regexp
haskell at list.mightyreason.com
Tue Jan 16 06:28:23 EST 2007
Bernd Holzmüller wrote:
> I would like to match word boundaries in a regular expression but this
> doesn't seem to work with Text.Regex in GHC 6.4.2.
> The regular expression looks something like: "\\b(send|receive)\\b" to
> match either the keyword send or the keyword receive but not the word
> sending. Neither works \< and \> for matching the beginning and end of a
> Thanks for any help,
What you want to do is not POSIX regular expression syntax.
What you are asking for is Perl(-Compatible-Regular-Expressions, aka PCRE).
This is provided in Haskell. You will first need to ensure you have the PCRE
library. You may already have libpcre, if not you can get it from
http://www.pcre.org/ where it is developed.
I have the newest wrapper for calling this from Haskell:
You will need regex-base and regex-pcre packages from:
darcs get --partial http://darcs.haskell.org/packages/regex-base/
darcs get --partial http://darcs.haskell.org/packages/regex-pcre/
It works on both String and Data.ByteString (great performance), and with a bit
of .cabal file editing (to point to libpcre) it should compile and run with GHC
6.4.2 (which I have done on OS X). To easily compile the packages you will also
need the Data.ByteString module which is provided by Don's fps package:
darcs get --partial http://www.cse.unsw.edu.au/~dons/code/fps
The Text.Regex module is the old Posix api. You don't want that. You want the
new api exported by Text.Regex.Base and Text.Regex.PCRE which uses (=~) (=~~)
and classes RegexOptions, RegexMaker, RegexLike, RegexContext.
If you upgrade to GHC 6.6 then the regex-base and Data.ByteString are already
installed and you would only need regex-pcre.
Please continue to ask for help on this mailing list or haskell-cafe.
(This was based on the older JRegex libpcre wrapper, which for reference is at
More information about the Glasgow-haskell-users