[Haskell-cafe] regex and Regular Expressions Libraries

Evan Laforge qdunkan at gmail.com
Fri Mar 10 17:08:34 UTC 2017


On Fri, Mar 10, 2017 at 2:24 AM, Chris Dornan <chris at chrisdornan.com> wrote:
> By the sounds of it regex should help with this – each match operator being
> available in an un-overloaded format. Does this API work for you?

I looked at the tutorial and.... maybe not so much?  I hardcode to
Text + PCRE since that's all I need, but that combination seems to be
unsupported.  As a light user of regexes, I won't remember much of the
API between uses, so I'm just looking to find the 'Regex -> Text ->
Bool' function as fast as possible, and a bunch of polymorphic
operators I'll never remember would just get in the way.  Also for the
same reason I'd be worried about any deviation from "standard" PCRE,
e.g. $(..) for groups.  However, I'm a lightweight user, so don't take
me too seriously, and I made my own tiny little bikeshed anyway.
Which is to say don't let me rain on your parade :)

For what it's worth, I mostly used regexes in python, and it gets
along fine with hardcoded Text + PCRE, no operators, and basically
three functions: match, get groups, and substitute groups.  So it's no
surprise my wrapper basically looks like that:

compileOptions :: [Option] -> String -> Either String Regex

matches :: Regex -> Text -> Bool

-- | Return (complete_match, [group_match]).
groups :: Regex -> Text -> [(Text, [Text])]

-- | Half-open ranges of where the regex matches.
groupRanges :: Regex -> Text -> [((Int, Int), [(Int, Int)])]
    -- ^ (entire, [group])

substitute :: Regex -> (Text -> [Text] -> Text)
    -- ^ (complete_match -> groups -> replacement)
    -> Text -> Text

I also added a Show instance that shows the regex rather than hex and
the mysteriously missing:

-- | Escape a string so the regex matches it literally.
escape :: String -> String


The QuasiQuote stuff seems neat, but I'm sort of scared of TH, and if
the regex gets complicated enough that would make it worth it, I
probably already switched to a parser.  Or I get regexes from user
input because of how succinct they are and that's runtime anyway.


More information about the Haskell-Cafe mailing list