[Haskell-cafe] regex and Regular Expressions Libraries
Evan Laforge
qdunkan at gmail.com
Fri Mar 10 17:08:34 UTC 2017
On Fri, Mar 10, 2017 at 2:24 AM, Chris Dornan <chris at chrisdornan.com> wrote:
> By the sounds of it regex should help with this – each match operator being
> available in an un-overloaded format. Does this API work for you?
I looked at the tutorial and.... maybe not so much? I hardcode to
Text + PCRE since that's all I need, but that combination seems to be
unsupported. As a light user of regexes, I won't remember much of the
API between uses, so I'm just looking to find the 'Regex -> Text ->
Bool' function as fast as possible, and a bunch of polymorphic
operators I'll never remember would just get in the way. Also for the
same reason I'd be worried about any deviation from "standard" PCRE,
e.g. $(..) for groups. However, I'm a lightweight user, so don't take
me too seriously, and I made my own tiny little bikeshed anyway.
Which is to say don't let me rain on your parade :)
For what it's worth, I mostly used regexes in python, and it gets
along fine with hardcoded Text + PCRE, no operators, and basically
three functions: match, get groups, and substitute groups. So it's no
surprise my wrapper basically looks like that:
compileOptions :: [Option] -> String -> Either String Regex
matches :: Regex -> Text -> Bool
-- | Return (complete_match, [group_match]).
groups :: Regex -> Text -> [(Text, [Text])]
-- | Half-open ranges of where the regex matches.
groupRanges :: Regex -> Text -> [((Int, Int), [(Int, Int)])]
-- ^ (entire, [group])
substitute :: Regex -> (Text -> [Text] -> Text)
-- ^ (complete_match -> groups -> replacement)
-> Text -> Text
I also added a Show instance that shows the regex rather than hex and
the mysteriously missing:
-- | Escape a string so the regex matches it literally.
escape :: String -> String
The QuasiQuote stuff seems neat, but I'm sort of scared of TH, and if
the regex gets complicated enough that would make it worth it, I
probably already switched to a parser. Or I get regexes from user
input because of how succinct they are and that's runtime anyway.
More information about the Haskell-Cafe
mailing list