[Haskell-cafe] regex and Regular Expressions Libraries

Chris Dornan chris at chrisdornan.com
Fri Mar 10 17:30:13 UTC 2017


Thanks Evan, 

That feedback is really valuable and I understand why you would have no reason
to switch to regex.

On the use of ‘$’, as far as I know this extension will not clash with any of
the PCRE extensions (if anybody knows of any problems please give me shout),
though for sure you will have to fill out the numbers when converting between 
the two text-replacement schemes.

As for Text and PCRE – that is on the top my list but it will need some
coordination with the upstream regex-pcre maintainers.

I do have escape functions though I haven’t included them in the tutorial yet.

Being able to recover the text of the REs would be great and I would like to include
it in a future release, but again that will need some coordination with the regex-base
maintainers.

I will raise those issues.

Fantastic feedback!

Cheers,

Chris



On 10/03/2017, 17:08, "Evan Laforge" <qdunkan at gmail.com> wrote:

    On Fri, Mar 10, 2017 at 2:24 AM, Chris Dornan <chris at chrisdornan.com> wrote:
    > By the sounds of it regex should help with this – each match operator being
    > available in an un-overloaded format. Does this API work for you?
    
    I looked at the tutorial and.... maybe not so much?  I hardcode to
    Text + PCRE since that's all I need, but that combination seems to be
    unsupported.  As a light user of regexes, I won't remember much of the
    API between uses, so I'm just looking to find the 'Regex -> Text ->
    Bool' function as fast as possible, and a bunch of polymorphic
    operators I'll never remember would just get in the way.  Also for the
    same reason I'd be worried about any deviation from "standard" PCRE,
    e.g. $(..) for groups.  However, I'm a lightweight user, so don't take
    me too seriously, and I made my own tiny little bikeshed anyway.
    Which is to say don't let me rain on your parade :)
    
    For what it's worth, I mostly used regexes in python, and it gets
    along fine with hardcoded Text + PCRE, no operators, and basically
    three functions: match, get groups, and substitute groups.  So it's no
    surprise my wrapper basically looks like that:
    
    compileOptions :: [Option] -> String -> Either String Regex
    
    matches :: Regex -> Text -> Bool
    
    -- | Return (complete_match, [group_match]).
    groups :: Regex -> Text -> [(Text, [Text])]
    
    -- | Half-open ranges of where the regex matches.
    groupRanges :: Regex -> Text -> [((Int, Int), [(Int, Int)])]
        -- ^ (entire, [group])
    
    substitute :: Regex -> (Text -> [Text] -> Text)
        -- ^ (complete_match -> groups -> replacement)
        -> Text -> Text
    
    I also added a Show instance that shows the regex rather than hex and
    the mysteriously missing:
    
    -- | Escape a string so the regex matches it literally.
    escape :: String -> String
    
    
    The QuasiQuote stuff seems neat, but I'm sort of scared of TH, and if
    the regex gets complicated enough that would make it worth it, I
    probably already switched to a parser.  Or I get regexes from user
    input because of how succinct they are and that's runtime anyway.
    




More information about the Haskell-Cafe mailing list