[Haskell-cafe] Perl-ish =~ operator
ozone at algorithm.com.au
ozone at algorithm.com.au
Tue Feb 24 15:06:09 EST 2004
On 24/02/2004, at 1:30 AM, ozone at algorithm.com.au wrote:
> In my effort to turn Haskell into a language more like Perl
> (muahaha)[1], I got a bit fed up and implemented something like Perl
> 5's =~ binding operator (a.k.a. "regex" operator); I thought maybe
> somebody else here might find it useful. Perl has the concept of
> 'contexts': a function does something different depending on what type
> its caller expects back from the function. Sounds like a perfect
> abuse of type classes, to me :).
Bonus round: I've managed to hack up something which simulates Perl's
s/// operator, and also something to emulate its /e modifier
(evaluate), so that you can run a function over the resulting
subexpression matches. (You don't really _need_ the latter, because
all you have to do is use the =~ operator in a [String] context and run
your function over that, but hey, it's cool, and I get to abuse type
classes even more :).
Now, =~ has to change its behaviour not only depending on its context,
but also depending on what "operation" you do with it: matching,
substitution, etc. We differentiate each of the operations by giving
them a different type: a simple match operation takes in a String (the
regex to match), whereas substitution requires two strings: the string
to match against, and the substitution string, i.e. (String, String).
---
{-# OPTIONS -fglasgow-exts #-}
-- Need this for "instance Foo [String]" declarations (look, don't even
-- need undecidable or overlapping instances :)
module PLRE where
-- Perl-Like Regular Expressions
import Maybe
import Text.Regex
-- Perl-Like =~ operator, which changes behaviour depending on its
calling
-- context
class Bind op context where
(=~) :: String -> op -> context
-- (=~) :: String -> String -> Bool
-- returns whether the regex matched or not
instance Bind String Bool where
s =~ re = case matchRegex (mkRegex re) s of
Nothing -> False
Just x -> True
-- (=~) :: String -> String -> [String]
-- returns a list of subexpression matches
instance Bind String [String] where
s =~ re = case matchRegex (mkRegex re) s of
Nothing -> []
Just x -> x
-- (=~) :: String -> (String, String) -> String
-- substitution: "foo" =~ ("f", "g") = "goo"
instance Bind (String, String) String where
s =~ (re, sub) = case matchRegexAll (mkRegex re) s of
Nothing -> []
Just (before, _, after, _) -> before ++ sub ++ after
-- perl's /e modifier. We expect a function that takes in an argument
of
-- type [String] (and can output any type): in that argument, index 0 of
-- the list is the original string to match against, index 1 (if it
-- exists) is the first subexpression match, index 2 is the second
-- subexpression match, etc.
instance Bind (String, ([String] -> context)) context where
s =~ (re, fn) = case matchRegex (mkRegex re) s of
Just matches -> fn (s:matches)
Nothing -> fn [] -- or maybe this should be an error?
boolContextTest string regEx =
case string =~ regEx of
True -> print True
False -> print False
stringListContextTest string regEx =
case string =~ regEx of
(a:x) -> print ("First match: " ++ a)
_ -> error "No subexpression matches"
---
For an example of how to use the /e-like operator:
PLRE> "foo" =~ ("^(..)", \l -> map Char.ord (l!!1) ) :: [Int]
[102,111]
i.e. it (vaguely) resembles something like $foo =~ s/^(..)/ord $1/;
One thing which would be really nice is to use implicit parameters for
the subexpression match instead of passing the list of subexpression
matches explicitly to the function, so that you could instead write:
PLRE> "foo" =~ ("^(..)", map Char.ord ?_1 ) :: [Int]
[102,111]
So ?_n maps nicely on to Perl's $n match variable (or \n, if you're a
sed foo). I couldn't find any way for this to work, though, since
implicit parameters aren't allowed in an instance declaration.
--
% Andre Pang : trust.in.love.to.save
More information about the Haskell-Cafe
mailing list