[Haskell-cafe] Efficient parallel regular expressions
roger peppe
rogpeppe at gmail.com
Wed Nov 5 09:02:05 EST 2008
On Tue, Nov 4, 2008 at 6:44 PM, i wrote:
> i'm sorry if this is obviously wrong (i haven't used Text.Regex), but
> can't you do this with submatches?
rights or wrongs of regexps aside, i just checked that the above
approach *is* feasible with Text.Regex
here's some code:
>module Multimatch(multimatch) where
> import Text.Regex
> import qualified Data.List as DL
> import qualified Data.Maybe as DM
>
> brcount :: String -> Int
> brcount ('\\' : _ : s) =
> brcount s
> brcount ('(' : s) =
> 1 + brcount s
> brcount (_ : s) =
> brcount s
> brcount [] =
> 0
>
> -- given a list of strings representing regular expressions,
> -- each associated with a tag, match against a string
> -- and return the match, along with the associated tag,
> -- or Nothing if there's no match.
> multimatch :: [(tag, String)] -> (String -> Maybe (tag, String))
> multimatch rs =
> let re = mkRegex $ DL.intercalate "|" $ map ((\s -> "(x(" ++ s ++ "))") . snd) rs in
> let tags = submatches rs in
> (\ s ->
> do
> ms <- matchRegex re ("x" ++ s)
> (tag, m) <- DL.find (\(_, m) -> not (null m)) (zip tags ms)
> return (DM.fromJust tag, tail m))
> submatches [] =
> []
> submatches ((tag, r) : rs) =
> (Just tag : take (brcount r + 1) (repeat Nothing)) ++ submatches rs
i'm sure there's a more compact implementation in there somewhere:
i'm just a haskell newbie.
cheers,
rog.
More information about the Haskell-Cafe
mailing list