[Haskell-cafe] Regular Expressions without GADTs
oleg at pobox.com
oleg at pobox.com
Wed Oct 19 00:39:21 EDT 2005
Conor McBride wrote:
> Neither Oleg nor Bruno translated my code; they threw away my
> structurally recursive on-the-fly automaton and wrote combinator parsers
> instead. That's why there's no existential, etc. The suggestion that
> removing the GADT simplifies the code would be better substantiated if
> like was compared with like.
...
> I'm sure the program I actually wrote can be expressed with the type
> class trick, just by cutting up my functions and pasting the bits into
> individual instances; the use of the existential is still available. I
> don't immediately see how to code this up in Bruno's style, but that
> doesn't mean it can't be done. Still, it might be worth comparing like
> with like.
Please see the enclosed code. It is still in Haskell98 -- and works in
Hugs.
> I suspect that once you start producing values with the
> relevant properties (as I do) rather than just consuming them (as Oleg
> and Bruno do), the GADT method might work out a little neater.
Actually, the code is pretty much your original code, with downcased
identifiers. It faithfully implements that parser division
approach. Still, there are no existentials. I wouldn't say that GADT
code is so much different. Perhaps the code below is a bit neater due
to the absence of existentials, `case' statements, and local type
annotations.
{-- Haskell98! --}
module RegExps where
import Monad
newtype Zero = Zero Zero -- Zero type in Haskell 98
-- Bruno.Oliveira's type class
class RegExp g where
zero :: g tok Zero
one :: g tok ()
check :: (tok -> Bool) -> g tok tok
plus :: g tok a -> g tok b -> g tok (Either a b)
mult :: g tok a -> g tok b -> g tok (a,b)
star :: g tok a -> g tok [a]
data Parse tok t = Parse { empty :: Maybe t
, divide :: tok -> Parse tok t}
parse :: Parse tok a -> [tok] -> Maybe a
parse p [] = empty p
parse p (t:ts) = parse (divide p t) ts
liftP f p = Parse{empty = liftM f (empty p),
divide = \tok -> liftP f (divide p tok)}
liftP2 mf p1 p2 = Parse{empty = mf (empty p1) (empty p2),
divide = \tok -> liftP2 mf (divide p1 tok)
(divide p2 tok)}
lsum x y = (liftM Left x) `mplus` (liftM Right y)
lprod x y = liftM2 (,) x y
-- Conor McBride's parser division algorithm
instance RegExp Parse where
zero = Parse mzero (\_ -> zero)
one = Parse (return ()) (\_ -> liftP (const ()) zero)
check p = Parse mzero (\t -> if p t
then liftP (const t) one
else liftP (const t) zero)
plus r1 r2 = Parse (lsum (empty r1) (empty r2))
(\t -> plus (divide r1 t) (divide r2 t))
mult r1 r2 = Parse (lprod (empty r1) (empty r2))
(\t -> let (q1,q2) = (divide r1 t, divide r2 t)
lp x1 = liftP (either id ((,) x1))
in maybe
(mult q1 r2)
(\x1 -> lp x1 (plus (mult q1 r2) q2))
(empty r1))
star r = Parse (return [])
(\t-> liftP (uncurry (:)) (mult (divide r t) (star r)))
p1 :: RegExp g => g Char ([Char], [Char])
p1 = mult (star (check (== 'a'))) (star (check (== 'b')))
testp = parse (star (mult (star (check (== 'a'))) (star (check (== 'b')))))
"abaabaaabbbb"
{-
*RX> testp
Just [("a","b"),("aa","b"),("aaa","bbbb")]
-}
testc = parse (star one) "abracadabra"
-- Parsing the list of integers
ieven = even :: Int->Bool
iodd = odd :: Int->Bool
p2 :: RegExp g => g Int (Either (Int, (Int, [Int])) (Int, [Int]))
p2 = plus (mult (check iodd) (mult (check iodd) (star (check ieven))))
(mult (check ieven) (star (check iodd)))
test2 = parse (star p2) [1::Int,1,2,3,3,4]
More information about the Haskell-Cafe
mailing list