[Haskell-cafe] Regular Expressions without GADTs

Ross Paterson ross at soi.city.ac.uk
Wed Oct 19 08:58:13 EDT 2005


On Tue, Oct 18, 2005 at 05:51:13PM +0100, Conor McBride wrote:
| Neither Oleg nor Bruno translated my code; they threw away my 
| structurally recursive on-the-fly automaton and wrote combinator parsers 
| instead. That's why there's no existential, etc. The suggestion that 
| removing the GADT simplifies the code would be better substantiated if 
| like was compared with like.

Here's a version in H98 + existentials.  I'm not claiming it proves
anything.

> module RegExp where

> import Control.Monad

> data RegExp tok a
>	= Zero
>	| One a
>	| Check (tok -> a) (tok -> Bool)
>	| Plus (RegExp tok a) (RegExp tok a)
>	| forall b. Mult (RegExp tok (b -> a)) (RegExp tok b)
>	| forall e. Star ([e] -> a) (RegExp tok e)

I could just use Prod (One f) instead of fmap f, but the functor
instance is mechanical:

> instance Functor (RegExp tok) where
>	fmap f Zero = Zero
>	fmap f (One v) = One (f v)
>	fmap f (Check k p) = Check (f.k) p
>	fmap f (Plus r1 r2) = Plus (fmap f r1) (fmap f r2)
>	fmap f (Mult r1 r2) = Mult (fmap (f.) r1) r2
>	fmap f (Star k r) = Star (f.k) r

Constructors

> one :: RegExp tok ()
> one = One ()

> check :: (tok -> Bool) -> RegExp tok tok
> check = Check id

> plus :: RegExp tok a -> RegExp tok b -> RegExp tok (Either a b)
> plus r1 r2 = Plus (fmap Left r1) (fmap Right r2)

> mult :: RegExp tok a -> RegExp tok b -> RegExp tok (a,b)
> mult r1 r2 = Mult (fmap (,) r1) r2

> star :: RegExp tok a -> RegExp tok [a]
> star = Star id

Parsing

> parse :: RegExp tok a -> [tok] -> Maybe a
> parse r []       = empty r
> parse r (t : ts) = parse (divide t r) ts

> empty :: RegExp tok a -> Maybe a
> empty Zero		= mzero
> empty (One v)		= return v
> empty (Check _ _)	= mzero
> empty (Plus r1 r2)	= empty r1 `mplus` empty r2
> empty (Mult r1 r2)	= empty r1 `ap` empty r2
> empty (Star k _)	= return (k [])

> divide :: tok -> RegExp tok a -> RegExp tok a
> divide t Zero		= Zero
> divide t (One v)	= Zero
> divide t (Check k p)
>   | p t		= One (k t)
>   | otherwise		= Zero
> divide t (Plus r1 r2)	= Plus (divide t r1) (divide t r2)
> divide t (Mult r1 r2)	= case empty r1 of
>	Nothing	-> Mult (divide t r1) r2
>	Just f	-> Plus (Mult (divide t r1) r2) (fmap f (divide t r2))
> divide t (Star k r)	= Mult (fmap k_cons (divide t r)) (Star id r)
>   where k_cons x xs = k (x:xs)



More information about the Haskell-Cafe mailing list