[Haskell-cafe] Regular Expressions without GADTs

Conor McBride ctm at Cs.Nott.AC.UK
Fri Oct 21 08:58:07 EDT 2005


Hi

Ross Paterson wrote:

>On Wed, Oct 19, 2005 at 03:17:41PM +0100, Conor McBride wrote:
>  
>
>>It's nice that the syntax of regular expressions survives, but it
>>has been somewhat spun in favour of the functionality required in
>>the example.
>>    
>>
>
>I'm not sure about that either -- the existential coding supports
>elimination too, if the result type is functorial (again specializing
>the general scheme):
>  
>

So, you certainly use the weakening of the equality constraint to cache 
semantic information inside the datatype of regular expressions. 
Correspondingly, you acquire this need for a functorial result type in 
the elimination scheme. That's a big ask: normally the result type need 
only respect equality, which is trivial, morally at least.

>caseRegExp
>	:: Functor r
>	=> r Empty
>	-> r ()
>	-> ((tok -> Bool) -> r tok)
>	-> (forall a b. RegExp tok a -> RegExp tok b -> r (Either a b))
>	-> (forall a b. RegExp tok a -> RegExp tok b -> r (a, b))
>	-> (forall a. RegExp tok a -> r [a])
>	-> RegExp tok v -> r v
>caseRegExp fZero fOne fCheck fPlus fMult fStar x = case x of
>	Zero		-> fmap (const bot)	fZero
>	One v		-> fmap (const v)	fOne
>	Check k p	-> fmap k		(fCheck p)
>	Plus r1 r2	-> fmap (either id id)	(fPlus r1 r2)
>	Mult r1 r2	-> fmap (uncurry id)	(fMult r1 r2)
>	Star k r	-> fmap k		(fStar r)
>  where bot = error "can't happen"
>
>Not that this is comfortable to use, but it does show that existentials
>are as expressive, assuming the functor: they can't do RegExp t a -> a -> a.
>  
>

Indeed: you get what you pay for. By the way, if you set things up 
properly, substituting equal for equal should have no run-time overhead; 
fmap may well cost...

>>I would say that the availability of 'this workaround for this example, 
>>that workaround for that example', is evidence in *favour* of adopting 
>>the general tools which are designed to support many examples. If a 
>>job's worth doing, it's worth doing well. Is there a *uniform* 
>>workaround which delivers all the GADT functionality cheaply and 
>>cleanly? Kind of subjective, I suppose, but I haven't seen anything 
>>cheap enough for me.
>>    
>>
>
>It seems the existential coding is uniform, if everything is covariant.
>

Yes, the availability of a uniform coding wasn't in doubt; it's a 
question of convenience. I'm inclined to agree that things aren't too 
bad here *yet*, especially if you code up the elimination scheme with 
the help of Bruno's type class. Where the wheel really starts to wobble 
is when you need to eliminate a more specific instance of the family. 
How do you write a function in RegExp tok (Either Empty x) -> MyReturn 
x? Just eliminating directly leaves you with the problem of getting rid 
of the non-Plus cases and refining the Plus case. You end up needing a 
return type loaded with 'Henry Ford' equations

  newtype LoadedReturn a = MkLoadedReturn (forall x. Eq a (Either Empty 
x) -> MyReturn x)

for some suitable definition of Eq (like (->)?).

Never mind the non-Plus cases, the Plus case gives you some RegExp tok 
a, some RegExp tok b, and a handy fact, viz Eq (Either a b) (Either 
Empty x), from which you need a MyReturn x. Now the hard work starts. 
I'm not saying that it's impossible, nor that it's always this hard. I'm 
just saying that it's suffering unnecessarily to crank this machinery by 
hand.

The only point I'm trying to underline here is that, in general, GADTs 
yield a non-trivial saving over their various encodings, sufficient to 
be worth tidying up and keeping. Obviously I'm biased, but I think 
they're a big step in the right direction.

Cheers

Conor

This message has been checked for viruses but the contents of an attachment
may still contain software viruses, which could damage your computer system:
you are advised to perform your own checks. Email communications with the
University of Nottingham may be monitored as permitted by UK legislation.



More information about the Haskell-Cafe mailing list