[Haskell-cafe] Typeclasses and GADT [Was: Regular Expressions without GADTs]

Wed Oct 19 02:51:48 EDT 2005

Ralf Hinze wrote:
> To me replacing a GADT by class and instance declarations seems the
> wrong way round. We should not forget that the DT in GADT stands for
> `data type'.

Francois Pottier enumerated some problems with type inference of GADT
code during his ICFP'05 invited talk. Various extensions turn out
necessary, such as local inference, shape inference, ad hoc
elaboration. GADT implementation in GHC 6.4 uses a different
extension, wobbly types. However well these new type system extensions
may work on their own, they must co-exist with the other features of
the language, in particular type classes. The lifting of type class
constraints out of the clauses that pattern-match GADT might no longer
be a good idea: legitimate code can no longer type check because the
combined constraints are contradictory. One may think that GADTs
subsume the regular algebraic data types (ADT), but they do not:
irrefutable pattern matching is not possible with GADT in their
current implementation. Otherwise, unsoundness ensues. One may feel
that type classes are more mature and safer.

> One could certainly argue that the gist of functional programming is
> to define a collection of data types and functions that operate on
> these types. The fact that with GADTs constructors can have more
> general types just allows us to use the functional design pattern more
> often

It seems that type classes offer just the same programming pattern,
only more type-explicit. Indeed, we can compare code that
pattern-match over GADT

 > empty :: RegExp tok a -> Maybe a
 > empty Zero         = mzero
 > empty (Plus r1 r2) = (return Left  `ap` empty r1) `mplus`
 >                      (return Right `ap` empty r2)

with the corresponding type-class based code:

> instance RegExp Zero tok Empty where
>     empty  _ _ = mzero
> instance (RegExp r1 tok a, RegExp r2 tok b) 
>     => RegExp (Plus r1 r2) tok (Either a b) where
>     empty (Plus r1 r2) t = (liftM Left  $ empty r1 t) `mplus`
>                            (liftM Right $ empty r2 t)

which does essentially the same pattern-match, and which also
associates constraints with various alternatives. Because
typeclass-based pattern-matching is `type explicit', we get a better
feedback from the typechecker. For example, `type-pattern-match'
failure becomes a type error rather than a run-time error. The
typeclass-based `pattern matching' is also more expressive, because we
can match on higher-order types and the patterns may be non-linear.
Regular-expression `patterns' implemented with the help of type
classes have more explicit types, e.g.,

  *RX> :t p
  p :: Star (Mult (Star (Check Char)) (Star (Check Char)))

We can easily extend these types with Digit, Alphabetic, etc.

I'm glad Martin Sulzmann mentioned extensibility. The day after the
generic value parser code was posted, Huong Nguyen asked to extend it
for his particular data type:

> data HType = 
>     C1
>     { x :: [String]
>     , y :: HType}
>  | C2 {x1 :: String}
>  | C3 {y1 :: Bool} deriving Show

The extension was quite straightforward and required no modifications
to the previously written code:

> instance Type HType where parse = parseHType
> instance Type Bool  where parse = reads
>
> parseHType  
>     =   readParen True (C1   <$$> (parseList (unStr <$> parse),parse))
>     <+> C2   <$> (unStr <$> parse)
>     <+> C3   <$> parse
>
> (f <$$> (p1,p2)) s  =  [ (f a b, t') | (a, t) <- p1 s, (b,t') <- p2 t ]

Ralf Hinze wrote:
> freeing us from the need or temptation
> to resort to type class hackery with multiple parameter type
> classes, undecidable instances, functional dependencies etc.

Incidentally none of the parsing code posted recently used
undecidable instances let alone functional dependencies. Some of the
pieces of code under discussion was Haskell98.

One should note that P. J. Stuckey and Martin Sulzmann's paper ``A
theory of overloading'' (the journal version) has formalized
multi-parameter type classes with functional dependencies and even
overlapping instances, and developed their meta-theory. The paper
proved various useful properties such as coherence. One can't help but
recall the saying by John Reynolds that the pointer XORing trick is no
longer a disgusting hack because he had a clean proof of correctness
for it.

   Multi-parameter type classes with functional dependencies and
   overlapping instances are not a hack!!

Conor McBride wrote:
> So what's the point? GADTs are a very convenient way to provide data
> which describe types. Their encoding via type classes, or
> whatever, may be possible, but this encoding is not necessarily as
> convenient as the GADT.

First of all, the precise relationship between GADTs and the
typeclasses may be a legitimate topic. Do GADTs bring new expressive
power or mere convenience? Another interesting question concerns
type-checking: if GADTs are so closely related with type classes,
there may exist a deep relationship between typeclass overloading
resolution (a quite mature topic) and wobbly types, shape inference,
etc.

I never argued about convenience of GADTs. They can be quite handy
when dealing with existentials: GADT embody a safe cast and so spare
us form writing the boring casting code ourselves. And perhaps this is
the only compelling case for GADTs.