Exceptions

Wed Sep 6 13:17:46 EDT 2006

Bulat Ziganshin wrote:
> Friday, September 1, 2006, 2:27:34 PM, you wrote:
> 
> > Thanks for your interest in open data types. As one of the authors of
> > the "open data types" paper, I'd like to comment on the current
> > discussion.
> 
> i'm not yet read about this, but may be open types have something in
> common with type families already implemented by Manuel Chakravarty?
> 
> http://hackage.haskell.org/trac/ghc/wiki/TypeFunctions

Löh/Hinze-style open data types are orthogonal to the type indexed data
types described at the above wiki page.  Instances of type indexed data
types (or indexed type families as I am tending to call them currently)
may not overlap.  For example, while it is fine to write

        data family T a :: *
        data instance T Bool = TBool
        data instance T Int  = TInt

the following two instances are bad:

        data instance T (Int, a) = TL	-- BAD
        data instance T (a, Int) = TR	-- DEFINITION

as they overlap at T (Int, Int).

In contrast, Löh/Hinze open data types are about *fully* overlapping
data declarations.  So, in their proposal (using a slightly different
syntax)

        open data S :: *
        S1 :: S
        S2 :: S

is a perfectly fine definition.  This distinction between overlapping
and non-overlapping definitions continues on the value level (ie, with
functions operating on these data types).   Given the indexed type
family T above, I can write a function

        foo :: T Bool -> ()
        foo TBool = ()

but I cannot write a *toplevel* function pattern matching on more than
one data instance at once; ie, the following gives a type error:

        foo :: T a -> a
        foo TBool = False	-- BAD
        foo TInt  = 0		-- DEFINITION

If I want to write such a function, I need to use a type class, as
follows:

        class Foo a where
          foo :: T a -> a
        instance Foo Bool where
          foo TBool = False
        instance Foo Int where
          foo TInt = 0

Again, in contrast, Löh/Hinze open data types enable us to write

        open bar :: S -> Bool
        bar S1 = False
        bar S2 = True

So, both features are truly orthogonal and, in fact, they are
synergetic!  More precisely, an alternative syntax for Löh/Hinze open
types are overlapping type families.  So, we might define S
alternatively as

        data family S :: *
        data instance S = S1
        data instance S = S2

Then, one might hope we can allow overlapping indexed type families,
such as the instances for T (Int, a) and T (a, Int) above, and implement
them by a combination of the implementation of indexed data types that I
already added to GHC and Löh/Hinze's method for open data types.

NB: Curiously, the application of open data types that AFAIK got Andres
and Ralf into open data types, namely spine-view SYB
<http://www.iai.uni-bonn.de/~loeh/SYB1.html>, can already be implemented
with *non-overlapping* indexed type families if I am not mistaken.

> one more question what i still plan to ask him is what is the
> difference between GADTs and type families

GADTS:
      * Closed definition
      * Local type-refinement in case alternatives

Data families:
      * Open definitions (much like classes are open, you can always add
        more instances)
      * Type constraints due to indexes is propagated globally

In other words, the relationship between GADTs and data families is not
unlike that between toplevel function definitions (closed) and class
methods (open).  Moreover, you can perfectly well have indexed newtypes.
(They are already implemented and quite interesting as Haskell
guarantees that newtypes are unlifted.)

In fact, there is nothing essential preventing us from having indexed
families of GADTs - well, maybe except the occasional exploding head ;)
For example, you might define

        data family T a :: *
        data instance T [a] where
          IList :: Int -> T [Int]
          BList :: Bool -> T [Bool]
        data instance T (Maybe a) where
          IMaybe :: Int -> T (Maybe Int)
          BMaybe :: Bool -> T (Maybe Bool)

(This definition is not supposed to make much sense, it just illustrates
the idea of an indexed GADT.)

However, I haven't fully implemented indexed GADT families yet, as I
want to finish other functionality first.  So, maybe there are problems
that I haven't stumbled over yet.

Manuel