TypeFamilies vs. FunctionalDependencies & type-level recursion

Thu Jun 16 02:48:07 CEST 2011

On Tue, Jun 14, 2011 at 9:56 PM,
<dm-list-haskell-prime at scs.stanford.edu> wrote:
> Maybe "functional dependencies" is a poor name.  I'm not going to
> argue it's a good one, just that the extension is quite useful.
> Despite what the name suggests, it is most useful to think of
> "| a b -> c d" as meaning "any method that uses types a and b does not
> need to use types c and d for the compiler to determine a unique
> instance".  Once you start thinking of fundeps that way (and always
> keep in mind that contexts play no role in instance selection), then I
> think it gets a bit easier to reason about fundeps.

I wasn't really posting in this thread because I'm confused about how
fundeps actually behave in GHC. Rather, I care about what functional
dependencies mean, and what they should do, not just what they do
actually do in one implementation or another.

> I don't think so.  Because fundeps (despite the name) are mostly about
> instance selection, incoherent fundeps won't violate safety.  Your
> program may incoherently chose between methods in multiple instances
> that should be the same instance, but at least each individual method
> is type safe.  With type families, you can actually undermine type
> safety, and there's no cheating way to fix this, which is why I think
> TypeFamilies could by very dangerous when combined with dynamic
> loading.

I know that the actual, current implementation won't violate type
safety. But there are reasonable expectations for how *functional
dependencies* might work that would cause problems. Here's an example.

    class C a b | a -> b

    instance C Int b

    foo :: forall a b c d. (C a c, C b d, a ~ b) => a -> b -> c -> d
    foo _ _ x = x

    coerce :: forall a b. a -> b
    coerce = foo (5 :: Int) (5 :: Int)

This currently gets complaints about not being able to deduce c ~ d.
However, actually reading things as genuine functional dependencies,
there's nothing wrong with the logic:

    a ~ b
    c ~ F a  -- for some F
    d ~ F b  -- for the same F
    c ~ d

The only thing stopping this is that fundeps in GHC don't actually
introduce the information that they in theory represent. But this in
turn means that they don't fulfill certain usages of type families.

This is not inherent to fundeps. Fundeps could interact with local
constraints more like type families. And in fact, if fundeps ever
become sugar for type families (which is at least possible), then they
will (I think) work exactly this way, and the above instance would
need to be rejected to avoid unsoundness.

> Though I realize you are unlikely ever to like lifting the coverage
> condition, let me at least leave you with a better example of why
> Undecidable instances are useful.  Suppose you want to define an
> instance of MonadState for another monad like MaybeT.  You would need
> to write code like this:
>
>        instance (MonadState s m) => MonadState s (MaybeT m) where
>            get = lift get
>            put = lift . put
>
> This code fails the coverage condition, because class argument
> (MaybeT m) does not contain the type variable s.  Yet, unlike the
> compiler, we humans can look at the type context when reasoning about
> instance selection.

Yes, I'm familiar with the mtl examples. I had forgotten that they
actually need the coverage condition to be lifted, but they are
similar to the type cast case in that they're (possibly) uncheckably
all right.

I consider the fact that you need undecidable instances to be an
advertisement for type families, though.

> We know that even though our get method is
> returning an s--meaning really "forall s. s", since s is mentioned
> nowhere else--there is really only one s that will satisfy the
> MonadState s m requirement in the context.  Perhaps more importantly,
> we know that in any event, if the code compiles, the s we get is the
> one that the surrounding code (calling get or put) expects.

No, the s in the instance does not mean (forall s. s). What it means
is that forall s there is an instance for s, where the quantification
is outside the instance. The difference is significant. If we move to
explicit dictionary construction and arbitrary rank types, it'd look
like:

    instance (MonadState s m) => MonadState s (T m)
       ==>
    forall s m. (MSDict s m -> MSDict s (T m))

And having access to

    forall s m. MSDict s (T m)

is not at all the same as having access to

    forall m. MSDict (forall s. s) (T m)

Similarly, (forall a b. CDict a b) is not the same as (forall a. CDict
a (forall b. b)). And:

    instance C a b

corresponds to the former, not the latter. Viewing C as a relation,
the former expresses that C relates every particular type to every
other type. The latter says that C relates every type to the single
type (forall b. b). And in the absence of any other instances, the
latter is a functional relation, and the former is not.

> Now if, in addition to lifting the coverage condition, you add
> OverlappingInstances, you can do something even better--you can write
> one single recursive definition of MonadState that works for all
> MonadTrans types (along with a base case for StateT).  This is far
> preferable to the N^2 boilerplate functions currently required by N
> monad transformers:
>
>        instance (Monad m) => MonadState s (StateT s m) where
>            get = StateT $ \s -> return (s, s)
>
>        instance (Monad (t m), MonadTrans t, MonadState s m) =>
>            MonadState s (t m) where
>                get = lift get
>                put = lift . put
>
> This is not something you can do with type families.  So while
> UndecidableInstances and OverlappingInstances aren't very elegant, the
> fact that they can reduce source code size from O(N^2) to O(N) is a
> good indication that there is some unmet need in the base language.

This will probably come as no surprise, but I'm not too big on instances like:

    instance C (f a)

either. I tend to view type classes as being justified by induction on
the structure of types. So instances of the form 'F <t>' are justified
by F being a constructor of *. But 'f a' is not in that manner. We are
similarly not able to define:

    data T = C1 Int | C2 Char Int

    foo :: T -> T
    foo (f a) = f (a + 1)

And this wasn't addressed to me, but:

> If, instead of FunctionalDependencies, the extension were called
> ChooseInstancesWithoutKnowingAllTypeVariables, would you still have
> this objection?

No. In that case I would object that there are ad-hoc, unprincipled
rules for resolving instances in GHC, and there should instead be some
theoretical grounding to guide them. :)

-- Dan