Deriving Data for poly-kinded datatypes

Ryan Scott ryan.gl.scott at gmail.com
Fri Feb 24 16:13:48 UTC 2017


Yikes, I was afraid of reaching for something like DataIfStar. But I
don't have any other ideas, so let's play along with that example.

Frustratingly, I can't even write the class definition of DataIfStar.
If I do, GHC yells at me like so:

    • Expected a type, but ‘k’ has kind ‘t’
    • In the first argument of ‘Data’, namely ‘k’
      In the type signature:
        dataIf :: (t ~ Type) => proxy k -> (Data k => r) -> r -> r
      In the class declaration for ‘DataIfStar’

In other words, GHC isn't smart enough to see that in the (Data k =>
r) argument, k's kind t should be the same as *. This is pretty
annoying, because this overlapping instances trick only works due to
the fact that their instance heads distinguish between kinds. In other
words, if you were to change the definition of DataIfStar to this:

    class Typeable k => DataIfStar (k :: *) where
      dataIf :: proxy k -> (Data k => r) -> r -> r

Then although the class declaration would now typecheck, the two
instances below it would be flagged as duplicate instances because
they now have the same instance head! Ugh.

Somehow, I need to be able to defer an equality constraint at the type
level, but just writing (t ~ Type) => ... doesn't quite work,
unfortunately.

Ryan S.

On Fri, Feb 24, 2017 at 12:14 AM, Edward Kmett <ekmett at gmail.com> wrote:
> That is a right mess.
>
> I've now stepped a bit outside of what I think is a practical
> recommendation, but let's just keep playing for the fun of it.
>
> This is going to be hard to do without access to GHC at the moment but here
> goes:
>
> If phantom was of kind * then dataCast1 would need the (Data phantom), so we
> need some sort of way to supply it.
>
> Looks like we'd need some 'Data if *'. We don't have exponentials in the
> type checker though. They are admissable, just not a thing GHC does today.
> We could fake it with some nastiness perhaps:
>
> class (Typeable t, Typeable k) => DataIfStar (k :: t) where
>   dataIf :: (t ~ Type) => proxy k -> (Data k => r) -> r
>
> You might be able to make a couple of overlapping instances, as much as it
> pains me to consider. (Here be dragons, I haven't thought this through.
>
> instance Data k => DataIfStar k where -- overlapping
>   dataIf _ r = r
>
> instance DataIfStar k where -- overlapped
>   dataIf _ _ = undefined
>
> Then after you check k ~ Type in the instance you could invoke dataIf using
> the knowledge that k ~ Type to pull the Data phantom instance into scope.
> You'd get a type signature like:
>
> dataCast1T :: forall k c t (phantom :: k).
>                   (Typeable t, DataIfStar phantom)
>                => (forall d. Data d => c (t d))
>                -> Maybe (c (T phantom))
>
> which would mean the Data instance for T phantom would get a constraint
> like:
>
> instance DataIfStar phantom => Data (T phantom)
>
> Does that make sense?
>
> -Edward
>
>
>
>
> On Thu, Feb 23, 2017 at 10:30 PM, Ryan Scott <ryan.gl.scott at gmail.com>
> wrote:
>>
>> > Supplying the default shouldn't lock our data instance to the form T a.
>> > If for some reason adding this default would break the instance We can make
>> > a more interesting default that does something like look at the kind of the
>> > argument first to determine if it is kind * before proceeding after casting
>> > to ensure the kinds match.
>>
>> Interesting. Do you happen to know how to write this "more interesting
>> default"? I've tried various things, but sadly I can't escape past the
>> typechecker. My attempt that made it the furthest was this:
>>
>>     {-# LANGUAGE GADTs #-}
>>     {-# LANGUAGE PolyKinds #-}
>>     {-# LANGUAGE RankNTypes #-}
>>     {-# LANGUAGE ScopedTypeVariables #-}
>>     {-# LANGUAGE TypeInType #-}
>>     {-# LANGUAGE TypeOperators #-}
>>     module DataCast where
>>
>>     import Data.Data
>>     import Data.Kind (Type)
>>
>>     data T (phantom :: k) = T
>>
>>     dataCast1T :: forall k c t (phantom :: k).
>>                   (Typeable k, Typeable t, Typeable phantom)
>>                => (forall d. Data d => c (t d))
>>                -> Maybe (c (T phantom))
>>     dataCast1T f = case eqT :: Maybe (k :~: Type) of
>>                      Nothing   -> Nothing
>>                      Just Refl -> gcast1 f
>>
>> This would work were it not for the constraints involved:
>>
>>     Could not deduce (Data phantom) arising from a use of ‘f’
>>
>> It seems that applying f is forcing the type parameter to T (phantom)
>> to be a Data instance, which obviously can't happen if (phantom :: k).
>> I don't know a way around this, though, as I'm not aware of a way to
>> "defer" a class constraint (unlike equality constraints, which can be
>> deferred via Typeable).
>>
>> Ryan S.
>>
>> On Thu, Feb 23, 2017 at 5:49 PM, Edward Kmett <ekmett at gmail.com> wrote:
>> > Some thoughts on the topic: admittedly, probably not very useful.
>> >
>> > A couple of obvious statements:
>> >
>> >
>> >
>> > 1)
>> >
>> > gcast1 itself operationally makes sense regardless of the kind of the
>> > argument you're skipping past.
>> >
>> > gcast1 :: forall c t t' a. (Typeable t, Typeable t') => c (t a) -> Maybe
>> > (c
>> > (t' a))
>> >
>> > This is an operation we can define pretty easily, however we want. It
>> > works
>> > more or less by definition. It was included in the original paper.
>> >
>> >
>> >
>> > 2)
>> >
>> > dataCast1 on the other hand is an member of the class, and it needs that
>> > 'Data d' constraint or it can't do its job.
>> >
>> > dataCast1 :: (Data a, Typeable t) => (forall d. Data d => c (t d)) ->
>> > Maybe
>> > (c a)
>> >
>> > Attempting to weaken the Data d constraint there doesn't work, because
>> > then
>> > dataCast1 wouldn't be able to do its job. Without it you can't write
>> > ext1Q
>> > in terms of dataCast1, as noted in the but about 'bogusDataCast' in the
>> > original paper: Scrap More Boilerplate: Reflection, Zips and Generalized
>> > Casts
>> >
>> >
>> >
>> >
>> > Since then we obviously picked up polymorphic kinds, which muddled the
>> > story
>> > about when a data instance is for a type that looks like `T d`.
>> >
>> > On the other hand, the user will be the one calling this method, and
>> > they'll
>> > have to take Data instances for d and use them to generate c (t d).
>> > Knowledge of if argument is of kind * is purely local, though. If t is
>> > Typeable and d is Data, then (t d), d is kind *, t :: * -> *. So to call
>> > dataCast1, the user has to already know the argument is kind *, and that
>> > the
>> > type t that they are trying to use (maybe not the T in the data instance
>> > itself) is of kind * -> *.
>> >
>> > There is no contradiction in trying to call this on a data type with the
>> > wrong kind for it to succeed. T is not necessarily t. It is our job to
>> > check
>> > if it is.
>> >
>> > Supplying the default shouldn't lock our data instance to the form T a.
>> > If
>> > for some reason adding this default would break the instance We can make
>> > a
>> > more interesting default that does something like look at the kind of
>> > the
>> > argument first to determine if it is kind * before proceeding after
>> > casting
>> > to ensure the kinds match.
>> >
>> > tl;dr yes, it would seem it would be worth fixing the instances of Data
>> > produced to supply dataCast1 when the kind of the argument is
>> > polymorphic.
>> > Otherwise turning on PolyKinds in a package will simply break dataCast1
>> > for
>> > basically all of its data types.
>> >
>> > -Edward
>> >
>> > P.S. Ultimately, in a perfect world we'd be able to unify dataCast1 and
>> > dataCast2 with some tricky base case for kind k1 = * and induction over
>> > k ->
>> > k1. Off hand, I don't see how to do it. I played for a while with trying
>> > to
>> > write a higher kinded Data to support this, but my scribbles didn't
>> > cohere
>> > into usable code. gfoldl pretty much locks you into kind *. I even tried
>> > playing with profunctors and powers here to no real avail. I do have
>> > some
>> > old Data1 code in an syb-extras package that I use to write a few
>> > 'impossible' Data instances, but that seems to be solving a different,
>> > if
>> > related, problem, and doesn't scale up to arbitrary kinds. Ideally
>> > there'd
>> > be a plausible Data that makes sense for other kinds k, and everything
>> > could
>> > become dataCast1, w/ dataCast2, and the missing higher versions just an
>> > iterated application of it. I just don't see that there is a way to turn
>> > it
>> > into code.
>> >
>> >
>> > On Thu, Feb 23, 2017 at 2:51 PM, Ryan Scott <ryan.gl.scott at gmail.com>
>> > wrote:
>> >>
>> >> Hi Pedro,
>> >>
>> >> I'm quite confused by a peculiarity of deriving Data (more info in
>> >> Trac #13327 [1]). In particular, if you write this:
>> >>
>> >>     data T phantom = T
>> >>       deriving Data
>> >>
>> >> Then the derived Data instance is NOT this:
>> >>
>> >>     instance Typeable phantom => Data (T phantom) where
>> >>       ...
>> >>
>> >> But instead, it's this:
>> >>
>> >>     instance Data phantom => Data (T phantom) where
>> >>       ...
>> >>       dataCast1 f = gcast1 f
>> >>
>> >> The gcast1 part is why it requires the stronger (Data phantom)
>> >> context, as you noted in Trac #4028 [2].
>> >>
>> >> What confuses me, however, is that is apparently does not carry over
>> >> to poly-kinded datatypes. For instance, if you write this:
>> >>
>> >>     data T (phantom :: k) = T
>> >>       deriving Data
>> >>
>> >> Then you do NOT get this instance:
>> >>
>> >>     instance Data (phantom :: *) => Data (T phantom) where
>> >>       ...
>> >>       dataCast1 f = gcast1 f
>> >>
>> >> But instead, you get this instance!
>> >>
>> >>     instance (Typeable k, Typeable (phantom :: k)) => Data (T phantom)
>> >> where
>> >>       ...
>> >>       -- No implementation for dataCast1
>> >>
>> >> This is quite surprising to me. I'm not knowledgeable enough about
>> >> Data to know for sure if this is an oversight, expected behavior, or
>> >> something else, so I was hoping you (or someone else highly
>> >> knowledgeable about SYB-style generic programming) could help me out
>> >> here.
>> >>
>> >> In particular:
>> >>
>> >> 1. Does emitting "dataCast1 f = gcast1 f" for datatypes of kind (k ->
>> >> *) make sense? Or does it only make sense for types of kind (* -> *)?
>> >> 2. Is there an alternate way to define dataCast1 that doesn't require
>> >> the stronger Data context, but instead only requires the more general
>> >> Typeable context?
>> >>
>> >> Ryan S.
>> >> -----
>> >> [1] https://ghc.haskell.org/trac/ghc/ticket/13327
>> >> [2] https://ghc.haskell.org/trac/ghc/ticket/4028#comment:5
>> >> _______________________________________________
>> >> ghc-devs mailing list
>> >> ghc-devs at haskell.org
>> >> http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs
>> >
>> >
>
>


More information about the ghc-devs mailing list