Resources on how to implement (Haskell 98) kind checking?
Richard Eisenberg
lists at richarde.dev
Wed Oct 27 19:15:02 UTC 2021
> On Oct 27, 2021, at 9:36 AM, Benjamin Redelings <benjamin.redelings at gmail.com> wrote:
>>
>> This won't work.
>>
>> class C a where
>> meth :: a b -> b Int
>>
>> You have to know the kind of local b to learn the kind of class-variable a. So you have to do it all at once.
> I was doing it all at once -- but I wasn't sure how to export the information about 'b' from the procedure. (After you record the kinds of the typecons like C, I believe the different typecons in the recursive group become separable.)
>
> I presume (in retrospect) that GHC modifies the declaration to record the kind of b, and then re-walks the declaration to substitute kind variables later?
>
That would be smart, but it's not what GHC does. Instead, GHC first says (a :: kappa), for a fresh unification variable kappa. Then, GHC determines (b :: Type -> Type) and thus unifies kappa := (Type -> Type) -> Type. GHC then *throws away* the information about b. (See the `_ <- ...` at https://gitlab.haskell.org/ghc/ghc/-/blob/638f65482ca5265c268aa97abfcc14cdc27e46ba/compiler/GHC/Tc/Gen/HsType.hs#L388) Having now determined conclusively what the kind of `a` is, GHC will later re-check the kind of the type of meth, recording it for good.
> But if that is a difficulty on the right road, I will just go for it.
>
If zonking is a difficulty, there's probably something wrong somewhere. Lengthy, yes, but not difficult. If you're using Data.Data, I imagine you could implement zonking with just the right use of Data.Generics.Schemes.everywhere.
> Oh, I forgot to add, would it make sense to put some of the information I discovered about implementing kind checking into the wiki somewhere?
>
This is an excellent idea, but I'll counterpropose that you write this in a Note. Indeed, Note [Kind checking for type and class decls] in GHC.Tc.TyCl is probably meant to be the Note you're looking for, but it's too short. Feel free to flesh it out with all of these details. Then, make sure that the functions that implement the details refer to the Note (and vice versa). This will help it to stay up-to-date -- much more so than putting it in the wiki.
> It might also help to reference the relevant papers (mostly the PolyKinds paper), and maybe also to mention papers like the THIH paper that don't actually implement kind checking.
>
Yes, refer to papers. But please do use their proper names (years and conference names are also helpful). It actually took some inference for me to figure out what the "PolyKinds paper" was in your emails. (The paper that proposes the extension that became PolyKinds is "Giving Haskell a Promotion", TLDI'12, and the paper that describes its extension to work more generally is "System FC with Explicit Kind Equality", ICFP'13. I thought you were referring to one of these -- not to "Kind Inference for Datatypes".)
Thanks!
Richard
> -BenRI
>
> On 10/15/21 1:37 PM, Richard Eisenberg wrote:
>>
>>
>>> On Oct 14, 2021, at 11:59 AM, Benjamin Redelings <benjamin.redelings at gmail.com <mailto:benjamin.redelings at gmail.com>> wrote:
>>>
>>> I asked about this on Haskell-Cafe, and was recommended to ask here instead. Any help is much appreciated!
>>>
>>
>> I saw your post over there, but haven't had time to respond.... but this retelling of the story makes it easier to respond, so I'll do so here.
>>
>>> * The PolyKinds paper was the most helpful thing I've found, but it doesn't cover type classes. I'm also not sure that all implementers can follow algorithm descriptions that are laid out as inference rules, but maybe that could be fixed with a few hints about how to run the rules in reverse. Also, in practice I think an implementer would want to follow GHC in specifying the initial kind of a data type as k1 -> k2 -> ... kn -> *.
>>>
>>
>> What is unique about type classes? It seems like you're worried about locally quantified type variables in method types, but (as far as kind inference is concerned) those are very much like existential variables in data constructors. So perhaps take the bit about existential variables from the PolyKinds part of that paper and combine it with the Haskell98 part.
>>
>> It's true that many implementors may find the notation in that paper to be a barrier, but you just have to read the rules clockwise, starting from the bottom left and ending on the bottom right. :)
>>>
>>> 2. The following question (which I have maybe kind of answered now, but could use more advice on) is an example of what I am hoping such documentation would explain:
>>>
>>>
>>>> Q: How do you handle type variables that are present in class methods, but are not type class parameters? If there are multiple types/classes in a single recursive group, the kind of such type variables might not be fully resolved until a later type-or-class is processed. Is there a recommended approach?
>>>>
>>>> I can see two ways to proceed:
>>>>
>>>> i) First determine the kinds of all the data types, classes, and type synonyms. Then perform a second pass over each type or class to determine the kinds of type variables (in class methods) that are not type class parameters.
>>
>> This won't work.
>>
>> class C a where
>> meth :: a b -> b Int
>>
>> You have to know the kind of local b to learn the kind of class-variable a. So you have to do it all at once.
>>
>>>>
>>>> ii) Alternatively, record the kind of each type variable as it is encountered -- even though such kinds may contain unification kind variables. After visiting all types-or-classes in the recursive group, replace any kind variables with their definition, or with a * if there is no definition.
>>>>
>>>> I've currently implement approach i), which requires doing kind inference on class methods twice.
>>>>
>>> Further investigation revealed that GHC takes yet another approach (I think):
>>>
>>> iii) Represent kinds with modifiable variables. Substitution can be implemented by modifying kind variables in-place. This is (I think) called "zonking" in the GHC sources.
>>
>> I don't really see the difference between (ii) and (iii). Maybe (ii) records the kinds in a table somewhere, while (iii) records them "in" the kind variables themselves, but that's not so different, I think.
>>
>>>
>>> This solves a small mystery for me, since I previously thought that zonking was just replacing remaining kind variables with '*'. And indeed this seems to be an example of zonking, but not what zonking is (I think).
>>
>> We can imagine that, instead of mutation, we build a substitution mapping unification variables to types (or kinds). This would be stored just as a simple mapping or dictionary structure. No mutation. As we learn about a unification variable, we just add to the mapping. If we did this, zonking would be the act of applying the substitution, replacing known unification variables with their values. It just so happens that GHC builds a mapping by using mutable cells in memory, but that's just an implementation detail: zonking is still just applying the substitution.
>>
>> Zonking does not replace anything with *. Well, functions that have "zonk" in their name may do this. But it is not really logically part of the zonking operation. If you like, you can pretend that, after zonking a program, we take a separate pass replacing any yet-unfilled kind-level unification variables with *. Sometimes, this is called "zapping" in GHC, I believe.
>>
>>>
>>> Zonking looks painful to implement, but approach (i) might require multiple passes over types to update the kind of type variables, which might be worse...
>>
>> Zonking is a bit laborious to implement, but not painful. Laborious, because it requires a full pass over the AST. Not painful, because all it's trying to do is replace type/kind variables with substitutions: each individual piece of the puzzle is quite simple.
>>
>> I hope this is helpful!
>> Richard
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.haskell.org/pipermail/ghc-devs/attachments/20211027/932cb454/attachment.html>
More information about the ghc-devs
mailing list