Broken Data.Data instances

Alan & Kim Zimmerman alan.zimm at gmail.com
Wed Aug 13 06:50:25 UTC 2014


And I dipped my toes into the phabricator water, and uploaded a diff to
https://phabricator.haskell.org/D153

I left the lines long for now, so that it is clear that I simply added
parameters to existing type signatures.


On Tue, Aug 12, 2014 at 10:51 PM, Alan & Kim Zimmerman <alan.zimm at gmail.com>
wrote:

> Status update
>
> I have worked through a proof of concept update to the GHC AST whereby the
> type is provided as a parameter to each data type. This was basically a
> mechanical process of changing type signatures, and required very little
> actual code changes, being only to initialise the placeholder types.
>
> The enabling types are
>
>
>     type PostTcType = Type        -- Used for slots in the abstract syntax
>                     -- where we want to keep slot for a type
>                     -- to be added by the type checker...but
>                     -- [before typechecking it's just bogus]
>     type PreTcType = ()             -- used before typechecking
>
>
>     class PlaceHolderType a where
>       placeHolderType :: a
>
>     instance PlaceHolderType PostTcType where
>
>       placeHolderType  = panic "Evaluated the place holder for a
> PostTcType"
>
>     instance PlaceHolderType PreTcType where
>       placeHolderType = ()
>
> These are used to replace all instances of PostTcType in the hsSyn types.
>
> The change was applied against HEAD as of last friday, and can be found
> here
>
> https://github.com/alanz/ghc/tree/wip/landmine-param
> https://github.com/alanz/haddock/tree/wip/landmine-param
>
> They pass 'sh validate' with GHC 7.6.3, and compile against GHC 7.8.3. I
> have not tried to validate that yet, have no reason to expect failure.
>
>
> Can I please get some feedback as to whether this is a worthwhile change?
>
> It is the first step to getting a generic traversal safe AST
>
> Regards
>   Alan
>
>
> On Mon, Jul 28, 2014 at 5:45 PM, Alan & Kim Zimmerman <alan.zimm at gmail.com
> > wrote:
>
>> FYI I edited the paste at http://lpaste.net/108262 to show the problem
>>
>>
>> On Mon, Jul 28, 2014 at 5:41 PM, Alan & Kim Zimmerman <
>> alan.zimm at gmail.com> wrote:
>>
>>> I already tried that, the syntax does not seem to allow it.
>>>
>>> I suspect some higher form of sorcery will be required, as alluded to
>>> here
>>> http://stackoverflow.com/questions/14133121/can-i-constrain-a-type-family
>>>
>>> Alan
>>>
>>>
>>> On Mon, Jul 28, 2014 at 4:55 PM, <p.k.f.holzenspies at utwente.nl> wrote:
>>>
>>>>  Dear Alan,
>>>>
>>>>
>>>>
>>>> I would think you would want to constrain the result, i.e.
>>>>
>>>>
>>>>
>>>> type family (Data (PostTcType a)) => PostTcType a where …
>>>>
>>>>
>>>>
>>>> The Data-instance of ‘a’ doesn’t give you much if you have a
>>>> ‘PostTcType a’.
>>>>
>>>>
>>>>
>>>> Your point about SYB-recognition of WrongPhase is, of course, a good
>>>> one ;)
>>>>
>>>>
>>>>
>>>> Regards,
>>>>
>>>> Philip
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>> *From:* Alan & Kim Zimmerman [mailto:alan.zimm at gmail.com]
>>>> *Sent:* maandag 28 juli 2014 14:10
>>>> *To:* Holzenspies, P.K.F. (EWI)
>>>> *Cc:* Simon Peyton Jones; Edward Kmett; ghc-devs at haskell.org
>>>>
>>>> *Subject:* Re: Broken Data.Data instances
>>>>
>>>>
>>>>
>>>> Philip
>>>>
>>>> I think the main reason for the WrongPhase thing is to have something
>>>> that explicitly has a Data and Typeable instance, to allow generic (SYB)
>>>> traversal. If we can get by without this so much the better.
>>>>
>>>> On a related note, is there any way to constrain the 'a' in
>>>>
>>>> type family PostTcType a where
>>>>   PostTcType Id    = TcType
>>>>   PostTcType other = WrongPhaseTyp
>>>>
>>>>   to have an instance of Data?
>>>>
>>>> I am experimenting with traversals over my earlier paste, and got stuck
>>>> here (which is the reason the Show instances were commentet out in the
>>>> original).
>>>>
>>>> Alan
>>>>
>>>>
>>>>
>>>>
>>>>
>>>> On Mon, Jul 28, 2014 at 12:30 PM, <p.k.f.holzenspies at utwente.nl> wrote:
>>>>
>>>> Sorry about that… I’m having it out with my terminal server and the
>>>> server seems to be winning. Here’s another go:
>>>>
>>>>
>>>>
>>>> I always read the () as “there’s nothing meaningful to stick in here,
>>>> but I have to stick in something” so I don’t necessarily want the
>>>> WrongPhase-thing. There is very old commentary stating it would be lovely
>>>> if someone could expose the PostTcType as a parameter of the AST-types, but
>>>> that there are so many types and constructors, that it’s a boring chore to
>>>> do. Actually, I was hoping haRe would come up to speed to be able to do
>>>> this. That being said, I think Simon’s idea to turn PostTcType into a
>>>> type-family is a better way altogether; it also documents intent, i.e. ()
>>>> may not say so much, but PostTcType RdrName says quite a lot.
>>>>
>>>>
>>>>
>>>> Simon commented that a lot of the internal structures aren’t trees, but
>>>> cyclic graphs, e.g. the TyCon for Maybe references the DataCons for Just
>>>> and Nothing, which again refer to the TyCon for Maybe. I was wondering
>>>> whether it would be possible to make stateful lenses for this. Of course,
>>>> for specific cases, we could do this, but I wonder if it is also possible
>>>> to have lenses remember the things they visited and not visit them twice.
>>>> Any ideas on this, Edward?
>>>>
>>>>
>>>>
>>>> Regards,
>>>>
>>>> Philip
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>> *From:* Alan & Kim Zimmerman [mailto:alan.zimm at gmail.com]
>>>>
>>>> *Sent:* maandag 28 juli 2014 11:14
>>>>
>>>> *To:* Simon Peyton Jones
>>>> *Cc:* Edward Kmett; Holzenspies, P.K.F. (EWI); ghc-devs
>>>>
>>>>
>>>> *Subject:* Re: Broken Data.Data instances
>>>>
>>>>
>>>>
>>>> I have made a conceptual example of this here http://lpaste.net/108262
>>>>
>>>> Alan
>>>>
>>>>
>>>>
>>>> On Mon, Jul 28, 2014 at 9:50 AM, Alan & Kim Zimmerman <
>>>> alan.zimm at gmail.com> wrote:
>>>>
>>>> What about creating a specific type with a single constructor for the
>>>> "not relevant to this phase" type to be used instead of () above? That
>>>> would also clearly document what was going on.
>>>>
>>>> Alan
>>>>
>>>>
>>>>
>>>> On Mon, Jul 28, 2014 at 9:14 AM, Simon Peyton Jones <
>>>> simonpj at microsoft.com> wrote:
>>>>
>>>> I've had to mangle a bunch of hand-written Data instances and push out
>>>> patches to a dozen packages that used to be built this way before I
>>>> convinced the authors to switch to safer versions of Data. Using virtual
>>>> smart constructors like we do now in containers and Text where needed can
>>>> be used to preserve internal invariants, etc.
>>>>
>>>>
>>>>
>>>> If the “hand grenades” are the PostTcTypes, etc, then I can explain why
>>>> they are there.
>>>>
>>>>
>>>>
>>>> There simply is no sensible type you can put before the type checker
>>>> runs.  For example one of the constructors  in HsExpr is
>>>>
>>>>   | HsMultiIf   PostTcType [LGRHS id (LHsExpr id)]
>>>>
>>>> After type checking we know what type the thing has, but before we have
>>>> no clue.
>>>>
>>>>
>>>>
>>>> We could get around this by saying
>>>>
>>>>             type PostTcType = Maybe TcType
>>>>
>>>> but that would mean that every post-typechecking consumer would need a
>>>> redundant pattern-match on a Just that would always succeed.
>>>>
>>>>
>>>>
>>>> It’s nothing deeper than that.  Adding Maybes everywhere would be
>>>> possible, just clunky.
>>>>
>>>>
>>>>
>>>>
>>>>
>>>> However we now have type functions, and HsExpr is parameterised by an
>>>> ‘id’ parameter, which changes from RdrName (after parsing) to Name (after
>>>> renaming) to Id (after typechecking).  So we could do this:
>>>>
>>>>   | HsMultiIf   (PostTcType id) [LGRHS id (LHsExpr id)]
>>>>
>>>> and define PostTcType as a closed type family thus
>>>>
>>>>
>>>>
>>>>      type family PostTcType a where
>>>>
>>>>           PostTcType Id = TcType
>>>>
>>>>           PostTcType other = ()
>>>>
>>>>
>>>>
>>>> That would be better than filling it with bottoms.  But it might not
>>>> help with generic programming, because there’d be a component whose type
>>>> wasn’t fixed.  I have no idea how generics and type functions interact.
>>>>
>>>>
>>>>
>>>> Simon
>>>>
>>>>
>>>>
>>>> *From:* Edward Kmett [mailto:ekmett at gmail.com]
>>>> *Sent:* 27 July 2014 18:27
>>>> *To:* p.k.f.holzenspies at utwente.nl
>>>> *Cc:* alan.zimm at gmail.com; Simon Peyton Jones; ghc-devs
>>>>
>>>>
>>>> *Subject:* Re: Broken Data.Data instances
>>>>
>>>>
>>>>
>>>> Philip, Alan,
>>>>
>>>>
>>>>
>>>> If you need a hand, I'm happy to pitch in guidance.
>>>>
>>>>
>>>>
>>>> I've had to mangle a bunch of hand-written Data instances and push out
>>>> patches to a dozen packages that used to be built this way before I
>>>> convinced the authors to switch to safer versions of Data. Using virtual
>>>> smart constructors like we do now in containers and Text where needed can
>>>> be used to preserve internal invariants, etc.
>>>>
>>>>
>>>>
>>>> This works far better for users of the API than just randomly throwing
>>>> them a live hand grenade. As I recall, these little grenades in generic
>>>> programming over the GHC API have been a constant source of pain for
>>>> libraries like haddock.
>>>>
>>>>
>>>>
>>>> Simon,
>>>>
>>>>
>>>>
>>>> It seems to me that regarding circular data structures, nothing
>>>> prevents you from walking a circular data structure with Data.Data. You can
>>>> generate a new one productively that looks just like the old with the
>>>> contents swapped out, it is indistinguishable to an observer if the fixed
>>>> point is lost, and a clever observer can use observable sharing to get it
>>>> back, supposing that they are allowed to try.
>>>>
>>>>
>>>>
>>>> Alternately, we could use the 'virtual constructor' trick there to
>>>> break the cycle and reintroduce it, but I'm less enthusiastic about that
>>>> idea, even if it is simpler in many ways.
>>>>
>>>>
>>>>
>>>> -Edward
>>>>
>>>>
>>>>
>>>> On Sun, Jul 27, 2014 at 10:17 AM, <p.k.f.holzenspies at utwente.nl> wrote:
>>>>
>>>>  Alan,
>>>>
>>>> In that case, let's have a short feedback-loop between the two of us.
>>>> It seems many of these files (Name.lhs, for example) are really stable
>>>> through the repo-history. It would be nice to have one bigger refactoring
>>>> all in one go (some of the code could use a polish, a lot of code seems
>>>> removable).
>>>>
>>>> Regards,
>>>> Philip
>>>>   ------------------------------
>>>>
>>>> *Van:* Alan & Kim Zimmerman [alan.zimm at gmail.com]
>>>> *Verzonden:* vrijdag 25 juli 2014 13:44
>>>> *Aan:* Simon Peyton Jones
>>>> *CC:* Holzenspies, P.K.F. (EWI); ghc-devs at haskell.org
>>>> *Onderwerp:* Re: Broken Data.Data instances
>>>>
>>>> By the way, I would be happy to attempt this task, if the concept is
>>>> viable.
>>>>
>>>>
>>>>
>>>> On Thu, Jul 24, 2014 at 11:23 PM, Alan & Kim Zimmerman <
>>>> alan.zimm at gmail.com> wrote:
>>>>
>>>>    While we are talking about fixing traversals, how about getting rid
>>>> of the phase specific panic initialisers for placeHolderType,
>>>> placeHolderKind and friends?
>>>>
>>>> In order to safely traverse with SYB, the following needs to be
>>>> inserted into all the SYB schemes (see
>>>>
>>>> https://github.com/alanz/HaRe/blob/master/src/Language/Haskell/Refact/Utils/GhcUtils.hs
>>>> )
>>>>
>>>> -- Check the Typeable items
>>>> checkItemStage1 :: (Typeable a) => SYB.Stage -> a -> Bool
>>>> checkItemStage1 stage x = (const False `SYB.extQ` postTcType `SYB.extQ`
>>>> fixity `SYB.extQ` nameSet) x
>>>>   where nameSet     = const (stage `elem` [SYB.Parser,SYB.TypeChecker])
>>>> :: GHC.NameSet       -> Bool
>>>>         postTcType  = const (stage < SYB.TypeChecker                  )
>>>> :: GHC.PostTcType    -> Bool
>>>>         fixity      = const (stage < SYB.Renamer                      )
>>>> :: GHC.Fixity        -> Bool
>>>>
>>>> And in addition HsCmdTop and ParStmtBlock are initialised with explicit
>>>> 'undefined values.
>>>>
>>>> Perhaps use an initialiser that can have its panic turned off when
>>>> called via the GHC API?
>>>>
>>>> Regards
>>>>
>>>>   Alan
>>>>
>>>>
>>>>
>>>>
>>>>
>>>> On Thu, Jul 24, 2014 at 11:06 PM, Simon Peyton Jones <
>>>> simonpj at microsoft.com> wrote:
>>>>
>>>>    So... does anyone object to me changing these "broken" instances
>>>> with the ones given by DeriveDataTypeable?
>>>>
>>>> That’s fine with me provided (a) the default behaviour is not immediate
>>>> divergence (which it might well be), and (b) the pitfalls are documented.
>>>>
>>>>
>>>>
>>>> Simon
>>>>
>>>>
>>>>
>>>> *From:* "Philip K.F. Hölzenspies" [mailto:p.k.f.holzenspies at utwente.nl]
>>>>
>>>> *Sent:* 24 July 2014 18:42
>>>> *To:* Simon Peyton Jones
>>>> *Cc:* ghc-devs at haskell.org
>>>> *Subject:* Re: Broken Data.Data instances
>>>>
>>>>
>>>>
>>>> Dear Simon, et al,
>>>>
>>>> These are very good points to make for people writing such traversals
>>>> and queries. I would be more than happy to write a page on the pitfalls
>>>> etc. on the wiki, but in my experience so far, exploring the innards of GHC
>>>> is tremendously helped by trying small things out and showing (bits of) the
>>>> intermediate structures. For me, personally, this has always been hindered
>>>> by the absence of good instances of Data and/or Show (not having to bring
>>>> DynFlags and not just visualising with the pretty printer are very helpful).
>>>>
>>>> So... does anyone object to me changing these "broken" instances with
>>>> the ones given by DeriveDataTypeable?
>>>>
>>>> Also, many of these internal data structures could be provided with
>>>> useful lenses to improve such traversals further. Anyone ever go at that?
>>>> Would be people be interested?
>>>>
>>>> Regards,
>>>> Philip
>>>>
>>>>     *Simon Peyton Jones* <simonpj at microsoft.com>
>>>>
>>>> 24 Jul 2014 18:22
>>>>
>>>> GHC’s data structures are often mutually recursive. e.g.
>>>>
>>>> ·        The TyCon for Maybe contains the DataCon for Just
>>>>
>>>> ·        The DataCon For just contains Just’s type
>>>>
>>>> ·        Just’s type contains the TyCon for Maybe
>>>>
>>>>
>>>>
>>>> So any attempt to recursively walk over all these structures, as you
>>>> would a tree, will fail.
>>>>
>>>>
>>>>
>>>> Also there’s a lot of sharing.  For example, every occurrence of ‘map’
>>>> is a Var, and inside that Var is map’s type, its strictness, its rewrite
>>>> RULE, etc etc.  In walking over a term you may not want to walk over all
>>>> that stuff at every occurrence of map.
>>>>
>>>>
>>>>
>>>> Maybe that’s it; I’m not certain since I did not write the Data
>>>> instances for any of GHC’s types
>>>>
>>>>
>>>>
>>>> Simon
>>>>
>>>>
>>>>
>>>> *From:* ghc-devs [mailto:ghc-devs-bounces at haskell.org
>>>> <ghc-devs-bounces at haskell.org>] *On Behalf Of *
>>>> p.k.f.holzenspies at utwente.nl
>>>> *Sent:* 24 July 2014 16:42
>>>> *To:* ghc-devs at haskell.org
>>>> *Subject:* Broken Data.Data instances
>>>>
>>>>
>>>>
>>>> Dear GHC-ers,
>>>>
>>>>
>>>>
>>>> Is there a reason for explicitly broken Data.Data instances? Case in
>>>> point:
>>>>
>>>>
>>>>
>>>> > instance Data Var where
>>>>
>>>> >   -- don't traverse?
>>>>
>>>> >   toConstr _   = abstractConstr "Var"
>>>>
>>>> >   gunfold _ _  = error "gunfold"
>>>>
>>>> >   dataTypeOf _ = mkNoRepType "Var"
>>>>
>>>>
>>>>
>>>> I understand (vaguely) arguments about abstract data types, but this
>>>> also excludes convenient queries that can, e.g. extract all types from a
>>>> CoreExpr. I had hoped to do stuff like this:
>>>>
>>>>
>>>>
>>>> > collect :: (Typeable b, Data a, MonadPlus m) => a -> m b
>>>>
>>>> > collect = everything mplus $ mkQ mzero return
>>>>
>>>> >
>>>>
>>>> > allTypes :: CoreExpr -> [Type]
>>>>
>>>> > allTypes = collect
>>>>
>>>>
>>>>
>>>> Especially when still exploring (parts of) the GHC API, being able to
>>>> extract things in this fashion is very helpful. SYB’s “everything” being
>>>> broken by these instances, not so much.
>>>>
>>>>
>>>>
>>>> Would a patch “fixing” these instances be acceptable?
>>>>
>>>>
>>>>
>>>> Regards,
>>>>
>>>> Philip
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>> _______________________________________________
>>>> ghc-devs mailing list
>>>> ghc-devs at haskell.org
>>>> http://www.haskell.org/mailman/listinfo/ghc-devs
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>> _______________________________________________
>>>> ghc-devs mailing list
>>>> ghc-devs at haskell.org
>>>> http://www.haskell.org/mailman/listinfo/ghc-devs
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>
>>>
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.haskell.org/pipermail/ghc-devs/attachments/20140813/18e56d16/attachment-0001.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: image001.jpg
Type: image/jpeg
Size: 1247 bytes
Desc: not available
URL: <http://www.haskell.org/pipermail/ghc-devs/attachments/20140813/18e56d16/attachment-0001.jpg>


More information about the ghc-devs mailing list