Broken Data.Data instances
Alan & Kim Zimmerman
alan.zimm at gmail.com
Fri Aug 15 15:36:15 UTC 2014
Did you look at https://phabricator.haskell.org/D157? It superseded
https://phabricator.haskell.org/D153
On Fri, Aug 15, 2014 at 5:17 PM, Simon Peyton Jones <simonpj at microsoft.com>
wrote:
> Eek. Glancing at this I see that every single data type has an extra
> type parameter. To me this feels like a sledgehammer to crack a nut. What
> is wrong with the type-function approach?
>
>
>
> Simon
>
>
>
> *From:* Alan & Kim Zimmerman [mailto:alan.zimm at gmail.com]
> *Sent:* 13 August 2014 07:50
> *To:* Philip K.F. Hölzenspies
>
> *Cc:* Simon Peyton Jones; Edward Kmett; ghc-devs at haskell.org
> *Subject:* Re: Broken Data.Data instances
>
>
>
> And I dipped my toes into the phabricator water, and uploaded a diff to
> https://phabricator.haskell.org/D153
>
> I left the lines long for now, so that it is clear that I simply added
> parameters to existing type signatures.
>
>
>
> On Tue, Aug 12, 2014 at 10:51 PM, Alan & Kim Zimmerman <
> alan.zimm at gmail.com> wrote:
>
> Status update
>
> I have worked through a proof of concept update to the GHC AST whereby the
> type is provided as a parameter to each data type. This was basically a
> mechanical process of changing type signatures, and required very little
> actual code changes, being only to initialise the placeholder types.
>
> The enabling types are
>
> type PostTcType = Type -- Used for slots in the abstract
> syntax
> -- where we want to keep slot for a type
> -- to be added by the type checker...but
> -- [before typechecking it's just bogus]
>
> type PreTcType = () -- used before typechecking
>
>
> class PlaceHolderType a where
> placeHolderType :: a
>
> instance PlaceHolderType PostTcType where
>
>
> placeHolderType = panic "Evaluated the place holder for a
> PostTcType"
>
> instance PlaceHolderType PreTcType where
> placeHolderType = ()
>
> These are used to replace all instances of PostTcType in the hsSyn types.
>
> The change was applied against HEAD as of last friday, and can be found
> here
>
> https://github.com/alanz/ghc/tree/wip/landmine-param
> https://github.com/alanz/haddock/tree/wip/landmine-param
>
> They pass 'sh validate' with GHC 7.6.3, and compile against GHC 7.8.3. I
> have not tried to validate that yet, have no reason to expect failure.
>
> Can I please get some feedback as to whether this is a worthwhile
> change?
>
>
> It is the first step to getting a generic traversal safe AST
>
> Regards
>
> Alan
>
>
>
> On Mon, Jul 28, 2014 at 5:45 PM, Alan & Kim Zimmerman <alan.zimm at gmail.com>
> wrote:
>
> FYI I edited the paste at http://lpaste.net/108262 to show the problem
>
>
>
> On Mon, Jul 28, 2014 at 5:41 PM, Alan & Kim Zimmerman <alan.zimm at gmail.com>
> wrote:
>
> I already tried that, the syntax does not seem to allow it.
>
> I suspect some higher form of sorcery will be required, as alluded to here
> http://stackoverflow.com/questions/14133121/can-i-constrain-a-type-family
>
> Alan
>
>
>
> On Mon, Jul 28, 2014 at 4:55 PM, <p.k.f.holzenspies at utwente.nl> wrote:
>
> Dear Alan,
>
>
>
> I would think you would want to constrain the result, i.e.
>
>
>
> type family (Data (PostTcType a)) => PostTcType a where …
>
>
>
> The Data-instance of ‘a’ doesn’t give you much if you have a ‘PostTcType
> a’.
>
>
>
> Your point about SYB-recognition of WrongPhase is, of course, a good one ;)
>
>
>
> Regards,
>
> Philip
>
>
>
>
>
>
>
> *From:* Alan & Kim Zimmerman [mailto:alan.zimm at gmail.com]
> *Sent:* maandag 28 juli 2014 14:10
> *To:* Holzenspies, P.K.F. (EWI)
> *Cc:* Simon Peyton Jones; Edward Kmett; ghc-devs at haskell.org
>
>
> *Subject:* Re: Broken Data.Data instances
>
>
>
> Philip
>
> I think the main reason for the WrongPhase thing is to have something that
> explicitly has a Data and Typeable instance, to allow generic (SYB)
> traversal. If we can get by without this so much the better.
>
> On a related note, is there any way to constrain the 'a' in
>
> type family PostTcType a where
> PostTcType Id = TcType
> PostTcType other = WrongPhaseTyp
>
> to have an instance of Data?
>
> I am experimenting with traversals over my earlier paste, and got stuck
> here (which is the reason the Show instances were commentet out in the
> original).
>
> Alan
>
>
>
>
>
> On Mon, Jul 28, 2014 at 12:30 PM, <p.k.f.holzenspies at utwente.nl> wrote:
>
> Sorry about that… I’m having it out with my terminal server and the server
> seems to be winning. Here’s another go:
>
>
>
> I always read the () as “there’s nothing meaningful to stick in here, but
> I have to stick in something” so I don’t necessarily want the
> WrongPhase-thing. There is very old commentary stating it would be lovely
> if someone could expose the PostTcType as a parameter of the AST-types, but
> that there are so many types and constructors, that it’s a boring chore to
> do. Actually, I was hoping haRe would come up to speed to be able to do
> this. That being said, I think Simon’s idea to turn PostTcType into a
> type-family is a better way altogether; it also documents intent, i.e. ()
> may not say so much, but PostTcType RdrName says quite a lot.
>
>
>
> Simon commented that a lot of the internal structures aren’t trees, but
> cyclic graphs, e.g. the TyCon for Maybe references the DataCons for Just
> and Nothing, which again refer to the TyCon for Maybe. I was wondering
> whether it would be possible to make stateful lenses for this. Of course,
> for specific cases, we could do this, but I wonder if it is also possible
> to have lenses remember the things they visited and not visit them twice.
> Any ideas on this, Edward?
>
>
>
> Regards,
>
> Philip
>
>
>
>
>
>
>
>
>
>
>
> *From:* Alan & Kim Zimmerman [mailto:alan.zimm at gmail.com]
>
> *Sent:* maandag 28 juli 2014 11:14
>
> *To:* Simon Peyton Jones
> *Cc:* Edward Kmett; Holzenspies, P.K.F. (EWI); ghc-devs
>
>
> *Subject:* Re: Broken Data.Data instances
>
>
>
> I have made a conceptual example of this here http://lpaste.net/108262
>
> Alan
>
>
>
> On Mon, Jul 28, 2014 at 9:50 AM, Alan & Kim Zimmerman <alan.zimm at gmail.com>
> wrote:
>
> What about creating a specific type with a single constructor for the "not
> relevant to this phase" type to be used instead of () above? That would
> also clearly document what was going on.
>
> Alan
>
>
>
> On Mon, Jul 28, 2014 at 9:14 AM, Simon Peyton Jones <simonpj at microsoft.com>
> wrote:
>
> I've had to mangle a bunch of hand-written Data instances and push out
> patches to a dozen packages that used to be built this way before I
> convinced the authors to switch to safer versions of Data. Using virtual
> smart constructors like we do now in containers and Text where needed can
> be used to preserve internal invariants, etc.
>
>
>
> If the “hand grenades” are the PostTcTypes, etc, then I can explain why
> they are there.
>
>
>
> There simply is no sensible type you can put before the type checker
> runs. For example one of the constructors in HsExpr is
>
> | HsMultiIf PostTcType [LGRHS id (LHsExpr id)]
>
> After type checking we know what type the thing has, but before we have no
> clue.
>
>
>
> We could get around this by saying
>
> type PostTcType = Maybe TcType
>
> but that would mean that every post-typechecking consumer would need a
> redundant pattern-match on a Just that would always succeed.
>
>
>
> It’s nothing deeper than that. Adding Maybes everywhere would be
> possible, just clunky.
>
>
>
>
>
> However we now have type functions, and HsExpr is parameterised by an ‘id’
> parameter, which changes from RdrName (after parsing) to Name (after
> renaming) to Id (after typechecking). So we could do this:
>
> | HsMultiIf (PostTcType id) [LGRHS id (LHsExpr id)]
>
> and define PostTcType as a closed type family thus
>
>
>
> type family PostTcType a where
>
> PostTcType Id = TcType
>
> PostTcType other = ()
>
>
>
> That would be better than filling it with bottoms. But it might not help
> with generic programming, because there’d be a component whose type wasn’t
> fixed. I have no idea how generics and type functions interact.
>
>
>
> Simon
>
>
>
> *From:* Edward Kmett [mailto:ekmett at gmail.com]
> *Sent:* 27 July 2014 18:27
> *To:* p.k.f.holzenspies at utwente.nl
> *Cc:* alan.zimm at gmail.com; Simon Peyton Jones; ghc-devs
>
>
> *Subject:* Re: Broken Data.Data instances
>
>
>
> Philip, Alan,
>
>
>
> If you need a hand, I'm happy to pitch in guidance.
>
>
>
> I've had to mangle a bunch of hand-written Data instances and push out
> patches to a dozen packages that used to be built this way before I
> convinced the authors to switch to safer versions of Data. Using virtual
> smart constructors like we do now in containers and Text where needed can
> be used to preserve internal invariants, etc.
>
>
>
> This works far better for users of the API than just randomly throwing
> them a live hand grenade. As I recall, these little grenades in generic
> programming over the GHC API have been a constant source of pain for
> libraries like haddock.
>
>
>
> Simon,
>
>
>
> It seems to me that regarding circular data structures, nothing prevents
> you from walking a circular data structure with Data.Data. You can generate
> a new one productively that looks just like the old with the contents
> swapped out, it is indistinguishable to an observer if the fixed point is
> lost, and a clever observer can use observable sharing to get it back,
> supposing that they are allowed to try.
>
>
>
> Alternately, we could use the 'virtual constructor' trick there to break
> the cycle and reintroduce it, but I'm less enthusiastic about that idea,
> even if it is simpler in many ways.
>
>
>
> -Edward
>
>
>
> On Sun, Jul 27, 2014 at 10:17 AM, <p.k.f.holzenspies at utwente.nl> wrote:
>
> Alan,
>
> In that case, let's have a short feedback-loop between the two of us. It
> seems many of these files (Name.lhs, for example) are really stable through
> the repo-history. It would be nice to have one bigger refactoring all in
> one go (some of the code could use a polish, a lot of code seems removable).
>
> Regards,
> Philip
> ------------------------------
>
> *Van:* Alan & Kim Zimmerman [alan.zimm at gmail.com]
> *Verzonden:* vrijdag 25 juli 2014 13:44
> *Aan:* Simon Peyton Jones
> *CC:* Holzenspies, P.K.F. (EWI); ghc-devs at haskell.org
> *Onderwerp:* Re: Broken Data.Data instances
>
> By the way, I would be happy to attempt this task, if the concept is
> viable.
>
>
>
> On Thu, Jul 24, 2014 at 11:23 PM, Alan & Kim Zimmerman <
> alan.zimm at gmail.com> wrote:
>
> While we are talking about fixing traversals, how about getting rid of
> the phase specific panic initialisers for placeHolderType, placeHolderKind
> and friends?
>
> In order to safely traverse with SYB, the following needs to be inserted
> into all the SYB schemes (see
>
> https://github.com/alanz/HaRe/blob/master/src/Language/Haskell/Refact/Utils/GhcUtils.hs
> )
>
> -- Check the Typeable items
> checkItemStage1 :: (Typeable a) => SYB.Stage -> a -> Bool
> checkItemStage1 stage x = (const False `SYB.extQ` postTcType `SYB.extQ`
> fixity `SYB.extQ` nameSet) x
> where nameSet = const (stage `elem` [SYB.Parser,SYB.TypeChecker]) ::
> GHC.NameSet -> Bool
> postTcType = const (stage < SYB.TypeChecker ) ::
> GHC.PostTcType -> Bool
> fixity = const (stage < SYB.Renamer ) ::
> GHC.Fixity -> Bool
>
> And in addition HsCmdTop and ParStmtBlock are initialised with explicit
> 'undefined values.
>
> Perhaps use an initialiser that can have its panic turned off when called
> via the GHC API?
>
> Regards
>
> Alan
>
>
>
>
>
> On Thu, Jul 24, 2014 at 11:06 PM, Simon Peyton Jones <
> simonpj at microsoft.com> wrote:
>
> So... does anyone object to me changing these "broken" instances with
> the ones given by DeriveDataTypeable?
>
> That’s fine with me provided (a) the default behaviour is not immediate
> divergence (which it might well be), and (b) the pitfalls are documented.
>
>
>
> Simon
>
>
>
> *From:* "Philip K.F. Hölzenspies" [mailto:p.k.f.holzenspies at utwente.nl]
> *Sent:* 24 July 2014 18:42
> *To:* Simon Peyton Jones
> *Cc:* ghc-devs at haskell.org
> *Subject:* Re: Broken Data.Data instances
>
>
>
> Dear Simon, et al,
>
> These are very good points to make for people writing such traversals and
> queries. I would be more than happy to write a page on the pitfalls etc. on
> the wiki, but in my experience so far, exploring the innards of GHC is
> tremendously helped by trying small things out and showing (bits of) the
> intermediate structures. For me, personally, this has always been hindered
> by the absence of good instances of Data and/or Show (not having to bring
> DynFlags and not just visualising with the pretty printer are very helpful).
>
> So... does anyone object to me changing these "broken" instances with the
> ones given by DeriveDataTypeable?
>
> Also, many of these internal data structures could be provided with useful
> lenses to improve such traversals further. Anyone ever go at that? Would be
> people be interested?
>
> Regards,
> Philip
>
> *Simon Peyton Jones* <simonpj at microsoft.com>
>
> 24 Jul 2014 18:22
>
> GHC’s data structures are often mutually recursive. e.g.
>
> · The TyCon for Maybe contains the DataCon for Just
>
> · The DataCon For just contains Just’s type
>
> · Just’s type contains the TyCon for Maybe
>
>
>
> So any attempt to recursively walk over all these structures, as you would
> a tree, will fail.
>
>
>
> Also there’s a lot of sharing. For example, every occurrence of ‘map’ is
> a Var, and inside that Var is map’s type, its strictness, its rewrite RULE,
> etc etc. In walking over a term you may not want to walk over all that
> stuff at every occurrence of map.
>
>
>
> Maybe that’s it; I’m not certain since I did not write the Data instances
> for any of GHC’s types
>
>
>
> Simon
>
>
>
> *From:* ghc-devs [mailto:ghc-devs-bounces at haskell.org
> <ghc-devs-bounces at haskell.org>] *On Behalf Of *
> p.k.f.holzenspies at utwente.nl
> *Sent:* 24 July 2014 16:42
> *To:* ghc-devs at haskell.org
> *Subject:* Broken Data.Data instances
>
>
>
> Dear GHC-ers,
>
>
>
> Is there a reason for explicitly broken Data.Data instances? Case in point:
>
>
>
> > instance Data Var where
>
> > -- don't traverse?
>
> > toConstr _ = abstractConstr "Var"
>
> > gunfold _ _ = error "gunfold"
>
> > dataTypeOf _ = mkNoRepType "Var"
>
>
>
> I understand (vaguely) arguments about abstract data types, but this also
> excludes convenient queries that can, e.g. extract all types from a
> CoreExpr. I had hoped to do stuff like this:
>
>
>
> > collect :: (Typeable b, Data a, MonadPlus m) => a -> m b
>
> > collect = everything mplus $ mkQ mzero return
>
> >
>
> > allTypes :: CoreExpr -> [Type]
>
> > allTypes = collect
>
>
>
> Especially when still exploring (parts of) the GHC API, being able to
> extract things in this fashion is very helpful. SYB’s “everything” being
> broken by these instances, not so much.
>
>
>
> Would a patch “fixing” these instances be acceptable?
>
>
>
> Regards,
>
> Philip
>
>
>
>
>
>
>
> _______________________________________________
> ghc-devs mailing list
> ghc-devs at haskell.org
> http://www.haskell.org/mailman/listinfo/ghc-devs
>
>
>
>
>
>
> _______________________________________________
> ghc-devs mailing list
> ghc-devs at haskell.org
> http://www.haskell.org/mailman/listinfo/ghc-devs
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.haskell.org/pipermail/ghc-devs/attachments/20140815/bc21baf3/attachment-0001.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: image001.jpg
Type: image/jpeg
Size: 1247 bytes
Desc: not available
URL: <http://www.haskell.org/pipermail/ghc-devs/attachments/20140815/bc21baf3/attachment-0001.jpg>
More information about the ghc-devs
mailing list