Broken Data.Data instances

Alan & Kim Zimmerman alan.zimm at gmail.com
Mon Jul 28 15:45:48 UTC 2014


FYI I edited the paste at http://lpaste.net/108262 to show the problem


On Mon, Jul 28, 2014 at 5:41 PM, Alan & Kim Zimmerman <alan.zimm at gmail.com>
wrote:

> I already tried that, the syntax does not seem to allow it.
>
> I suspect some higher form of sorcery will be required, as alluded to here
> http://stackoverflow.com/questions/14133121/can-i-constrain-a-type-family
>
> Alan
>
>
> On Mon, Jul 28, 2014 at 4:55 PM, <p.k.f.holzenspies at utwente.nl> wrote:
>
>>  Dear Alan,
>>
>>
>>
>> I would think you would want to constrain the result, i.e.
>>
>>
>>
>> type family (Data (PostTcType a)) => PostTcType a where …
>>
>>
>>
>> The Data-instance of ‘a’ doesn’t give you much if you have a ‘PostTcType
>> a’.
>>
>>
>>
>> Your point about SYB-recognition of WrongPhase is, of course, a good one
>> ;)
>>
>>
>>
>> Regards,
>>
>> Philip
>>
>>
>>
>>
>>
>>
>>
>> *From:* Alan & Kim Zimmerman [mailto:alan.zimm at gmail.com]
>> *Sent:* maandag 28 juli 2014 14:10
>> *To:* Holzenspies, P.K.F. (EWI)
>> *Cc:* Simon Peyton Jones; Edward Kmett; ghc-devs at haskell.org
>>
>> *Subject:* Re: Broken Data.Data instances
>>
>>
>>
>> Philip
>>
>> I think the main reason for the WrongPhase thing is to have something
>> that explicitly has a Data and Typeable instance, to allow generic (SYB)
>> traversal. If we can get by without this so much the better.
>>
>> On a related note, is there any way to constrain the 'a' in
>>
>> type family PostTcType a where
>>   PostTcType Id    = TcType
>>   PostTcType other = WrongPhaseTyp
>>
>>   to have an instance of Data?
>>
>> I am experimenting with traversals over my earlier paste, and got stuck
>> here (which is the reason the Show instances were commentet out in the
>> original).
>>
>> Alan
>>
>>
>>
>>
>>
>> On Mon, Jul 28, 2014 at 12:30 PM, <p.k.f.holzenspies at utwente.nl> wrote:
>>
>> Sorry about that… I’m having it out with my terminal server and the
>> server seems to be winning. Here’s another go:
>>
>>
>>
>> I always read the () as “there’s nothing meaningful to stick in here, but
>> I have to stick in something” so I don’t necessarily want the
>> WrongPhase-thing. There is very old commentary stating it would be lovely
>> if someone could expose the PostTcType as a parameter of the AST-types, but
>> that there are so many types and constructors, that it’s a boring chore to
>> do. Actually, I was hoping haRe would come up to speed to be able to do
>> this. That being said, I think Simon’s idea to turn PostTcType into a
>> type-family is a better way altogether; it also documents intent, i.e. ()
>> may not say so much, but PostTcType RdrName says quite a lot.
>>
>>
>>
>> Simon commented that a lot of the internal structures aren’t trees, but
>> cyclic graphs, e.g. the TyCon for Maybe references the DataCons for Just
>> and Nothing, which again refer to the TyCon for Maybe. I was wondering
>> whether it would be possible to make stateful lenses for this. Of course,
>> for specific cases, we could do this, but I wonder if it is also possible
>> to have lenses remember the things they visited and not visit them twice.
>> Any ideas on this, Edward?
>>
>>
>>
>> Regards,
>>
>> Philip
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>> *From:* Alan & Kim Zimmerman [mailto:alan.zimm at gmail.com]
>>
>> *Sent:* maandag 28 juli 2014 11:14
>>
>> *To:* Simon Peyton Jones
>> *Cc:* Edward Kmett; Holzenspies, P.K.F. (EWI); ghc-devs
>>
>>
>> *Subject:* Re: Broken Data.Data instances
>>
>>
>>
>> I have made a conceptual example of this here http://lpaste.net/108262
>>
>> Alan
>>
>>
>>
>> On Mon, Jul 28, 2014 at 9:50 AM, Alan & Kim Zimmerman <
>> alan.zimm at gmail.com> wrote:
>>
>> What about creating a specific type with a single constructor for the
>> "not relevant to this phase" type to be used instead of () above? That
>> would also clearly document what was going on.
>>
>> Alan
>>
>>
>>
>> On Mon, Jul 28, 2014 at 9:14 AM, Simon Peyton Jones <
>> simonpj at microsoft.com> wrote:
>>
>> I've had to mangle a bunch of hand-written Data instances and push out
>> patches to a dozen packages that used to be built this way before I
>> convinced the authors to switch to safer versions of Data. Using virtual
>> smart constructors like we do now in containers and Text where needed can
>> be used to preserve internal invariants, etc.
>>
>>
>>
>> If the “hand grenades” are the PostTcTypes, etc, then I can explain why
>> they are there.
>>
>>
>>
>> There simply is no sensible type you can put before the type checker
>> runs.  For example one of the constructors  in HsExpr is
>>
>>   | HsMultiIf   PostTcType [LGRHS id (LHsExpr id)]
>>
>> After type checking we know what type the thing has, but before we have
>> no clue.
>>
>>
>>
>> We could get around this by saying
>>
>>             type PostTcType = Maybe TcType
>>
>> but that would mean that every post-typechecking consumer would need a
>> redundant pattern-match on a Just that would always succeed.
>>
>>
>>
>> It’s nothing deeper than that.  Adding Maybes everywhere would be
>> possible, just clunky.
>>
>>
>>
>>
>>
>> However we now have type functions, and HsExpr is parameterised by an
>> ‘id’ parameter, which changes from RdrName (after parsing) to Name (after
>> renaming) to Id (after typechecking).  So we could do this:
>>
>>   | HsMultiIf   (PostTcType id) [LGRHS id (LHsExpr id)]
>>
>> and define PostTcType as a closed type family thus
>>
>>
>>
>>      type family PostTcType a where
>>
>>           PostTcType Id = TcType
>>
>>           PostTcType other = ()
>>
>>
>>
>> That would be better than filling it with bottoms.  But it might not help
>> with generic programming, because there’d be a component whose type wasn’t
>> fixed.  I have no idea how generics and type functions interact.
>>
>>
>>
>> Simon
>>
>>
>>
>> *From:* Edward Kmett [mailto:ekmett at gmail.com]
>> *Sent:* 27 July 2014 18:27
>> *To:* p.k.f.holzenspies at utwente.nl
>> *Cc:* alan.zimm at gmail.com; Simon Peyton Jones; ghc-devs
>>
>>
>> *Subject:* Re: Broken Data.Data instances
>>
>>
>>
>> Philip, Alan,
>>
>>
>>
>> If you need a hand, I'm happy to pitch in guidance.
>>
>>
>>
>> I've had to mangle a bunch of hand-written Data instances and push out
>> patches to a dozen packages that used to be built this way before I
>> convinced the authors to switch to safer versions of Data. Using virtual
>> smart constructors like we do now in containers and Text where needed can
>> be used to preserve internal invariants, etc.
>>
>>
>>
>> This works far better for users of the API than just randomly throwing
>> them a live hand grenade. As I recall, these little grenades in generic
>> programming over the GHC API have been a constant source of pain for
>> libraries like haddock.
>>
>>
>>
>> Simon,
>>
>>
>>
>> It seems to me that regarding circular data structures, nothing prevents
>> you from walking a circular data structure with Data.Data. You can generate
>> a new one productively that looks just like the old with the contents
>> swapped out, it is indistinguishable to an observer if the fixed point is
>> lost, and a clever observer can use observable sharing to get it back,
>> supposing that they are allowed to try.
>>
>>
>>
>> Alternately, we could use the 'virtual constructor' trick there to break
>> the cycle and reintroduce it, but I'm less enthusiastic about that idea,
>> even if it is simpler in many ways.
>>
>>
>>
>> -Edward
>>
>>
>>
>> On Sun, Jul 27, 2014 at 10:17 AM, <p.k.f.holzenspies at utwente.nl> wrote:
>>
>>  Alan,
>>
>> In that case, let's have a short feedback-loop between the two of us. It
>> seems many of these files (Name.lhs, for example) are really stable through
>> the repo-history. It would be nice to have one bigger refactoring all in
>> one go (some of the code could use a polish, a lot of code seems removable).
>>
>> Regards,
>> Philip
>>   ------------------------------
>>
>> *Van:* Alan & Kim Zimmerman [alan.zimm at gmail.com]
>> *Verzonden:* vrijdag 25 juli 2014 13:44
>> *Aan:* Simon Peyton Jones
>> *CC:* Holzenspies, P.K.F. (EWI); ghc-devs at haskell.org
>> *Onderwerp:* Re: Broken Data.Data instances
>>
>> By the way, I would be happy to attempt this task, if the concept is
>> viable.
>>
>>
>>
>> On Thu, Jul 24, 2014 at 11:23 PM, Alan & Kim Zimmerman <
>> alan.zimm at gmail.com> wrote:
>>
>>    While we are talking about fixing traversals, how about getting rid
>> of the phase specific panic initialisers for placeHolderType,
>> placeHolderKind and friends?
>>
>> In order to safely traverse with SYB, the following needs to be inserted
>> into all the SYB schemes (see
>>
>> https://github.com/alanz/HaRe/blob/master/src/Language/Haskell/Refact/Utils/GhcUtils.hs
>> )
>>
>> -- Check the Typeable items
>> checkItemStage1 :: (Typeable a) => SYB.Stage -> a -> Bool
>> checkItemStage1 stage x = (const False `SYB.extQ` postTcType `SYB.extQ`
>> fixity `SYB.extQ` nameSet) x
>>   where nameSet     = const (stage `elem` [SYB.Parser,SYB.TypeChecker])
>> :: GHC.NameSet       -> Bool
>>         postTcType  = const (stage < SYB.TypeChecker                  )
>> :: GHC.PostTcType    -> Bool
>>         fixity      = const (stage < SYB.Renamer                      )
>> :: GHC.Fixity        -> Bool
>>
>> And in addition HsCmdTop and ParStmtBlock are initialised with explicit
>> 'undefined values.
>>
>> Perhaps use an initialiser that can have its panic turned off when called
>> via the GHC API?
>>
>> Regards
>>
>>   Alan
>>
>>
>>
>>
>>
>> On Thu, Jul 24, 2014 at 11:06 PM, Simon Peyton Jones <
>> simonpj at microsoft.com> wrote:
>>
>>    So... does anyone object to me changing these "broken" instances with
>> the ones given by DeriveDataTypeable?
>>
>> That’s fine with me provided (a) the default behaviour is not immediate
>> divergence (which it might well be), and (b) the pitfalls are documented.
>>
>>
>>
>> Simon
>>
>>
>>
>> *From:* "Philip K.F. Hölzenspies" [mailto:p.k.f.holzenspies at utwente.nl]
>> *Sent:* 24 July 2014 18:42
>> *To:* Simon Peyton Jones
>> *Cc:* ghc-devs at haskell.org
>> *Subject:* Re: Broken Data.Data instances
>>
>>
>>
>> Dear Simon, et al,
>>
>> These are very good points to make for people writing such traversals and
>> queries. I would be more than happy to write a page on the pitfalls etc. on
>> the wiki, but in my experience so far, exploring the innards of GHC is
>> tremendously helped by trying small things out and showing (bits of) the
>> intermediate structures. For me, personally, this has always been hindered
>> by the absence of good instances of Data and/or Show (not having to bring
>> DynFlags and not just visualising with the pretty printer are very helpful).
>>
>> So... does anyone object to me changing these "broken" instances with the
>> ones given by DeriveDataTypeable?
>>
>> Also, many of these internal data structures could be provided with
>> useful lenses to improve such traversals further. Anyone ever go at that?
>> Would be people be interested?
>>
>> Regards,
>> Philip
>>
>>     *Simon Peyton Jones* <simonpj at microsoft.com>
>>
>> 24 Jul 2014 18:22
>>
>> GHC’s data structures are often mutually recursive. e.g.
>>
>> ·        The TyCon for Maybe contains the DataCon for Just
>>
>> ·        The DataCon For just contains Just’s type
>>
>> ·        Just’s type contains the TyCon for Maybe
>>
>>
>>
>> So any attempt to recursively walk over all these structures, as you
>> would a tree, will fail.
>>
>>
>>
>> Also there’s a lot of sharing.  For example, every occurrence of ‘map’ is
>> a Var, and inside that Var is map’s type, its strictness, its rewrite RULE,
>> etc etc.  In walking over a term you may not want to walk over all that
>> stuff at every occurrence of map.
>>
>>
>>
>> Maybe that’s it; I’m not certain since I did not write the Data instances
>> for any of GHC’s types
>>
>>
>>
>> Simon
>>
>>
>>
>> *From:* ghc-devs [mailto:ghc-devs-bounces at haskell.org
>> <ghc-devs-bounces at haskell.org>] *On Behalf Of *
>> p.k.f.holzenspies at utwente.nl
>> *Sent:* 24 July 2014 16:42
>> *To:* ghc-devs at haskell.org
>> *Subject:* Broken Data.Data instances
>>
>>
>>
>> Dear GHC-ers,
>>
>>
>>
>> Is there a reason for explicitly broken Data.Data instances? Case in
>> point:
>>
>>
>>
>> > instance Data Var where
>>
>> >   -- don't traverse?
>>
>> >   toConstr _   = abstractConstr "Var"
>>
>> >   gunfold _ _  = error "gunfold"
>>
>> >   dataTypeOf _ = mkNoRepType "Var"
>>
>>
>>
>> I understand (vaguely) arguments about abstract data types, but this also
>> excludes convenient queries that can, e.g. extract all types from a
>> CoreExpr. I had hoped to do stuff like this:
>>
>>
>>
>> > collect :: (Typeable b, Data a, MonadPlus m) => a -> m b
>>
>> > collect = everything mplus $ mkQ mzero return
>>
>> >
>>
>> > allTypes :: CoreExpr -> [Type]
>>
>> > allTypes = collect
>>
>>
>>
>> Especially when still exploring (parts of) the GHC API, being able to
>> extract things in this fashion is very helpful. SYB’s “everything” being
>> broken by these instances, not so much.
>>
>>
>>
>> Would a patch “fixing” these instances be acceptable?
>>
>>
>>
>> Regards,
>>
>> Philip
>>
>>
>>
>>
>>
>>
>>
>> _______________________________________________
>> ghc-devs mailing list
>> ghc-devs at haskell.org
>> http://www.haskell.org/mailman/listinfo/ghc-devs
>>
>>
>>
>>
>>
>>
>> _______________________________________________
>> ghc-devs mailing list
>> ghc-devs at haskell.org
>> http://www.haskell.org/mailman/listinfo/ghc-devs
>>
>>
>>
>>
>>
>>
>>
>>
>>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.haskell.org/pipermail/ghc-devs/attachments/20140728/dec3e1c7/attachment-0001.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: image001.jpg
Type: image/jpeg
Size: 1247 bytes
Desc: not available
URL: <http://www.haskell.org/pipermail/ghc-devs/attachments/20140728/dec3e1c7/attachment-0001.jpg>


More information about the ghc-devs mailing list