Broken Data.Data instances

Alan & Kim Zimmerman alan.zimm at gmail.com
Mon Jul 28 09:14:23 UTC 2014


I have made a conceptual example of this here http://lpaste.net/108262

Alan


On Mon, Jul 28, 2014 at 9:50 AM, Alan & Kim Zimmerman <alan.zimm at gmail.com>
wrote:

> What about creating a specific type with a single constructor for the "not
> relevant to this phase" type to be used instead of () above? That would
> also clearly document what was going on.
>
> Alan
>
>
> On Mon, Jul 28, 2014 at 9:14 AM, Simon Peyton Jones <simonpj at microsoft.com
> > wrote:
>
>>   I've had to mangle a bunch of hand-written Data instances and push out
>> patches to a dozen packages that used to be built this way before I
>> convinced the authors to switch to safer versions of Data. Using virtual
>> smart constructors like we do now in containers and Text where needed can
>> be used to preserve internal invariants, etc.
>>
>>
>>
>> If the “hand grenades” are the PostTcTypes, etc, then I can explain why
>> they are there.
>>
>>
>>
>> There simply is no sensible type you can put before the type checker
>> runs.  For example one of the constructors  in HsExpr is
>>
>>   | HsMultiIf   PostTcType [LGRHS id (LHsExpr id)]
>>
>> After type checking we know what type the thing has, but before we have
>> no clue.
>>
>>
>>
>> We could get around this by saying
>>
>>             type PostTcType = Maybe TcType
>>
>> but that would mean that every post-typechecking consumer would need a
>> redundant pattern-match on a Just that would always succeed.
>>
>>
>>
>> It’s nothing deeper than that.  Adding Maybes everywhere would be
>> possible, just clunky.
>>
>>
>>
>>
>>
>> However we now have type functions, and HsExpr is parameterised by an
>> ‘id’ parameter, which changes from RdrName (after parsing) to Name (after
>> renaming) to Id (after typechecking).  So we could do this:
>>
>>   | HsMultiIf   (PostTcType id) [LGRHS id (LHsExpr id)]
>>
>> and define PostTcType as a closed type family thus
>>
>>
>>
>>      type family PostTcType a where
>>
>>           PostTcType Id = TcType
>>
>>           PostTcType other = ()
>>
>>
>>
>> That would be better than filling it with bottoms.  But it might not help
>> with generic programming, because there’d be a component whose type wasn’t
>> fixed.  I have no idea how generics and type functions interact.
>>
>>
>>
>> Simon
>>
>>
>>
>> *From:* Edward Kmett [mailto:ekmett at gmail.com]
>> *Sent:* 27 July 2014 18:27
>> *To:* p.k.f.holzenspies at utwente.nl
>> *Cc:* alan.zimm at gmail.com; Simon Peyton Jones; ghc-devs
>>
>> *Subject:* Re: Broken Data.Data instances
>>
>>
>>
>> Philip, Alan,
>>
>>
>>
>> If you need a hand, I'm happy to pitch in guidance.
>>
>>
>>
>> I've had to mangle a bunch of hand-written Data instances and push out
>> patches to a dozen packages that used to be built this way before I
>> convinced the authors to switch to safer versions of Data. Using virtual
>> smart constructors like we do now in containers and Text where needed can
>> be used to preserve internal invariants, etc.
>>
>>
>>
>> This works far better for users of the API than just randomly throwing
>> them a live hand grenade. As I recall, these little grenades in generic
>> programming over the GHC API have been a constant source of pain for
>> libraries like haddock.
>>
>>
>>
>> Simon,
>>
>>
>>
>> It seems to me that regarding circular data structures, nothing prevents
>> you from walking a circular data structure with Data.Data. You can generate
>> a new one productively that looks just like the old with the contents
>> swapped out, it is indistinguishable to an observer if the fixed point is
>> lost, and a clever observer can use observable sharing to get it back,
>> supposing that they are allowed to try.
>>
>>
>>
>> Alternately, we could use the 'virtual constructor' trick there to break
>> the cycle and reintroduce it, but I'm less enthusiastic about that idea,
>> even if it is simpler in many ways.
>>
>>
>>
>> -Edward
>>
>>
>>
>> On Sun, Jul 27, 2014 at 10:17 AM, <p.k.f.holzenspies at utwente.nl> wrote:
>>
>>   Alan,
>>
>> In that case, let's have a short feedback-loop between the two of us. It
>> seems many of these files (Name.lhs, for example) are really stable through
>> the repo-history. It would be nice to have one bigger refactoring all in
>> one go (some of the code could use a polish, a lot of code seems removable).
>>
>> Regards,
>> Philip
>>   ------------------------------
>>
>> *Van:* Alan & Kim Zimmerman [alan.zimm at gmail.com]
>> *Verzonden:* vrijdag 25 juli 2014 13:44
>> *Aan:* Simon Peyton Jones
>> *CC:* Holzenspies, P.K.F. (EWI); ghc-devs at haskell.org
>> *Onderwerp:* Re: Broken Data.Data instances
>>
>> By the way, I would be happy to attempt this task, if the concept is
>> viable.
>>
>>
>>
>> On Thu, Jul 24, 2014 at 11:23 PM, Alan & Kim Zimmerman <
>> alan.zimm at gmail.com> wrote:
>>
>>     While we are talking about fixing traversals, how about getting rid
>> of the phase specific panic initialisers for placeHolderType,
>> placeHolderKind and friends?
>>
>> In order to safely traverse with SYB, the following needs to be inserted
>> into all the SYB schemes (see
>>
>> https://github.com/alanz/HaRe/blob/master/src/Language/Haskell/Refact/Utils/GhcUtils.hs
>> )
>>
>> -- Check the Typeable items
>> checkItemStage1 :: (Typeable a) => SYB.Stage -> a -> Bool
>> checkItemStage1 stage x = (const False `SYB.extQ` postTcType `SYB.extQ`
>> fixity `SYB.extQ` nameSet) x
>>   where nameSet     = const (stage `elem` [SYB.Parser,SYB.TypeChecker])
>> :: GHC.NameSet       -> Bool
>>         postTcType  = const (stage < SYB.TypeChecker                  )
>> :: GHC.PostTcType    -> Bool
>>         fixity      = const (stage < SYB.Renamer                      )
>> :: GHC.Fixity        -> Bool
>>
>> And in addition HsCmdTop and ParStmtBlock are initialised with explicit
>> 'undefined values.
>>
>> Perhaps use an initialiser that can have its panic turned off when called
>> via the GHC API?
>>
>> Regards
>>
>>   Alan
>>
>>
>>
>>
>>
>> On Thu, Jul 24, 2014 at 11:06 PM, Simon Peyton Jones <
>> simonpj at microsoft.com> wrote:
>>
>>    So... does anyone object to me changing these "broken" instances with
>> the ones given by DeriveDataTypeable?
>>
>> That’s fine with me provided (a) the default behaviour is not immediate
>> divergence (which it might well be), and (b) the pitfalls are documented.
>>
>>
>>
>> Simon
>>
>>
>>
>> *From:* "Philip K.F. Hölzenspies" [mailto:p.k.f.holzenspies at utwente.nl]
>> *Sent:* 24 July 2014 18:42
>> *To:* Simon Peyton Jones
>> *Cc:* ghc-devs at haskell.org
>> *Subject:* Re: Broken Data.Data instances
>>
>>
>>
>> Dear Simon, et al,
>>
>> These are very good points to make for people writing such traversals and
>> queries. I would be more than happy to write a page on the pitfalls etc. on
>> the wiki, but in my experience so far, exploring the innards of GHC is
>> tremendously helped by trying small things out and showing (bits of) the
>> intermediate structures. For me, personally, this has always been hindered
>> by the absence of good instances of Data and/or Show (not having to bring
>> DynFlags and not just visualising with the pretty printer are very helpful).
>>
>> So... does anyone object to me changing these "broken" instances with the
>> ones given by DeriveDataTypeable?
>>
>> Also, many of these internal data structures could be provided with
>> useful lenses to improve such traversals further. Anyone ever go at that?
>> Would be people be interested?
>>
>> Regards,
>> Philip
>>
>>      *Simon Peyton Jones* <simonpj at microsoft.com>
>>
>> 24 Jul 2014 18:22
>>
>> GHC’s data structures are often mutually recursive. e.g.
>>
>> ·        The TyCon for Maybe contains the DataCon for Just
>>
>> ·        The DataCon For just contains Just’s type
>>
>> ·        Just’s type contains the TyCon for Maybe
>>
>>
>>
>> So any attempt to recursively walk over all these structures, as you
>> would a tree, will fail.
>>
>>
>>
>> Also there’s a lot of sharing.  For example, every occurrence of ‘map’ is
>> a Var, and inside that Var is map’s type, its strictness, its rewrite RULE,
>> etc etc.  In walking over a term you may not want to walk over all that
>> stuff at every occurrence of map.
>>
>>
>>
>> Maybe that’s it; I’m not certain since I did not write the Data instances
>> for any of GHC’s types
>>
>>
>>
>> Simon
>>
>>
>>
>> *From:* ghc-devs [mailto:ghc-devs-bounces at haskell.org
>> <ghc-devs-bounces at haskell.org>] *On Behalf Of *
>> p.k.f.holzenspies at utwente.nl
>> *Sent:* 24 July 2014 16:42
>> *To:* ghc-devs at haskell.org
>> *Subject:* Broken Data.Data instances
>>
>>
>>
>> Dear GHC-ers,
>>
>>
>>
>> Is there a reason for explicitly broken Data.Data instances? Case in
>> point:
>>
>>
>>
>> > instance Data Var where
>>
>> >   -- don't traverse?
>>
>> >   toConstr _   = abstractConstr "Var"
>>
>> >   gunfold _ _  = error "gunfold"
>>
>> >   dataTypeOf _ = mkNoRepType "Var"
>>
>>
>>
>> I understand (vaguely) arguments about abstract data types, but this also
>> excludes convenient queries that can, e.g. extract all types from a
>> CoreExpr. I had hoped to do stuff like this:
>>
>>
>>
>> > collect :: (Typeable b, Data a, MonadPlus m) => a -> m b
>>
>> > collect = everything mplus $ mkQ mzero return
>>
>> >
>>
>> > allTypes :: CoreExpr -> [Type]
>>
>> > allTypes = collect
>>
>>
>>
>> Especially when still exploring (parts of) the GHC API, being able to
>> extract things in this fashion is very helpful. SYB’s “everything” being
>> broken by these instances, not so much.
>>
>>
>>
>> Would a patch “fixing” these instances be acceptable?
>>
>>
>>
>> Regards,
>>
>> Philip
>>
>>
>>
>>
>>
>>
>>
>> _______________________________________________
>> ghc-devs mailing list
>> ghc-devs at haskell.org
>> http://www.haskell.org/mailman/listinfo/ghc-devs
>>
>>
>>
>>
>>
>>
>> _______________________________________________
>> ghc-devs mailing list
>> ghc-devs at haskell.org
>> http://www.haskell.org/mailman/listinfo/ghc-devs
>>
>>
>>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.haskell.org/pipermail/ghc-devs/attachments/20140728/68053901/attachment-0001.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: image001.jpg
Type: image/jpeg
Size: 1247 bytes
Desc: not available
URL: <http://www.haskell.org/pipermail/ghc-devs/attachments/20140728/68053901/attachment-0001.jpg>


More information about the ghc-devs mailing list