Template Haskell determinism

Edward Z. Yang ezyang at mit.edu
Sat Jul 2 00:16:45 UTC 2016


Oh drat, that's right, local names don't get given a package
key / package id, and externally visible local names aren't given a
deterministic name until we tidy (which is too late to help
Template Haskell.)  So I suppose there is not much we can do here.

Edward

Excerpts from Michael Sloan's message of 2016-06-29 13:41:13 -0400:
> No, NameU and NameL both lack package key / package id.
> 
> -Michael
> 
> On Wed, Jun 29, 2016 at 7:34 AM, Edward Z. Yang <ezyang at mit.edu> wrote:
> > No, nameBase is not the right thing to use here; you also need the
> > unit ID (in GHC 8.0 parlance; package key in GHC 7.10; package id
> > in GHC 7.8 and before).  If you have that information, then
> > GHC establishes an invariant that if two names compare stably equal,
> > then the uniques associated with them are the same.
> >
> > Edward
> >
> > Excerpts from Michael Sloan's message of 2016-06-10 17:16:44 -0400:
> >> Hey, sorry for not getting back to this sooner!
> >>
> >> Perhaps I should have added the following to my list of goals in contention:
> >>
> >> (3) (==) shouldn't yield True for Names that have different unique ids.
> >>
> >> We can only have stable comparisons if goal (3) isn't met, and two
> >> different unique Names would be considered to be equivalent based on the
> >> nameBase.  This is because Ord is a total order, not a partial order.  As
> >> described in my prior email, PartialOrd could be added, but it'd be
> >> inconvenient to use with existing Ord based containers.
> >>
> >> -Michael
> >>
> >> On Sun, Jun 5, 2016 at 10:15 AM, Edward Z. Yang <ezyang at mit.edu> wrote:
> >>
> >> > I must admit, I am a bit confused by this discussion.
> >> >
> >> > It is true that every Name is associated with a Unique.  But you don't
> >> > need the Unique to equality/ordering tests; the names also contain
> >> > enough (stable) information for stable comparisons of that sort.  So
> >> > why don't we expose that instead of the Unique?
> >> >
> >> > Edward
> >> >
> >> > Excerpts from Michael Sloan's message of 2016-06-04 18:44:03 -0700:
> >> > > On Thu, Jun 2, 2016 at 4:12 AM, Simon Peyton Jones <
> >> > simonpj at microsoft.com>
> >> > > wrote:
> >> > >
> >> > > > If names get different ordering keys when reified from different
> >> > modules
> >> > > > (seems like they'd have to, particularly given ghc's "-j"), then we
> >> > end up
> >> > > > with an unpleasant circumstance where these do not compare as equal
> >> > > >
> >> > > >
> >> > > >
> >> > > > The I believe that global, top level names (NameG) are not subject to
> >> > this
> >> > > > ordering stuff, so I don’t think this problem can occur.
> >> > > >
> >> > >
> >> > > True, top level names are NameG.  The reified Info for a top level Dec
> >> > may
> >> > > include NameU, though.  For example, the type variables in 'Maybe' are
> >> > > NameU:
> >> > >
> >> > > $(do TyConI (DataD _ _ [KindedTV (Name _ nf) _] _ _ _) <- reify ''Maybe
> >> > >      lift (show nf))
> >> > >
> >> > > The resulting expression is something like "NameU 822083586"
> >> > >
> >> > > > This is a breaking change and it doesn't fix the problem that
> >> > NameFlavour
> >> > > > is
> >> > > >
> >> > > > not abstract and leaks the Uniques. It would break at least:
> >> > > >
> >> > > >
> >> > > >
> >> > > > But why is NameU exposed to clients?   GHC needs to know, but clients
> >> > > > don’t.  What use are these packages making of it?
> >> > > >
> >> > >
> >> > > It's being leaked in the public inteface via Ord.  The Eq instance is
> >> > fine,
> >> > > because these are Uniques, so the results should be consistent.
> >> > >
> >> > > There are two goals in contention here:
> >> > >
> >> > > 1) Having some ordering on Names so that they can be used in Map or Set
> >> > > 2) Having law-abiding Eq / Ord instances.  We'd need a 'PartialOrd' to
> >> > > really handle these well.  In that case, the ordering would be based on
> >> > > everything but the NameU int, but 'Eq' would still follow it
> >> > >
> >> > > A few ideas for different approaches to resolving this:
> >> > >
> >> > > 1) Document it.  Less appealing than fixing it in the API, but still
> >> > would
> >> > > be good.
> >> > >
> >> > > 2) Remove the 'Ord' instance, and force the user to pick 'NamePartialOrd'
> >> > > newtype (partial ord on the non-unique info), or 'UnstableNameOrd'
> >> > newtype
> >> > > (current behavior).  A trickyness of this approach is that you'd need
> >> > > containers that can handle (PartialOrd k, Eq k) keys.  In lots of cases
> >> > > people are using the 'Ord' instance with 'Name's that are not 'NameU', so
> >> > > this would break a lot of code that was already deterministic.
> >> > >
> >> > > 3) Some approaches like this ordering key, but I'm not sure how it will
> >> > > help when comparing NameUs from different modules?
> >> > >
> >> > > > S
> >> > > >
> >> > > >
> >> > > >
> >> > > >
> >> > > >
> >> > > > *From:* ghc-devs [mailto:ghc-devs-bounces at haskell.org] *On Behalf Of
> >> > *Michael
> >> > > > Sloan
> >> > > > *Sent:* 02 June 2016 02:07
> >> > > > *To:* Bartosz Nitka <niteria at gmail.com>
> >> > > > *Cc:* ghc-devs Devs <ghc-devs at haskell.org>
> >> > > > *Subject:* Re: Template Haskell determinism
> >> > > >
> >> > > >
> >> > > >
> >> > > > +1 to solving this.  Not sure about the approach, but assuming the
> >> > > > following concerns are addressed, I'm (+1) on it too:
> >> > > >
> >> > > >
> >> > > >
> >> > > > This solution is clever!  However, I think there is some difficulty to
> >> > > > determining this ordering key.  Namely, what happens when I construct
> >> > the
> >> > > > (Set Name) using results from multiple reifies?
> >> > > >
> >> > > >
> >> > > >
> >> > > > One solution is to have the ordering key be a consecutive supply that's
> >> > > > initialized on a per-module basis.  There is still an issue there,
> >> > though,
> >> > > > which is that you might store one of these names in a global IORef
> >> > that's
> >> > > > used by a later TH splice.  Or, similarly, serialize the names to a
> >> > file
> >> > > > and later load them.  At least in those cases you need to use 'runIO'
> >> > to
> >> > > > break determinism.
> >> > > >
> >> > > >
> >> > > >
> >> > > > If names get different ordering keys when reified from different
> >> > modules
> >> > > > (seems like they'd have to, particularly given ghc's "-j"), then we
> >> > end up
> >> > > > with an unpleasant circumstance where these do not compare as equal.
> >> > How
> >> > > > about having the Eq instance ignore the ordering key?  I think that
> >> > mostly
> >> > > > resolves this concern.  This implies that the Ord instance should also
> >> > > > yield EQ and ignore the ordering key, when the unique key matches.
> >> > > >
> >> > > >
> >> > > >
> >> > > > One issue with this is that switching the order of reify could
> >> > > > unexpectedly vary the behavior.
> >> > > >
> >> > > >
> >> > > >
> >> > > > Does the map in TcGblEnv imply that a reify from a later module will
> >> > get
> >> > > > the same ordering key?  So does this mean that the keys used in a given
> >> > > > reify depend on which things have already been reified?  In that case,
> >> > then
> >> > > > this is also an issue with your solution.  Now, it's not a big problem
> >> > at
> >> > > > all, just surprising to the user.
> >> > > >
> >> > > >
> >> > > >
> >> > > >
> >> > > >
> >> > > > If the internal API for Name does change, may as well address
> >> > > > https://ghc.haskell.org/trac/ghc/ticket/10311 too.  I agree with SPJ's
> >> > > > suggested solution of having both the traditional package identifier
> >> > and
> >> > > > package keys in 'Name'.
> >> > > >
> >> > > >
> >> > > >
> >> > > > -Michael
> >> > > >
> >> > > >
> >> > > >
> >> > > > On Tue, May 31, 2016 at 6:54 AM, Bartosz Nitka <niteria at gmail.com>
> >> > wrote:
> >> > > >
> >> > > > Template Haskell with its ability to do arbitrary IO is
> >> > non-deterministic
> >> > > > by
> >> > > >
> >> > > > design. You could for example embed the current date in a file. There
> >> > is
> >> > > >
> >> > > > however one kind of non-deterministic behavior that you can trigger
> >> > > >
> >> > > > accidentally. It has to do with how Names are reified. If you take a
> >> > look
> >> > > > at
> >> > > >
> >> > > > the definition of reifyName you can see that it puts the assigned
> >> > Unique
> >> > > > in a
> >> > > >
> >> > > > NameU:
> >> > > >
> >> > > >
> >> > > >
> >> > > >   reifyName :: NamedThing n => n -> TH.Name
> >> > > >
> >> > > >   reifyName thing
> >> > > >
> >> > > >     | isExternalName name = mk_varg pkg_str mod_str occ_str
> >> > > >
> >> > > >     | otherwise           = TH.mkNameU occ_str (getKey (getUnique
> >> > name))
> >> > > >
> >> > > >     ...
> >> > > >
> >> > > > NameFlavour which NameU is a constructor of has a default Ord instance,
> >> > > > meaning
> >> > > >
> >> > > > that it ends up comparing the Uniques. The relative ordering of
> >> > Uniques is
> >> > > > not
> >> > > >
> >> > > > guaranteed to be stable across recompilations [1], so this can lead to
> >> > > >
> >> > > > ABI-incompatible binaries.
> >> > > >
> >> > > >
> >> > > >
> >> > > > This isn't an abstract problem and it actually happens in practice. The
> >> > > >
> >> > > > microlens package keeps Names in a Set and later turns that set into a
> >> > > > list.
> >> > > >
> >> > > > The results have different orders of TyVars resulting in different ABI
> >> > > > hashes
> >> > > >
> >> > > > and can potentially be optimized differently.
> >> > > >
> >> > > >
> >> > > >
> >> > > > I believe it's worth to handle this case in a deterministic way and I
> >> > have
> >> > > > a
> >> > > >
> >> > > > solution in mind. The idea is to extend NameU (and potentially NameL)
> >> > with
> >> > > > an
> >> > > >
> >> > > > ordering key. To be more concrete:
> >> > > >
> >> > > >
> >> > > >
> >> > > > -   | NameU !Int
> >> > > >
> >> > > > +   | NameU !Int !Int
> >> > > >
> >> > > >
> >> > > >
> >> > > > This way the Ord instance can use a stable key and the problem reduces
> >> > to
> >> > > >
> >> > > > ensuring the keys are stable. To generate stable keys we can use the
> >> > fact
> >> > > > that
> >> > > >
> >> > > > reify traverses the expressions in the same order every time and
> >> > > > sequentially
> >> > > >
> >> > > > allocate new keys based on traversal order. The way I have it
> >> > implemented
> >> > > > now
> >> > > >
> >> > > > is to add a new field in TcGblEnv which maps Uniques to allocated keys:
> >> > > >
> >> > > >
> >> > > >
> >> > > > +        tcg_th_names :: TcRef (UniqFM Int, Int),
> >> > > >
> >> > > >
> >> > > >
> >> > > > Then the reifyName and qNewName do the necessary bookkeeping and
> >> > translate
> >> > > > the
> >> > > >
> >> > > > Uniques on the fly.
> >> > > >
> >> > > >
> >> > > >
> >> > > > This is a breaking change and it doesn't fix the problem that
> >> > NameFlavour
> >> > > > is
> >> > > >
> >> > > > not abstract and leaks the Uniques. It would break at least:
> >> > > >
> >> > > >
> >> > > >
> >> > > > - singletons
> >> > > >
> >> > > > - th-lift
> >> > > >
> >> > > > - haskell-src-meta
> >> > > >
> >> > > > - shakespeare
> >> > > >
> >> > > > - distributed-closure
> >> > > >
> >> > > >
> >> > > >
> >> > > > I'd like to get feedback if this is an acceptable solution and if the
> >> > > > problem
> >> > > >
> >> > > > is worth solving.
> >> > > >
> >> > > >
> >> > > >
> >> > > > Cheers,
> >> > > >
> >> > > > Bartosz
> >> > > >
> >> > > >
> >> > > >
> >> > > > [1]
> >> > > >
> >> > https://ghc.haskell.org/trac/ghc/wiki/DeterministicBuilds#NondeterministicUniques
> >> > > >
> >> > > >
> >> > > > _______________________________________________
> >> > > > ghc-devs mailing list
> >> > > > ghc-devs at haskell.org
> >> > > > http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs
> >> > > > <
> >> > https://na01.safelinks.protection.outlook.com/?url=http%3a%2f%2fmail.haskell.org%2fcgi-bin%2fmailman%2flistinfo%2fghc-devs&data=01%7c01%7csimonpj%40064d.mgd.microsoft.com%7c1a4a84c9341546403e1508d38a8246ee%7c72f988bf86f141af91ab2d7cd011db47%7c1&sdata=mjEDuk%2fuRsDLg0q63zaIBeh5e2IyfKnKjcEcRLDvERE%3d
> >> > >
> >> > > >
> >> > > >
> >> > > >
> >> >


More information about the ghc-devs mailing list