Template Haskell determinism

Michael Sloan mgsloan at gmail.com
Wed Jun 29 17:41:13 UTC 2016


No, NameU and NameL both lack package key / package id.

-Michael

On Wed, Jun 29, 2016 at 7:34 AM, Edward Z. Yang <ezyang at mit.edu> wrote:
> No, nameBase is not the right thing to use here; you also need the
> unit ID (in GHC 8.0 parlance; package key in GHC 7.10; package id
> in GHC 7.8 and before).  If you have that information, then
> GHC establishes an invariant that if two names compare stably equal,
> then the uniques associated with them are the same.
>
> Edward
>
> Excerpts from Michael Sloan's message of 2016-06-10 17:16:44 -0400:
>> Hey, sorry for not getting back to this sooner!
>>
>> Perhaps I should have added the following to my list of goals in contention:
>>
>> (3) (==) shouldn't yield True for Names that have different unique ids.
>>
>> We can only have stable comparisons if goal (3) isn't met, and two
>> different unique Names would be considered to be equivalent based on the
>> nameBase.  This is because Ord is a total order, not a partial order.  As
>> described in my prior email, PartialOrd could be added, but it'd be
>> inconvenient to use with existing Ord based containers.
>>
>> -Michael
>>
>> On Sun, Jun 5, 2016 at 10:15 AM, Edward Z. Yang <ezyang at mit.edu> wrote:
>>
>> > I must admit, I am a bit confused by this discussion.
>> >
>> > It is true that every Name is associated with a Unique.  But you don't
>> > need the Unique to equality/ordering tests; the names also contain
>> > enough (stable) information for stable comparisons of that sort.  So
>> > why don't we expose that instead of the Unique?
>> >
>> > Edward
>> >
>> > Excerpts from Michael Sloan's message of 2016-06-04 18:44:03 -0700:
>> > > On Thu, Jun 2, 2016 at 4:12 AM, Simon Peyton Jones <
>> > simonpj at microsoft.com>
>> > > wrote:
>> > >
>> > > > If names get different ordering keys when reified from different
>> > modules
>> > > > (seems like they'd have to, particularly given ghc's "-j"), then we
>> > end up
>> > > > with an unpleasant circumstance where these do not compare as equal
>> > > >
>> > > >
>> > > >
>> > > > The I believe that global, top level names (NameG) are not subject to
>> > this
>> > > > ordering stuff, so I don’t think this problem can occur.
>> > > >
>> > >
>> > > True, top level names are NameG.  The reified Info for a top level Dec
>> > may
>> > > include NameU, though.  For example, the type variables in 'Maybe' are
>> > > NameU:
>> > >
>> > > $(do TyConI (DataD _ _ [KindedTV (Name _ nf) _] _ _ _) <- reify ''Maybe
>> > >      lift (show nf))
>> > >
>> > > The resulting expression is something like "NameU 822083586"
>> > >
>> > > > This is a breaking change and it doesn't fix the problem that
>> > NameFlavour
>> > > > is
>> > > >
>> > > > not abstract and leaks the Uniques. It would break at least:
>> > > >
>> > > >
>> > > >
>> > > > But why is NameU exposed to clients?   GHC needs to know, but clients
>> > > > don’t.  What use are these packages making of it?
>> > > >
>> > >
>> > > It's being leaked in the public inteface via Ord.  The Eq instance is
>> > fine,
>> > > because these are Uniques, so the results should be consistent.
>> > >
>> > > There are two goals in contention here:
>> > >
>> > > 1) Having some ordering on Names so that they can be used in Map or Set
>> > > 2) Having law-abiding Eq / Ord instances.  We'd need a 'PartialOrd' to
>> > > really handle these well.  In that case, the ordering would be based on
>> > > everything but the NameU int, but 'Eq' would still follow it
>> > >
>> > > A few ideas for different approaches to resolving this:
>> > >
>> > > 1) Document it.  Less appealing than fixing it in the API, but still
>> > would
>> > > be good.
>> > >
>> > > 2) Remove the 'Ord' instance, and force the user to pick 'NamePartialOrd'
>> > > newtype (partial ord on the non-unique info), or 'UnstableNameOrd'
>> > newtype
>> > > (current behavior).  A trickyness of this approach is that you'd need
>> > > containers that can handle (PartialOrd k, Eq k) keys.  In lots of cases
>> > > people are using the 'Ord' instance with 'Name's that are not 'NameU', so
>> > > this would break a lot of code that was already deterministic.
>> > >
>> > > 3) Some approaches like this ordering key, but I'm not sure how it will
>> > > help when comparing NameUs from different modules?
>> > >
>> > > > S
>> > > >
>> > > >
>> > > >
>> > > >
>> > > >
>> > > > *From:* ghc-devs [mailto:ghc-devs-bounces at haskell.org] *On Behalf Of
>> > *Michael
>> > > > Sloan
>> > > > *Sent:* 02 June 2016 02:07
>> > > > *To:* Bartosz Nitka <niteria at gmail.com>
>> > > > *Cc:* ghc-devs Devs <ghc-devs at haskell.org>
>> > > > *Subject:* Re: Template Haskell determinism
>> > > >
>> > > >
>> > > >
>> > > > +1 to solving this.  Not sure about the approach, but assuming the
>> > > > following concerns are addressed, I'm (+1) on it too:
>> > > >
>> > > >
>> > > >
>> > > > This solution is clever!  However, I think there is some difficulty to
>> > > > determining this ordering key.  Namely, what happens when I construct
>> > the
>> > > > (Set Name) using results from multiple reifies?
>> > > >
>> > > >
>> > > >
>> > > > One solution is to have the ordering key be a consecutive supply that's
>> > > > initialized on a per-module basis.  There is still an issue there,
>> > though,
>> > > > which is that you might store one of these names in a global IORef
>> > that's
>> > > > used by a later TH splice.  Or, similarly, serialize the names to a
>> > file
>> > > > and later load them.  At least in those cases you need to use 'runIO'
>> > to
>> > > > break determinism.
>> > > >
>> > > >
>> > > >
>> > > > If names get different ordering keys when reified from different
>> > modules
>> > > > (seems like they'd have to, particularly given ghc's "-j"), then we
>> > end up
>> > > > with an unpleasant circumstance where these do not compare as equal.
>> > How
>> > > > about having the Eq instance ignore the ordering key?  I think that
>> > mostly
>> > > > resolves this concern.  This implies that the Ord instance should also
>> > > > yield EQ and ignore the ordering key, when the unique key matches.
>> > > >
>> > > >
>> > > >
>> > > > One issue with this is that switching the order of reify could
>> > > > unexpectedly vary the behavior.
>> > > >
>> > > >
>> > > >
>> > > > Does the map in TcGblEnv imply that a reify from a later module will
>> > get
>> > > > the same ordering key?  So does this mean that the keys used in a given
>> > > > reify depend on which things have already been reified?  In that case,
>> > then
>> > > > this is also an issue with your solution.  Now, it's not a big problem
>> > at
>> > > > all, just surprising to the user.
>> > > >
>> > > >
>> > > >
>> > > >
>> > > >
>> > > > If the internal API for Name does change, may as well address
>> > > > https://ghc.haskell.org/trac/ghc/ticket/10311 too.  I agree with SPJ's
>> > > > suggested solution of having both the traditional package identifier
>> > and
>> > > > package keys in 'Name'.
>> > > >
>> > > >
>> > > >
>> > > > -Michael
>> > > >
>> > > >
>> > > >
>> > > > On Tue, May 31, 2016 at 6:54 AM, Bartosz Nitka <niteria at gmail.com>
>> > wrote:
>> > > >
>> > > > Template Haskell with its ability to do arbitrary IO is
>> > non-deterministic
>> > > > by
>> > > >
>> > > > design. You could for example embed the current date in a file. There
>> > is
>> > > >
>> > > > however one kind of non-deterministic behavior that you can trigger
>> > > >
>> > > > accidentally. It has to do with how Names are reified. If you take a
>> > look
>> > > > at
>> > > >
>> > > > the definition of reifyName you can see that it puts the assigned
>> > Unique
>> > > > in a
>> > > >
>> > > > NameU:
>> > > >
>> > > >
>> > > >
>> > > >   reifyName :: NamedThing n => n -> TH.Name
>> > > >
>> > > >   reifyName thing
>> > > >
>> > > >     | isExternalName name = mk_varg pkg_str mod_str occ_str
>> > > >
>> > > >     | otherwise           = TH.mkNameU occ_str (getKey (getUnique
>> > name))
>> > > >
>> > > >     ...
>> > > >
>> > > > NameFlavour which NameU is a constructor of has a default Ord instance,
>> > > > meaning
>> > > >
>> > > > that it ends up comparing the Uniques. The relative ordering of
>> > Uniques is
>> > > > not
>> > > >
>> > > > guaranteed to be stable across recompilations [1], so this can lead to
>> > > >
>> > > > ABI-incompatible binaries.
>> > > >
>> > > >
>> > > >
>> > > > This isn't an abstract problem and it actually happens in practice. The
>> > > >
>> > > > microlens package keeps Names in a Set and later turns that set into a
>> > > > list.
>> > > >
>> > > > The results have different orders of TyVars resulting in different ABI
>> > > > hashes
>> > > >
>> > > > and can potentially be optimized differently.
>> > > >
>> > > >
>> > > >
>> > > > I believe it's worth to handle this case in a deterministic way and I
>> > have
>> > > > a
>> > > >
>> > > > solution in mind. The idea is to extend NameU (and potentially NameL)
>> > with
>> > > > an
>> > > >
>> > > > ordering key. To be more concrete:
>> > > >
>> > > >
>> > > >
>> > > > -   | NameU !Int
>> > > >
>> > > > +   | NameU !Int !Int
>> > > >
>> > > >
>> > > >
>> > > > This way the Ord instance can use a stable key and the problem reduces
>> > to
>> > > >
>> > > > ensuring the keys are stable. To generate stable keys we can use the
>> > fact
>> > > > that
>> > > >
>> > > > reify traverses the expressions in the same order every time and
>> > > > sequentially
>> > > >
>> > > > allocate new keys based on traversal order. The way I have it
>> > implemented
>> > > > now
>> > > >
>> > > > is to add a new field in TcGblEnv which maps Uniques to allocated keys:
>> > > >
>> > > >
>> > > >
>> > > > +        tcg_th_names :: TcRef (UniqFM Int, Int),
>> > > >
>> > > >
>> > > >
>> > > > Then the reifyName and qNewName do the necessary bookkeeping and
>> > translate
>> > > > the
>> > > >
>> > > > Uniques on the fly.
>> > > >
>> > > >
>> > > >
>> > > > This is a breaking change and it doesn't fix the problem that
>> > NameFlavour
>> > > > is
>> > > >
>> > > > not abstract and leaks the Uniques. It would break at least:
>> > > >
>> > > >
>> > > >
>> > > > - singletons
>> > > >
>> > > > - th-lift
>> > > >
>> > > > - haskell-src-meta
>> > > >
>> > > > - shakespeare
>> > > >
>> > > > - distributed-closure
>> > > >
>> > > >
>> > > >
>> > > > I'd like to get feedback if this is an acceptable solution and if the
>> > > > problem
>> > > >
>> > > > is worth solving.
>> > > >
>> > > >
>> > > >
>> > > > Cheers,
>> > > >
>> > > > Bartosz
>> > > >
>> > > >
>> > > >
>> > > > [1]
>> > > >
>> > https://ghc.haskell.org/trac/ghc/wiki/DeterministicBuilds#NondeterministicUniques
>> > > >
>> > > >
>> > > > _______________________________________________
>> > > > ghc-devs mailing list
>> > > > ghc-devs at haskell.org
>> > > > http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs
>> > > > <
>> > https://na01.safelinks.protection.outlook.com/?url=http%3a%2f%2fmail.haskell.org%2fcgi-bin%2fmailman%2flistinfo%2fghc-devs&data=01%7c01%7csimonpj%40064d.mgd.microsoft.com%7c1a4a84c9341546403e1508d38a8246ee%7c72f988bf86f141af91ab2d7cd011db47%7c1&sdata=mjEDuk%2fuRsDLg0q63zaIBeh5e2IyfKnKjcEcRLDvERE%3d
>> > >
>> > > >
>> > > >
>> > > >
>> >


More information about the ghc-devs mailing list