Again: Uniques in GHC
p.k.f.holzenspies at utwente.nl
p.k.f.holzenspies at utwente.nl
Thu Oct 9 11:39:10 UTC 2014
Dear Simon, et al,
I've created the wiki-page about the Unique-patch [1].
Should it be linked to from the KeyDataTypes [2]?
Regards,
Philip
[1] https://ghc.haskell.org/trac/ghc/wiki/Commentary/Compiler/Unique
[2] https://ghc.haskell.org/trac/ghc/wiki/Commentary/Compiler/KeyDataTypes
________________________________
From: Simon Peyton Jones <simonpj at microsoft.com>
Sent: 07 October 2014 23:23
To: Holzenspies, P.K.F. (EWI); carter.schonwald at gmail.com
Cc: ghc-devs at haskell.org
Subject: RE: Again: Uniques in GHC
One of the things I'm finding difficult about this Phab stuff is that I get presented with lots of code without enough supporting text saying
* What problem is this patch trying to solve?
* What is the user-visible design (for language features)?
* What are the main ideas in the implementation?
The place we usually put such design documents is on the GHC Trac Wiki. Email is ok for discussion, but the wiki is FAR better for stating clearly the current state of play. Philip, might you make such a page for this unique stuff?
To answer some of you specific questions (please include the answers in the wiki page in some form):
* Uniques are never put in .hi files (as far as I know). They do not survive a single invocation of GHC.
* However with ghc --make, or ghci, uniques do survive for the entire invocation of GHC. For example in ghc --make, uniques assigned when compiling module A should not clash with those for module B
* Yes, TyCons and DataCons must have separate uniques. We often form sets of Names, which contain both TyCons and DataCons. Let's not mess with this.
* Having unique-supply-splitting as a pure function is so deeply embedded in GHC that I could not hazard a guess as to how difficult it would be to IO-ify it. Moreover, I would regret doing so because it would force sequentiality where none is needed.
* Template Haskell is a completely independent Haskell library. It does not import GHC. If uniques were in their own package, then TH and GHC could share them. Ditto Hoopl.
* You say that Uniques are serialised as Word32. I'm not sure why they are serialised at all!
* Enforcing determinacy everywhere is a heavy burden. Instead I suppose that you could run a pass at the end to give everything a more determinate name TidyPgm does this for the name strings, so it would probably be easy to do so for the uniques too.
Simon
________________________________
From: ghc-devs [ghc-devs-bounces at haskell.org] on behalf of p.k.f.holzenspies at utwente.nl [p.k.f.holzenspies at utwente.nl]
Sent: 07 October 2014 22:03
To: carter.schonwald at gmail.com
Cc: ghc-devs at haskell.org
Subject: RE: Again: Uniques in GHC
Dear Carter, Simon, et al,
(CC'd SPJ on this explicitly, because I *think* he'll be most knowledgeable on some of the constraints that need to be guaranteed for Uniques)
I agree, but to that end, a few parameters need to become clear. To this end, I've created a Phabricator-thing that we can discuss things off of:
https://phabricator.haskell.org/D323
Here are my open issues:
- There were ad hoc domains of Uniques being created everywhere in the compiler (i.e. characters chosen to classify the generated Uniques). I have gathered them all up and given them names as constructors in Unique.UniqueDomain. Some of these names are arbitrary, because I don't know what they're for precisely. I generally went for the module name as a starting point. I did, however, make a point of having different invocations of mkSplitUniqSupply et al all have different constructors (e.g. HscMainA through HscMainC). This is to prevent the high potential for conflicts (see comments in uniqueDomainChar). If there are people that are more knowledgeable about the use of Uniques in these modules (e.g. HscMain, ByteCodeGen, etc.) can say that the uniques coming from these different invocations can never cause conflict, they maybe can reduce the number of UniqueDomains.
?
- Some UniqueDomains only have a handful of instances and seem a bit wasteful.
- Uniques were represented by a custom-boxed Int#, but serialised as Word32. Most modern machines see Int# as a 64-bit thing. Aren't we worried about the potential for undetected overlap/conflict there?
- What is the scope in which a Unique must be Unique? I.e. what if independently compiled modules have overlapping Uniques (for different Ids) in their hi-files? Also, do TyCons and DataCons really need to have guaranteed different Uniques? Shouldn't the parser/renamer figure out what goes where and raise errors on domain violations?
- There seem to be related-but-different Unique implementations in Template Haskell and Hoopl. Why is this?
- How critical is it to let mkUnique (and mkSplitUniqSupply) be pure functions? If they can be IO, we could greatly simplify the management of (un)generated Uniques in each UniqueDomain and quite possibly make the move to a threaded GHC easier (for what that's worth). Also, this may help solve the non-determinism issues.
- Missing haddocks, failing lints (lines too long) and a lot of cosmetics will be met when the above points have become a tad more clear. I'm more than happy to document a lot of the answers to the above stuff in Unique and/or commentary.
Regards,
Philip
________________________________
From: Carter Schonwald <carter.schonwald at gmail.com>
Sent: 07 October 2014 21:30
To: Holzenspies, P.K.F. (EWI)
Cc: Austin Seipp; ghc-devs at haskell.org
Subject: Re: Again: Uniques in GHC
in some respects, having fully deterministic builds is a very important goal: a lot of tooling for eg, caching builds of libraries works much much better if you have that property :)
On Tue, Oct 7, 2014 at 12:45 PM, <p.k.f.holzenspies at utwente.nl<mailto:p.k.f.holzenspies at utwente.nl>> wrote:
________________________________________
From: mad.one at gmail.com<mailto:mad.one at gmail.com> <mad.one at gmail.com<mailto:mad.one at gmail.com>> on behalf of Austin Seipp <austin at well-typed.com<mailto:austin at well-typed.com>>
So I assume your change would mean 'ghc -j' would not work for 32bit.
I still consider this a big limitation, one which is only due to an
implementation detail. But we need to confirm this will actually fix
any bottlenecks first though before getting to that point.
Yes, that's what I'm saying.
Let me just add that what I'm proposing by no means prohibits or hinders making 32-bit GHC-versions be parallel later on, it just doesn't solve the problem. It depends to what extent the "fully deterministic behaviour" bug is considered a priority (there was something about parts of the hi-files being non-deterministic across different executions of GHC; don't recall the details).
Anyhow, the work I'm doing now exposes a few things about Uniques that confuse me a little and that could have been bugs (that maybe never acted up). Extended e-mail to follow later on.
Ph.
_______________________________________________
ghc-devs mailing list
ghc-devs at haskell.org<mailto:ghc-devs at haskell.org>
http://www.haskell.org/mailman/listinfo/ghc-devs
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.haskell.org/pipermail/ghc-devs/attachments/20141009/2d5671af/attachment-0001.html>
More information about the ghc-devs
mailing list