[GHC] #4012: Compilation results are not deterministic
GHC
ghc-devs at haskell.org
Wed Dec 2 08:56:48 UTC 2015
#4012: Compilation results are not deterministic
-------------------------------------+-------------------------------------
Reporter: kili | Owner: niteria
Type: bug | Status: patch
Priority: high | Milestone: 8.0.1
Component: Compiler | Version: 6.12.2
Resolution: | Keywords:
Operating System: Unknown/Multiple | Architecture:
| Unknown/Multiple
Type of failure: Other | Test Case:
Blocked By: | Blocking:
Related Tickets: #10424 | Differential Rev(s): Phab:D910,
| Phab:D1073, Phab:D1133, Phab:D1192,
| Phab:D1268, Phab:D1360, Phab:D1373,
| Phab:D1396, Phab:D1457, Phab:D1468,
Wiki Page: | Phab:D1487, Phab:D1504, Phab:D1508
-------------------------------------+-------------------------------------
Comment (by niteria):
@nomeata asked me to post an update, so here it goes.
> What remains to be done, are the remaining problems solved (and just
need doing)?
So far I've only seen a recurring theme of non-deterministic uniques
affecting the order of constructs in the code, like abstracted variables
in a lambda. I believe the other problems have been solved before I
started to work on this.
This core of the problem is the `Ord` instance for `Unique`. It's used
for: unique maps, computing SCCs, computing free variables, normalizing by
sorting and type comparison.
There are many instances of that, I'm sure I haven't discovered all of
them yet, but conceptually the problem is rather simple and it's just a
matter of putting the work.
There's progress here, right now (I have some local patches) all 60 of the
packages that I picked from stackage and still build with GHC HEAD rebuild
deterministically. What I mean by that is that if you remove just the .o
files, leaving existing .hi files, and start the build again you end up
with the same result.
I've recently shifted my focus from external packages, to GHC's tests
since they offer nice small pieces of code. Here my method is a bit
different, I run the tests with unique allocation reversed and compare the
interface files. I've recently gone from 156 differing .hi files down to
26.
All of these metrics are flawed in their own way and can fluctuate, so
take them with a grain of salt.
> Are there issues without a known solution so far?
* Performance - so far I've spent a lot of time ensuring the performance
is within the same ranges as before, forcing me to improve the free
variable computation for example. In principle I could replace all the
unique maps with deterministic unique maps and get a big chunk of non-
determinism eliminated. The problem is that deterministic unique maps have
a slower union operation and have to have extra memory to keep order. I've
tried that and the cost was about 10% slower compilation times on `text`
and `aeson`. So a lot of work is in just not creating regressions.
* Ensuring that it stays fixed - so far this relies on tests which might
be brittle + comments on places that I had to fix. It's a bit
unsatisfying, I'd prefer a solution enforced by the compiler, like
`Unique` not having an `Ord` instance.
> The focus is now on reproducible interface files, what would it take to
have reproducible binaries?
I haven't looked much into that, but my gut feeling is that it's exactly
the same problem, but you need to care about the passes after the core as
well. I suspect you can get determinism today, if you don't do incremental
builds and your build system builds single threaded. The external factor
that affects the allocation of uniques is presence/absence of interface
files.
> Are there areas of work where you could need some extra help?
So far my bottleneck has been making my patches mergeable in the sense of
clean code and performance. Since the fixes stack on top of each other,
because many files are a manifestation of more than one bug I think the
easiest way to collaborate would be if I shared my unfinished patches and
someone else ensured it's still performant enough and well documented.
I've recently created #11148, which I believe is a bit more than a
determinism bug, I've tried to look into it, but it was over my head, so
that might be a more exciting way to pitch in.
--
Ticket URL: <http://ghc.haskell.org/trac/ghc/ticket/4012#comment:135>
GHC <http://www.haskell.org/ghc/>
The Glasgow Haskell Compiler
More information about the ghc-tickets
mailing list