[GHC] #4012: Compilation results are not deterministic

GHC ghc-devs at haskell.org
Wed Dec 2 08:56:48 UTC 2015


#4012: Compilation results are not deterministic
-------------------------------------+-------------------------------------
        Reporter:  kili              |                Owner:  niteria
            Type:  bug               |               Status:  patch
        Priority:  high              |            Milestone:  8.0.1
       Component:  Compiler          |              Version:  6.12.2
      Resolution:                    |             Keywords:
Operating System:  Unknown/Multiple  |         Architecture:
                                     |  Unknown/Multiple
 Type of failure:  Other             |            Test Case:
      Blocked By:                    |             Blocking:
 Related Tickets:  #10424            |  Differential Rev(s):  Phab:D910,
                                     |  Phab:D1073, Phab:D1133, Phab:D1192,
                                     |  Phab:D1268, Phab:D1360, Phab:D1373,
                                     |  Phab:D1396, Phab:D1457, Phab:D1468,
       Wiki Page:                    |  Phab:D1487, Phab:D1504, Phab:D1508
-------------------------------------+-------------------------------------

Comment (by niteria):

 @nomeata asked me to post an update, so here it goes.

 > What remains to be done, are the remaining problems solved (and just
 need doing)?

 So far I've only seen a recurring theme of non-deterministic uniques
 affecting the order of constructs in the code, like abstracted variables
 in a lambda. I believe the other problems have been solved before I
 started to work on this.

 This core of the problem is the `Ord` instance for `Unique`. It's used
 for: unique maps, computing SCCs, computing free variables, normalizing by
 sorting and type comparison.

 There are many instances of that, I'm sure I haven't discovered all of
 them yet, but conceptually the problem is rather simple and it's just a
 matter of putting the work.

 There's progress here, right now (I have some local patches) all 60 of the
 packages that I picked from stackage and still build with GHC HEAD rebuild
 deterministically. What I mean by that is that if you remove just the .o
 files, leaving existing .hi files, and start the build again you end up
 with the same result.

 I've recently shifted my focus from external packages, to GHC's tests
 since they offer nice small pieces of code. Here my method is a bit
 different, I run the tests with unique allocation reversed and compare the
 interface files. I've recently gone from 156 differing .hi files down to
 26.

 All of these metrics are flawed in their own way and can fluctuate, so
 take them with a grain of salt.

 > Are there issues without a known solution so far?

 * Performance - so far I've spent a lot of time ensuring the performance
 is within the same ranges as before, forcing me to improve the free
 variable computation for example. In principle I could replace all the
 unique maps with deterministic unique maps and get a big chunk of non-
 determinism eliminated. The problem is that deterministic unique maps have
 a slower union operation and have to have extra memory to keep order. I've
 tried that and the cost was about 10% slower compilation times on `text`
 and `aeson`. So a lot of work is in just not creating regressions.

 * Ensuring that it stays fixed - so far this relies on tests which might
 be brittle + comments on places that I had to fix. It's a bit
 unsatisfying, I'd prefer a solution enforced by the compiler, like
 `Unique` not having an `Ord` instance.

 > The focus is now on reproducible interface files, what would it take to
 have reproducible binaries?

 I haven't looked much into that, but my gut feeling is that it's exactly
 the same problem, but you need to care about the passes after the core as
 well. I suspect you can get determinism today, if you don't do incremental
 builds and your build system builds single threaded. The external factor
 that affects the allocation of uniques is presence/absence of interface
 files.

 > Are there areas of work where you could need some extra help?

 So far my bottleneck has been making my patches mergeable in the sense of
 clean code and performance. Since the fixes stack on top of each other,
 because many files are a manifestation of more than one bug I think the
 easiest way to collaborate would be if I shared my unfinished patches and
 someone else ensured it's still performant enough and well documented.

 I've recently created #11148, which I believe is a bit more than a
 determinism bug, I've tried to look into it, but it was over my head, so
 that might be a more exciting way to pitch in.

--
Ticket URL: <http://ghc.haskell.org/trac/ghc/ticket/4012#comment:135>
GHC <http://www.haskell.org/ghc/>
The Glasgow Haskell Compiler


More information about the ghc-tickets mailing list