Making compilation results deterministic (#4012)
Simon Marlow
marlowsd at gmail.com
Thu Sep 17 12:37:55 UTC 2015
On 16/09/2015 12:18, Simon Peyton Jones wrote:
>
> | My understanding is that currently, if you build a Haskell project
> | from clean sources with ghc --make, then wipe all the .o/.hi files,
> | and rebuild again from clean, with all the same flags and environment,
> | you are unlikely to end up with identical binaries for either the .o
> | or .hi files
>
> Is that right Bartosz? If that's the goal, then can we please say that explicitly on the wiki page?
>
> Let's hypothesise that it it's the goal. Then I don't understand why, in a single-threaded GHC, you would get non-det results. Presumably not from lazy-interface-file-loading. After all the same things would be loaded in the same order, no?
Bartosz is going to write a wiki page and answer your earlier questions,
but I'll try to go into a bit more detail about this point. We don't
fully understand why we get non-deterministic results from a
single-threaded GHC with a clean build, however there are things that
will change from run to run that might influence compilation, e.g. the
contents of directories on disk can change and the names of temporary
files will change. We know for sure that having an old .hi file from a
previous compilation causes uniques to change, because GHC reads the .hi
file and assigns some uniques (yet this should clearly not affect the
compilation results). Perhaps we could fix these things, but it's
fragile, and furthermore we want to handle multithreaded compilation and
--make, both of which make it much harder.
So we concluded that it was probably futile to aim for
fully-deterministic compilation by making the uniques the same every
time, and instead we should try to make compilation deterministic in the
face of non-deterministic uniques. This also turns out to be really
hard, hence Bartosz' long email about the problems and the things he tried.
We don't currently have a good way to reproduce the problem from a
completely clean build, however it's easy to reproduce by doing two
builds and leaving the .hi files from the first build in place while
removing the .o files. You could also reproduce it easily by
randomizing the order that uniques are generated in some way.
Cheers
Simon
More information about the ghc-devs
mailing list