Making compilation results deterministic (#4012)

Simon Marlow marlowsd at gmail.com
Thu Sep 17 12:37:55 UTC 2015


On 16/09/2015 12:18, Simon Peyton Jones wrote:
>
> |  My understanding is that currently, if you build a Haskell project
> |  from clean sources with ghc --make, then wipe all the .o/.hi files,
> |  and rebuild again from clean, with all the same flags and environment,
> |  you are unlikely to end up with identical binaries for either the .o
> |  or .hi files
>
> Is that right Bartosz?  If that's the goal, then can we please say that explicitly on the wiki page?
>
> Let's hypothesise that it it's the goal. Then I don't understand why, in a single-threaded GHC, you would get non-det results.  Presumably not from lazy-interface-file-loading.  After all the same things would be loaded in the same order, no?

Bartosz is going to write a wiki page and answer your earlier questions, 
but I'll try to go into a bit more detail about this point.  We don't 
fully understand why we get non-deterministic results from a 
single-threaded GHC with a clean build, however there are things that 
will change from run to run that might influence compilation, e.g. the 
contents of directories on disk can change and the names of temporary 
files will change.  We know for sure that having an old .hi file from a 
previous compilation causes uniques to change, because GHC reads the .hi 
file and assigns some uniques (yet this should clearly not affect the 
compilation results).  Perhaps we could fix these things, but it's 
fragile, and furthermore we want to handle multithreaded compilation and 
--make, both of which make it much harder.

So we concluded that it was probably futile to aim for 
fully-deterministic compilation by making the uniques the same every 
time, and instead we should try to make compilation deterministic in the 
face of non-deterministic uniques.  This also turns out to be really 
hard, hence Bartosz' long email about the problems and the things he tried.

We don't currently have a good way to reproduce the problem from a 
completely clean build, however it's easy to reproduce by doing two 
builds and leaving the .hi files from the first build in place while 
removing the .o files.  You could also reproduce it easily by 
randomizing the order that uniques are generated in some way.

Cheers
Simon


More information about the ghc-devs mailing list