Potential GSoC proposal: Reduce the speed gap between 'ghc -c' and 'ghc --make'

Simon Marlow marlowsd at gmail.com
Tue Apr 10 13:03:19 CEST 2012


On 02/04/2012 07:37, Mikhail Glushenkov wrote:
> Hi all,
>
> [Hoping it's not too late.]
>
> During my work on parallelising 'ghc --make' [1] I encountered a
> stumbling block: running 'ghc --make' can be often much faster than
> using separate compile ('ghc -c') and link stages, which means that
> any parallel build tool built on top of 'ghc -c' will be significantly
> handicapped [2]. As far as I understand, this is mainly due to the
> effects of interface file caching - 'ghc --make' only needs to parse
> and load them once. One potential improvement (suggested by Duncan
> Coutts [3]) is to produce whole-package interface files and load them
> in using mmap().
>
> Questions:
>
> Would implementing this optimisation be a worthwhile/realistic GSoC project?
> What are other potential ways to bring 'ghc -c' performance up to par
> with 'ghc --make'?

My guess is that this won't have a significant impact on ghc -c compile 
times.

The advantage of squashing the .hi files for a package together is that 
they could share a string table, which would save a bit of space and 
time, but I think the time saved is small compared to the cost of 
deserialising and typechecking the declarations from the interface, 
which still has to be done.  In fact it might make things worse, if the 
string table for the whole base package is larger than the individual 
tables that would be read from .hi files.  I don't think mmap() will buy 
very much over the current scheme of just reading the file into a ByteArray.

Of course this is all just (educated) guesswork without actual 
measurements, and I could be wrong...

Perhaps there are ways to optimise the reading of interface files.  A 
good first step would be to do some profiling and see where the hotspots 
are.

Cheers,
	Simon



More information about the Glasgow-haskell-users mailing list