GHC development asks too much of the host system
Ben Gamari
ben at smart-cactus.org
Tue Jul 19 19:10:35 UTC 2022
Hécate <hecate at glitchbra.in> writes:
> Hello ghc-devs,
>
> I hadn't made significant contributions to the GHC code base in a while,
> until a few days ago, where I discovered that my computer wasn't able to
> sustain running the test suite, nor handle HLS well.
>
> Whether it is my OS automatically killing the process due to oom-killer
> or just the fact that I don't have a war machine, I find it too bad and
> I'm frankly discouraged.
Do you know which process was being killed? There is one testsuite tests
that I know of which does have quite a considerable memory footprint
(T16992) due to its nature; otherwise I would expect a reasonably recent
machine to pass the testsuite without much trouble. It's particularly
concerning if this is a new regression; is this the first time you have
observed this particular failure?
> This is not the first time such feedback emerges, as the documentation
> task force for the base library was unable to properly onboard some
> people from third-world countries who do not have access to hardware
> we'd consider "standard" in western Europe or some parts of North
> America. Or at least "standard" until even my standard stuff didn't cut
> it anymore.
>
> So yeah, I'll stay around but I'm afraid I'm going to have to focus on
> projects for which the feedback loop is not on the scale of hours , as
> this is a hobby project.
>
> Hope this will open some eyes.
>
Hi Hécate,
I would reiterate that the more specific feedback you can offer, the
better.
To share my some of my own experience: I have access to a variety of hardware,
some of which is quite powerful. However, I find that I end up doing
much of my development on my laptop which, while certainly not a slouch
(being a Ryzen 4750U), is also not a monster. In particular, while a
fresh build takes nearly twice as long on my laptop than some of the
other hardware I have, I nevertheless find ways to make it worthwhile
(due to the ease of iteration compared to ssh). If you routinely have
multi-hour iteration times then something isn't right.
In particular, I think there are a few tricks which make life far
easier:
* Be careful about doing things that would incur
significant amounts of rebuilding. This includes:
* After modifying, e.g., `compiler/ghc.cabal.in` (e.g. to add a new
module to GHC), modify `compiler/ghc.cabal` manually instead of
rerunning `configure`.
* Be careful about pulling/rebase. I generally pick a base commit to
build off of and rebase sparingly: Having to stop what I'm doing to
wait for full rebuild is an easy way to lose momentum.
* Avoid switching branches; I generally have a GHC tree per on-going
project.
* Take advantage of Hadrian's `--freeze1` flag
* Use `hadrian/ghci` to typecheck changes
* Use the stage1 compiler instead of stage2 to smoke-test changes when
possible. (specifically, using the script generated by Hadrian's
`_build/ghc-stage1` target)
* Use the right build flavour for the task at hand: If I don't need a
performant compiler and am confident that I can get by without
thorough testsuite validation, I use `quick`. Otherwise, plan ahead
for what you need (e.g. `default+assertions+debug_info` or
`validate`)
* Run the fraction of the testsuite that is relevant to your change.
Hadrian's `--test-way` and `--only` flags are your friends.
* Take advantage of CI. At the moment we have a fair amount of CI
capacity. If you think that your change is close to working, you can
open an MR and start a build locally. If it fails, iterate on just the
failing testcases locally.
* Task-level parallelism. Admittedly, this is harder when you are
working as a hobby, but I often have two or three projects on-going
at a time. While one tree is building I try to make progress on
another.
I don't use HLS so I may be insulated from some of the pain in this
regard. However, I do know that Matt is a regular user and he
disables most plugins.
I would also say that, sadly, GHC is comparable to other similarly-size
compilers in its build time: A build of LLVM (not even clang) takes ~50
minutes on my 8-core desktop; impressively, rustc takes ~7 minutes
although it is a considerably smaller compiler (being just a front-end).
By contrast, GHC takes around 20 minutes. I know that this doesn't
make the cost any easier to bear and I would love to bring this number
down, but ultimately there are only so many hours in the day.
I think one underexplored approach to addressing the build-time problem
is to look not at the full-build time but rather look for common tasks
where we could *avoid* doing a full build (e.g. updating documentation,
typechecking `base`, running a "good enough" subset of the testsuite)
and find ways to make those workflows more efficient.
Cheers,
- Ben
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 487 bytes
Desc: not available
URL: <http://mail.haskell.org/pipermail/ghc-devs/attachments/20220719/29bf4af9/attachment.sig>
More information about the ghc-devs
mailing list