GHC development asks too much of the host system

Ben Gamari ben at smart-cactus.org
Tue Jul 19 22:03:13 UTC 2022


Artem Pelenitsyn <a.pelenitsyn at gmail.com> writes:

> Hey everyone! I'm not a frequent contributor but I observed similar
> challenges as Hécate. I notice couple points.
>
> ### HLS and other editor integrations
>
> I've never tried HLS for GHC development but it absolutely chokes on
> Cabal for me (a $2K laptop), so I'm not surprised it's having troubles
> with GHC. I've never tried to dig into it, but I heard before that
> disabling plugins is a good start.
>
> Ghcid (after the introduction of ghc-in-ghci) was interesting but still on
> a slow side.
>
> I once tried to generate ETAGS and use them from Emacs (with plain
> haskell-mode): this was quite nice. As Moritz, I didn't use much above
> syntax coloring, but ETAGS allowed jumping to definitions, which is
> important. Maintaining tags wasn't fun, on the other hand.
>
> In all fairness, I think that's an issue with HLS more than with GHC.
>
> ### Build Times
>
> I have been using a dedicated server for this, but this still was
> painful at times (even git clone takes non-negligible amount of time,
> and I never got used to git worktree because of a hoop you have to
> jump over, which I already forgot but I know it can be looked up in
> Andreas Herrmann's presentation on developing GHC). I'm surprised no
> one seems to try to challenge the status quo.
>
IMHO, `git worktree` is indispensible. Not only does it make cloning
cheaper but it makes it trivial to share commits between work trees,
which is incredibly helpful when cleaning up branch history,
backporting, and other common tasks. I just wish it also worked
transparently for submodules.

> Hadrian is a Shake application. How is Cloud Shake doing? In the era
> of Nix and Bazel you start assuming niceties like remote caching. It'd
> be great to improve on this front as it just feels very wrong
> rebuilding master again and again on every contributor's computer.
> Especially after so much effort put into GHC modularity, which, I
> believe, should make it easier to cache.

Sadly using Cloud Shake in Hadrian ran into some rather fundamental
difficulties:

 * GHC has native dependencies (namely, the native toolchain, ncurses,
   gmp, and possible libdw and libnuma). If everyone were to use, e.g.,
   ghc.nix this could be largely mitigated, but this isn't the world in
   which we live.

 * GHC is a bootstrapped compiler. Consequently, most changes will
   invalidate well over half of the build cache (that is, everything
   built by stage 1). This significantly limits the benefit that one
   could gain from Cloud Shake, especially since in the typical
   development workflow the stage 1 build (which is where most of the
   caching benefit would be seen) is a rather small cost (IIRC it takes
   around 5 minutes to make it to the stage 2 build on my machine).

   One might think that we could simply "freeze" the stage 1 compiler,
   but in general this is not safe. For instance, it would break subtly
   on any change to known keys, the interface file format, the ABI, or
   primop definitions.

>
> It's sad that GHC still needs ./boot && ./configure: this can preclude any
> remote caching technology that I can imagine. At one point it seemed like
> configure could go into Hadrian, but it didn't really happen.
>
I don't see us moving away from `configure` (or something like it) as
long as GHC has native dependencies. Having a clear separation between
"configuration" and "building" is very much necessary to maintain sanity.

Cheers,

- Ben
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 487 bytes
Desc: not available
URL: <http://mail.haskell.org/pipermail/ghc-devs/attachments/20220719/431ff14e/attachment.sig>


More information about the ghc-devs mailing list