Implementing a compilation server

Matthew Pickering matthewtpickering at gmail.com
Tue Oct 11 11:01:16 UTC 2022


Sorry Facundo I didn't realise this reply had remaining questions in it.

On Mon, May 30, 2022 at 3:35 AM Domínguez, Facundo
<facundo.dominguez at tweag.io> wrote:
>
> Thanks Matthew for your pointers.
>
> Since originally posting, I managed to simplify the problem by terminating the compilation server at the end of a build, which allows to introduce the assumption that the code doesn't change during the lifetime of the server.
>
> Now, I'm observing that sometimes different compilation requests place the same package databases at different paths using the -package-db flags. From the point of view of GHC, it is as if the package databases had been moved from one location to another. In newer requests, GHC still looks for the interface files at the old locations, and fails when it doesn't find them.

I think you mean that your build system is putting the package
database at different paths. If you modify the package database
arguments at all you need to call `setSessionDynFlags` to update the
state of the package databases again. You might also need to clear the
finder cache if by "package database" you also mean the location of
the interface files.

>
> Another difference between requests is that, even for a same package database, different interface files are present, depending on what the module under compilation imports transitively. This is causing failures sometimes but not always, I still need to pin exactly the circumstances. The error manifests as an attempt to load a missing interface file that is apparently not transitively needed.

I don't think I can comment on this without more information. It
sounds like you have a missing dependency in your build graph so the
right .hi files are not present? Need a reproducer to comment
properly.

>
> If I understand correctly, all the packages pointed with -package-id and -package-db end up in the EPS. And this means that we can't expect to update the locations of the interface files without discarding and repopulating the EPS, correct? I'm thinking of this as approximately as costly as restarting the compilation server.

The EPS does not contain "packages" but interface files from packages
and the interface files are only loaded if they are needed. Once the
interface file is loaded into the EPS I would not expect it to matter
where the file came from on disk, as now GHC can just read it from
memory. Was there a specific bug you had here?

>
> I can reasonably ensure that package databases aren't moved around between compilation requests. But from the standpoint of the build system, it would require some compromises to demand that all of the interface files of a package be available even when not all of them are transitively imported. Can we hope to have GHC cope with this dynamic membership of modules to Haskell packages during the build? Is this an ability that 8.10.7 already has?

I don't think you need to engineer this requirement. GHC should work
fine if you only have the transitive interface files. This is how
hadrian now works and also multi-component support in GHC 9.4. I can't
comment so much on 8.10.7 as it was a long time ago!

Matt

>
> Thanks,
> Facundo
>
> On Thu, May 5, 2022 at 5:13 AM Matthew Pickering <matthewtpickering at gmail.com> wrote:
>>
>> Hi Facundo
>>
>> Some pointers...
>>
>> 1. Only put things in the EPS if they are not going to change
>> throughout the whole compilation
>> 2. Treat everything which can change as a home package
>> 2a. I suppose you have performed your own dependency analysis, so
>> build your own `ModGraph` and start looking from `load'`, you might
>> just want to call `upsweep_mod/compileOne'` directly yourself.
>> 2b. I suppose you are NOT targeting 9.4.1 yet, but that will make
>> things easier as you can use support for multiple home packages,
>> otherwise you will get into severe difficulties if you load a package
>> you later want to compile into the EPS. The only thing you can do here
>>       is restart the compilation session I think.
>> 3. To my knowledge, there is no issue using different -this-unit-id in
>> the same session. Not sure what errors you have seen.
>> 4. You need to use --make mode rather than -c (oneshot) because
>> oneshot mode loads all interfaces into the EPS (see point 1)
>>
>> ghcide is the closest program to this kind of compilation server you
>> imagine so you can look at how that uses the GHC API.
>>
>> Cheers,
>>
>> Matt
>>
>>
>> On Thu, May 5, 2022 at 1:06 AM Domínguez, Facundo
>> <facundo.dominguez at tweag.io> wrote:
>> >
>> > Dear ghc devs,
>> >
>> > I'm using the ghc API to write a compilation server (a.k.a. persistent worker). The idea is to serve requests to compile individual modules. In this fashion, we can compile modules with different compilation flags and yet pay only once for the startup overheads of the compiler.
>> >
>> > One challenge of this approach is to reuse as much as possible from the ghc API session/environment from one compilation request to the next, so we save the trouble of reconstructing it each time. This message is to ask for advise on how to better accomplish this reuse.
>> >
>> > I tried reusing the whole environment for multiple requests, but I'm conjecturing that this might cause troubles when the requests require building modules with different values of -this-unit-id. Another problem that streams from this is that recompiling a module which defines a type class instance fails because it encounters in the environment the type class instance from the
>> > previous compilation.
>> >
>> > My work-in-progress implementation is here [1]. There appears to be multiple ways to compile a module in the API, so far I have been trying DriverPipeline.compileFile.
>> >
>> > My best lead right now is to look for inspiration in how GHCi implements the load command, but this does a sort of --make compilation while I'm going here for the one-shot style.
>> >
>> > Thanks in advance,
>> > Facundo
>> >
>> > [1] https://github.com/tweag/rules_haskell/blob/16ba422457ea4daa5dbf40d46327ebcb20588e97/tools/haskell_module_worker/src/Compile.hs#L188
>> > _______________________________________________
>> > ghc-devs mailing list
>> > ghc-devs at haskell.org
>> > http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs


More information about the ghc-devs mailing list