GSOC Idea: Bytecode serialization and/or Fat Interface files

Fri Mar 12 22:20:22 UTC 2021

I believe Josh has already been working on 2 some time ago? cc'ing him
to this thread.

I'm personally in favor of 2 since it's also super useful for
prototyping whole-program ghc backends, where one can just read all
the CgGuts from the .hi files, and get all codegen-related Core for
free.

Cheers,
Cheng

On Fri, Mar 12, 2021 at 10:32 PM Zubin Duggal <zubin.duggal at gmail.com> wrote:
>
> Hi all,
>
> This is following up on this recent discussion on the list concerning fat
> interface files: https://mail.haskell.org/pipermail/ghc-devs/2020-October/019324.html
>
> Now that we have been accepted as a GSOC organisation, I think
> it would be a good project idea for a sufficiently motivated and
> advanced student. This is a call for mentors (and students as
> well!) who would be interested in this project
>
> The problem is the following:
>
> Haskell Language Server (and ghci with `-fno-code`) have very
> fast startup times for codebases which don't make use of Template
> Haskell, and thus don't require any code-gen to typecheck. This
> is because they can simply read the cached iface files generated by a
> previous compile and don't need to re-invoke the typechecker.
>
> However, as soon as TH is involved, we are forced to retypecheck and
> compile files, since it is not possible to restart the code-gen process
> starting with only a iface file. I can think of two ways to address this
> problem:
>
> 1. Allow bytecode to be serialized
>
> 2. Serialize desugared Core into iface files (fat interfaces), so that
> (byte)code-gen can be restarted from this point and doesn't need
>
> (1) might be challenging, but offers a few more advantages over (2),
> in that we can reduce the work done to load TH-heavy codebases to just
> a load of the cached bytecode objects from disk, and could make the
> load process (and times) for these codebases directly comparable to
> their TH-free cousins.
>
> It would also make ghci startup a lot faster with a warm cache of
> bytecode objects, bringing ghci startup times in line with those of
> -fno-code
>
> However (2) might be much easier to achieve and offers many
> of the same advantages, in that we would not need to re-run
> the compiler frontend or core-to-core optimisation phases.
> There is also already a (slightly bitrotted) implementation
> of (2) thanks to the work of Edward Yang.
>
> If any of this sounds exciting to you as a student or a mentor, please
> get in touch.
>
> In particular, I think (2) is a feasible project that can be completed
> with minimal mentoring effort. However, I'm only vaguely familiar with
> the details of the byte code generator, so if (1) is a direction we want
> to pursue, we would need a mentor familiar with the details of this part
> of GHC.
>
> Cheers,
> Zubin
> _______________________________________________
> ghc-devs mailing list
> ghc-devs at haskell.org
> http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs