GSOC Idea: Bytecode serialization and/or Fat Interface files

Moritz Angermann moritz.angermann at gmail.com
Sat Mar 13 02:50:20 UTC 2021


I'd be happy to mentor anyone on either of these. The CI part is going to
be grueling demotivatinal work with very long pauses in between, which is
why I didn't propose it yet.

I agree with John, that I'm a bit skeptical about a Student being able to
help/pull anything off in the current state how things are with multiple
parties being actively involved in this already, without being relegated to
a spectators position.

On Sat, Mar 13, 2021 at 9:34 AM John Ericson <john.ericson at obsidian.systems>
wrote:

> Yes, see
> https://gitlab.haskell.org/ghc/ghc/-/wikis/Plan-for-increased-parallelism-and-more-detailed-intermediate-output
> where we (Obsidian) and IOHK have been planning together.
>
> I must saw, I am a bit skeptical about a GSOC being able to take this on
> successfully. I thought Fendor did a great job with multiple home units,
> for example, but we have still to finish merging all his work! The driver
> is perhaps the biggest cesspool of technical debt in GHC, and it will take
> a while to untangle let alone implement new features.
>
> I forget what the rules are for more incremental or multifaceted projects,
> but I would prefer an approach of trying to untangle things with no
> singular large goal. Or maybe we can involve a student with efforts to
> improve CI, attacking the root cause for why it's so hard to land things in
> the first place .
>
> John
> On 3/12/21 7:11 PM, Moritz Angermann wrote:
>
> Yes there is also John resumable compilation ideas. And the current
> performance work obsidian systems does.
>
> On Sat, 13 Mar 2021 at 6:21 AM, Cheng Shao <cheng.shao at tweag.io> wrote:
>
>> I believe Josh has already been working on 2 some time ago? cc'ing him
>> to this thread.
>>
>> I'm personally in favor of 2 since it's also super useful for
>> prototyping whole-program ghc backends, where one can just read all
>> the CgGuts from the .hi files, and get all codegen-related Core for
>> free.
>>
>> Cheers,
>> Cheng
>>
>> On Fri, Mar 12, 2021 at 10:32 PM Zubin Duggal <zubin.duggal at gmail.com>
>> wrote:
>> >
>> > Hi all,
>> >
>> > This is following up on this recent discussion on the list concerning
>> fat
>> > interface files:
>> https://mail.haskell.org/pipermail/ghc-devs/2020-October/019324.html
>> >
>> > Now that we have been accepted as a GSOC organisation, I think
>> > it would be a good project idea for a sufficiently motivated and
>> > advanced student. This is a call for mentors (and students as
>> > well!) who would be interested in this project
>> >
>> > The problem is the following:
>> >
>> > Haskell Language Server (and ghci with `-fno-code`) have very
>> > fast startup times for codebases which don't make use of Template
>> > Haskell, and thus don't require any code-gen to typecheck. This
>> > is because they can simply read the cached iface files generated by a
>> > previous compile and don't need to re-invoke the typechecker.
>> >
>> > However, as soon as TH is involved, we are forced to retypecheck and
>> > compile files, since it is not possible to restart the code-gen process
>> > starting with only a iface file. I can think of two ways to address this
>> > problem:
>> >
>> > 1. Allow bytecode to be serialized
>> >
>> > 2. Serialize desugared Core into iface files (fat interfaces), so that
>> > (byte)code-gen can be restarted from this point and doesn't need
>> >
>> > (1) might be challenging, but offers a few more advantages over (2),
>> > in that we can reduce the work done to load TH-heavy codebases to just
>> > a load of the cached bytecode objects from disk, and could make the
>> > load process (and times) for these codebases directly comparable to
>> > their TH-free cousins.
>> >
>> > It would also make ghci startup a lot faster with a warm cache of
>> > bytecode objects, bringing ghci startup times in line with those of
>> > -fno-code
>> >
>> > However (2) might be much easier to achieve and offers many
>> > of the same advantages, in that we would not need to re-run
>> > the compiler frontend or core-to-core optimisation phases.
>> > There is also already a (slightly bitrotted) implementation
>> > of (2) thanks to the work of Edward Yang.
>> >
>> > If any of this sounds exciting to you as a student or a mentor, please
>> > get in touch.
>> >
>> > In particular, I think (2) is a feasible project that can be completed
>> > with minimal mentoring effort. However, I'm only vaguely familiar with
>> > the details of the byte code generator, so if (1) is a direction we want
>> > to pursue, we would need a mentor familiar with the details of this part
>> > of GHC.
>> >
>> > Cheers,
>> > Zubin
>> > _______________________________________________
>> > ghc-devs mailing list
>> > ghc-devs at haskell.org
>> > http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs
>> _______________________________________________
>> ghc-devs mailing list
>> ghc-devs at haskell.org
>> http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs
>>
> _______________________________________________
> ghc-devs mailing list
> ghc-devs at haskell.org
> http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.haskell.org/pipermail/ghc-devs/attachments/20210313/803e7900/attachment.html>


More information about the ghc-devs mailing list