Mentor for a JVM backend for GHC
Carter Schonwald
carter.schonwald at gmail.com
Mon May 2 15:48:27 UTC 2016
Woah this is some prodigious work!
1) would initially targeting only newer jvm variants make the work targets
for the summer easier ?
2) it's worth talking with ghc folks on #ghc and the gsoc admins at
#haskell-gsoc on freenode IRC about gsoc fit and possible mentors.
3) it sounds like the work so far is worth sharing on the r/Haskell sub
Reddit. If you want. I do recommend IRC for talking with some of the
active ghc and gsoc folk
On Monday, May 2, 2016, Rahul Muttineni <rahulmutt at gmail.com> wrote:
> Hi GHC Developers,
>
> I've started working on a JVM backend for GHC [1] and I'd love to work on
> it as my Summer of Haskell project.
>
> Currently, the build system is setup using a mix of Shake (for the RTS
> build) and Stack (for the main compiler build) and I ensure that most
> commits build successfully. I have ported the core part of the scheduler
> and ported over the fundamental types (Capability, StgTSO, Task,
> StgClosure, etc.) taking advantage of OOP in the implementation when I
> could.
>
> Additionally, I performed a non-trivial refactor of the hs-java package
> adding support for inner classes and fields which was very cumbersome to do
> in the original package. On the frontend, I have tapped into the STG code
> from the GHC 7.10.3 library and setup a CodeGen monad for generating JVM
> bytecode. The main task of generating the actual bytecode, porting the more
> critical parts of the RTS, and adding support for the threaded RTS remain.
>
> The strategy for compilation is as follows:
> - Intercept the STG code in the GHC pipeline
> - Convert from STG->JVM bytecode [2] in a similar manner as STG->Cmm
> preserving semantics as best as possible [3]
> - Port the GHC RTS (normal & threaded) to Java [4]
> - Put all the generated class files + RTS into a single jar to be run
> directly by the JVM.
>
> My objectives for the project during the summer are:
> - To implement the compilation strategy mentioned above
> - Implement the Java FFI for foreign imports. [5]
> - Implement the most important [6] PrimOps that GHC supports.
> - Port the base package replacing the C FFI imports with equivalent Java
> FFI imports. [7]
>
> A little bit about myself: I spent a lot of time studying functional
> language implementation by reading SPJ's famous book and reading research
> papers on related topics last summer as self-study.
>
> I took a break and resumed a couple months ago where I spent a lot of time
> plowing through the STG->Cmm code generator as well as the RTS and going
> back and forth between them to get a clear understanding of how everything
> works.
>
> Moreover, I compiled simple Haskell programs and observed the STG, Cmm,
> and assembly output (by decompiling the final executable with objdump) to
> understand bits of the code generator where the source code wasn't that
> clear.
>
> I also spent a great deal of time studying the JVM internals, reading the
> JVM spec, looking for any new features that could facilitate a high
> performance implementation [8].
>
> It would be great if someone with an understanding of nuances of the RTS
> and code generator could mentor me for this project. It has been a blast so
> far learning all the prerequisites and contemplating the design. I'd be
> very excited to take this on as a summer project.
>
> Also, given that I have hardly 5 days remaining, does anyone have
> suggestions on how I can structure the proposal without getting into too
> many details? There are still some parts of the design I haven't figured
> out, but I know I could find some solution when I get to it during the
> porting process.
>
> Thanks,
> Rahul Muttineni
>
> [1] http://github.com/rahulmutt/ghcvm
>
> [2] I intend to organically derive an IR at a later stage to allow for
> some optimizations by looking at the final working implementation without
> an IR and looking for patterns of repeated sequences of bytecode and
> assigning each sequence its own instruction in the IR.
>
> [3] Obviously, the lack of control of memory layouts (besides allocating
> off the JVM heap using DirectByteBuffers) and lack of general tail calls
> makes it tough to match the semantics of Cmm, but there are many solutions
> around it, as can be found in the few papers on translating STG to Java/JVM
> bytecode.
>
> [4] This is the GHC RTS without GC and profiling since the JVM has great
> support for those already. Also, lots of care must be taken to ensure that
> the lock semantics stays in tact during the port.
>
> [5] foreign exports will be dealt at a later stage, but I am taking care
> of naming the closures nicely so that in the future you don't have to type
> long names like the labels GHC compiles to call a Haskell function in Java.
>
> [6] Basically all the PrimOps that would be required to provide plumbing
> for the Prelude functions that can compile beginner-level programs found in
> books such as Learn You a Haskell for Great Good.
>
> [7] I know that it's a lot more complicated than just replacing FFI calls.
> I'd have to change around a lot of the code in base as well.
>
> [8] I found that the new "invokedynamic" instruction as well as the
> MethodHandle API (something like function pointers) that were introduced in
> JDK 7 could fit the bill. But as of now, I want to get a baseline
> implementation that is compatible with Java 5 so I will not be utilizing
> these newer features.
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.haskell.org/pipermail/ghc-devs/attachments/20160502/a1895c5f/attachment.html>
More information about the ghc-devs
mailing list