Mentor for a JVM backend for GHC

Fri May 6 11:40:35 UTC 2016

Hi Rahul,

Exicting to see other folks interested into doing this!

I did started an exprimental project on this at the end of last year, sadly
I did not had much time to work on it and it was closed sources, I just
opened it, so maybe we can share/reuse stuff?

Please take a look at this repository: https://github.com/aloiscochard/kuna

The most useful part is probaly the java assembly, I was not so happy of
hs-java so I rewrote an assembler from scratch.

To be honest I have not yet looked in your code, but I'll definitely will.

Let's keep in touch!

Cheers

On 2 May 2016 at 17:26, Rahul Muttineni <rahulmutt at gmail.com> wrote:

> Hi GHC Developers,
>
> I've started working on a JVM backend for GHC [1] and I'd love to work on
> it as my Summer of Haskell project.
>
> Currently, the build system is setup using a mix of Shake (for the RTS
> build) and Stack (for the main compiler build) and I ensure that most
> commits build successfully. I have ported the core part of the scheduler
> and ported over the fundamental types (Capability, StgTSO, Task,
> StgClosure, etc.) taking advantage of OOP in the implementation when I
> could.
>
> Additionally, I performed a non-trivial refactor of the hs-java package
> adding support for inner classes and fields which was very cumbersome to do
> in the original package. On the frontend, I have tapped into the STG code
> from the GHC 7.10.3 library and setup a CodeGen monad for generating JVM
> bytecode. The main task of generating the actual bytecode, porting the more
> critical parts of the RTS, and adding support for the threaded RTS remain.
>
> The strategy for compilation is as follows:
> - Intercept the STG code in the GHC pipeline
> - Convert from STG->JVM bytecode [2] in a similar manner as STG->Cmm
> preserving semantics as best as possible [3]
> - Port the GHC RTS (normal & threaded) to Java [4]
> - Put all the generated class files + RTS into a single jar to be run
> directly by the JVM.
>
> My objectives for the project during the summer are:
> - To implement the compilation strategy mentioned above
> - Implement the Java FFI for foreign imports. [5]
> - Implement the most important [6] PrimOps that GHC supports.
> - Port the base package replacing the C FFI imports with equivalent Java
> FFI imports. [7]
>
> A little bit about myself: I spent a lot of time studying functional
> language implementation by reading SPJ's famous book and reading research
> papers on related topics last summer as self-study.
>
> I took a break and resumed a couple months ago where I spent a lot of time
> plowing through the STG->Cmm code generator as well as the RTS and going
> back and forth between them to get a clear understanding of how everything
> works.
>
> Moreover, I compiled simple Haskell programs and observed the STG, Cmm,
> and assembly output (by decompiling the final executable with objdump) to
> understand bits of the code generator where the source code wasn't that
> clear.
>
> I also spent a great deal of time studying the JVM internals, reading the
> JVM spec, looking for any new features that could facilitate a high
> performance implementation [8].
>
> It would be great if someone with an understanding of nuances of the RTS
> and code generator could mentor me for this project. It has been a blast so
> far learning all the prerequisites and contemplating the design. I'd be
> very excited to take this on as a summer project.
>
> Also, given that I have hardly 5 days remaining, does anyone have
> suggestions on how I can structure the proposal without getting into too
> many details? There are still some parts of the design I haven't figured
> out, but I know I could find some solution when I get to it during the
> porting process.
>
> Thanks,
> Rahul Muttineni
>
> [1] http://github.com/rahulmutt/ghcvm
>
> [2] I intend to organically derive an IR at a later stage to allow for
> some optimizations by looking at the final working implementation without
> an IR and looking for patterns of repeated sequences of bytecode and
> assigning each sequence its own instruction in the IR.
>
> [3] Obviously, the lack of control of memory layouts (besides allocating
> off the JVM heap using DirectByteBuffers) and lack of general tail calls
> makes it tough to match the semantics of Cmm, but there are many solutions
> around it, as can be found in the few papers on translating STG to Java/JVM
> bytecode.
>
> [4] This is the GHC RTS without GC and profiling since the JVM has great
> support for those already. Also, lots of care must be taken to ensure that
> the lock semantics stays in tact during the port.
>
> [5] foreign exports will be dealt at a later stage, but I am taking care
> of naming the closures nicely so that in the future you don't have to type
> long names like the labels GHC compiles to call a Haskell function in Java.
>
> [6] Basically all the PrimOps that would be required to provide plumbing
> for the Prelude functions that can compile beginner-level programs found in
> books such as Learn You a Haskell for Great Good.
>
> [7] I know that it's a lot more complicated than just replacing FFI calls.
> I'd have to change around a lot of the code in base as well.
>
> [8] I found that the new "invokedynamic" instruction as well as the
> MethodHandle API (something like function pointers) that were introduced in
> JDK 7 could fit the bill. But as of now, I want to get a baseline
> implementation that is compatible with Java 5 so I will not be utilizing
> these newer features.
>
>
> _______________________________________________
> ghc-devs mailing list
> ghc-devs at haskell.org
> http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs
>
>

-- 
*Λ\ois*
http://twitter.com/aloiscochard
http://github.com/aloiscochard
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.haskell.org/pipermail/ghc-devs/attachments/20160506/7f111dd6/attachment.html>