Mentor for a JVM backend for GHC

Mon May 2 17:55:36 UTC 2016

@Carter:
1) Actually, that would be harder. I've given the implementation without
the new features a lot of thought and it's simpler to get done. I've only
briefly given a thought on how the newer JVM features could be used, but I
figured as I port more of the GHC's RTS/code generator, I'll know enough of
the nuances to be able to judge where to apply these new features
effectively.

2) Thanks for the suggestion. I don't use IRC that often, but if I could
reach more people there then I'll give it a try.

3) Right now it doesn't do anything that is visible to the user so I'll
hold off on that until I've ported a tiny but usable subset of the code
generator. My target is to get the basic Java FFI and I/O working so that
there's something new and interesting to play with.

@Edward: Thanks a lot! That would be great.

@Karel: Yeah it is, but the reward of being able to run Haskell anywhere is
definitely worth it! I'm aware of the LambdaVM project. I didn't mention it
because I wanted to focus on the present. I sent Brian Alliet an email
asking for guidance on architectural decisions a while back, but have
received no response.

I'd be interested in having a copy of the LambdaVM source, thanks. There
are a couple of places in the design I'm iffy on, so it'd be nice to have a
source of inspiration.

Frege is a great project in its own right, but GHC Haskell has become the
standard and I find many of the extensions to be useful in eliminating many
more classes of bugs at compile-time.

Thanks guys,
Rahul Muttineni

On Mon, May 2, 2016 at 9:59 PM, Karel Gardas <karel.gardas at centrum.cz>
wrote:

>
> Rahul,
>
> a lot of work in front of you. There was a similar project in the past,
> LambdaVM was it's name, author Brian Alliet -- unfortunately google can't
> reveal working site with it nor web.archive.org. What I can offer you are
> some of the latest sources of this project. It's based on GHC 6.7 version
> so really an old base, but perhaps something would be usable even for your
> own work and even after all those years? Let me know if you are interested
> and I'll pack that and upload somewhere -- well assuming you have not heart
> about it since otherwise I would expect it to be noted somewhere in your
> email. Just let me know if you are interested...
>
> Anyway, I'm keeping my finger crossed for this your work! I'm personally
> very interested in seeing Haskell hosted in JVM and inter operable with
> Java...(yes, I know about Frege...)
>
> Cheers,
> Karel
>
>
> On 05/ 2/16 05:26 PM, Rahul Muttineni wrote:
>
>> Hi GHC Developers,
>>
>> I've started working on a JVM backend for GHC [1] and I'd love to work
>> on it as my Summer of Haskell project.
>>
>> Currently, the build system is setup using a mix of Shake (for the RTS
>> build) and Stack (for the main compiler build) and I ensure that most
>> commits build successfully. I have ported the core part of the scheduler
>> and ported over the fundamental types (Capability, StgTSO, Task,
>> StgClosure, etc.) taking advantage of OOP in the implementation when I
>> could.
>>
>> Additionally, I performed a non-trivial refactor of the hs-java package
>> adding support for inner classes and fields which was very cumbersome to
>> do in the original package. On the frontend, I have tapped into the STG
>> code from the GHC 7.10.3 library and setup a CodeGen monad for
>> generating JVM bytecode. The main task of generating the actual
>> bytecode, porting the more critical parts of the RTS, and adding support
>> for the threaded RTS remain.
>>
>> The strategy for compilation is as follows:
>> - Intercept the STG code in the GHC pipeline
>> - Convert from STG->JVM bytecode [2] in a similar manner as STG->Cmm
>> preserving semantics as best as possible [3]
>> - Port the GHC RTS (normal & threaded) to Java [4]
>> - Put all the generated class files + RTS into a single jar to be run
>> directly by the JVM.
>>
>> My objectives for the project during the summer are:
>> - To implement the compilation strategy mentioned above
>> - Implement the Java FFI for foreign imports. [5]
>> - Implement the most important [6] PrimOps that GHC supports.
>> - Port the base package replacing the C FFI imports with equivalent Java
>> FFI imports. [7]
>>
>> A little bit about myself: I spent a lot of time studying functional
>> language implementation by reading SPJ's famous book and reading
>> research papers on related topics last summer as self-study.
>>
>> I took a break and resumed a couple months ago where I spent a lot of
>> time plowing through the STG->Cmm code generator as well as the RTS and
>> going back and forth between them to get a clear understanding of how
>> everything works.
>>
>> Moreover, I compiled simple Haskell programs and observed the STG, Cmm,
>> and assembly output (by decompiling the final executable with objdump)
>> to understand bits of the code generator where the source code wasn't
>> that clear.
>>
>> I also spent a great deal of time studying the JVM internals, reading
>> the JVM spec, looking for any new features that could facilitate a high
>> performance implementation [8].
>>
>> It would be great if someone with an understanding of nuances of the RTS
>> and code generator could mentor me for this project. It has been a blast
>> so far learning all the prerequisites and contemplating the design. I'd
>> be very excited to take this on as a summer project.
>>
>> Also, given that I have hardly 5 days remaining, does anyone have
>> suggestions on how I can structure the proposal without getting into too
>> many details? There are still some parts of the design I haven't figured
>> out, but I know I could find some solution when I get to it during the
>> porting process.
>>
>> Thanks,
>> Rahul Muttineni
>>
>> [1] http://github.com/rahulmutt/ghcvm
>>
>> [2] I intend to organically derive an IR at a later stage to allow for
>> some optimizations by looking at the final working implementation
>> without an IR and looking for patterns of repeated sequences of bytecode
>> and assigning each sequence its own instruction in the IR.
>>
>> [3] Obviously, the lack of control of memory layouts (besides allocating
>> off the JVM heap using DirectByteBuffers) and lack of general tail calls
>> makes it tough to match the semantics of Cmm, but there are many
>> solutions around it, as can be found in the few papers on translating
>> STG to Java/JVM bytecode.
>>
>> [4] This is the GHC RTS without GC and profiling since the JVM has great
>> support for those already. Also, lots of care must be taken to ensure
>> that the lock semantics stays in tact during the port.
>>
>> [5] foreign exports will be dealt at a later stage, but I am taking care
>> of naming the closures nicely so that in the future you don't have to
>> type long names like the labels GHC compiles to call a Haskell function
>> in Java.
>>
>> [6] Basically all the PrimOps that would be required to provide plumbing
>> for the Prelude functions that can compile beginner-level programs found
>> in books such as Learn You a Haskell for Great Good.
>>
>> [7] I know that it's a lot more complicated than just replacing FFI
>> calls. I'd have to change around a lot of the code in base as well.
>>
>> [8] I found that the new "invokedynamic" instruction as well as the
>> MethodHandle API (something like function pointers) that were introduced
>> in JDK 7 could fit the bill. But as of now, I want to get a baseline
>> implementation that is compatible with Java 5 so I will not be utilizing
>> these newer features.
>>
>>
>>
>> _______________________________________________
>> ghc-devs mailing list
>> ghc-devs at haskell.org
>> http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs
>>
>>
>

-- 
Rahul Muttineni
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.haskell.org/pipermail/ghc-devs/attachments/20160502/3fa74ceb/attachment.html>