<html><head></head><body lang="en-GB" style="background-color: rgb(255, 255, 255); line-height: initial;">                                                                                      <div style="width: 100%; font-size: initial; font-family: Calibri, 'Slate Pro', sans-serif, sans-serif; color: rgb(31, 73, 125); text-align: initial; background-color: rgb(255, 255, 255);">Hi Alois,</div><div style="width: 100%; font-size: initial; font-family: Calibri, 'Slate Pro', sans-serif, sans-serif; color: rgb(31, 73, 125); text-align: initial; background-color: rgb(255, 255, 255);"><br></div><div style="width: 100%; font-size: initial; font-family: Calibri, 'Slate Pro', sans-serif, sans-serif; color: rgb(31, 73, 125); text-align: initial; background-color: rgb(255, 255, 255);">I just checked out Kuna, and it looks like a great project. For others the link to the repo is https://github.com/aloiscochard/kuna. I think I'll go with it since not having to implement StackMapTables will save me a lot of time. </div><div style="width: 100%; font-size: initial; font-family: Calibri, 'Slate Pro', sans-serif, sans-serif; color: rgb(31, 73, 125); text-align: initial; background-color: rgb(255, 255, 255);"><br></div><div style="width: 100%; font-size: initial; font-family: Calibri, 'Slate Pro', sans-serif, sans-serif; color: rgb(31, 73, 125); text-align: initial; background-color: rgb(255, 255, 255);">I'm interested in your approach, can you explain more, especially the stack-safe bytecode part? And I noticed the last commit was in December. Is there any particular reason you stopped the project?</div><div style="width: 100%; font-size: initial; font-family: Calibri, 'Slate Pro', sans-serif, sans-serif; color: rgb(31, 73, 125); text-align: initial; background-color: rgb(255, 255, 255);"><br></div><div style="width: 100%; font-size: initial; font-family: Calibri, 'Slate Pro', sans-serif, sans-serif; color: rgb(31, 73, 125); text-align: initial; background-color: rgb(255, 255, 255);">I chose STG over Core because Core has an embedded language of coercions which complicates the code generation (or maybe they can simply be ignored?), and the terms are not in administrative normal form which requires more effort to manage. But a thought did cross my mind that certain Core optimizations might actually need to be turned off <span style="font-size: initial; line-height: initial; text-align: initial;">‎because the resulting STG might be in a form that might not get translated to the most efficient JVM bytecode. Again, all these issues can be looked at once a performance baseline has been established.</span></div><div style="width: 100%; font-size: initial; font-family: Calibri, 'Slate Pro', sans-serif, sans-serif; color: rgb(31, 73, 125); text-align: initial; background-color: rgb(255, 255, 255);"><span style="font-size: initial; line-height: initial; text-align: initial;"><br></span></div><div style="width: 100%; font-size: initial; font-family: Calibri, 'Slate Pro', sans-serif, sans-serif; color: rgb(31, 73, 125); text-align: initial; background-color: rgb(255, 255, 255);"><span style="font-size: initial; line-height: initial; text-align: initial;">Thanks, </span></div><div style="width: 100%; font-size: initial; font-family: Calibri, 'Slate Pro', sans-serif, sans-serif; color: rgb(31, 73, 125); text-align: initial; background-color: rgb(255, 255, 255);"><span style="font-size: initial; line-height: initial; text-align: initial;">Rahul Muttineni</span></div><div style="width: 100%; font-size: initial; font-family: Calibri, 'Slate Pro', sans-serif, sans-serif; color: rgb(31, 73, 125); text-align: initial; background-color: rgb(255, 255, 255);"><span style="font-size: initial; line-height: initial; text-align: initial;"><br></span></div>                                                                                                                                                                                                   <div style="font-size: initial; font-family: Calibri, 'Slate Pro', sans-serif, sans-serif; color: rgb(31, 73, 125); text-align: initial; background-color: rgb(255, 255, 255);">Sent from my BlackBerry 10 smartphone.<span style="font-family: Calibri, 'Slate Pro', sans-serif, sans-serif; font-size: initial; text-align: initial; line-height: initial;"></span></div>                                                                                                                                                                                  <table width="100%" style="background-color:white;border-spacing:0px;"> <tbody><tr><td colspan="2" style="font-size: initial; text-align: initial; background-color: rgb(255, 255, 255);">                           <div style="border-style: solid none none; border-top-color: rgb(181, 196, 223); border-top-width: 1pt; padding: 3pt 0in 0in; font-family: Tahoma, 'BB Alpha Sans', 'Slate Pro'; font-size: 10pt;">  <div><b>From: </b>Alois Cochard</div><div><b>Sent: </b>Monday 9 May 2016 10:21 PM</div><div><b>To: </b>Edward Kmett</div><div><b>Cc: </b>ghc-devs@haskell.org</div><div><b>Subject: </b>Re: Mentor for a JVM backend for GHC</div></div></td></tr></tbody></table><div style="border-style: solid none none; border-top-color: rgb(186, 188, 209); border-top-width: 1pt; font-size: initial; text-align: initial; background-color: rgb(255, 255, 255);"></div><br><div id="_originalContent" style=""><p dir="ltr">Totally agree, Cmm is too late. </p>

<p dir="ltr">My aim in Kuna was to start the transformation from Core, targeting stack-safe JVM bytecode without using Graal or the like. </p>

<p dir="ltr">Yes, I am quite an optimistic person ;-) but I believe it's the path to take. I'm not interested in Frege like approach for various reasons, performance being one of them. </p>

<div class="gmail_quote">On 7 May 2016 6:19 pm, "Edward Kmett" <<a href="mailto:ekmett@gmail.com">ekmett@gmail.com</a>> wrote:<br type="attribution"><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">By the time it has made it down to Cmm there are a lot of assumptions<br>

about layout in memory -- everything is assumed to be a flat object<br>

made out of 32-bit or 64-bit slots. These assumptions aren't really<br>

suitable for the JVM.<br>

<br>

-Edward<br>

<br>

On Sat, May 7, 2016 at 11:32 AM, Thomas Jakway <<a href="mailto:tjakway@nyu.edu">tjakway@nyu.edu</a>> wrote:<br>

> This is a strange coincidence.  I'm definitely no expert GHC hacker but I<br>

> started (highly preliminary) work on a JVM backend for GHC a few weeks ago.<br>

> It's here: <a href="https://github.com/tjakway/ghcjvm/tree/jvm/compiler/jvmGen/Jvm" rel="noreferrer" target="_blank">https://github.com/tjakway/ghcjvm/tree/jvm/compiler/jvmGen/Jvm</a><br>

> (The memory runtime is here: <a href="https://github.com/tjakway/lljvm" rel="noreferrer" target="_blank">https://github.com/tjakway/lljvm</a>)<br>

><br>

> I'm very new to this so pardon my ignorance, but I don't understand what the<br>

> benefit is of intercepting STG code and translating that to bytecode vs.<br>

> translating Cmm to bytecode (or Jasmin assembly, as I'd prefer)?  It seems<br>

> like Cmm is designed for backends and the obvious choice.  Or have I got<br>

> this really mixed up?<br>

><br>

> I hope this isn't out of line considering my overall lack of experience but<br>

> I think I can give some advice:<br>

><br>

> read the JVM 7 spec cover-to-cover.<br>

> I highly suggest outputting Jasmin assembly instead of raw bytecode.  The<br>

> classfile format is complicated and you will have to essentially rewrite<br>

> Jasmin in Haskell if you don't want to reuse it.  Jasmin is also the de<br>

> facto standard assembler and much more thoroughly tested than any homegrown<br>

> solution we might make.<br>

> read the LLVM code generator.  This project is more like the LLVM backend<br>

> than the native code generator.<br>

> Don't go for speed.  The approach that I've begun is to emulate a C stack<br>

> and memory system the RTS can run on top of<br>

> (<a href="https://github.com/tjakway/lljvm/blob/master/src/main/java/lljvm/runtime/Memory.java" rel="noreferrer" target="_blank">https://github.com/tjakway/lljvm/blob/master/src/main/java/lljvm/runtime/Memory.java</a>).<br>

> This will make getting something working much faster and also solves the<br>

> problem of how to deal with memcpy/memset/memmove on the JVM.  This will of<br>

> course be very slow (I think) and is not a permanent solution.  Can't do<br>

> everything at once.  Any other approach will probably require rewriting the<br>

> entire RTS from the beginning.<br>

> I don't think Frege is especially useful to this project, though I'd love to<br>

> be proven wrong.  Frege's compilation model is completely different from<br>

> GHC's: they compile Haskell to Java and then send that to javac.  Porting<br>

> GHC to the JVM is really more like writing a Cmm to JVM compiler.<br>

><br>

><br>

> I've heard of the LambdaVM project but couldn't find the actual code<br>

> anywhere.  The site where it was hosted appears to be offline.  I'd<br>

> certainly like to look at it if anyone knows where to find it.<br>

><br>

> Information on Jasmin:<br>

> <a href="http://web.mit.edu/javadev/packages/jasmin/doc/" rel="noreferrer" target="_blank">http://web.mit.edu/javadev/packages/jasmin/doc/</a><br>

> <a href="http://web.mit.edu/javadev/packages/jasmin/doc/instructions.html" rel="noreferrer" target="_blank">http://web.mit.edu/javadev/packages/jasmin/doc/instructions.html</a><br>

> <a href="http://web.mit.edu/javadev/packages/jasmin/doc/about.html" rel="noreferrer" target="_blank">http://web.mit.edu/javadev/packages/jasmin/doc/about.html</a><br>

><br>

> Once you've tried manually dealing with constant pools you'll appreciate<br>

> Jonathan Meyer's work!<br>

><br>

> I forked davidar's extended version of Jasmin.  The differences versus the<br>

> original Jasmin are detailed here.  Some nice additions:<br>

><br>

> supports invokedynamic<br>

> supports .annotation, .inner, .attribute, .deprecated directives<br>

> better handling of the ldc_w instruction<br>

> multi-line fields<br>

> .debug directives<br>

> signatures for local variables<br>

> .bytecode directive to specify bytecode version<br>

> (most importantly, I think): support for the StackMap attribute.  If we<br>

> eventually want to use new JVM instructions like invokedynamic, we need<br>

> stack map frames or the JVM will reject our bytecode.  JVM 7 has options to<br>

> bypass this (but it's a hack), but they're deprecated and I believe not<br>

> optional going forward.  Alternatively we can stick with older bytecode<br>

> versions indefinitely and not use the new features.<br>

><br>

> (Just to be clear, I forked it in case it was deleted.  I didn't write those<br>

> features, the credit belongs to him).<br>

><br>

> I think the biggest risk is taking too much on at once.  Any one of these<br>

> subtasks, writing a bytecode assembler, porting the RTS, etc. could consume<br>

> the whole summer if you're not careful.<br>

><br>

> I'd love to help out with this project!<br>

><br>

> Sincerely,<br>

> Thomas Jakway<br>

><br>

> -------<br>

><br>

> Woops, after scrolling back through the emails it looks like someone sent<br>

> out the LambdaVM source.  I'll have to take a look at that.<br>

><br>

><br>

><br>

> On 05/02/2016 11:26 AM, Rahul Muttineni wrote:<br>

><br>

> Hi GHC Developers,<br>

><br>

> I've started working on a JVM backend for GHC [1] and I'd love to work on it<br>

> as my Summer of Haskell project.<br>

><br>

> Currently, the build system is setup using a mix of Shake (for the RTS<br>

> build) and Stack (for the main compiler build) and I ensure that most<br>

> commits build successfully. I have ported the core part of the scheduler and<br>

> ported over the fundamental types (Capability, StgTSO, Task, StgClosure,<br>

> etc.) taking advantage of OOP in the implementation when I could.<br>

><br>

> Additionally, I performed a non-trivial refactor of the hs-java package<br>

> adding support for inner classes and fields which was very cumbersome to do<br>

> in the original package. On the frontend, I have tapped into the STG code<br>

> from the GHC 7.10.3 library and setup a CodeGen monad for generating JVM<br>

> bytecode. The main task of generating the actual bytecode, porting the more<br>

> critical parts of the RTS, and adding support for the threaded RTS remain.<br>

><br>

> The strategy for compilation is as follows:<br>

> - Intercept the STG code in the GHC pipeline<br>

> - Convert from STG->JVM bytecode [2] in a similar manner as STG->Cmm<br>

> preserving semantics as best as possible [3]<br>

> - Port the GHC RTS (normal & threaded) to Java [4]<br>

> - Put all the generated class files + RTS into a single jar to be run<br>

> directly by the JVM.<br>

><br>

> My objectives for the project during the summer are:<br>

> - To implement the compilation strategy mentioned above<br>

> - Implement the Java FFI for foreign imports. [5]<br>

> - Implement the most important [6] PrimOps that GHC supports.<br>

> - Port the base package replacing the C FFI imports with equivalent Java FFI<br>

> imports. [7]<br>

><br>

> A little bit about myself: I spent a lot of time studying functional<br>

> language implementation by reading SPJ's famous book and reading research<br>

> papers on related topics last summer as self-study.<br>

><br>

> I took a break and resumed a couple months ago where I spent a lot of time<br>

> plowing through the STG->Cmm code generator as well as the RTS and going<br>

> back and forth between them to get a clear understanding of how everything<br>

> works.<br>

><br>

> Moreover, I compiled simple Haskell programs and observed the STG, Cmm, and<br>

> assembly output (by decompiling the final executable with objdump) to<br>

> understand bits of the code generator where the source code wasn't that<br>

> clear.<br>

><br>

> I also spent a great deal of time studying the JVM internals, reading the<br>

> JVM spec, looking for any new features that could facilitate a high<br>

> performance implementation [8].<br>

><br>

> It would be great if someone with an understanding of nuances of the RTS and<br>

> code generator could mentor me for this project. It has been a blast so far<br>

> learning all the prerequisites and contemplating the design. I'd be very<br>

> excited to take this on as a summer project.<br>

><br>

> Also, given that I have hardly 5 days remaining, does anyone have<br>

> suggestions on how I can structure the proposal without getting into too<br>

> many details? There are still some parts of the design I haven't figured<br>

> out, but I know I could find some solution when I get to it during the<br>

> porting process.<br>

><br>

> Thanks,<br>

> Rahul Muttineni<br>

><br>

> [1] <a href="http://github.com/rahulmutt/ghcvm" rel="noreferrer" target="_blank">http://github.com/rahulmutt/ghcvm</a><br>

><br>

> [2] I intend to organically derive an IR at a later stage to allow for some<br>

> optimizations by looking at the final working implementation without an IR<br>

> and looking for patterns of repeated sequences of bytecode and assigning<br>

> each sequence its own instruction in the IR.<br>

><br>

> [3] Obviously, the lack of control of memory layouts (besides allocating off<br>

> the JVM heap using DirectByteBuffers) and lack of general tail calls makes<br>

> it tough to match the semantics of Cmm, but there are many solutions around<br>

> it, as can be found in the few papers on translating STG to Java/JVM<br>

> bytecode.<br>

><br>

> [4] This is the GHC RTS without GC and profiling since the JVM has great<br>

> support for those already. Also, lots of care must be taken to ensure that<br>

> the lock semantics stays in tact during the port.<br>

><br>

> [5] foreign exports will be dealt at a later stage, but I am taking care of<br>

> naming the closures nicely so that in the future you don't have to type long<br>

> names like the labels GHC compiles to call a Haskell function in Java.<br>

><br>

> [6] Basically all the PrimOps that would be required to provide plumbing for<br>

> the Prelude functions that can compile beginner-level programs found in<br>

> books such as Learn You a Haskell for Great Good.<br>

><br>

> [7] I know that it's a lot more complicated than just replacing FFI calls.<br>

> I'd have to change around a lot of the code in base as well.<br>

><br>

> [8] I found that the new "invokedynamic" instruction as well as the<br>

> MethodHandle API (something like function pointers) that were introduced in<br>

> JDK 7 could fit the bill. But as of now, I want to get a baseline<br>

> implementation that is compatible with Java 5 so I will not be utilizing<br>

> these newer features.<br>

><br>

><br>

><br>

> _______________________________________________<br>

> ghc-devs mailing list<br>

> <a href="mailto:ghc-devs@haskell.org">ghc-devs@haskell.org</a><br>

> <a href="http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs" rel="noreferrer" target="_blank">http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs</a><br>

><br>

><br>

><br>

> _______________________________________________<br>

> ghc-devs mailing list<br>

> <a href="mailto:ghc-devs@haskell.org">ghc-devs@haskell.org</a><br>

> <a href="http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs" rel="noreferrer" target="_blank">http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs</a><br>

><br>

_______________________________________________<br>

ghc-devs mailing list<br>

<a href="mailto:ghc-devs@haskell.org">ghc-devs@haskell.org</a><br>

<a href="http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs" rel="noreferrer" target="_blank">http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs</a><br>

</blockquote></div>

<br><!--end of _originalContent --></div></body></html>