[Haskell-cafe] Well typed OS

Mon Oct 8 07:08:40 UTC 2018

On Mon, Oct 8, 2018 at 8:53 AM Joachim Durchholz <jo at durchholz.org> wrote:

> Am 08.10.2018 um 01:34 schrieb Vanessa McHale:
> > The problem with an IR is that some languages would inevitably suffer -
> > LLVM in particular was designed as a backend for a C compiler, and so it
> > is not necessarily well-suited for lazy languages, immutable languages,
> > etc. (not to mention self-modifying assembly and other such pathological
> > beasts...)
> Actually LLVM is built for being adaptable to different kinds of
> languages. It does have a bias towards C-style languages, but you can
> adapt what doesn't fit your needs *and still keep the rest*.
>
> The following was true a few years ago:
>
> When I asked, the LLVM IR was intentionally not specified to be reusable
> across languages, so that different compiler toolchain could adapt the
> IR to whatever needs their language or backend infrastructure needed.
>
> Garbage collection is one area where you have to do a lot of work. There
> are some primitive instructions that support it, but the semantics is
> vague and doesn't cover all kinds of write barriers. You'll have to roll
> your own IR extensions - or maybe I didn't understand the primitives
> well enough to see how much they cover.
> Anyway, LLVM does not come with a GC implementation.
> OTOH, it does not prevent you from doing a GC. In particular, you're
> free to avoid C-style pointers, so you have the full range of GC
> algorithms available.
>
> Laziness? No problem. If you do tagless/spineless, you'll code the
> evaluation machine anyway. Just add an IR instructions that calls the
> interpreter.
>

I'm far from expert in this area, but isn't that "interpreter" a simple yet
slow approach to codegen? My understanding is that when you use, say, a
global variable as a register for your evaluation machine, it is slower
than if you somehow pin real hardware register for that purpose. I think
this is what "registerized" GHC build means.
In LLVM you can't use, say, RSP in a way you want, but it is doomed to be
"stack pointer register", even if you don't use stack at all.

As I read in some blog, you can slightly affect LLVM codegen by adding
calling conventions, but the real solution would be another algorithm for
instruction selection. No one implemented that yet, AFAIK.

> Immutability? No problem - actually nowhere a problem. Immutability
> happens at the language level, at the IR level it is pretty irrelevant
> because compilers try to replace object copying by in-place modification
> wherever possible, anyway.
>
> Self-modifying assembly? No IR really supports that. Mostly it's
> backends that generate self-modifying code from IR instructions for
> specific backends.
>
> TL;DR: For its generality, LLVM IR is better suited to languages with
> specific needs in the backend than anything else that I have seen (which
> means C runtimes, various VM proofs of concept which don't really count,
> and JVM - in particular I don't know how .net compares).
>
> Regards,
> Jo
> _______________________________________________
> Haskell-Cafe mailing list
> To (un)subscribe, modify options or view archives go to:
> http://mail.haskell.org/cgi-bin/mailman/listinfo/haskell-cafe
> Only members subscribed via the mailman list are allowed to post.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.haskell.org/pipermail/haskell-cafe/attachments/20181008/eea63779/attachment.html>