LLVM back end

Thu Dec 21 08:27:27 EST 2006

On 21-Dec-06, at 1:26 PM, Michael T. Richter wrote:

> Global Register Variables -- I'm not sure I understand the point  
> here.  LLVM is a... well, a virtual machine.  It's not a real  
> target.  The LLVM code is then either simulated in a VM (as per,  
> say, Java) or it is further compiled to a native representation.   
> (Or it is JITted, etc.)  If it's run in a VM, there is no register  
> anythings.  [...]

Reading between the lines, I take it that you want to port GHC to  
LLVM in its entirety; LLVM-ghc would not support -fvia-C or -fasm  
code generation but leave everything to LLVM only. If you want to  
keep any of the other code generation routes, then the global  
register variables used by GHC are part of the ABI and have to be  
specified explitly.

>  It sounds like an optimiser that does just that could be written  
> pretty simply.

Well, it wouldn't be just another optimiser, it would require support  
from the code generators, I guess...

On the other hand... if we're not trying to mix LLVM code with  
traditional GHC code anyway, then who says those variables should be  
global? They should probably be parameters to every function, and  
then lets hope that the calling convention used for tail-calls puts  
them in registers:

void %foo(sbyte* %stackPointer, sbyte* %stackLimit, sbyte* % 
heapPointer, sbyte* %heapLimit) {
entry:
         tail call void %bar( sbyte* %stackPointer, sbyte* % 
stackLimit, sbyte* %heapPointer, sbyte* %heapLimit )
         ret void
}

> The ability to put data next to code -- I'm not exactly sure what  
> you mean by this.  Do you mean some kind of inlined data like this  
> kind of psuedo-assembler?
>
>     move r1,mem-whatever
>     jmp foo
>
>     mem-data-inline db 1, 2, 3, 4, 5, 6, 7, 8
>
> :foo
>     move r2,mem-data-inline
>     ...

Well, basically, a heap object will start with a pointer to foo;  
sometimes we will want to tail-call foo, and sometimes we will want  
to access things at (foo-4) etc, i.e. the things at mem-data-inline.  
We want both things to be blindingly fast, i.e. we don't want to  
dereference another pointer.
Unless we want to link the LLVM-compiled code with -fasm and -fvia-C- 
compiled code, all we need is *any* mechanism to quickly access a  
block of data and a block of code from the same pointer.

On 21-Dec-06, at 9:14 AM, Simon Peyton-Jones wrote:

> How about
> * concurrency (the ability to have zillions of little stacks,
>         with stack overflow checks and growing the stck on overflow)?
> * exception handling (the ability to crawl over the stack
>         looking for exception catch frames)?
> * garbage collection (the ability to find pointers in the stack)

The alternative to using LLVM's support for all those things is to  
keep using GHC's run-time system and a stack we manage ourselves.  
Then we don't need to care about LLVM's support for those yet  
(although it might still be a good idea later, to give the optimiser  
more opportunities to help us).

Cheers,

Wolfgang