Questions concerning the LLVM backend - i.e. 'proc point splitting'

Fri Nov 20 23:00:30 UTC 2015

Simon Peyton Jones <simonpj at microsoft.com> writes:

> David
>
> All this stuff is terribly paged-out for me. But I'd love someone to
> pay attention to some of this back-end stuff, so if you'd like to work
> there, I'd like to help you.
>
> David Terei was also thinking of getting rid of proc point splitting
> for LLVM (see attached email and the notes attached to it)
>
> Simon Marlow also was thinking about this; see his attached email.
>
> The main *reason* for establishing a proc-point is so that, when
> generating C code (which we don't do any more) you can take its
> address. E.g.
>
> foo() { ... Push &bar onto the stack (effectively a return address)
>   Jump to thingumpy }
>
> bar() { ... }
>
> Really bar is part of foo, but we have make it a separate top level
> thing so that we can take the address of bar and push it on the stack.
>
> The trouble is that if there is a label inside bar that foo wants to
> jump to (i.e. without pushing &bar etc) then we have to make that
> label too into a proc-point, so that both foo and bar can see it.
> Urgh.
>
> In LLVM this probably is not true; presumably you can take the address
> of any label?
>
This is true. While labels themselves have function-local scope in LLVM,
there is an expression, `blockaddress`, which allows you to take an
address to a label in another function [1]. That being said, in reading
through the documentation it's not at all clear to me that it would be
safe to jump to such an address. In fact, given that the instruction
that this expression is supposed to be used with, `indirectbr`, can only
be used for local blocks, I suspect it is not. More information about
this feature can be found here [2].

The jump issue aside, I don't know how you would deal with
tables-next-to-code. The prefix data support that currently available in
LLVM is attached to functions and I unfortunately don't see that
changing any time soon.

Ultimately it seems that trying to refer to labels defined in other
functions is using LLVM against the way it was intended. One alternative
would be to teach llvmGen to merge mutually recusive top-level functions
into a single LLVM function during code generation. Otherwise I'm afraid
you are stuck with either the status quo or attempting to improve on
LLVM's own cross-procedure optimization ability.

That being said, it sounds as though eliminating proc-point splitting
would make for quite a nice project in the native code generator.

Cheers,

- Ben

[1] http://llvm.org/docs/LangRef.html#addresses-of-basic-blocks
[2] http://blog.llvm.org/2010/01/address-of-label-and-indirect-branches.html
[3] http://llvm.org/docs/LangRef.html#prefix-data
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 472 bytes
Desc: not available
URL: <http://mail.haskell.org/pipermail/ghc-devs/attachments/20151121/25a7318b/attachment.sig>