LLVM and dynamic linking

Fri Dec 27 20:41:10 UTC 2013

great work! :)

On Fri, Dec 27, 2013 at 3:21 PM, Ben Gamari <bgamari.foss at gmail.com> wrote:

> Simon Marlow <marlowsd at gmail.com> writes:
>
> > This sounds right to me.  Did you submit a patch?
> >
> > Note that dynamic linking with LLVM is likely to produce significantly
> > worse code that with the NCG right now, because the LLVM back end uses
> > dynamic references even for symbols in the same package, whereas the NCG
> > back-end uses direct static references for these.
> >
> Today with the help of Edward Yang I examined the code produced by the
> LLVM backend in light of this statement. I was surprised to find that
> LLVM's code appears to be no worse than the NCG with respect to
> intra-package references.
>
> My test case can be found here[2] and can be built with the included
> `build.sh` script. The test consists of two modules build into a shared
> library. One module, `LibTest`, exports a few simple members while the
> other module (`LibTest2`) defines members that consume them. Care is
> taken to ensure the members are not inlined.
>
> The tests were done on x86_64 running LLVM 3.4 and GHC HEAD with the
> patches[1] I referred to in my last message. Please let me know if I've
> missed something.
>
>
>
> # Evaluation
>
> ## First example ##
>
> The first member is a simple `String` (defined in `LibTest`),
>
>     helloWorld :: String
>     helloWorld = "Hello World!"
>
> The use-site is quite straightforward,
>
>     testHelloWorld :: IO String
>     testHelloWorld = return helloWorld
>
> With `-O1` the code looks reasonable in both cases. Most importantly,
> both backends use IP relative addressing to find the symbol.
>
> ### LLVM ###
>
>     0000000000000ef8 <rKw_info>:
>          ef8:   48 8b 45 00             mov    0x0(%rbp),%rax
>          efc:   48 8d 1d cd 11 20 00    lea    0x2011cd(%rip),%rbx
>  # 2020d0 <libtestzm0zi1zi0zi0_LibTest_helloWorld_closure>
>          f03:   ff e0                   jmpq   *%rax
>
>     0000000000000f28 <libtestzm0zi1zi0zi0_LibTest2_testHelloWorld_info>:
>          f28:   eb ce                   jmp    ef8 <rKw_info>
>          f2a:   66 0f 1f 44 00 00       nopw   0x0(%rax,%rax,1)
>
> ### NCG ###
>
>     0000000000000d58 <rH1_info>:
>      d58:       48 8d 1d 71 13 20 00    lea    0x201371(%rip),%rbx
>  # 2020d0 <libtestzm0zi1zi0zi0_LibTest_helloWorld_closure>
>      d5f:       ff 65 00                jmpq   *0x0(%rbp)
>
>     0000000000000d88 <libtestzm0zi1zi0zi0_LibTest2_testHelloWorld_info>:
>      d88:       eb ce                   jmp    d58 <rH1_info>
>
>
> With `-O0` the code is substantially longer but the relocation behavior
> is still correct, as one would expect.
>
> Looking at the definition of `helloWorld`[3] itself it becomes clear that
> the LLVM backend is more likely to use PLT relocations over GOT. In
> general, `stg_*` primitives are called through the PLT. As far as I can
> tell, both of these call mechanisms will incur two memory
> accesses. However, in the case of the PLT the call will consist of two
> JMPs whereas the GOT will consist of only one. Is this a cause for
> concern? Could these two jumps interfere with prediction?
>
> In general the LLVM backend produces a few more instructions than the
> NCG although this doesn't appear to be related to handling of
> relocations. For instance, the inexplicable (to me) `mov` at the
> beginning of LLVM's `rKw_info`.
>
>
> ## Second example ##
>
> The second example demonstrates an actual call,
>
>     -- Definition (in LibTest)
>     infoRef :: Int -> Int
>     infoRef n = n + 1
>
>     -- Call site
>     testInfoRef :: IO Int
>     testInfoRef = return (infoRef 2)
>
> With `-O1` this produces the following code,
>
> ### LLVM ###
>
>     0000000000000fb0 <rLy_info>:
>          fb0:   48 8b 45 00             mov    0x0(%rbp),%rax
>          fb4:   48 8d 1d a5 10 20 00    lea    0x2010a5(%rip),%rbx
>  # 202060 <rLx_closure>
>          fbb:   ff e0                   jmpq   *%rax
>
>     0000000000000fe0 <libtestzm0zi1zi0zi0_LibTest2_testInfoRef_info>:
>          fe0:   eb ce                   jmp    fb0 <rLy_info>
>
> ### NCG ###
>
>     0000000000000e10 <rI3_info>:
>      e10:       48 8d 1d 51 12 20 00    lea    0x201251(%rip),%rbx
>  # 202068 <rI2_closure>
>      e17:       ff 65 00                jmpq   *0x0(%rbp)
>
>     0000000000000e40 <libtestzm0zi1zi0zi0_LibTest2_testInfoRef_info>:
>      e40:       eb ce                   jmp    e10 <rI3_info>
>
> Again, it seems that LLVM is a bit more verbose but seems to handle
> intra-package calls efficiently.
>
>
>
> [1] https://github.com/bgamari/ghc/commits/llvm-dynamic
> [2] https://github.com/bgamari/ghc-linking-tests/tree/master/ghc-test
> [3] `helloWorld` definitions:
>
> LLVM:
>     00000000000010a8 <libtestzm0zi1zi0zi0_LibTest_helloWorld_info>:
>         10a8:   50                      push   %rax
>         10a9:   4c 8d 75 f0             lea    -0x10(%rbp),%r14
>         10ad:   4d 39 fe                cmp    %r15,%r14
>         10b0:   73 07                   jae    10b9
> <libtestzm0zi1zi0zi0_LibTest_helloWorld_info+0x11>
>         10b2:   49 8b 45 f0             mov    -0x10(%r13),%rax
>         10b6:   5a                      pop    %rdx
>         10b7:   ff e0                   jmpq   *%rax
>         10b9:   4c 89 ef                mov    %r13,%rdi
>         10bc:   48 89 de                mov    %rbx,%rsi
>         10bf:   e8 0c fd ff ff          callq  dd0 <newCAF at plt>
>         10c4:   48 85 c0                test   %rax,%rax
>         10c7:   74 22                   je     10eb
> <libtestzm0zi1zi0zi0_LibTest_helloWorld_info+0x43>
>         10c9:   48 8b 0d 18 0f 20 00    mov    0x200f18(%rip),%rcx
>  # 201fe8 <_DYNAMIC+0x228>
>         10d0:   48 89 4d f0             mov    %rcx,-0x10(%rbp)
>         10d4:   48 89 45 f8             mov    %rax,-0x8(%rbp)
>         10d8:   48 8d 05 21 00 00 00    lea    0x21(%rip),%rax        #
> 1100 <cJC_str>
>         10df:   4c 89 f5                mov    %r14,%rbp
>         10e2:   49 89 c6                mov    %rax,%r14
>         10e5:   58                      pop    %rax
>         10e6:   e9 b5 fc ff ff          jmpq   da0
> <ghczmprim_GHCziCString_unpackCStringzh_info at plt>
>         10eb:   48 8b 03                mov    (%rbx),%rax
>         10ee:   5a                      pop    %rdx
>         10ef:   ff e0                   jmpq   *%rax
>
>
> NCG:
>
>     0000000000000ef8 <libtestzm0zi1zi0zi0_LibTest_helloWorld_info>:
>      ef8:       48 8d 45 f0             lea    -0x10(%rbp),%rax
>      efc:       4c 39 f8                cmp    %r15,%rax
>      eff:       72 3f                   jb     f40
> <libtestzm0zi1zi0zi0_LibTest_helloWorld_info+0x48>
>      f01:       4c 89 ef                mov    %r13,%rdi
>      f04:       48 89 de                mov    %rbx,%rsi
>      f07:       48 83 ec 08             sub    $0x8,%rsp
>      f0b:       b8 00 00 00 00          mov    $0x0,%eax
>      f10:       e8 1b fd ff ff          callq  c30 <newCAF at plt>
>      f15:       48 83 c4 08             add    $0x8,%rsp
>      f19:       48 85 c0                test   %rax,%rax
>      f1c:       74 20                   je     f3e
> <libtestzm0zi1zi0zi0_LibTest_helloWorld_info+0x46>
>      f1e:       48 8b 1d cb 10 20 00    mov    0x2010cb(%rip),%rbx
>  # 201ff0 <_DYNAMIC+0x238>
>      f25:       48 89 5d f0             mov    %rbx,-0x10(%rbp)
>      f29:       48 89 45 f8             mov    %rax,-0x8(%rbp)
>      f2d:       4c 8d 35 1c 00 00 00    lea    0x1c(%rip),%r14        #
> f50 <cGG_str>
>      f34:       48 83 c5 f0             add    $0xfffffffffffffff0,%rbp
>      f38:       ff 25 7a 10 20 00       jmpq   *0x20107a(%rip)        #
> 201fb8 <_DYNAMIC+0x200>
>      f3e:       ff 23                   jmpq   *(%rbx)
>      f40:       41 ff 65 f0             jmpq   *-0x10(%r13)
>
> _______________________________________________
> ghc-devs mailing list
> ghc-devs at haskell.org
> http://www.haskell.org/mailman/listinfo/ghc-devs
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.haskell.org/pipermail/ghc-devs/attachments/20131227/138bbf13/attachment-0001.html>