[GHC] #7602: Threaded RTS performing badly on recent OS X (10.8?)
GHC
cvs-ghc at haskell.org
Thu Jan 17 22:17:24 CET 2013
#7602: Threaded RTS performing badly on recent OS X (10.8?)
---------------------------------+------------------------------------------
Reporter: simonmar | Owner:
Type: bug | Status: new
Priority: normal | Milestone: _|_
Component: Runtime System | Version: 7.6.1
Keywords: | Os: Unknown/Multiple
Architecture: Unknown/Multiple | Failure: None/Unknown
Difficulty: Unknown | Testcase:
Blockedby: | Blocking:
Related: |
---------------------------------+------------------------------------------
Comment(by thoughtpolice):
I don't think so, or at least it doesn't in my trivial case:
{{{
#include <stdio.h>
#include <stdlib.h>
__thread int foo;
int main(int ac, char* av[]) {
if (ac < 2) foo = 10;
else foo = atoi(av[1]);
printf("foo = %d\n", foo);
return 0;
}
}}}
On Mac OS X 10.8, with Clang 3.2, I can compile this with no special
options. Disassembling, we see:
{{{
$ lldb ./a.out
(lldb) disassemble -m -n main
a.out`main at tls.c:6
5
6 int main(int ac, char* av[]) {
7 if (ac < 2) foo = 10;
a.out[0x100000eb0]: pushq %rbp
a.out[0x100000eb1]: movq %rsp, %rbp
a.out[0x100000eb4]: subq $48, %rsp
a.out[0x100000eb8]: movl $0, -4(%rbp)
a.out[0x100000ebf]: movl %edi, -8(%rbp)
a.out[0x100000ec2]: movq %rsi, -16(%rbp)
a.out`main + 22 at tls.c:7
6 int main(int ac, char* av[]) {
7 if (ac < 2) foo = 10;
8 else foo = atoi(av[1]);
a.out[0x100000ec6]: cmpl $2, -8(%rbp)
a.out[0x100000ecd]: jge 0x100000ee7 ; main + 55 at
tls.c:8
a.out[0x100000ed3]: leaq 326(%rip), %rdi ; foo
a.out[0x100000eda]: callq *(%rdi)
a.out[0x100000edc]: movl $10, (%rax)
a.out[0x100000ee2]: jmpq 0x100000f05 ; main + 85 at
tls.c:8
a.out`main + 55 at tls.c:8
7 if (ac < 2) foo = 10;
8 else foo = atoi(av[1]);
9
a.out[0x100000ee7]: movq -16(%rbp), %rax
a.out[0x100000eeb]: movq 8(%rax), %rdi
a.out[0x100000eef]: callq 0x100000f36 ; symbol stub for:
atoi
a.out[0x100000ef4]: leaq 293(%rip), %rdi ; foo
a.out[0x100000efb]: movl %eax, -20(%rbp)
a.out[0x100000efe]: callq *(%rdi)
a.out[0x100000f00]: movl -20(%rbp), %ecx
a.out[0x100000f03]: movl %ecx, (%rax)
a.out[0x100000f05]: leaq 92(%rip), %rdi ; "foo = %d\n"
a.out`main + 92 at tls.c:10
9
10 printf("foo = %d\n", foo);
11
a.out[0x100000f0c]: movq %rdi, -32(%rbp)
a.out[0x100000f10]: leaq 265(%rip), %rdi ; foo
a.out[0x100000f17]: callq *(%rdi)
a.out[0x100000f19]: movl (%rax), %esi
a.out[0x100000f1b]: movq -32(%rbp), %rdi
a.out[0x100000f1f]: movb $0, %al
a.out[0x100000f21]: callq 0x100000f3c ; symbol stub for:
printf
a.out[0x100000f26]: movl $0, %esi
a.out`main + 123 at tls.c:12
11
12 return 0;
13 }
a.out[0x100000f2b]: movl %eax, -36(%rbp)
a.out[0x100000f2e]: movl %esi, %eax
a.out[0x100000f30]: addq $48, %rsp
a.out[0x100000f34]: popq %rbp
a.out[0x100000f35]: ret
(lldb) ^D
}}}
In the origial post, David says that we basically get code like:
{{{
call getThreadLocalVar
movq (%rdi),%rdi #deref the key which is an index into the tls
memory
jmp <dynamic_linker_stub>
movq %gs:0x00000060(,%rdi,8),%rax #pthread_getspecific body
ret
}}}
where the biggest penalty is the jump into dyld to do linking for the
stub. This code does still exist in the latest implementation of Apple's
libc:
http://www.opensource.apple.com/source/Libc/Libc-594.9.1/pthreads/pthread_machdep.h
(Look at the __OPTIMIZE__ implementation.)
However, Clang on OS X seems to directly avoid this? I'm not sure why the
offsets of ```leaq``` for ```foo``` seem to decrease for every access...
I attempted to look through the LLVM source code for specific notes about
this, but the new TLS support is of course deeply ingrained in the new
release, so it's hard to point out any one thing about this behavioral
change.
I'll investigate this more over the next few days and look at disassembly
outputs, we should be able to see if this buys is anything at all pretty
quickly.
We don't use TLS for x86, only register variables, correct? If so, then
this still leaves 32bit OS X users up a creek a bit, but Apple and the
community are largely moving away from this anyway, it seems.
--
Ticket URL: <http://hackage.haskell.org/trac/ghc/ticket/7602#comment:4>
GHC <http://www.haskell.org/ghc/>
The Glasgow Haskell Compiler
More information about the ghc-tickets
mailing list