PSA: GHC can now be built with Clang
aseipp at pobox.com
Fri Jun 28 11:51:03 CEST 2013
Unfortunately the two machines are fairly wildly different in their
hardware characteristics. The OS X machine has 4GB of RAM; 8 Core i7.
The Linux machine has 16GB, but only a 4-core i5. And they have
different clock speeds.
I'll get GCC 4.8 on my OS X machine so I can force a build with it and
compare, but that'll take a while.
Also, as I said, the Linux/Clang build technically has this very same
bug too (GCTDecl.h needs to be modified slightly to fix this, because
it *always* falls back to pthread_getspecific/setspecific currently
with an LLVM based compiler. On Linux, we can just use __thread
instead.) There, compared to a Linux/GCC build the change is
approximately a ~30% difference on -threaded applications like
gc_bench. So the slowdown ratios *seem* relatively consistent.
Anyway, I'll get around to some full nofib runs today if possible, but
the OS X machine will be a little sluggish. I have to do some other
stuff today for this, anyway (like actually getting my second patch to
Clang accepted today, hopefully.)
Interested parties who'd like to see this change, as it stands, can
look at my diff here (it's the 'clang-fast-tls' branch on my GHC
On Fri, Jun 28, 2013 at 4:37 AM, Simon Marlow <marlowsd at gmail.com> wrote:
> On 26/06/13 04:13, Austin Seipp wrote:
>> Thanks Manuel!
>> I have an update on this work (I am also CC'ing glasgow-haskell-users,
>> as I forgot last time.) The TL;DR is this:
>> * HEAD will correctly work with Clang 3.4svn on both Linux, and OS X.
>> * I have a small, 6-line patch to Clang to fix the build failure in
>> primitive (Clang was too eager to stringify something.) Once this fix
>> is integrated into Clang (hopefully very soon,) it will be possible to
>> build GHC entirely including all stage2 libraries without any patches.
>> The patch is here: http://llvm.org/bugs/show_bug.cgi?id=16371 - I am
>> hoping this will also make it into XCode 5.
>> * I still have to eliminate some warnings throughout the build, which
>> will require fiddling and a bit of refactoring. The testsuite still
>> probably won't run cleanly on Linux, at least, until this is done I'm
>> afraid (but then again I haven't tried...)
>> As for the infamous ticket #7602, the large performance regression on
>> Mac OS X, I have some numbers finally between my fast-TLS and slow-TLS
>> ./gc_bench.slow-tls 19 500000 5 22 +RTS -H180m -N7 -RTS 395.57s user
>> 173.18s system 138% cpu 6:50.71 total
>> ./gc_bench.fast-tls 19 500000 5 22 +RTS -H180m -N7 -RTS 322.98s user
>> 132.37s system 132% cpu 5:44.65 total
>> Now, this probably looks totally awful from a scalability POV. And,
>> well, yeah, it is. But I am almost 100% certain there is something
>> extremely screwy going on with my machine here. I base this on the
>> fact that during gc_bench, kernel_task was eating up about ~600% of my
>> CPU consistently, giving user threads no time to run. I've noticed
>> this with other applications that were totally unrelated too (close
>> tweetbot -> 800% CPU usage,) so I guess it's time to learn DTrace. Or
>> turn it on and off again or something. Ugh.
>> Anyway, if you look at the user times, you get a nice 30% speedup
>> which is about what we expect!
> 30% better than before is good, but we need some absolute figures. Can you
> validate that against the performance on Linux, or against the performance
> you get when the RTS is compiled with gcc? If it's hard to get a direct
> comparison on equivalent hardware. you could compare the slowdown with
> -threaded on Linux and OS X.
>> On a related note, due to the source code structure at the moment,
>> Linux/Clang hilariously suffers from this same bug. That's because
>> while Clang on Linux supports extremely fast TLS via __thread (like
>> GCC,) it falls back to pthread_getspecific/setspecific. I haven't
>> fixed this yet. It'll happen after I fix #7602 and get it merged in.
>> On my Linux machine, gc_bench also sees a consistent 30% speedup
>> between these two approaches, so I think this is a relatively accurate
>> measurement. Well, as accurate as I can be without running nofib just
>> yet. So if you're just dying to have GHC HEAD built with Clang HEAD on
>> Linux because you've got reasons, you should probably hold on.
>> I also may have a similar, better approach to fixing #7602 that is not
>> entirely as evil and sneaky as crashing the WebKit party. I'll follow
>> up on this soon when I have more info in a separate thread to confer
>> with Simon. With nofib results. I hope.
>> But anyway, 7.8 will be shaping up quite nicely - in particular in the
>> Mac OS X department, I hope. Please feel free to pester me with
>> questions or if you attempt something and it doesn't work.
>> On Tue, Jun 25, 2013 at 7:34 PM, Manuel M T Chakravarty
>> <chak at cse.unsw.edu.au> wrote:
>>> Thank you very much for taking care of all these clang issues — that is
>>> very helpful!
>>> Austin Seipp <aseipp at pobox.com>:
>>>> Hi all,
>>>> As of commit 5dc74f it should now be possible to build a working
>>>> stage1 and stage2 compiler with (an extremely recent) Clang. With some
>>>> You can just do:
>>>> $ CC=/path/to/clang ./configure --with-gcc=/path/to/clang
>>>> $ make
>>>> I have done this work on Linux. I don't expect much difficulty on Mac
>>>> OS X, but it needs testing. Ditto with Windows, although Clang/mingw
>>>> is considered experimental.
>>>> The current caveats are:
>>>> * The testsuite will probably fail everywhere, because of some
>>>> warnings that happen during the linking phase when you invoke the
>>>> built compiler. So the testsuite runner will probably be unhappy.
>>>> Clang is very noisy about unused options, unlike GCC. That needs to be
>>>> fixed somewhere in DriverPipeline I'd guess, but with some
>>>> * Some of the stage2 libraries don't build due to a Clang bug. These
>>>> are vector/primitive/dph so far.
>>>> * There is no buildbot or anything to cover it.
>>>> You will need a very recent Clang. Due to this bug (preventing
>>>> primitive etc from building,) you'll preferably want to use an SVN
>>>> checkout from about 6 hours ago at latest:
>>>> Hilariously, this bug was tripped on primitive's Data.Primitive.Types
>>>> module due to some CPP weirdness. But even with a proper bugfix and no
>>>> segfault, it still fails to correctly parse this same module with the
>>>> same CPP declarations. I'm fairly certain this is another bug in
>>>> Clang, but I might be wrong. I'm trying to isolate it. Unfortunately
>>>> Clang/LLVM 3.3 was just released and it won't see bugfix releases. But
>>>> it will *probably* work if we just get rid of the CPP tomfoolery in
>>>> primitive. I'll be testing it in the next few days to see if we can
>>>> get 3.3 supported. (I'm sort of kicking myself in the head for not
>>>> doing this a week or two ago...)
>>>> Anyway, there are some rough edges but it should be in shape for 7.8 I
>>>> hope. It should be especially welcome for Mac users. (I'm also hoping
>>>> modern Macs could even go all-clang-all-the-time if my fix for #7602
>>>> can go in soon...)
>>>> PS. If you use ASSERT, I just fixed a lot of instances of using that
>>>> macro, involving whitespace between the name and arguments (commit
>>>> d8ee2b.) Clang does not forgive you for this. Should I note this
>>>> anywhere for the future in the wiki or something?
>>>> Austin - PGP: 4096R/0x91384671
>>>> ghc-devs mailing list
>>>> ghc-devs at haskell.org
Austin - PGP: 4096R/0x91384671
More information about the ghc-devs