scooter.phd at gmail.com
Wed Feb 17 16:15:56 EST 2010
On Wed, Feb 17, 2010 at 6:19 AM, Simon Marlow <marlowsd at gmail.com> wrote:
> On 17/02/10 07:37, Isaac Dupree wrote:
>> On 02/16/10 20:13, Roman Leshchinskiy wrote:
>>> On 15/02/2010, at 04:58, Don Stewart wrote:
>>> Do we have the blessing of the DPH team, wrt. tight, numeric inner
>>> FWIW, I don't think we even use -fvia-C when benchmarking. In general,
>>> -fvia-C is a dead end wrt numeric performance because gcc just doesn't
>>> optimise well enough. So even if we generated code that gcc could
>>> optimise properly (which we don't atm), we still would be way behind
>>> highly optimising compilers like Intel's or Sun's. IMO, the LLVM
>>> backend is the way to go here.
>> LLVM and GCC are open-source projects that are improving over time... is
>> there any particular reason we expect GCC to have poor numeric
>> performance forever?
> The problem is not the quality of the code generator in gcc vs. LLVM;
> indeed gcc is generally regarded as generating better code than LLVM right
> now, although LLVM is improving.
Depends a lot on the benchmark. The FreeBSD kernel dev crowd (one of whom
works for me) have seen performance improvements between 10-20% using LLVM
and clang over gcc. It also depends heavily on which optimization passes you
have LLVM invoke -- bear in mind that LLVM is a compiler optimization
infrastructure first and foremost.
The reason that using gcc is worse than LLVM for us in that when GHC uses
> gcc as a backend it generates C, whereas the LLVM backend generates code
> directly from GHC's internal C-- representation. Compiling via C is a
> tricky business that ultimately leads to not being able to generate as good
> code as you want(*). It would be entirely possible to hook into gcc's
> backend directly from GHC as an alternative to the LLVM backend, though LLVM
> is really intended to be used this way and has a more polished API.
Let's be a bit more specific: you can directly generate the LLVM
intermediate representation (IR) and pass that off to the LLVM optimization
passes. With gcc, you generate C code that then is processed by gcc command
line interface. I wouldn't suggest hooking into gcc's anything. LLVM is much
> Even so, LLVM doesn't let us generate exactly the code we'd like: we can't
> use GHC's tables-next-to-code optimisation. Measurements done by David Terei
> who built the LLVM backend apparently show that this doesn't matter much
> (~3% slower IIRC), though I'm still surprised that all those extra
> indirections don't have more of an effect, I think we need to investigate
> this more closely. It's important because if the LLVM backend is to be a
> compile-time option, we have to either drop tables-next-to-code, or wait
> until LLVM supports generating code in that style.
This sounds like an impedance mismatch between GHC's concept of IR and
LLVM's. There's also the danger of trying to prematurely optimize LLVM as if
it were a native backend rather than separating GHC and Haskell
language-specific optimizations from LLVM's optimizations. Like thinking you
can do your own register allocation better than LLVM, for example, is a
common one that I've seen. Just don't -- value tracing through SSA is what
LLVM is spectacularly good at. Use as many temporary variables as you like;
LLVM will eventually eliminate them.
All that said, if LLVM were to replace GHC's native backends, then one could
shift focus to engineering the IR impedance mismatch.
[disclaimer: grain of salt speculation, haven't read the code]
Tables-next-to-code has an obvious cache-friendliness property, BTW.
Generally, there's going to be some instruction prefetch into the cache.
This is likely why it's faster. Otherwise, you have to warm up the data
cache, since LLVM spills the tables into the target's constant pool.
(*) Though the main reason for this is the need to keep accurate GC
> information; if you're prepared to forego that (as in JHC) then you can
> generate much more optimisable C code.
> [now I rehash why to remove -fvia-C anyway. Feel free to ignore me.]
>> ...However, we think the native-code backends (and perhaps LLVM) will be
>> good enough within the next few years to catch up with registerized
> I should point out that for most Haskell programs, the NCG is already as
> fast (in some cases faster) than via C. The benchmarks showing a difference
> are all of the small tight loop kind - which are important to some people, I
> don't dispute that, but I expect that most people wouldn't notice the
NCGs should be faster than plain old C. Trying to produce optimized C is the
fool's errand, and I'm starting to agree with dropping that. My worry was
that the C backend would be dropped in its entirety, also a fool's errand.
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the Glasgow-haskell-users