LLVM and dynamic linking

Austin Seipp austin at well-typed.com
Wed Jan 8 08:01:20 UTC 2014


Personally I'd be in favor of that to keep it easy, but there hasn't
really been any poll about what to do. For the most part it tends to
work fine, but I think it's the wrong thing to do in any case.

IMO the truly 'correct' thing to do, is not to rely on the system LLVM
at all, but a version specifically tested with and distributed with
GHC. This can be a private binary only we use. We already do this with
MinGW on Windows actually, because in practice, relying on versions
'in the wild' is somewhat troublesome. In our case, we really just
need the bitcode compiler and optimizer, which are pretty small
pieces.

Relying on a moving target like the system install or whatever
possible random XYZ install from SVN or (some derivative forked
toolchain!) is problematic for developers, and users invariably want
to try new combinations, which can break in subtle or odd ways.

I think it's more sensible and straightforward - for the vast majority
of users and use-cases - for us to pick version that is tested,
reliably works and optimizes code well, and ship that. Then users just
know '-fasm is faster for compiling, -fllvm will optimize better for
some code.' That's all they really need to know.

If LLVM is to be considered 'stable' for Tier 1 GHC platforms, I'm
sympathetic to Aaron's argument, and I'd say it should be held to the
same standards as the NCG. That means it should be considered a
reliable option and we should vet it to reasonable standards, even if
it's a bit more work.

It's just really hard to do that right now. But I think implementing
this wouldn't be difficult, it just has some sticky bits about how to
do it.

We can of course upgrade it over time - but I think trying to hit
moving targets in the wild is a bad long-term solution.



On Tue, Jan 7, 2014 at 3:07 PM, George Colpitts
<george.colpitts at gmail.com> wrote:
> wrt
>
> We support a wide range of LLVM versions
>
> Why can't we stop doing that and only support one or two, e.g. GHC 7.8 would
> only support llvm 3.3 and perhaps 3.4?
>
>
>
>
>
> On Tue, Jan 7, 2014 at 4:54 PM, Austin Seipp <austin at well-typed.com> wrote:
>>
>> Hi all,
>>
>> Apologies for the late reply.
>>
>> First off, one thing to note wrt GMP: GMP is an LGPL library which we
>> link against. Technically, we need to allow relinking to be compliant
>> and free of of the LGPL for our own executables, but this should be
>> reasonably possible - on systems where there is a system-wide GMP
>> installed, we use that copy (this occurs mostly on OSX and Linux.) And
>> so do executables compiled by GHC. Even when GHC uses static linking
>> or dynamic linking for haskell code in this case, it will still always
>> dynamically link to libgmp - meaning replacing the shared object
>> should be possible. This is just the way modern Linux/OSX systems
>> distribute system-wide C libraries, as you expect.
>>
>> In the case where we don't have this, we build our own copy of libgmp
>> inside the source tree and use that instead. That said there are other
>> reasons why we might want to be free of GMP entirely, but that's
>> neither here nor there. In any case, the issue is pretty orthogonal to
>> LLVM, dynamic haskell linking, etc - on a Linux system, you should
>> reasonably be able to swap out a `libgmp.so` for another modified
>> copy[1], and your Haskell programs should be compliant in this
>> regard.[2]
>>
>> Now, as for LLVM.
>>
>> For one, LLVM actually is a 'relatively' cheap backend to have around.
>> I say LLVM is 'relatively' cheap because All External Dependencies
>> Have A Cost. The code is reasonably small, and in any case GHC still
>> does most of the heavy lifting - the LLVM backend and native code
>> generator share a very large amount of code. We don't really duplicate
>> optimizations ourselves, for example, and some optimizations we do
>> perform on our IR can't be done by LLVM anyway (it doesn't have enough
>> information.)
>>
>> But LLVM has some very notable costs for GHC developers:
>>
>>   * It's slower to compile with, because it tries to re-optimize the
>> code we give it, but it mostly accomplishes nothing beyond advanced
>> optimizations like vectorization/scalar evolution.
>>   * We support a wide range of LLVM versions (a nightmare IMO) which
>> means pinning down specific versions and supporting them all is rather
>> difficult. Combined with e.g. distro maintainers who may patch bugs
>> themselves, and the things you're depending on in the wild (or what
>> users might report bugs with) aren't as solid or well understood.
>>   * LLVM is extremely large, extremely complex, and the amount of
>> people who can sensibly work on both GHC and LLVM are few and far
>> inbetween. So fixing these issues is time consuming, difficult, and
>> mostly tedious grunt work.
>>
>> All this basically sums up to the fact that dealing with LLVM comes
>> with complications all on its own that makes it a different kind of
>> beast to handle.
>>
>> So, the LLVM backend definitely needs some love. All of these things
>> are solveable (and I have some ideas for solving most of them,) but
>> none of them will quite come for free. But there are some real
>> improvements that can be made here I think, and make LLVM much more
>> smoothly supported for GHC itself. If you'd like to help it'd be
>> really appreciated - I'd like to see LLVM have more love put forth,
>> but it's a lot of work of course!.
>>
>> (Finally, in reference to the last point: I am in the obvious
>> minority, but I am favorable to having the native code generator
>> around, even if it's a bit old and crufty these days - at least it's
>> small, fast and simple enough to be grokked and hacked on, and I don't
>> think it fragments development all that much. By comparison, LLVM is a
>> mammoth beast of incredible size with a sizeable entry barrier IMO. I
>> think there's merit to having both a simple, 'obviously working'
>> option in addition to the heavy duty one.)
>>
>> [1] Relevant tool: http://nixos.org/patchelf.html
>> [2] Of course, IANAL, but there you go.
>>
>> On Wed, Jan 1, 2014 at 9:03 PM, Aaron Friel <aaron at frieltek.com> wrote:
>> > Because I think it’s going to be an organizational issue and a
>> > duplication
>> > of effort if GHC is built one way but the future direction of LLVM is
>> > another.
>> >
>> > Imagine if GCC started developing a new engine and it didn’t work with
>> > one
>> > of the biggest, most regular consumers of GCC. Say, the Linux kernel, or
>> > itself. At first, the situation is optimistic - if this engine doesn’t
>> > work
>> > for the project that has the smartest, brightest GCC hackers potentially
>> > looking at it, then it should fix itself soon enough. Suppose the
>> > situation
>> > lingers though, and continues for months without fix. The new GCC
>> > backend
>> > starts to become the default, and the community around GCC advocates for
>> > end-users to use it to optimize code for their projects and it even
>> > becomes
>> > the default for some platforms, such as ARM.
>> >
>> > What I’ve described is analogous to the GHC situation - and the result
>> > is
>> > that GHC isn’t self-hosting on some platforms and the inertia that used
>> > to
>> > be behind the LLVM backend seems to have stagnated. Whereas LLVM used to
>> > be
>> > the “new hotness”, I’ve noticed that issues like Trac #7787 no longer
>> > have a
>> > lot of eyes on them and externally it seems like GHC has accepted a
>> > bifurcated approach for development.
>> >
>> > I dramatize the situation above, but there’s some truth to it. The LLVM
>> > backend needs some care and attention and if the majority of GHC devs
>> > can’t
>> > build GHC with LLVM, then that means the smartest, brightest GHC hackers
>> > won’t have their attention turned toward fixing those problems. If a
>> > patch
>> > to GHC-HEAD broke compilation for every backend, it would be fixed in
>> > short
>> > order. If a new version of GCC did not work with GHC, I can imagine it
>> > would
>> > be only hours before the first patches came in resolving the issue. On
>> > OS X
>> > Mavericks, an incompatibility with GHC has led to a swift reaction and
>> > strong support for resolving platform issues. The attention to the LLVM
>> > backend is visibly smaller, but I don’t know enough about the people
>> > working
>> > on GHC to know if it is actually smaller.
>> >
>> > The way I am trying to change this is by making it easier for people to
>> > start using GHC (by putting images on Docker.io) and, in the process,
>> > learning about GHC’s build process and trying to make things work for my
>> > own
>> > projects. The Docker image allows anyone with a Linux kernel to build
>> > and
>> > play with GHC HEAD. The information about building GHC yourself is
>> > difficult
>> > to approach and I found it hard to get started, and I want to improve
>> > that
>> > too, so I’m learning and asking questions.
>> >
>> > From: Carter Schonwald
>> > Sent: Wednesday, January 1, 2014 5:54 PM
>>
>> > To: Aaron Friel
>> > Cc: ghc-devs at haskell.org
>> >
>> > 7.8 should have working dylib support on the llvm backend. (i believe
>> > some
>> > of the relevant patches are in head already, though Ben Gamari can opine
>> > on
>> > that)
>> >
>> > why do you want ghc to be built with llvm? (i know i've tried myself in
>> > the
>> > past, and it should be doable with 7.8 using 7.8 soon too)
>> >
>> >
>> > On Wed, Jan 1, 2014 at 5:38 PM, Aaron Friel <aaron at frieltek.com> wrote:
>> >>
>> >> Replying to include the email list. You’re right, the llvm backend and
>> >> the
>> >> gmp licensing issues are orthogonal - or should be. The problem is I
>> >> get
>> >> build errors when trying to build GHC with LLVM and dynamic libraries.
>> >>
>> >> The result is that I get a few different choices when producing a
>> >> platform
>> >> image for development, with some uncomfortable tradeoffs:
>> >>
>> >> LLVM-built GHC, dynamic libs - doesn’t build.
>> >> LLVM-built GHC, static libs - potential licensing oddities with me
>> >> shipping a statically linked ghc binary that is now gpled. I am not a
>> >> lawyer, but the situation makes me uncomfortable.
>> >> GCC/ASM-built GHC, dynamic libs - this is the *standard* for most
>> >> platforms shipping ghc binaries, but it means that one of the biggest
>> >> and
>> >> most critical users of the LLVM backend is neglecting it. It also
>> >> bifurcates
>> >> development resources for GHC. Optimization work is duplicated and
>> >> already
>> >> devs are getting into the uncomfortable position of suggesting to users
>> >> that
>> >> they should trust GHC to build your programs in a particular way, but
>> >> not
>> >> itself.
>> >> GCC/ASM-built GHC, static libs - worst of all possible worlds.
>> >>
>> >>
>> >> Because of this, the libgmp and llvm-backend issues aren’t entirely
>> >> orthogonal. Trac ticket #7885 is exactly the issue I get when trying to
>> >> compile #1.
>> >>
>> >> From: Carter Schonwald
>> >> Sent: Monday, December 30, 2013 1:05 PM
>>
>> >> To: Aaron Friel
>> >>
>> >> Good question but you forgot to email the mailing list too :-)
>> >>
>> >> Using llvm has nothing to do with Gmp. Use the native code gen (it's
>> >> simper) and integer-simple.
>> >>
>> >> That said, standard ghc dylinks to a system copy of Gmp anyways (I
>> >> think
>> >> ). Building ghc as a Dylib is orthogonal.
>> >>
>> >> -Carter
>> >>
>> >> On Dec 30, 2013, at 1:58 PM, Aaron Friel <aaron at frieltek.com> wrote:
>> >>
>> >> Excellent research - I’m curious if this is the right thread to inquire
>> >> about the status of trying to link GHC itself dynamically.
>> >>
>> >> I’ve been attempting to do so with various LLVM versions (3.2, 3.3,
>> >> 3.4)
>> >> using snapshot builds of GHC (within the past week) from git, and I hit
>> >> ticket #7885 [https://ghc.haskell.org/trac/ghc/ticket/7885] every time
>> >> (even
>> >> the exact same error message).
>> >>
>> >> I’m interested in dynamically linking GHC with LLVM to avoid the
>> >> entanglement with libgmp’s license.
>> >>
>> >> If this is the wrong thread or if I should reply instead to the trac
>> >> item,
>> >> please let me know.
>> >
>> >
>> > _______________________________________________
>> > ghc-devs mailing list
>> > ghc-devs at haskell.org
>> > http://www.haskell.org/mailman/listinfo/ghc-devs
>> >
>>
>>
>>
>> --
>> Regards,
>>
>> Austin Seipp, Haskell Consultant
>> Well-Typed LLP, http://www.well-typed.com/
>> _______________________________________________
>> ghc-devs mailing list
>> ghc-devs at haskell.org
>> http://www.haskell.org/mailman/listinfo/ghc-devs
>
>



-- 
Regards,

Austin Seipp, Haskell Consultant
Well-Typed LLP, http://www.well-typed.com/


More information about the ghc-devs mailing list