Stop holding hadrian back with backwards compatibility

Thu Feb 11 09:30:22 UTC 2021

Hi, Just leaving my two cents feel free to ignore..

> I almost suggested that this had to be the reason for the back-compat
design

You're right, but not for backwards compat of Hadrian vs Make, but for
compat with RTS versions.
I could be wrong, but my understanding is the current design in Make is
just an artifact of getting something that works on all OSes without much
pain, but has proven to be suboptimal in a very important use case (slight
detour time):

You have to make a choice of which RTS to use at compile time.  Which is
quite bad.  Because it means that you can't swap between two RTS flavors
with the same ABI. It also means building presents a problem, you want your
compiler at the end of stage1 to use your new rts, not the one of the
stage0 compiler.

You can't have multiple versions of the RTS in one library, but if you have
the full name as a dependency the dynamic loader happily loads you multiple
copies.

To solve this issue the design was made to not declare the RTS as a
dependency on any haskell library. i.e. there's not DT_NEEDED entry for it
on ELF operating systems.  Which means before you load a Haskell produced
dynamic library on Linux you need to LD_PRELOAD an rts. It's clunky, but it
works, it allows you to switch between debug and non-debug rts at
initialization time.

On Windows, this problem was punted, because everything is statically
linked.  But the problem exists that you can have multiple DLLs with
different RTS and ABIs.  This is fine as long as the DLLs have no
dependencies on each other. Once they do... you have a big problem.  This
is one of the primary blockers of shared library support on Windows.

I.. don't know whatever wacky solution MacOS uses so can't comment there.

Now back to the original question about version 1.0, this has nothing to do
with Make at all. Make based system only implemented the scheme that was
wanted. It's not like any Make system design issues forced this scheme. Now
over the years, assumptions that the RTS is always version 1.0 could have
krept into the build system.  But I don't believe this to have been design,
just convenience. Right now, the design only requires you to know the GHC
version, which is available in all makefiles.  Knowing the RTS version
would be difficult, but the point is that in a proper design you don't need
to know the version.

Almost half a decade ago a plan was made to replace this scheme with one
that would work on all OSes and would allow us to solve these issues. The
design was made and debated here
https://gitlab.haskell.org/ghc/ghc/-/issues/10352

The actual solution isn't as simple as just adding the rts version to the
library name or add it only to the build system, in fact this would be the
wrong approach as it makes it impossible to observe backwards compatibility
between GHC releases.
i.e. without it, you'd need to have GHC 9.0.1 installed to run GHC 9.0.1
programs, you can't run using GHC 9.2.x rts if the version changed.

Typically ELF based platforms solve this by a combination of SONAME and
symbol versioning.  Windows solves this by a combination of SxS Assembly
versioning or mingw style SONAME.

All of which require you to have the same filename for the libraries, but
use a different path to disambiguate:

lib/ghc-${ver}/rts-1.0/libHSrts-ghc${ver}.so

lib/ghc-${ver}/rts-1.0/thr/libHSrts-ghc${ver}.so

lib/ghc-${ver}/rts-1.0/debug/libHSrts-ghc${ver}.so

lib/ghc-${ver}/rts-1.0/l/libHSrts-ghc${ver}.so

lib/ghc-${ver}/rts-1.0/thr_l/libHSrts-ghc${ver}.so

for each RTS with the same ABI. profiling libs for instance have a
different ABI and can't use this scheme.

So what has taken so long to implement this? Well.. time. As it turns out,
getting this scheme to work required a lot of foundational work in GHC
(Particularly on Windows where dynamic linking design wasn't optimal, but
both GHC and the dynamic linker are happy now).

On Linux it took a while to get SONAME support in cabal
https://github.com/haskell/cabal/issues/4052 so we don't have to hack
around in the build system.

But anyway this is why the current scheme exists, and why just adding an
rts version isn't really sufficient, especially if the name propagates to
the shared lib.

TL;DR;

If we are going to change the build system, we should do it properly.

The current scheme exists because GHC does not observe any mechanism to
support multiple runtimes with the same ABI and does not really have a
backwards compatibility story.

Kind Regards,

Tamar

On Wed, Feb 10, 2021 at 11:00 PM Richard Eisenberg <rae at richarde.dev> wrote:

>
>
> On Feb 10, 2021, at 8:50 AM, Simon Peyton Jones <simonpj at microsoft.com>
> wrote:
>
> build with hadrian, and then continue using make with the artifacts
> (partially) built by Hadrian
>
>
> I almost suggested that this had to be the reason for the back-compat
> design, but I assumed I had to be wrong. I also agree this is a non-goal;
> I'm quite happy to be forced to pick one or the other and stick with that
> choice until blasting away all build products.
>
> Richard
> _______________________________________________
> ghc-devs mailing list
> ghc-devs at haskell.org
> http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.haskell.org/pipermail/ghc-devs/attachments/20210211/c709b61d/attachment.html>