What makes C++ FFI painful?
Ben Gamari
ben at smart-cactus.org
Thu Oct 31 16:24:35 UTC 2024
Hécate via ghc-devs <ghc-devs at haskell.org> writes:
> Hi devs,
>
> Pardon me for the naïve question. I know that C++ FFI is *hard* in
> Haskell. From the perspective of an end-user I have heard that `text`'s
> new UTF-8 validation caused problems when released, as it relied on C++
> code, but I never had any problems with it personally. From the GHC
> perspective, what makes C++ a difficult language to interface with?
>
There are a few reasons for this. First, there are considerations due to
the language itself:
* C++ has an object system which our binding generators do not
currently make any attempt to capture. Consequently, developing
bindings to C++ libraries written in an object-oriented style can
require a fair amount of work.
* the prevalence of C++'s template system means that one may need to
generate one or more C++ snippets to instantiate an interface before
one can even begin thinking about binding to the interface. This can
be particularly tricky in libraries where you may want polymorphism
in the Haskell binding to be reflected in the C++ instantiation.
* Dealing with C++'s exception system requires great care when
developing bindings.
* The language itself is otherwise vast in scope, with numerous
features that don't play well with others. Thankfully, usually C++
library authors who intend for their work to be bound by others often
restrict themselves to easily-bound features in their outward-facing
interfaces.
Perhaps more significantly, there are also a variety of practical
considerations:
* there are three implementations of the C++ standard library in common
use today (libstdc++, libc++, MSVC). Determining which library should be
used on a particular platform, and subsequently *how* to compile/link
against it, is quite non-trivial (e.g. see #20010). The
`system-cxx-std-lib` meta-package introduced in GHC 9.2 was aimed
at improving this situation for Haskell packages by making GHC
responsible for determining this configuration in such a way that
users can easily depend upon.
* even once you have compiled your program, linking against the C++
standard library introduces a dynamic library dependency on most
platforms. This is problematic as C++ standard library availability
and installation path is not nearly as consistent as the C standard
library. This is particularly problematic on Windows, where dynamic
linking is already fraught and non-MSVC runtime dependencies are
not widely available (e.g. [1])
* Code generated by C++ compilers tends to use less-common relocation
types, uncovering new and exciting ways for GHC's RTS linker to crash
(e.g. see #21618)
* To make matters worse, C++11 introduced an ABI breakage to optimise
the representation of `std::string`. This was in the past a source of
linking failures due to linking differently-compiled objects into the
same library/executable. Thankfully most people have moved to the new
ABI at this point so it's rare to see this manifest in the wild
except when linking against ancient proprietary binaries.
All-in-all, the difficulty is just death by a thousand cuts, often due
to platform-dependent toolchain issues, many even entirely independent
of Haskell.
Does this help?
- Ben
[1] https://github.com/haskell/ghcup-hs/issues/745
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 487 bytes
Desc: not available
URL: <http://mail.haskell.org/pipermail/ghc-devs/attachments/20241031/26766da8/attachment.sig>
More information about the ghc-devs
mailing list