What makes C++ FFI painful?

Ben Gamari ben at smart-cactus.org
Thu Oct 31 16:24:35 UTC 2024


Hécate via ghc-devs <ghc-devs at haskell.org> writes:

> Hi devs,
>
> Pardon me for the naïve question. I know that C++ FFI is *hard* in 
> Haskell. From the perspective of an end-user I have heard that `text`'s 
> new UTF-8 validation caused problems when released, as it relied on C++ 
> code, but I never had any problems with it personally. From the GHC  
> perspective, what makes C++ a difficult language to interface with?
>
There are a few reasons for this. First, there are considerations due to
the language itself:

 * C++ has an object system which our binding generators do not
   currently make any attempt to capture. Consequently, developing
   bindings to C++ libraries written in an object-oriented style can
   require a fair amount of work.

 * the prevalence of C++'s template system means that one may need to
   generate one or more C++ snippets to instantiate an interface before
   one can even begin thinking about binding to the interface. This can
   be particularly tricky in libraries where you may want polymorphism
   in the Haskell binding to be reflected in the C++ instantiation.

 * Dealing with C++'s exception system requires great care when
   developing bindings.

 * The language itself is otherwise vast in scope, with numerous
   features that don't play well with others. Thankfully, usually C++
   library authors who intend for their work to be bound by others often
   restrict themselves to easily-bound features in their outward-facing
   interfaces.

Perhaps more significantly, there are also a variety of practical
considerations:

 * there are three implementations of the C++ standard library in common
   use today (libstdc++, libc++, MSVC). Determining which library should be
   used on a particular platform, and subsequently *how* to compile/link
   against it, is quite non-trivial (e.g. see #20010). The
   `system-cxx-std-lib` meta-package introduced in GHC 9.2 was aimed
   at improving this situation for Haskell packages by making GHC
   responsible for determining this configuration in such a way that
   users can easily depend upon.

 * even once you have compiled your program, linking against the C++
   standard library introduces a dynamic library dependency on most
   platforms. This is problematic as C++ standard library availability
   and installation path is not nearly as consistent as the C standard
   library. This is particularly problematic on Windows, where dynamic
   linking is already fraught and non-MSVC runtime dependencies are
   not widely available (e.g. [1])

 * Code generated by C++ compilers tends to use less-common relocation
   types, uncovering new and exciting ways for GHC's RTS linker to crash
   (e.g. see #21618)

 * To make matters worse, C++11 introduced an ABI breakage to optimise
   the representation of `std::string`. This was in the past a source of
   linking failures due to linking differently-compiled objects into the
   same library/executable. Thankfully most people have moved to the new
   ABI at this point so it's rare to see this manifest in the wild
   except when linking against ancient proprietary binaries.

All-in-all, the difficulty is just death by a thousand cuts, often due
to platform-dependent toolchain issues, many even entirely independent
of Haskell.

Does this help?

- Ben


[1] https://github.com/haskell/ghcup-hs/issues/745
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 487 bytes
Desc: not available
URL: <http://mail.haskell.org/pipermail/ghc-devs/attachments/20241031/26766da8/attachment.sig>


More information about the ghc-devs mailing list