Statically linking a small piece of C into every GHC generated binary

Johan Tibell johan.tibell at
Tue Jul 19 18:02:38 CEST 2011


I'm trying to add support for the POPCNT instruction, which exists on
some modern CPUs (e.g. Nehalem). The idea is to add a popCnt# primop
which would generate a POPCNT instruction when compiling with
-msse4.2. If the user didn't specified -msse4.2, the primop should
fall back to some other implementation of population count. A good
fallback, in terms of both speed and memory usage, is this
lookup-table based function:

static char popcount_table_8[256] = {
  /*0*/ 0,
  /*1*/ 1,
  /*2*/ 1,
  /*3*/ 2,
  /*4*/ 1,
  /*5*/ 2,
  /*6*/ 2,
  /*7*/ 3,
  /*8*/ 1,
  /*9*/ 2,
  /*10*/ 2,
  /*11*/ 3,

/* Table-driven popcount, with 8-bit tables */
/* 6 ops plus 4 casts and 4 lookups, 0 long immediates, 4 stages */
inline uint32_t
popcount(uint32_t x)
    return popcount_table_8[(uint8_t)x] +
       popcount_table_8[(uint8_t)(x >> 8)] +
       popcount_table_8[(uint8_t)(x >> 16)] +
       popcount_table_8[(uint8_t)(x >> 24)];

(GCC and LLVM use the same fallback method.)

It's important that the fallback is as good as it gets so that the
user of the primop doesn't have to implement their own fallback (which
is very complicated as the user would have to detect whether -msse4.2
is used or not!). This precludes non-table based solutions (as they're

I've implemented the primop but run into some difficulty: to use the
above fallback I need the code to be statically linked into every
binary. I'm not quite sure how to achieve that. GCC manages by having
the above function definition in libc, which is always statically
linked. I think LLVM uses a small statically linked compiler run-time
library for the same purpose.

How would one go about having a small C library linked into every
Haskell binary? If we go ahead and implement more of these modern
instructions we're likely to need more fallbacks (so this isn't needed
by just POPCNT).


More information about the Glasgow-haskell-users mailing list