Proposal: provide cas and barriers symbols even without -threaded

Ryan Newton rrnewton at
Fri Jul 19 19:02:52 CEST 2013

Yes, I'd absolutely rather not suffer C call overhead for these functions
(or the CAS functions).  But isn't that how it's done currently for the
casMutVar# primop?

To avoid the overhead, is it necessary to make each primop in-line rather
than out-of-line, or just to get rid of the "ccall"?

Another reason it would be good to package these with GHC is that I'm
having trouble building robust libraries of foreign primops that work under
all "ways" (e.g. GHCI).  For example, this bug:

If I write .cmm code that depends on RTS functionality like
stg_MUT_VAR_CLEAN_info, then it seems to work fine when in compiled mode
(with/without threading, profiling), but I get link errors from GHCI where
these symbols aren't defined.

I've got a draft of the relevant primops here:

Which includes:

   - variants of CAS for MutableArray# and MutableByteArray#
   - fetch-and-add for MutableByteArray#

Also, there are some tweaks to support the new "ticketed" interface for
safer CAS:

I started adding some of these primops to GHC proper (still as
out-of-line), but not all of them.  I had gone with the foreign primop
route instead...


P.S. Where is the write barrier primop?  I don't see it listed in

On Fri, Jul 19, 2013 at 11:41 AM, Carter Schonwald <
carter.schonwald at> wrote:

> I guess I should find the time to finish the CAS primop work I volunteered
> to do then. Ill look into in a few days.
> On Friday, July 19, 2013, Simon Marlow wrote:
>> On 18/07/13 14:17, Ryan Newton wrote:
>>> The "atomic-primops" library depends on symbols such as
>>> store_load_barrier and "cas", which are defined in SMP.h.  Thus the
>>> result is that if the program is linked WITHOUT "-threaded", the user
>>> gets a linker error about undefined symbols.
>>> The specific place it's used is in the 'foreign "C"' bits of this .cmm
>>> code:
>>> 87e63b21b2a6c375e93c30b98c28c1**d04f88781c/AtomicPrimops/**
>>> cbits/primops.cmm<>
>>> I'm trying to explore hacks that will enable me to pull in those
>>> functions during compile time, without duplicating a whole bunch of code
>>> from the RTS.  But it's a fragile business.
>>> It seems to me that some of these routines have general utility.  In
>>> future versions of GHC, could we consider linking in those routines
>>> irrespective of "-threaded"?
>> We should make the non-THREADED versions EXTERN_INLINE too, so that there
>> will be (empty) functions to call in rts/Inlines.c.  Want to submit a patch?
>> A better solution would be to make them into primops.  You don't really
>> want to be calling out to a C function to implement a memory barrier. We
>> have this for write_barrier(), but none of the others so far.  Of couse
>> that's a larger change.
>> Cheers,
>>         Simon
>> ______________________________**_________________
>> ghc-devs mailing list
>> ghc-devs at
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>

More information about the ghc-devs mailing list