FFI calls: is it possible to allocate a small memory block on a
rtvd at mac.com
Thu Apr 22 16:25:22 EDT 2010
Thank you, Simon
I have identified a number of problems and have created patches for a
couple of them. A ticket #4004 was raised in trac and I hope that
someone would take a look and put it into repository if the patches look
Things I did:
* Inlining for a few functions
* changed multiplication and division in include/Cmm.h to bit shifts
Things that can be done:
* optimizations in the threaded RTS. Locking is used frequently, and
every locking on a normal mutex in "POSIX threads" costs about 20
nanoseconds on my computer.
* moving some computations from Cmm code to Haskell. This requires
passing an information on word size and things like that to Haskell
code, but the benefit is that some computations can be performed
statically as they depend primarily on the data type we allocate space
* fix/improvement for Cmm compiler. There is some code in it already
which substitutes divisions and multiplications by 2^n by bit shifts,
but for some reason it does not work. Also, divisions can be replaced by
multiplications with bit shifts in general case.
Also, while looking at this thing I've got a number of questions. One of
them is this:
What is the meaning of "pinned_object_block" in rts/sm/Storage.h and why
is it shared between TSOs? It looks like "allocatePinned" has to lock on
SM_MUTEX every time it is called (in threaded RTS) because other threads
can be accessing it. More than that, this block of memory is assigned to
a nursery of one of the TSOs. Why should it be shared with the rest of
the world then instead of being local to TSO?
On the side note, is London HUG still active? The website seems to be
With kind regards,
> Adding an INLINE pragma is the right thing for alloca and similar functions.
> alloca is a small overloaded wrapper around allocaBytesAligned, and
> without the INLINE pragma the body of allocaBytesAligned gets inlined
> into alloca itself, making it too big to be inlined at the call site
> (you can work around it with e.g. -funfolding-use-threshold=100). This
> is really a case of manual worker/wrapper: we want to tell GHC that
> alloca is a wrapper, and the way to do that is with INLINE. Ideally GHC
> would manage this itself - there's a lot of scope for doing some general
> code splitting, I don't think anyone has explored that yet.
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the Glasgow-haskell-users