FFI calls: is it possible to allocate a small memory block on a
stack?
Denys Rtveliashvili
rtvd at mac.com
Thu Apr 22 23:39:21 EDT 2010
Hi Simon,
> Thanks - I already did this for alloca/malloc, I'll add the others from
> your patch.
Thank you.
> We go to quite a lot of trouble to avoid locking in the common cases and
> fast paths - most of our data structures are CPU-local. Where in
> particular have you encountered locking that could be reduced?
> The pinned_object_block is CPU-local, usually no locking is required.
> Only when the block is full do we have to get a new block from the block
> allocator, and that requires a lock, but it's a rare case.
OK, the code I have checked out from the repository contains this in
"rts/sm/Storage.h":
extern bdescr * pinned_object_block;
And in "rts/sm/Storage.c":
bdescr *pinned_object_block;
My C might be rusty, but I see no way for pinned_object_block to be CPU
local. If it is truly CPU local then what makes it to be that kind?
As for locking, here is one one of examples:
StgPtr
allocatePinned( lnat n )
{
StgPtr p;
bdescr *bd = pinned_object_block;
// If the request is for a large object, then allocate()
// will give us a pinned object anyway.
if (n >= LARGE_OBJECT_THRESHOLD/sizeof(W_)) {
p = allocate(n);
Bdescr(p)->flags |= BF_PINNED;
return p;
}
ACQUIRE_SM_LOCK; // [RTVD: here we acquire the lock]
TICK_ALLOC_HEAP_NOCTR(n);
CCS_ALLOC(CCCS,n);
// If we don't have a block of pinned objects yet, or the
current
// one isn't large enough to hold the new object, allocate a
new one.
if (bd == NULL || (bd->free + n) > (bd->start +
BLOCK_SIZE_W)) {
pinned_object_block = bd = allocBlock();
dbl_link_onto(bd, &g0s0->large_objects);
g0s0->n_large_blocks++;
bd->gen_no = 0;
bd->step = g0s0;
bd->flags = BF_PINNED | BF_LARGE;
bd->free = bd->start;
alloc_blocks++;
}
p = bd->free;
bd->free += n;
RELEASE_SM_LOCK; // [RTVD: here we release the lock]
return p;
}
Of course, TICK_ALLOC_HEAP_NOCTR and CCS_ALLOC may require
synchronization if they use shared state (which is, again,
probably unnecessary). However, in case no profiling goes on and
"pinned_object_block" is TSO-local, isn't it possible to remove
locking completely from this code? The only case when locking
will be necessary is when a fresh block has to be allocated, and
that can be done within the "allocBlock" method (or, more
precisely, by using "allocBlock_lock".
ACQUIRE_SM_LOCK/RELEASE_SM_LOCK pair is present in other places
too, but I have not analysed yet if it is really necessary
there. For example, things like newCAF and newDynCAF are wrapped
into it.
With kind regards,
Denys Rtveliashvili
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://www.haskell.org/pipermail/glasgow-haskell-users/attachments/20100422/c69fbc50/attachment.html
More information about the Glasgow-haskell-users
mailing list