Should we always inline newByteArray#?

Thu Mar 13 21:24:39 UTC 2014

On 13/03/14 20:39, Johan Tibell wrote:
> Hi all,
>
> After some refactoring of the StgCmmPrim, it's now possible to have both
> an inline and an out-of-line (in PrimOps.cmm) version of the same
> primop. Very soon (#8876) we'll have both an inline and an out-of-line
> version of newByteArray#. The inline version is used when the array size
> is statically known and commons up the allocation with the normal heap
> check.
>
> The reason to have both versions is that we don't want to increase code
> size to much (by inlining a primop which implementation is large) unless
> we know that there's a benefit in doing so. However, the newByteArray#
> implementation is one function call (to allocate) followed by three
> stores (to the closure header). Perhaps, that's small enough to always
> inline? It would save one function call for each call to newByteArray#.
>
> Anyone have any thoughts on whether always inlining would be a good idea?

It's a bad idea for large arrays (>= 3k), because when allocated via 
allocate() these arrays get a blocked marked with BF_LARGE that doesn't 
get copied during GC.

It might be a good idea for arrays less than this size (including the 
header).  It's a bad idea if the size isn't statically known, though.

Cheers,
Simon