Faster Array#/MutableArray# copies

Johan Tibell johan.tibell at
Mon Feb 28 18:29:56 CET 2011

On Mon, Feb 28, 2011 at 9:01 AM, Simon Marlow <marlowsd at> wrote:
> On 18/02/2011 19:42, Nathan Howell wrote:
>> On Fri, Feb 18, 2011 at 12:54 AM, Roman Leshchinskiy <rl at
>> <mailto:rl at>> wrote:
>>    Max Bolingbroke wrote:
>>     > On 18 February 2011 01:18, Johan Tibell <johan.tibell at
>>    <mailto:johan.tibell at>> wrote:> It seems like a sufficient
>>    solution for your needs would be for us to
>>     > use the LTO support in LLVM to inline across module boundaries - in
>>     > particular to inline primop implementations into their call
>>    sites. LLVM
>>     > would then probably deal with unrolling small loops with
>>    statically known
>>     > bounds.
>>    Could we simply use this?
>> Might be easier to implement a PrimOp inlining pass, and to run it
>> before LLVM's built-in MemCpyOptimization pass [0]. This wouldn't
>> generally be as good as LTO but would work without gold.
>> [0]
> Ideally you'd want the heap check in the primop to be aggregated into the
> calling function's heap check, and the primop should allocate directly from
> the heap instead of calling out to the RTS allocate(). All this is a bit
> much to expect LLVM to do, but we could do it in the Glorious New Code
> Generator...
> For small arrays like this maybe we should have a new array type that leaves
> out all the card-marking stuff too (or just use tuples, as Roman suggested).

I might try to use tuples directly. It would be very ugly though as I
would need a sum type of 32 different tuple sizes.


More information about the Glasgow-haskell-users mailing list