optimizing StgPtr allocate (Capability *cap, W_ n)
Bulat Ziganshin
bulat.ziganshin at gmail.com
Tue Oct 14 22:22:37 UTC 2014
Hello Mikolaj,
Wednesday, October 15, 2014, 2:07:35 AM, you wrote:
i'm sorry, it's the same
> Hi Bulat!
> Is this exactly the same email you posted to
> glasgow-haskell-users@ previously (I'm asking to know
> if I need to reread)? It was already forwarded here by none
> other but SPJ. :)
> Welcome,
> Mikolaj
> On Wed, Oct 15, 2014 at 12:01 AM, Bulat Ziganshin
> <bulat.ziganshin at gmail.com> wrote:
>> Hello,
>>
>> i'm looking a the https://github.com/ghc/ghc/blob/23bb90460d7c963ee617d250fa0a33c6ac7bbc53/rts/sm/Storage.c#L680
>>
>> if i correctly understand, it's speed-critical routine?
>>
>> i think that it may be improved in this way:
>>
>> StgPtr allocate (Capability *cap, W_ n)
>> {
>> bdescr *bd;
>> StgPtr p;
>>
>> TICK_ALLOC_HEAP_NOCTR(WDS(n));
>> CCS_ALLOC(cap->r.rCCCS,n);
>>
>> /// here starts new improved code:
>>
>> bd = cap->r.rCurrentAlloc;
>> if (bd == NULL || bd->free + n > bd->end) {
>> if (n >= LARGE_OBJECT_THRESHOLD/sizeof(W_)) {
>> ....
>> }
>> if (bd->free + n <= bd->start + BLOCK_SIZE_W)
>> bd->end = min (bd->start + BLOCK_SIZE_W, bd->free + LARGE_OBJECT_THRESHOLD)
>> goto usual_alloc;
>> }
>> ....
>> }
>>
>> /// and here it stops
>>
>> usual_alloc:
>> p = bd->free;
>> bd->free += n;
>>
>> IF_DEBUG(sanity, ASSERT(*((StgWord8*)p) == 0xaa));
>> return p;
>> }
>>
>>
>> i think it's obvious - we consolidate two if's on the crirical path
>> into the single one plus avoid one ADD by keeping highly-useful bd->end pointer
>>
>> further improvements may include removing bd==NULL check by
>> initializing bd->free=bd->end=NULL and moving entire "if" body
>> into separate slow_allocate() procedure marked "noinline" with
>> allocate() probably marked as forceinline:
>>
>> StgPtr allocate (Capability *cap, W_ n)
>> {
>> bdescr *bd;
>> StgPtr p;
>>
>> TICK_ALLOC_HEAP_NOCTR(WDS(n));
>> CCS_ALLOC(cap->r.rCCCS,n);
>>
>> bd = cap->r.rCurrentAlloc;
>> if (bd->free + n > bd->end)
>> return slow_allocate(cap,n);
>>
>> p = bd->free;
>> bd->free += n;
>>
>> IF_DEBUG(sanity, ASSERT(*((StgWord8*)p) == 0xaa));
>> return p;
>> }
>>
>> this change will greatly simplify optimizer's work. according to my
>> experience current C++ compilers are weak on optimizing large
>> functions with complex execution paths and such transformations really
>> improve the generated code
>>
>> --
>> Best regards,
>> Bulat mailto:Bulat.Ziganshin at gmail.com
>>
>> _______________________________________________
>> ghc-devs mailing list
>> ghc-devs at haskell.org
>> http://www.haskell.org/mailman/listinfo/ghc-devs
>>
--
Best regards,
Bulat mailto:Bulat.Ziganshin at gmail.com
More information about the ghc-devs
mailing list