optimizing StgPtr allocate (Capability *cap, W_ n)
Mikolaj Konarski
mikolaj at well-typed.com
Tue Oct 14 22:07:35 UTC 2014
Hi Bulat!
Is this exactly the same email you posted to
glasgow-haskell-users@ previously (I'm asking to know
if I need to reread)? It was already forwarded here by none
other but SPJ. :)
Welcome,
Mikolaj
On Wed, Oct 15, 2014 at 12:01 AM, Bulat Ziganshin
<bulat.ziganshin at gmail.com> wrote:
> Hello,
>
> i'm looking a the https://github.com/ghc/ghc/blob/23bb90460d7c963ee617d250fa0a33c6ac7bbc53/rts/sm/Storage.c#L680
>
> if i correctly understand, it's speed-critical routine?
>
> i think that it may be improved in this way:
>
> StgPtr allocate (Capability *cap, W_ n)
> {
> bdescr *bd;
> StgPtr p;
>
> TICK_ALLOC_HEAP_NOCTR(WDS(n));
> CCS_ALLOC(cap->r.rCCCS,n);
>
> /// here starts new improved code:
>
> bd = cap->r.rCurrentAlloc;
> if (bd == NULL || bd->free + n > bd->end) {
> if (n >= LARGE_OBJECT_THRESHOLD/sizeof(W_)) {
> ....
> }
> if (bd->free + n <= bd->start + BLOCK_SIZE_W)
> bd->end = min (bd->start + BLOCK_SIZE_W, bd->free + LARGE_OBJECT_THRESHOLD)
> goto usual_alloc;
> }
> ....
> }
>
> /// and here it stops
>
> usual_alloc:
> p = bd->free;
> bd->free += n;
>
> IF_DEBUG(sanity, ASSERT(*((StgWord8*)p) == 0xaa));
> return p;
> }
>
>
> i think it's obvious - we consolidate two if's on the crirical path
> into the single one plus avoid one ADD by keeping highly-useful bd->end pointer
>
> further improvements may include removing bd==NULL check by
> initializing bd->free=bd->end=NULL and moving entire "if" body
> into separate slow_allocate() procedure marked "noinline" with
> allocate() probably marked as forceinline:
>
> StgPtr allocate (Capability *cap, W_ n)
> {
> bdescr *bd;
> StgPtr p;
>
> TICK_ALLOC_HEAP_NOCTR(WDS(n));
> CCS_ALLOC(cap->r.rCCCS,n);
>
> bd = cap->r.rCurrentAlloc;
> if (bd->free + n > bd->end)
> return slow_allocate(cap,n);
>
> p = bd->free;
> bd->free += n;
>
> IF_DEBUG(sanity, ASSERT(*((StgWord8*)p) == 0xaa));
> return p;
> }
>
> this change will greatly simplify optimizer's work. according to my
> experience current C++ compilers are weak on optimizing large
> functions with complex execution paths and such transformations really
> improve the generated code
>
> --
> Best regards,
> Bulat mailto:Bulat.Ziganshin at gmail.com
>
> _______________________________________________
> ghc-devs mailing list
> ghc-devs at haskell.org
> http://www.haskell.org/mailman/listinfo/ghc-devs
>
More information about the ghc-devs
mailing list