Performance of small allocations via prim ops

Fri Apr 7 12:41:20 UTC 2023

Great /fast experimentation!

I will admit I’m pleased that my dated intuition is still correct, but more
importantly we have more current data!

Thanks for the exploration and sharing what you found!

On Fri, Apr 7, 2023 at 7:35 AM Harendra Kumar <harendra.kumar at gmail.com>
wrote:

>
>
> On Fri, 7 Apr 2023 at 02:18, Carter Schonwald <carter.schonwald at gmail.com>
> wrote:
>
>> That sounds like a worthy experiment!
>>
>> I  guess that would look like having an inline macro’d up path that
>> checks if it can get the job done that falls back to the general code?
>>
>> Last I checked, the overhead for this sort of c call was on the order of
>> 10nanoseconds or less which seems like it’d be very unlikely to be a
>> bottleneck, but do you have any natural or artificial benchmark programs
>> that would show case this?
>>
>
> I converted my example code into a loop and ran it a million times with a
> 1 byte array size (would be 8 bytes after alignment). So roughly 3 words
> would be allocated per array, including the header and length. It took 5 ms
> using the statically known size optimization which inlines the alloc
> completely, and 10 ms using an unknown size (from program arg) which makes
> a call to newByteArray# . That turns out to be of the order of 5ns more per
> allocation. It does not sound like a big deal.
>
> -harendra
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.haskell.org/pipermail/ghc-devs/attachments/20230407/13044255/attachment.html>