Performance of `count`ing a char in a ByteString

Vanessa McHale vamchale at gmail.com
Sat Feb 8 23:09:32 UTC 2020


clang is the default C compiler on Mac. It would be sensible to at least support clang as well as gcc

Cheers,
Vanessa McHale

> On Feb 8, 2020, at 4:34 PM, Georg Rudoy <0xd34df00d at gmail.com> wrote:
> 
> Hi there,
> 
> tl;dr: I've noticed ByteString's `count`'s performance can be improved
> by a factor between 2.5 and 8 by explicitly using certain SIMD
> extensions, so what's the policy on those?
> 
> Results of a small C benchmark (since the corresponding BS routine is
> implemented in C anyway) are available here (
> https://github.com/0xd34df00d/counting-chars ), as well as some sample
> code to give the idea of what's happening (which might be
> incorrect/suboptimal, my SSE/AVX is rusty). I'll also put a sample
> input file if I manage to get git-lfs going.
> 
> This change only really makes sense for large strings, but that's a
> somewhat common use case at least in my applications, so I wanted to
> gather some initial feedback, as well as ask a couple of questions and
> emphasize a couple of points.
> 
> 1. How are SIMD extensions generally controlled?
> 
> 2. Is depending on gcc OK? If so, we could leverage gcc's support of
> automatically generating a dispatching function that'll check cpuid
> for the supported extensions and call the right implementation.
> 
> 3. Does BS' allocator give any guarantees on the alignment? It'd be
> nice to get rid of that "prefix" part that makes sure the SIMD code
> churns through 64-byte-aligned data.
> 
> 4. Note that compiling with `-O3 -march=native` makes code effectively
> non-portable (at least, to CPUs lacking extensions the compiler
> decides to use), while the code with `-O2`, explicit SIMD versions and
> a dispatching function will get the best performance using the SIMD
> extensions available, degrading gracefully on older hardware. So,
> personally, when I look at that table with the results I compare the
> SSE4.2/AVX2 implementations against the `-O2` column.
> 
> 5. Thanks Gregory Collins for mentioning PCMPESTRM somewhere — that
> remark sparked the desire to do some intrinsics stuff.
> 
> -- 
>  Georg Rudoy
> _______________________________________________
> Libraries mailing list
> Libraries at haskell.org
> http://mail.haskell.org/cgi-bin/mailman/listinfo/libraries



More information about the Libraries mailing list