Proposal: change the Bits instance for Bool to align with other basic types and support branchless calculations

Sun Sep 28 21:08:53 UTC 2014

Hi all,

> -----Original message-----
> From: Edward Kmett <ekmett at gmail.com>
> Sent: 28 Sep 2014, 13:06
>
> In my quick tests, the branchless form only looked to be about 10% faster.

Maybe we should have a real benchmark. I tried to measure these numbers
myself and got very varied results, from no change at all, to 80% faster
(that was actually lazy version of && versus strict .&. because of the
unboxing).

It would be really helpful if someone could create and publish benchmark
comparing the implementations (I could not come up with a good one).

If we agree on the 10%, I would be -1 on the whole proposal. The reason
from the performance view would be that in half of the cases, the
shortcutting kicks in and we may get faster execution (and in the worst
case 10% slowdown on the ands themselves, not on the rest of the
computation). The reason from the language point of view is that
I would expect a .&. b to be shortcutting on Bools.

Cheers,
Milan