Proposal: change the Bits instance for Bool to align with other basic types and support branchless calculations

David Feuer david.feuer at gmail.com
Sun Sep 28 21:25:07 UTC 2014


On Sep 28, 2014 4:15 PM, "Carter Schonwald" <carter.schonwald at gmail.com>
wrote:
>
> Brandon, I think you're articulating a very good point here, in noting
that the current set of instances for Bits that *are* strict are precisely
so because their underlying representations are spine strict!
>
> From that perspective, it is definitely valid to argue that the
shortcircuiting behavior of bool is in fact correct.

Maybe, but maybe not so much. For example, it would make sense (to me, at
least) to write

instance Bits x => Bits [x] where
  xs .&. ys = zipWith (.&.) xs ys
  xs .|. ya = zipWith (.|.) xs ys
  ....

This is, I guess you'd say, partially spine-lazy, but it's not
short-circuiting.

> Its also  important to keep in mind that theres many examples where the
branchless version of bool is not a win

Yes, of course. As Edward Kmett indicated, there seems to be clear value in
offering both short-circuiting and branchless Bool operators in *some*
fashion, whether the branchless ones do or don't end up in Bits.

> We've not yet articulated an example where someone will be writing code
that is both generic AND branchless (indeed, I'm not certain if there is
actually a meaningful performance win in the fully polymorphic case when
its the desired semantics).

If you're arguing that Bool shouldn't be in Bits at all, I could understand
that viewpoint. There are a number of Bits members that reduce to useless
trivialities for Bool. But I would turn your argument around—where's the
code that's both generic and short-circuiting? What makes it valuable to
put Bool in Bits at all? The most valuable thing *I* see is to give it
branchless operators that match a common interface. Edward Kmett sees a
different benefit—but a different benefit requiring the whole class to be
refactored to become something else.

> Additionally, one entire space left unaddressed here is this: "why cant
we formulate an optimization algorithm to add to ghc that tries to estimate
when transformation into a (relatively) branchless formulation of the same
code"?

This is a *very* good question. I don't know enough about compilation to
really say, and there are surely some challenges, but I think it's an
important question. If y is already forced, or it's certain to be forced
later, then

case x of
  True -> case y of
            True -> e1
            False -> e2
  False -> e2

Is the same as if (x & y) then e1 else e2. The next question is whether e1
and e2 are such that the case selection can become a conditional move or
something. I have no idea what's involved in general. It also seems more
awkward to write

if y `seq` x && y then blah else bloop

than to write

if x .&. y then blah else bloop or even

if x `bland` y then blah else bloop --Yucky name, but whatever.

But to catch cases programmers might not recognize, I think your idea
sounds interesting.

> a style i've grown to favor is the following: make it easy to write
generic code thats always correct (wrt complexity), and make it sane to do
specialization/optimization by hand later (and/or teach the compiler.)

"Correct wrt complexity" is clearly something we disagree on here.

> Branchless code is immune to that problem, but if you dont have bad
branch prediction, the branching code will be faster / simpler than the
branch free  code.

Sometimes. The branchless code is sometimes still slightly faster,
depending on the situation, and of course it's also *smaller* in many cases.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.haskell.org/pipermail/libraries/attachments/20140928/23714539/attachment.html>


More information about the Libraries mailing list