Re: Branchless implementation for literal case – is it worth it?
Gregory Collins
greg at gregorycollins.net
Mon Apr 20 16:10:02 UTC 2015
I've gotten substantial speedups (>30%) in the past (e.g. in the hashtables
package) by replacing search loops with branchless code, but the technique
works best in situations where you have a long run of straight-line integer
code for the CPU to chew on; for branchless to be a win you need to feed
enough instructions into the pipeline such that the savings of getting 3
instructions per cycle and avoiding paying for branch mispredictions
outweighs the savings you would get from not doing the work. I don't think
the use case you posted falls into that category.
On Sun, Apr 19, 2015 at 12:44 AM, Joachim Breitner <mail at joachim-breitner.de
> wrote:
> Dear devs,
>
> in https://ghc.haskell.org/trac/ghc/ticket/10124 bgamari suggested that
> code like
>
> f :: Int -> Bool
> f a = case a of
> 1 -> True
> 5 -> True
> 8 -> True
> 9 -> True
> 11 -> True
> 19 -> True
> _ -> False
> {-# NOINLINE f #-}
>
> should not compile to a series of conditional jumps, but rather a
> branchless code akin to
>
> let !p = a ==# 1 `orI#` a ==# 5 `orI#` a ==# 8 `orI#` a ==# 9
> `orI#` a ==# 11 `orI#` a ==# 19
> case p of
> 1 -> do_something
> 0 -> do_something_else
>
> Subsequently, I refactored the way we produce Cmm code from STG, opening
> the possibility to implement this optimization at that stage¹.
>
> But when I then added this implementation, I could not measure any
> runtime improvement, see
> https://ghc.haskell.org/trac/ghc/ticket/10124#comment:15
>
> So my question to the general audience: Is such branchless code really
> better than the current, branching code? Can someone provide us with an
> example that shows that it is better? Do I need to produce different
> branchless assembly?
>
> If it is not better, I can again refactor the switch generation and
> simplify it a bit again.
>
> Hmm, only now I see that rwbarton has replied there. Not sure why I
> missed that. Anyways, more voices are welcome :-)
>
> Greetings,
> Joachim
>
> ¹ This should not preclude an implementation on the Core level, which
> SPJ preferred. But I improved other aspects of the code generation as
> well, so this is worthwhile on its own.
>
> --
> Joachim “nomeata” Breitner
> mail at joachim-breitner.de • http://www.joachim-breitner.de/
> Jabber: nomeata at joachim-breitner.de • GPG-Key: 0xF0FBF51F
> Debian Developer: nomeata at debian.org
>
> _______________________________________________
> ghc-devs mailing list
> ghc-devs at haskell.org
> http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs
>
>
--
Gregory Collins <greg at gregorycollins.net>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.haskell.org/pipermail/ghc-devs/attachments/20150420/7b6f6d04/attachment.html>
More information about the ghc-devs
mailing list