[GHC] #10137: Rewrite switch code generation
GHC
ghc-devs at haskell.org
Wed Mar 4 16:30:44 UTC 2015
#10137: Rewrite switch code generation
-------------------------------------+-------------------------------------
Reporter: nomeata | Owner:
Type: task | Status: new
Priority: normal | Milestone:
Component: Compiler | Version: 7.9
(CodeGen) | Operating System: Unknown/Multiple
Keywords: | Type of failure: None/Unknown
Architecture: | Blocked By:
Unknown/Multiple | Related Tickets:
Test Case: |
Blocking: |
Differential Revisions: |
-------------------------------------+-------------------------------------
Inspired by #10124 I looked at the code generation for enumeration and
integer types, and I think this can be improved in a few ways. My goals
are:
* Fancier code for integer types. Currently, the code for enumeration
types will emit jump tables for dense choices; there is no reason to treat
integer types differently.
* The ability to behave differently if some of the cases are equal, and,
as an extension of that,
* The possibility to create branchless code if multiple checks would go
to the same jump.
The current scheme is roughly:
* When we generate Cmm code for a STG case expression, we handle
enumeration types and literals separately.
* At this point, the decisions about what code to generate are made (jump
tables (but only for enumeration types) or if-then-else trees).
* The Common Block Optimization on Cmm happens later in the pipeline,
making it non-trivial to detect branches that do the same thing.
My plan is the following:
* In the STG→Cmm transformation, floats will be handled separately.
Matching against literals literals is fishy anyways, so my suggestion is
to simply generate a linear list of equality checks here – turning the
intended operation (equality test) into something else (comparisons in a
if-then-else tree) feels wrong to me for floats. And the rest would not
work well for floats, so I’d like to have them out of the way.
* The case of enumeration types will be reduced to word literals, and
treated the same from now on.
* For integer types, no code generation decisions is made at this point.
Instead, always a `CmmSwitch` statement is generated.
* In a separate Cmm transformation pass, which will run /after/ the
common block optimization, we visit all `CmmSwitches` and create proper
code for them.
I’d like to separate the algorithm that plans the code generation into a
function (possibly even module) of its own, so that the decisions can
easily by analyized and modified.
The algorithm has a few choices to make:
* If multiple values point to the same code, we can generate branchless
code (`y := x == 1 || x == 5 || x = 7; if (y==0) then goto l1 else goto
l2`).
* If there are whole ranges pointing to the same code, the above can also
use comparisons.
* If there are dense ranges (i.e. a range with more than half of the
possible values mapped to something), we want to generate jump tables from
them (still `CmmSwitch` values).
* Unless all options are handled by one of these possibilities, they need
to be combined using `if-then-else` trees.
The `CmmSwitch` constructor needs to change for that. It currently takes a
`[Maybe Label]` argument, which is not suitable for before that pass, when
its table may be sparse. I think an `IntMap Label` would work nicely.
--
Ticket URL: <http://ghc.haskell.org/trac/ghc/ticket/10137>
GHC <http://www.haskell.org/ghc/>
The Glasgow Haskell Compiler
More information about the ghc-tickets
mailing list