compilation of pattern-matching?
Tyson Whitehead
twhitehead at gmail.com
Mon Mar 23 23:42:21 EDT 2009
On March 23, 2009 19:46:27 Claus Reinke wrote:
> My idea was that case branches correspond to conditional jumps
> (though the exact correspondence and optimization has been the
> subject of countless papers). If I loop through a very long list,
> most of the time the test for (:) will succeed, requiring no jump,
> while the test for [] will fail, requiring a jump to the alternative
> branch. So, if GHC's pattern-match compilation is naive, the
> reordering will introduce 2 jumps into the common case of the
> loop where none would be needed, right?
Module Test(test) where
test :: [a] -> Int
test (a:b:c) = 2
test (a:[]) = 1
test [] = 0
gives the following cmm (with GHC 6.10.1 and -O2)
Test_test_entry() {
...
chn:
if (Sp - 8 < SpLim) goto chp; // RTS stack check for space
R1 = R2;
I64[Sp - 8] = sgO_info; // Argument evaluation return address
Sp = Sp - 8;
if (R1 & 7 != 0) goto chs; // Is argument already evaluated?
jump I64[R1] (); // No, evaluate it
chp:
R1 = Test_test_closure; // RTS stack expansion (GC?)
jump stg_gc_fun ();
chs: jump sgO_info (); // Yes, go directly to return address
}
sgO_ret() {
...
chg:
_chh = R1 & 7; // Constructor tag is in lower ptr bits
if (_chh >= 2) goto chi; // Does the tag indicate (:)?
R1 = Test_lvl2_closure+1; // No, load up closure for 0 and return
Sp = Sp + 8;
jump (I64[Sp + 0]) ();
chi:
R1 = I64[R1 + 14]; // Yes, get the tail of (:)
I64[Sp + 0] = sgQ_info; // Tail evaluation return address
if (R1 & 7 != 0) goto chl; // Is tail already evaluated?
jump I64[R1] (); // No, evaluate it
chl: jump sgQ_info (); // Yes, go directly to return address
}
sgQ_ret() {
...
cha:
_chb = R1 & 7; // Constructor tag is in lower ptr bits
if (_chb >= 2) goto chc; // Does the tag indicate (:)?
R1 = Test_lvl1_closure+1; // No, load up closure for 1 and return
Sp = Sp + 8;
jump (I64[Sp + 0]) ();
chc:
R1 = Test_lvl_closure+1; // Yes, load up closure for 2 and return
Sp = Sp + 8;
jump (I64[Sp + 0]) ();
}
Thus the trip is more like (assuming the first two (:) are already evaluated)
test -> chs (WHNF check -- i.e., first (:) is already evaluated)
chs -> sgO
sg0 -> chi (constructor check -- i.e., not [])
chi -> chl (WHNF check -- i.e., second (:) is already evaluated)
chl -> sgQ
sgQ -> chc (constructor check -- i.e., not (a:[]))
chc -> return
Looking at the assembler, things are a bit better in that the the gotos that
immediately execute a jump are just replaced with a jump. For example, the
assembler for test gives (test -> chs -> sg0 is replaced with test -> sg0)
...
Test_test_info:
.Lchn:
leaq -8(%rbp),%rax // RTS stack check for return address
cmpq %r14,%rax
jb .Lchp
movq %rsi,%rbx
movq $sgO_info,-8(%rbp) // Argument evaluation return address
addq $-8,%rbp
testq $7,%rbx // Is argument already evaluated?
jne sgO_info // Yes, go directly to return address
jmp *(%rbx) // No, evaluate it
.Lchp:
movl $Test_test_closure,%ebx // RTS stack expansion (GC?)
jmp *-8(%r13)
...
sgO_info:
.Lchg:
movq %rbx,%rax // Constructor tag is in lower ptr bits
andq $7,%rax
cmpq $2,%rax // Does the tag indicate (:)?
jae .Lchi
movl $Test_lvl2_closure+1,%ebx// No, load up closure for 0 and return
addq $8,%rbp
jmp *(%rbp)
.Lchi:
movq 14(%rbx),%rbx // Yes, get the tail of (:)
movq $sgQ_info,(%rbp) // Tail evaluation return address
testq $7,%rbx // Is tail already evaluated?
jne sgQ_info // No, evaluate it
jmp *(%rbx) // Yes, go directly to return address
...
Thus you actually get
test -> sg0 (WHNF check -- i.e., first (:) is already evaluated)
sg0 -> chi (constructor check -- i.e., not [])
chi -> sgQ (WHNF check -- i.e., second (:) is already evaluated)
sgQ -> chc (constructor check -- i.e., not (a:[]))
chc -> return
I guess this is a long winded way of saying that the branches are being
ordered such that the fall though case is not the one that you put first,
which, if I recall correctly, is somewhat bad as the x86 branch predictor
guesses a forward branch that hasn't been seen before will fall through.
Perhaps they are being ordered by the constructor tag?
Cheers! -Tyson
PS: I reversed GHC's ordering of test, sgO, and sgQ for readability above.
The test -> sg0 and chi -> sgQ jumps actually go backwards, which is actually
what you want because, if I recall correctly, the x86 branch predictor guesses
a backwards branch it hasn't seen before will not fall through.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: This is a digitally signed message part.
Url : http://www.haskell.org/pipermail/glasgow-haskell-users/attachments/20090323/f3b0def7/attachment.bin
More information about the Glasgow-haskell-users
mailing list