[EXTERNAL] Unexpected duplicate join points in "Core" output?

Simon Peyton Jones simonpj at microsoft.com
Sat Nov 20 21:15:15 UTC 2021


There is absolutely no reason not to common-up those to join points.  But we can't common up some join points when we could if they were let's.  Consider

join j1 x = x+1
in case v of
      A -> f (join j2 x = x+1 in ...j2...)
      B -> ....j1...
      C -> ....j1...

Even though j2 is identical to j1's, we can't eliminate j2 in favour of j1 because then j1 wouldn't be a join point any more.

GHC.Core.Opt.CSE is conservative at the moment, and never CSE's *any* join point.  It would not be hard to make it clever enough to CSE join points, but no one has yet done it.

Do open a ticket!

Simon

PS: I am leaving Microsoft at the end of November 2021, at which point simonpj at microsoft.com will cease to work.  Use simon.peytonjones at gmail.com instead.  (For now, it just forwards to simonpj at microsoft.com.)

| -----Original Message-----
| From: ghc-devs <ghc-devs-bounces at haskell.org> On Behalf Of Viktor
| Dukhovni
| Sent: 20 November 2021 00:57
| To: ghc-devs at haskell.org
| Subject: [EXTERNAL] Unexpected duplicate join points in "Core" output?
| 
| The below "Core" output from "ghc -O2" (9.2/8.10) for the attached
| program shows seemingly rendundant join points:
| 
|       join {
|         exit :: State# RealWorld -> (# State# RealWorld, () #)
|         exit (ipv :: State# RealWorld) = jump $s$j ipv } in
| 
|       join {
|         exit1 :: State# RealWorld -> (# State# RealWorld, () #)
|         exit1 (ipv :: State# RealWorld) = jump $s$j ipv } in
| 
| that are identical in all but name.  These correspond to fallthrough to
| the "otherwise" case in:
| 
|            ...
|            | acc < q || (acc == q && d <= 5)
|              -> loop (ptr `plusPtr` 1) (acc * 10 + d)
|            | otherwise -> return Nothing
| 
| but it seems that the generated X86_64 code (also below) ultimately
| consolidates these into a single target... Is that why it is harmless
| to leave these duplicated in the generated "Core"?
| 
| [ Separately, in the generated machine code, it'd also be nice to avoid
|   comparing the same "q" with the accumulator twice.  A single load and
|   compare should I think be enough, as I'd expect the status flags to
|   persist across the jump the second test.
| 
|   This happens to not be performance critical in my case, because most
|   calls should satisfy the first test, but generally I think that 3-way
|   "a < b", "a == b", "a > b" branches ideally avoid comparing twice...
| ]
| 
| ======== Associated Core output
| 
|     -- RHS size: {terms: 1, types: 0, coercions: 0, joins: 0/0}
|     main2 :: Addr#
|     main2 = "12345678901234567890 junk"#
| 
|     -- RHS size: {terms: 129, types: 114, coercions: 0, joins: 6/8}
|     main1 :: State# RealWorld -> (# State# RealWorld, () #)
|     main1
|       = \ (eta :: State# RealWorld) ->
|           let {
|             end :: Addr#
|             end = plusAddr# main2 25# } in
|           join {
|             $s$j :: State# RealWorld -> (# State# RealWorld, () #)
|             $s$j _ = hPutStr2 stdout $fShowMaybe4 True eta } in
|           join {
|             exit :: State# RealWorld -> (# State# RealWorld, () #)
|             exit (ipv :: State# RealWorld) = jump $s$j ipv } in
|           join {
|             exit1 :: State# RealWorld -> (# State# RealWorld, () #)
|             exit1 (ipv :: State# RealWorld) = jump $s$j ipv } in
|           join {
|             exit2
|               :: Addr# -> Word# -> State# RealWorld -> (# State#
| RealWorld, () #)
|             exit2 (ww :: Addr#) (ww1 :: Word#) (ipv :: State#
| RealWorld)
|               = case eqAddr# ww main2 of {
|                   __DEFAULT ->
|                     hPutStr2
|                       stdout
|                       (++
|                          $fShowMaybe1
|                          (case $w$cshowsPrec3 11# (integerFromWord#
| ww1) [] of
|                           { (# ww3, ww4 #) ->
|                           : ww3 ww4
|                           }))
|                       True
|                       eta;
|                   1# -> jump $s$j ipv
|                 } } in
|           joinrec {
|             $wloop
|               :: Addr# -> Word# -> State# RealWorld -> (# State#
| RealWorld, () #)
|             $wloop (ww :: Addr#) (ww1 :: Word#) (w :: State# RealWorld)
|               = join {
|                   getDigit :: State# RealWorld -> (# State# RealWorld,
| () #)
|                   getDigit (eta1 :: State# RealWorld)
|                     = case eqAddr# ww end of {
|                         __DEFAULT ->
|                           case readWord8OffAddr# ww 0# eta1 of { (#
| ipv, ipv1 #) ->
|                           let {
|                             ipv2 :: Word#
|                             ipv2 = minusWord# (word8ToWord# ipv1) 48##
| } in
|                           case gtWord# ipv2 9## of {
|                             __DEFAULT ->
|                               case ltWord# ww1 1844674407370955161## of
| {
|                                 __DEFAULT ->
|                                   case ww1 of {
|                                     __DEFAULT -> jump exit ipv;
|                                     1844674407370955161## ->
|                                       case leWord# ipv2 5## of {
|                                         __DEFAULT -> jump exit1 ipv;
|                                         1# ->
|                                           jump $wloop
|                                             (plusAddr# ww 1#)
|                                             (plusWord#
| 18446744073709551610## ipv2)
|                                             ipv
|                                       }
|                                   };
|                                 1# ->
|                                   jump $wloop
|                                     (plusAddr# ww 1#) (plusWord#
| (timesWord# ww1 10##) ipv2) ipv
|                               };
|                             1# -> jump exit2 ww ww1 ipv
|                           }
|                           };
|                         1# -> jump exit2 ww ww1 eta1
|                       } } in
|                 jump getDigit w; } in
|           jump $wloop main2 0## realWorld#
| 
| ======== Executable disassembly
| 
| The jumps at "-1->" and "-2->" that correspond that "otherwise" have
| the same target.  The duplicate "load+cmp" with "q" is at "-3->" and "-
| 4->":
| 
|     0000000000408de8 <Main_main1_info>:
|       408de8:       48 8d 45 e8             lea    -0x18(%rbp),%rax
|       408dec:       4c 39 f8                cmp    %r15,%rax
|       408def:       0f 82 c8 00 00 00       jb     408ebd
| <Main_main1_info+0xd5>
|       408df5:       b8 79 dd 77 00          mov    $0x77dd79,%eax
|       408dfa:       31 db                   xor    %ebx,%ebx
|       408dfc:       b9 60 dd 77 00          mov    $0x77dd60,%ecx
|       408e01:       48 39 c1                cmp    %rax,%rcx
|       408e04:       74 66                   je     408e6c
| <Main_main1_info+0x84>
|       408e06:       0f b6 11                movzbl (%rcx),%edx
|       408e09:       48 83 c2 d0             add
| $0xffffffffffffffd0,%rdx
|       408e0d:       48 83 fa 09             cmp    $0x9,%rdx
|       408e11:       77 59                   ja     408e6c
| <Main_main1_info+0x84>
| -3->  408e13:       48 be 99 99 99 99 99    mov
| $0x1999999999999999,%rsi
|       408e1a:       99 99 19
|       408e1d:       48 39 f3                cmp    %rsi,%rbx
|       408e20:       73 0c                   jae    408e2e
| <Main_main1_info+0x46>
|       408e22:       48 6b db 0a             imul   $0xa,%rbx,%rbx
|       408e26:       48 01 d3                add    %rdx,%rbx
|       408e29:       48 ff c1                inc    %rcx
|       408e2c:       eb d3                   jmp    408e01
| <Main_main1_info+0x19>
| -4->  408e2e:       48 be 99 99 99 99 99    mov
| $0x1999999999999999,%rsi
|       408e35:       99 99 19
|       408e38:       48 39 f3                cmp    %rsi,%rbx
| -1->  408e3b:       75 49                   jne    408e86
| <Main_main1_info+0x9e>
|       408e3d:       48 83 fa 05             cmp    $0x5,%rdx
| -2->  408e41:       77 43                   ja     408e86
| <Main_main1_info+0x9e>
|       408e43:       48 8d 5a fa             lea    -0x6(%rdx),%rbx
|       408e47:       48 ff c1                inc    %rcx
|       408e4a:       eb b5                   jmp    408e01
| <Main_main1_info+0x19>
|       408e4c:       0f 1f 40 00             nopl   0x0(%rax)
|       408e50:       c2 00 00                retq   $0x0
| 
| --
|     Viktor.


More information about the ghc-devs mailing list