[commit: ghc] master: Rewrite Note [The polymorphism rule of join points] (862c59e)

Tue Jan 2 17:54:05 UTC 2018

Repository : ssh://git@git.haskell.org/ghc

On branch  : master
Link       : http://ghc.haskell.org/trac/ghc/changeset/862c59e7bf714e6059392ea401bb0a568c959725/ghc

>---------------------------------------------------------------

commit 862c59e7bf714e6059392ea401bb0a568c959725
Author: Joachim Breitner <mail at joachim-breitner.de>
Date:   Tue Jan 2 17:33:47 2018 +0100

    Rewrite Note [The polymorphism rule of join points]
    
    I found the reference to CPS unhelpful, but Simon gave me a good
    explanation in #14610 that I believe is going to be more enlightening
    for future readers.
    
    Differential Revision: https://phabricator.haskell.org/D4281


>---------------------------------------------------------------

862c59e7bf714e6059392ea401bb0a568c959725
 compiler/coreSyn/CoreSyn.hs | 69 ++++++++++++++++++++++++++++++++-------------
 1 file changed, 50 insertions(+), 19 deletions(-)

diff --git a/compiler/coreSyn/CoreSyn.hs b/compiler/coreSyn/CoreSyn.hs
index 9b9d20d..27a4c99 100644
--- a/compiler/coreSyn/CoreSyn.hs
+++ b/compiler/coreSyn/CoreSyn.hs
@@ -706,33 +706,64 @@ polymorphic in its return type. That is, if its type is
   forall a1 ... ak. t1 -> ... -> tn -> r
 
 where its join arity is k+n, none of the type parameters ai may occur free in r.
-The most direct explanation is that given
 
-  join j @a1 ... @ak x1 ... xn = e1 in e2
+In some way, this falls out of the fact that given
 
-our typing rules require `e1` and `e2` to have the same type. Therefore the type
-of `e1`---the return type of the join point---must be the same as the type of
-e2. Since the type variables aren't bound in `e2`, its type can't include them,
-and thus neither can the type of `e1`.
+  join
+     j @a1 ... @ak x1 ... xn = e1
+  in e2
+
+then all calls to `j` are in tail-call positions of `e`, and expressions in
+tail-call positions in `e` have the same type as `e`.
+Therefore the type of `e1` -- the return type of the join point -- must be the
+same as the type of e2.
+Since the type variables aren't bound in `e2`, its type can't include them, and
+thus neither can the type of `e1`.
+
+This unfortunately prevents the `go` in the following code from being a
+join-point:
+
+  iter :: forall a. Int -> (a -> a) -> a -> a
+  iter @a n f x = go @a n f x
+    where
+      go :: forall a. Int -> (a -> a) -> a -> a
+      go @a 0 _ x = x
+      go @a n f x = go @a (n-1) f (f x)
+
+In this case, a static argument transformation would fix that (see
+ticket #14620):
+
+  iter :: forall a. Int -> (a -> a) -> a -> a
+  iter @a n f x = go' @a n f x
+    where
+      go' :: Int -> (a -> a) -> a -> a
+      go' 0 _ x = x
+      go' n f x = go' (n-1) f (f x)
+
+In general, loopification could be employed to do that (see #14068.)
+
+Can we simply drop the requirement, and allow `go` to be a join-point? We
+could, and it would work. But we could not longer apply the case-of-join-point
+transformation universally. This transformation would do:
 
-There's a deeper explanation in terms of the sequent calculus in Section 5.3 of
-a previous paper:
+  case (join go @a n f x = case n of 0 -> x
+                                     n -> go @a (n-1) f (f x)
+        in go @Bool n neg True) of
+    True -> e1; False -> e2
 
-  Paul Downen, Luke Maurer, Zena Ariola, and Simon Peyton Jones. "Sequent
-  calculus as a compiler intermediate language." ICFP'16.
+ ===>
 
-  https://www.microsoft.com/en-us/research/wp-content/uploads/2016/04/sequent-calculus-icfp16.pdf
+  join go @a n f x = case n of 0 -> case x of True -> e1; False -> e2
+                          n -> go @a (n-1) f (f x)
+  in go @Bool n neg True
 
-The quick version: Consider the CPS term (the paper uses the sequent calculus,
-but we can translate readily):
+but that is ill-typed, as `x` is type `a`, not `Bool`.
 
-  \k -> join j @a1 ... @ak x1 ... xn = e1 k in e2 k
 
-Since `j` is a join point, it doesn't bind a continuation variable but reuses
-the variable `k` from the context. But the parameters `ai` are not in `k`'s
-scope, and `k`'s type determines the return type of `j`; thus the `ai`s don't
-appear in the return type of `j`. (Also, since `e1` and `e2` are passed the same
-continuation, they must have the same type; hence the direct explanation above.)
+This is also justifies why we do not consider the `e` in `e |> co` to be in
+tail position: A cast changes the type, but the type must be the same. But
+operationally, casts are vacuous, so this is a bit unfortunate! See #14610 for
+ideas how to fix this.
 
 ************************************************************************
 *                                                                      *