[Git][ghc/ghc][wip/marge_bot_batch_merge_job] 5 commits: Account for local rules in specImports

Marge Bot (@marge-bot) gitlab at gitlab.haskell.org
Tue Feb 28 20:15:06 UTC 2023



Marge Bot pushed to branch wip/marge_bot_batch_merge_job at Glasgow Haskell Compiler / GHC


Commits:
0c200ab7 by Simon Peyton Jones at 2023-02-28T11:10:31-05:00
Account for local rules in specImports

As #23024 showed, in GHC.Core.Opt.Specialise.specImports, we were
generating specialisations (a locally-define function) for imported
functions; and then generating specialisations for those
locally-defined functions.  The RULE for the latter should be
attached to the local Id, not put in the rules-for-imported-ids
set.

Fix is easy; similar to what happens in GHC.HsToCore.addExportFlagsAndRules

- - - - -
8b77f9bf by Sylvain Henry at 2023-02-28T11:11:21-05:00
JS: fix for overlap with copyMutableByteArray# (#23033)

The code wasn't taking into account some kind of overlap.

cgrun070 has been extended to test the missing case.

- - - - -
239202a2 by Sylvain Henry at 2023-02-28T11:12:03-05:00
Testsuite: replace some js_skip with req_cmm

req_cmm is more informative than js_skip

- - - - -
d8a31883 by Simon Peyton Jones at 2023-02-28T15:14:52-05:00
Take more care with unlifted bindings in the specialiser

As #22998 showed, we were floating an unlifted binding to top
level, which breaks a Core invariant.

The fix is easy, albeit a little bit conservative.  See
Note [Care with unlifted bindings] in GHC.Core.Opt.Specialise

- - - - -
fe3b7cfa by Simon Peyton Jones at 2023-02-28T15:14:53-05:00
Account for TYPE vs CONSTRAINT in mkSelCo

As #23018 showed, in mkRuntimeRepCo we need to account for coercions
between TYPE and COERCION.

See Note [mkRuntimeRepCo] in GHC.Core.Coercion.

- - - - -


28 changed files:

- compiler/GHC/Core.hs
- compiler/GHC/Core/Coercion.hs
- compiler/GHC/Core/Coercion/Opt.hs
- compiler/GHC/Core/Opt/Specialise.hs
- compiler/GHC/Core/Rules.hs
- compiler/GHC/Core/Type.hs
- compiler/GHC/HsToCore.hs
- compiler/GHC/StgToJS/Prim.hs
- rts/js/mem.js
- testsuite/driver/testlib.py
- testsuite/tests/cmm/should_compile/T21370/all.T
- testsuite/tests/cmm/should_compile/all.T
- testsuite/tests/cmm/should_run/all.T
- testsuite/tests/codeGen/should_compile/all.T
- testsuite/tests/codeGen/should_compile/cg010/all.T
- testsuite/tests/codeGen/should_run/CopySmallArray.hs
- testsuite/tests/codeGen/should_run/CopySmallArray.stdout
- testsuite/tests/codeGen/should_run/all.T
- testsuite/tests/codeGen/should_run/cgrun070.hs
- testsuite/tests/codeGen/should_run/cgrun070.stdout
- + testsuite/tests/simplCore/should_compile/T23024.hs
- + testsuite/tests/simplCore/should_compile/T23024a.hs
- testsuite/tests/simplCore/should_compile/all.T
- + testsuite/tests/simplCore/should_run/T22998.hs
- + testsuite/tests/simplCore/should_run/T22998.stdout
- testsuite/tests/simplCore/should_run/all.T
- + testsuite/tests/typecheck/should_compile/T23018.hs
- testsuite/tests/typecheck/should_compile/all.T


Changes:

=====================================
compiler/GHC/Core.hs
=====================================
@@ -366,68 +366,32 @@ a Coercion, (sym c).
 
 Note [Core letrec invariant]
 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~
-The right hand sides of all top-level and recursive @let at s
-/must/ be of lifted type (see "Type#type_classification" for
-the meaning of /lifted/ vs. /unlifted/).
+The Core letrec invariant:
 
-There is one exception to this rule, top-level @let at s are
-allowed to bind primitive string literals: see
-Note [Core top-level string literals].
+    The right hand sides of all
+      /top-level/ or /recursive/
+    bindings must be of lifted type
 
-Note [Core top-level string literals]
-~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
-As an exception to the usual rule that top-level binders must be lifted,
-we allow binding primitive string literals (of type Addr#) of type Addr# at the
-top level. This allows us to share string literals earlier in the pipeline and
-crucially allows other optimizations in the Core2Core pipeline to fire.
-Consider,
+    There is one exception to this rule, top-level @let at s are
+    allowed to bind primitive string literals: see
+    Note [Core top-level string literals].
 
-  f n = let a::Addr# = "foo"#
-        in \x -> blah
+See "Type#type_classification" in GHC.Core.Type
+for the meaning of "lifted" vs. "unlifted").
 
-In order to be able to inline `f`, we would like to float `a` to the top.
-Another option would be to inline `a`, but that would lead to duplicating string
-literals, which we want to avoid. See #8472.
-
-The solution is simply to allow top-level unlifted binders. We can't allow
-arbitrary unlifted expression at the top-level though, unlifted binders cannot
-be thunks, so we just allow string literals.
-
-We allow the top-level primitive string literals to be wrapped in Ticks
-in the same way they can be wrapped when nested in an expression.
-CoreToSTG currently discards Ticks around top-level primitive string literals.
-See #14779.
-
-Also see Note [Compilation plan for top-level string literals].
-
-Note [Compilation plan for top-level string literals]
-~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
-Here is a summary on how top-level string literals are handled by various
-parts of the compilation pipeline.
-
-* In the source language, there is no way to bind a primitive string literal
-  at the top level.
-
-* In Core, we have a special rule that permits top-level Addr# bindings. See
-  Note [Core top-level string literals]. Core-to-core passes may introduce
-  new top-level string literals.
-
-* In STG, top-level string literals are explicitly represented in the syntax
-  tree.
-
-* A top-level string literal may end up exported from a module. In this case,
-  in the object file, the content of the exported literal is given a label with
-  the _bytes suffix.
+For the non-top-level, non-recursive case see Note [Core let-can-float invariant].
 
 Note [Core let-can-float invariant]
-~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
 The let-can-float invariant:
 
-    The right hand side of a non-recursive 'Let'
-    /may/ be of unlifted type, but only if
+    The right hand side of a /non-top-level/, /non-recursive/ binding
+    may be of unlifted type, but only if
     the expression is ok-for-speculation
     or the 'Let' is for a join point.
 
+    (For top-level or recursive lets see Note [Core letrec invariant].)
+
 This means that the let can be floated around
 without difficulty. For example, this is OK:
 
@@ -466,6 +430,53 @@ we need to allow lots of things in the arguments of a call.
 
 TL;DR: we relaxed the let/app invariant to become the let-can-float invariant.
 
+Note [Core top-level string literals]
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+As an exception to the usual rule that top-level binders must be lifted,
+we allow binding primitive string literals (of type Addr#) of type Addr# at the
+top level. This allows us to share string literals earlier in the pipeline and
+crucially allows other optimizations in the Core2Core pipeline to fire.
+Consider,
+
+  f n = let a::Addr# = "foo"#
+        in \x -> blah
+
+In order to be able to inline `f`, we would like to float `a` to the top.
+Another option would be to inline `a`, but that would lead to duplicating string
+literals, which we want to avoid. See #8472.
+
+The solution is simply to allow top-level unlifted binders. We can't allow
+arbitrary unlifted expression at the top-level though, unlifted binders cannot
+be thunks, so we just allow string literals.
+
+We allow the top-level primitive string literals to be wrapped in Ticks
+in the same way they can be wrapped when nested in an expression.
+CoreToSTG currently discards Ticks around top-level primitive string literals.
+See #14779.
+
+Also see Note [Compilation plan for top-level string literals].
+
+Note [Compilation plan for top-level string literals]
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+Here is a summary on how top-level string literals are handled by various
+parts of the compilation pipeline.
+
+* In the source language, there is no way to bind a primitive string literal
+  at the top level.
+
+* In Core, we have a special rule that permits top-level Addr# bindings. See
+  Note [Core top-level string literals]. Core-to-core passes may introduce
+  new top-level string literals.
+
+  See GHC.Core.Utils.exprIsTopLevelBindable, and exprIsTickedString
+
+* In STG, top-level string literals are explicitly represented in the syntax
+  tree.
+
+* A top-level string literal may end up exported from a module. In this case,
+  in the object file, the content of the exported literal is given a label with
+  the _bytes suffix.
+
 Note [NON-BOTTOM-DICTS invariant]
 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
 It is a global invariant (not checkable by Lint) that


=====================================
compiler/GHC/Core/Coercion.hs
=====================================
@@ -612,14 +612,40 @@ eqTyConRole tc
   | otherwise
   = pprPanic "eqTyConRole: unknown tycon" (ppr tc)
 
--- | Given a coercion @co1 :: (a :: TYPE r1) ~ (b :: TYPE r2)@,
---   (or CONSTRAINT instead of TYPE)
--- produce a coercion @rep_co :: r1 ~ r2 at .
+-- | Given a coercion `co :: (t1 :: TYPE r1) ~ (t2 :: TYPE r2)`
+-- produce a coercion `rep_co :: r1 ~ r2`
+-- But actually it is possible that
+--     co :: (t1 :: CONSTRAINT r1) ~ (t2 :: CONSTRAINT r2)
+-- or  co :: (t1 :: TYPE r1)       ~ (t2 :: CONSTRAINT r2)
+-- or  co :: (t1 :: CONSTRAINT r1) ~ (t2 :: TYPE r2)
+-- See Note [mkRuntimeRepCo]
 mkRuntimeRepCo :: HasDebugCallStack => Coercion -> Coercion
 mkRuntimeRepCo co
-  = mkSelCo (SelTyCon 0 Nominal) kind_co
+  = assert (isTYPEorCONSTRAINT k1 && isTYPEorCONSTRAINT k2) $
+    mkSelCo (SelTyCon 0 Nominal) kind_co
   where
     kind_co = mkKindCo co  -- kind_co :: TYPE r1 ~ TYPE r2
+    Pair k1 k2 = coercionKind kind_co
+
+{- Note [mkRuntimeRepCo]
+~~~~~~~~~~~~~~~~~~~~~~~~
+Given
+   class C a where { op :: Maybe a }
+we will get an axiom
+   axC a :: (C a :: CONSTRAINT r1) ~ (Maybe a :: TYPE r2)
+(See Note [Type and Constraint are not apart] in GHC.Builtin.Types.Prim.)
+
+Then we may call mkRuntimeRepCo on (axC ty), and that will return
+   mkSelCo (SelTyCon 0 Nominal) (Kind (axC ty)) :: r1 ~ r2
+
+So mkSelCo needs to be happy with decomposing a coercion of kind
+   CONSTRAINT r1 ~ TYPE r2
+
+Hence the use of `tyConIsTYPEorCONSTRAINT` in the assertion `good_call`
+in `mkSelCo`. See #23018 for a concrete example.  (In this context it's
+important that TYPE and CONSTRAINT have the same arity and kind, not
+merely that they are not-apart; otherwise SelCo would not make sense.)
+-}
 
 isReflCoVar_maybe :: Var -> Maybe Coercion
 -- If cv :: t~t then isReflCoVar_maybe cv = Just (Refl t)
@@ -1173,7 +1199,8 @@ mkSelCo_maybe cs co
        , Just (tc2, tys2) <- splitTyConApp_maybe ty2
        , let { len1 = length tys1
              ; len2 = length tys2 }
-       =  tc1 == tc2
+       =  (tc1 == tc2 || (tyConIsTYPEorCONSTRAINT tc1 && tyConIsTYPEorCONSTRAINT tc2))
+                      -- tyConIsTYPEorCONSTRAINT: see Note [mkRuntimeRepCo]
        && len1 == len2
        && n < len1
        && r == tyConRole (coercionRole co) tc1 n


=====================================
compiler/GHC/Core/Coercion/Opt.hs
=====================================
@@ -44,6 +44,13 @@ import Control.Monad   ( zipWithM )
 %*                                                                      *
 %************************************************************************
 
+This module does coercion optimisation.  See the paper
+
+   Evidence normalization in Systtem FV (RTA'13)
+   https://simon.peytonjones.org/evidence-normalization/
+
+The paper is also in the GHC repo, in docs/opt-coercion.
+
 Note [Optimising coercion optimisation]
 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
 Looking up a coercion's role or kind is linear in the size of the


=====================================
compiler/GHC/Core/Opt/Specialise.hs
=====================================
@@ -27,7 +27,7 @@ import GHC.Core
 import GHC.Core.Make      ( mkLitRubbish )
 import GHC.Core.Unify     ( tcMatchTy )
 import GHC.Core.Rules
-import GHC.Core.Utils     ( exprIsTrivial
+import GHC.Core.Utils     ( exprIsTrivial, exprIsTopLevelBindable
                           , mkCast, exprType
                           , stripTicksTop, mkInScopeSetBndrs )
 import GHC.Core.FVs
@@ -64,6 +64,7 @@ import GHC.Unit.Module( Module )
 import GHC.Unit.Module.ModGuts
 import GHC.Core.Unfold
 
+import Data.List( partition )
 import Data.List.NonEmpty ( NonEmpty (..) )
 
 {-
@@ -726,6 +727,33 @@ specialisation (see canSpecImport):
     Specialise even INLINE things; it hasn't inlined yet, so perhaps
     it never will.  Moreover it may have calls inside it that we want
     to specialise
+
+Wrinkle (W1): If we specialise an imported Id M.foo, we make a /local/
+binding $sfoo.  But specImports may further specialise $sfoo. So we end up
+with RULES for both M.foo (imported) and $sfoo (local).  Rules for local
+Ids should be attached to the Ids themselves (see GHC.HsToCore
+Note [Attach rules to local ids]); so we must partition the rules and
+attach the local rules.  That is done in specImports, via addRulesToId.
+
+Note [Glom the bindings if imported functions are specialised]
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+Suppose we have an imported, *recursive*, INLINABLE function
+   f :: Eq a => a -> a
+   f = /\a \d x. ...(f a d)...
+In the module being compiled we have
+   g x = f (x::Int)
+Now we'll make a specialised function
+   f_spec :: Int -> Int
+   f_spec = \x -> ...(f Int dInt)...
+   {-# RULE  f Int _ = f_spec #-}
+   g = \x. f Int dInt x
+Note that f_spec doesn't look recursive
+After rewriting with the RULE, we get
+   f_spec = \x -> ...(f_spec)...
+BUT since f_spec was non-recursive before it'll *stay* non-recursive.
+The occurrence analyser never turns a NonRec into a Rec.  So we must
+make sure that f_spec is recursive.  Easiest thing is to make all
+the specialisations for imported bindings recursive.
 -}
 
 specImports :: SpecEnv
@@ -740,16 +768,24 @@ specImports top_env (MkUD { ud_binds = dict_binds, ud_calls = calls })
   = do { let env_w_dict_bndrs = top_env `bringFloatedDictsIntoScope` dict_binds
        ; (_env, spec_rules, spec_binds) <- spec_imports env_w_dict_bndrs [] dict_binds calls
 
-             -- Don't forget to wrap the specialized bindings with
-             -- bindings for the needed dictionaries.
-             -- See Note [Wrap bindings returned by specImports]
-             -- and Note [Glom the bindings if imported functions are specialised]
-       ; let final_binds
+             -- Make a Rec: see Note [Glom the bindings if imported functions are specialised]
+             --
+             -- wrapDictBinds: don't forget to wrap the specialized bindings with
+             --   bindings for the needed dictionaries.
+             --   See Note [Wrap bindings returned by specImports]
+             --
+             -- addRulesToId: see Wrinkle (W1) in Note [Specialising imported functions]
+             --               c.f. GHC.HsToCore.addExportFlagsAndRules
+       ; let (rules_for_locals, rules_for_imps) = partition isLocalRule spec_rules
+             local_rule_base = extendRuleBaseList emptyRuleBase rules_for_locals
+             final_binds
                | null spec_binds = wrapDictBinds dict_binds []
-               | otherwise       = [Rec $ flattenBinds $
-                                    wrapDictBinds dict_binds spec_binds]
+               | otherwise       = [Rec $ mapFst (addRulesToId local_rule_base) $
+                                          flattenBinds                          $
+                                          wrapDictBinds dict_binds              $
+                                          spec_binds]
 
-       ; return (spec_rules, final_binds)
+       ; return (rules_for_imps, final_binds)
     }
 
 -- | Specialise a set of calls to imported bindings
@@ -1111,27 +1147,6 @@ And if the call is to the same type, one specialisation is enough.
 Avoiding this recursive specialisation loop is one reason for the
 'callers' stack passed to specImports and specImport.
 
-Note [Glom the bindings if imported functions are specialised]
-~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
-Suppose we have an imported, *recursive*, INLINABLE function
-   f :: Eq a => a -> a
-   f = /\a \d x. ...(f a d)...
-In the module being compiled we have
-   g x = f (x::Int)
-Now we'll make a specialised function
-   f_spec :: Int -> Int
-   f_spec = \x -> ...(f Int dInt)...
-   {-# RULE  f Int _ = f_spec #-}
-   g = \x. f Int dInt x
-Note that f_spec doesn't look recursive
-After rewriting with the RULE, we get
-   f_spec = \x -> ...(f_spec)...
-BUT since f_spec was non-recursive before it'll *stay* non-recursive.
-The occurrence analyser never turns a NonRec into a Rec.  So we must
-make sure that f_spec is recursive.  Easiest thing is to make all
-the specialisations for imported bindings recursive.
-
-
 
 ************************************************************************
 *                                                                      *
@@ -1500,7 +1515,10 @@ specBind top_lvl env (NonRec fn rhs) do_body
                          = [mkDB $ NonRec b r | (b,r) <- pairs]
                            ++ fromOL dump_dbs
 
-       ; if float_all then
+             can_float_this_one = exprIsTopLevelBindable rhs (idType fn)
+             -- exprIsTopLevelBindable: see Note [Care with unlifted bindings]
+
+       ; if float_all && can_float_this_one then
              -- Rather than discard the calls mentioning the bound variables
              -- we float this (dictionary) binding along with the others
               return ([], body', all_free_uds `snocDictBinds` final_binds)
@@ -1861,6 +1879,28 @@ even in the case that @x = False@! Instead, we add a dummy 'Void#' argument to
 the specialisation '$sfInt' ($sfInt :: Void# -> Array# Int) in order to
 preserve laziness.
 
+Note [Care with unlifted bindings]
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+Consider (#22998)
+    f x = let x::ByteArray# = <some literal>
+              n::Natural    = NB x
+          in wombat @192827 (n |> co)
+where
+  co :: Natural ~ KnownNat 192827
+  wombat :: forall (n:Nat). KnownNat n => blah
+
+Left to itself, the specialiser would float the bindings for `x` and `n` to top
+level, so we can specialise `wombat`.  But we can't have a top-level ByteArray#
+(see Note [Core letrec invariant] in GHC.Core).  Boo.
+
+This is pretty exotic, so we take a simple way out: in specBind (the NonRec
+case) do not float the binding itself unless it satisfies exprIsTopLevelBindable.
+This is conservative: maybe the RHS of `x` has a free var that would stop it
+floating to top level anyway; but that is hard to spot (since we don't know what
+the non-top-level in-scope binders are) and rare (since the binding must satisfy
+Note [Core let-can-float invariant] in GHC.Core).
+
+
 Note [Specialising Calls]
 ~~~~~~~~~~~~~~~~~~~~~~~~~
 Suppose we have a function with a complicated type:


=====================================
compiler/GHC/Core/Rules.hs
=====================================
@@ -22,7 +22,7 @@ module GHC.Core.Rules (
 
         -- ** Manipulating 'RuleInfo' rules
         extendRuleInfo, addRuleInfo,
-        addIdSpecialisations,
+        addIdSpecialisations, addRulesToId,
 
         -- ** RuleBase and RuleEnv
 
@@ -349,6 +349,14 @@ addIdSpecialisations id rules
   = setIdSpecialisation id $
     extendRuleInfo (idSpecialisation id) rules
 
+addRulesToId :: RuleBase -> Id -> Id
+-- Add rules in the RuleBase to the rules in the Id
+addRulesToId rule_base bndr
+  | Just rules <- lookupNameEnv rule_base (idName bndr)
+  = bndr `addIdSpecialisations` rules
+  | otherwise
+  = bndr
+
 -- | Gather all the rules for locally bound identifiers from the supplied bindings
 rulesOfBinds :: [CoreBind] -> [CoreRule]
 rulesOfBinds binds = concatMap (concatMap idCoreRules . bindersOf) binds


=====================================
compiler/GHC/Core/Type.hs
=====================================
@@ -124,7 +124,7 @@ module GHC.Core.Type (
 
         -- *** Levity and boxity
         sORTKind_maybe, typeTypeOrConstraint,
-        typeLevity_maybe,
+        typeLevity_maybe, tyConIsTYPEorCONSTRAINT,
         isLiftedTypeKind, isUnliftedTypeKind, pickyIsLiftedTypeKind,
         isLiftedRuntimeRep, isUnliftedRuntimeRep, runtimeRepLevity_maybe,
         isBoxedRuntimeRep,
@@ -2652,13 +2652,18 @@ isPredTy ty = case typeTypeOrConstraint ty of
                   TypeLike       -> False
                   ConstraintLike -> True
 
------------------------------------------
 -- | Does this classify a type allowed to have values? Responds True to things
 -- like *, TYPE Lifted, TYPE IntRep, TYPE v, Constraint.
 isTYPEorCONSTRAINT :: Kind -> Bool
 -- ^ True of a kind `TYPE _` or `CONSTRAINT _`
 isTYPEorCONSTRAINT k = isJust (sORTKind_maybe k)
 
+tyConIsTYPEorCONSTRAINT :: TyCon -> Bool
+tyConIsTYPEorCONSTRAINT tc
+  = tc_uniq == tYPETyConKey || tc_uniq == cONSTRAINTTyConKey
+  where
+    !tc_uniq = tyConUnique tc
+
 isConstraintLikeKind :: Kind -> Bool
 -- True of (CONSTRAINT _)
 isConstraintLikeKind kind


=====================================
compiler/GHC/HsToCore.hs
=====================================
@@ -362,32 +362,28 @@ deSugarExpr hsc_env tc_expr = do
 addExportFlagsAndRules
     :: Backend -> NameSet -> NameSet -> [CoreRule]
     -> [(Id, t)] -> [(Id, t)]
-addExportFlagsAndRules bcknd exports keep_alive rules = mapFst add_one
+addExportFlagsAndRules bcknd exports keep_alive rules
+  = mapFst (addRulesToId rule_base . add_export_flag)
+        -- addRulesToId: see Note [Attach rules to local ids]
+        -- NB: the binder might have some existing rules,
+        -- arising from specialisation pragmas
+
   where
-    add_one bndr = add_rules name (add_export name bndr)
-       where
-         name = idName bndr
 
     ---------- Rules --------
-        -- See Note [Attach rules to local ids]
-        -- NB: the binder might have some existing rules,
-        -- arising from specialisation pragmas
-    add_rules name bndr
-        | Just rules <- lookupNameEnv rule_base name
-        = bndr `addIdSpecialisations` rules
-        | otherwise
-        = bndr
     rule_base = extendRuleBaseList emptyRuleBase rules
 
     ---------- Export flag --------
     -- See Note [Adding export flags]
-    add_export name bndr
-        | dont_discard name = setIdExported bndr
+    add_export_flag bndr
+        | dont_discard bndr = setIdExported bndr
         | otherwise         = bndr
 
-    dont_discard :: Name -> Bool
-    dont_discard name = is_exported name
+    dont_discard :: Id -> Bool
+    dont_discard bndr = is_exported name
                      || name `elemNameSet` keep_alive
+       where
+         name = idName bndr
 
         -- In interactive mode, we don't want to discard any top-level
         -- entities at all (eg. do not inline them away during


=====================================
compiler/GHC/StgToJS/Prim.hs
=====================================
@@ -573,11 +573,7 @@ genPrim prof bound ty op = case op of
         [ d .! (Add di i) |= s .! (Add si i)
         , postDecrS i
         ]
-  CopySmallMutableArrayOp    -> \[]    [s,si,d,di,n] -> PrimInline $
-      loopBlockS (Sub n one_) (.>=. zero_) \i ->
-        [ d .! (Add di i) |= s .! (Add si i)
-        , postDecrS i
-        ]
+  CopySmallMutableArrayOp    -> \[]    [s,si,d,di,n] -> PrimInline $ appS "h$copyMutableArray" [s,si,d,di,n]
   CloneSmallArrayOp          -> \[r]   [a,o,n]       -> PrimInline $ cloneArray r a (Just o) n
   CloneSmallMutableArrayOp   -> \[r]   [a,o,n]       -> PrimInline $ cloneArray r a (Just o) n
   FreezeSmallArrayOp         -> \[r]   [a,o,n]       -> PrimInline $ cloneArray r a (Just o) n
@@ -719,10 +715,7 @@ genPrim prof bound ty op = case op of
   CopyByteArrayOp -> \[] [a1,o1,a2,o2,n] ->
       PrimInline . boundsChecked bound a1 (Add o1 (Sub n 1))
                  . boundsChecked bound a2 (Add o2 (Sub n 1))
-                $ loopBlockS (Sub n one_) (.>=. zero_) \i ->
-                    [ write_u8 a2 (Add i o2) (read_u8 a1 (Add i o1))
-                    , postDecrS i
-                    ]
+                 $ appS "h$copyMutableByteArray" [a1,o1,a2,o2,n]
   CopyMutableByteArrayOp       -> \[] xs@[_a1,_o1,_a2,_o2,_n] -> genPrim prof bound ty CopyByteArrayOp [] xs
   CopyByteArrayToAddrOp        -> \[] xs@[_a1,_o1,_a2,_o2,_n] -> genPrim prof bound ty CopyByteArrayOp [] xs
   CopyMutableByteArrayToAddrOp -> \[] xs@[_a1,_o1,_a2,_o2,_n] -> genPrim prof bound ty CopyByteArrayOp [] xs


=====================================
rts/js/mem.js
=====================================
@@ -531,12 +531,17 @@ function h$sliceArray(a, start, n) {
   return r;
 }
 
+//////////////////////////////////////////////////////////
+//
 // copy between two mutable arrays. Range may overlap
+// so we check which offset is bigger to make a front-to-back or
+// back-to-front traversal of the arrays.
+
 function h$copyMutableArray(a1,o1,a2,o2,n) {
   if (n <= 0) return;
 
   if (o1 < o2) {
-    for (var i=n-1;i>=0;i--) { // start from the end to handle potential overlap
+    for (var i=n-1;i>=0;i--) {
       a2[o2+i] = a1[o1+i];
     }
   } else {
@@ -546,6 +551,22 @@ function h$copyMutableArray(a1,o1,a2,o2,n) {
   }
 }
 
+function h$copyMutableByteArray(a1,o1,a2,o2,n) {
+  if (n <= 0) return;
+
+  if (o1 < o2) {
+    for (var i=n-1;i>=0;i--) {
+      a2.u8[o2+i] = a1.u8[o1+i];
+    }
+  } else {
+    for (var i=0;i<n;i++) {
+      a2.u8[o2+i] = a1.u8[o1+i];
+    }
+  }
+}
+
+//////////////////////////////////////////////////////////
+
 function h$memcpy() {
   if(arguments.length === 3) {  // ByteArray# -> ByteArray# copy
     var dst = arguments[0];


=====================================
testsuite/driver/testlib.py
=====================================
@@ -273,6 +273,13 @@ def req_c( name, opts ):
     # JS backend doesn't support C (yet)
     js_skip(name, opts)
 
+def req_cmm( name, opts ):
+    """
+    Mark a test as requiring Cmm support
+    """
+    # JS backend doesn't support Cmm
+    js_skip(name, opts)
+
 def req_ffi_exports( name, opts):
     """
     Mark a test as requiring FFI exports
@@ -771,8 +778,7 @@ def objcpp_src( name, opts ):
 
 def cmm_src( name, opts ):
     opts.cmm_src = True
-    # JS backend doesn't support Cmm
-    js_skip(name, opts)
+    req_cmm(name, opts)
 
 def outputdir( odir ):
     return lambda name, opts, d=odir: _outputdir(name, opts, d)


=====================================
testsuite/tests/cmm/should_compile/T21370/all.T
=====================================
@@ -1,4 +1,4 @@
 test('T21370',
   [ extra_files(["subdir", "test.cmm", "test2.cmm", "Main.hs"])
-  , js_skip # use Cmm
+  , req_cmm
   ], makefile_test, [])


=====================================
testsuite/tests/cmm/should_compile/all.T
=====================================
@@ -1,5 +1,5 @@
 setTestOpts(
-  [ js_skip # Cmm not supported by the JS backend
+  [ req_cmm
   ])
 
 test('selfloop', [cmm_src], compile, ['-no-hs-main'])


=====================================
testsuite/tests/cmm/should_run/all.T
=====================================
@@ -6,7 +6,7 @@ test('HooplPostorder',
 test('cmp64',
      [    extra_run_opts('"' + config.libdir + '"')
      ,    omit_ways(['ghci'])
-     ,    js_skip
+     ,    req_cmm
      ],
      multi_compile_and_run,
      ['cmp64', [('cmp64_cmm.cmm', '')], '-O'])
@@ -21,7 +21,7 @@ test('cmp64',
 test('ByteSwitch',
      [    extra_run_opts('"' + config.libdir + '"')
      ,    omit_ways(['ghci'])
-     ,    js_skip
+     ,    req_cmm
      ],
      multi_compile_and_run,
      ['ByteSwitch', [('ByteSwitch_cmm.cmm', '')], ''])
@@ -29,7 +29,7 @@ test('ByteSwitch',
 test('T22871',
      [    extra_run_opts('"' + config.libdir + '"')
      ,    omit_ways(['ghci'])
-     ,    js_skip
+     ,    req_cmm
      ,    when(arch('i386'), skip) # x86 NCG panics with "iselExpr64(i386)"
      ],
      multi_compile_and_run,


=====================================
testsuite/tests/codeGen/should_compile/all.T
=====================================
@@ -29,7 +29,7 @@ test('T9329', [when(unregisterised(), expect_broken(15467)), cmm_src], compile,
 
 test('debug',
   [ normal,
-    js_skip # requires Cmm
+    req_cmm
   ],
   makefile_test, [])
 
@@ -72,7 +72,7 @@ test('T17334', [ unless(have_ncg() and (arch('x86_64') or arch('i386')), skip)
                ], compile, ['-O'])
 
 test('T14373',
-      [ js_skip # JS backend doesn't produce Cmm
+      [ req_cmm
       ],
      multimod_compile_filter, ['T14373', '-fasm -O2 -c -ddump-cmm-from-stg',
      r'grep -e "const T14373\.._closure+.;"'])
@@ -80,17 +80,17 @@ test('T14373',
 switch_skeleton_only = r'grep -e "switch \[" -e "case " -e "default: " | sed -e "s|\] .*|\]|g" -e "s|goto .*|goto |g"'
 
 test('T14373a',
-      [ js_skip # JS backend doesn't produce Cmm
+      [ req_cmm
       ],
      multimod_compile_filter, ['T14373a', '-fasm -O2 -c -ddump-cmm-from-stg',
      switch_skeleton_only])
 test('T14373b',
-      [ js_skip # JS backend doesn't produce Cmm
+      [ req_cmm
       ],
      multimod_compile_filter, ['T14373b', '-fasm -O2 -c -ddump-cmm-from-stg',
      switch_skeleton_only])
 test('T14373c',
-      [ js_skip # JS backend doesn't produce Cmm
+      [ req_cmm
       ],
      multimod_compile_filter, ['T14373c', '-fasm -O2 -c -ddump-cmm-from-stg',
      switch_skeleton_only])
@@ -99,7 +99,7 @@ switch_skeleton_and_entries_only = (r'grep -e "switch \[" -e "case " -e "default
                                     r'| sed -e "s|\] .*|\]|g" -e "s|goto .*|goto |g" -e "s|R1 = .*_closure+2;.*|R1 = XYZ_closure+2;|g" -e "s|//.*|//|g"')
 
 test('T14373d',
-      [ js_skip # JS backend doesn't produce Cmm
+      [ req_cmm
       ],
      multimod_compile_filter, ['T14373d', '-fasm -O2 -c -ddump-cmm-from-stg',
      switch_skeleton_and_entries_only])


=====================================
testsuite/tests/codeGen/should_compile/cg010/all.T
=====================================
@@ -1,4 +1,4 @@
 test('cg010',
   [ extra_files(['A.hs','Main.hs'])
-  , js_skip # skip with JS backend because Cmm is required
+  , req_cmm
   ], makefile_test, ['cg010'])


=====================================
testsuite/tests/codeGen/should_run/CopySmallArray.hs
=====================================
@@ -76,12 +76,21 @@ test_copyMutableArray =
 -- Perform a copy where the source and destination part overlap.
 test_copyMutableArrayOverlap :: String
 test_copyMutableArrayOverlap =
-    let arr = runST $ do
+    let arr1 = runST $ do
             marr <- fromList inp
             -- Overlap of two elements
             copyMutableArray marr 5 marr 7 8
             unsafeFreezeArray marr
-    in shows (toList arr (length inp)) "\n"
+        arr2 = runST $ do
+            marr <- fromList inp
+            -- Overlap of two elements
+            -- Offset 1 > offset 2 (cf #23033)
+            copyMutableArray marr 7 marr 5 8
+            unsafeFreezeArray marr
+    in shows (toList arr1 (length inp))
+       . showChar '\n'
+       . shows (toList arr2 (length inp))
+       $ "\n"
   where
      -- This case was known to fail at some point.
      inp = [0,169,196,9,16,25,36,16,25,81,100,121,144,169,196]


=====================================
testsuite/tests/codeGen/should_run/CopySmallArray.stdout
=====================================
@@ -3,6 +3,7 @@
 [-1,1,4,9,16,25,36,49,64,81,100,121,144,169,196,225,256,289,324,361,400,441,484,529,576,625,676,729,784,841,900,961,1024,1089,1156,1225,1296,1369,1444,1521,1600,1681,1764,1849,1936,2025,2116,2209,2304,2401,2500,2601,2704,2809,2916,3025,3136,3249,3364,3481,3600,3721,3844,3969,4096,4225,4356,4489,4624,4761,4900,5041,5184,5329,5476,5625,5776,5929,6084,6241,6400,6561,6724,6889,7056,7225,7396,7569,7744,7921,8100,8281,8464,8649,8836,9025,9216,9409,9604,9801,10000,10201,10404,10609,10816,11025,11236,11449,11664,11881,12100,12321,12544,12769,12996,13225,13456,13689,13924,14161,14400,14641,14884,15129,15376,15625,15876,16129,16384,-1]
 
 [0,169,196,9,16,25,36,25,36,16,25,81,100,121,144]
+[0,169,196,9,16,16,25,81,100,121,144,169,196,169,196]
 
 [1,4,9,16,25,36,49,64,81,100,121,144,169,196,225,256,289,324,361,400,441,484,529,576,625,676,729,784,841,900,961,1024,1089,1156,1225,1296,1369,1444,1521,1600,1681,1764,1849,1936,2025,2116,2209,2304,2401,2500,2601,2704,2809,2916,3025,3136,3249,3364,3481,3600,3721,3844,3969,4096,4225,4356,4489,4624,4761,4900,5041,5184,5329,5476,5625,5776,5929,6084,6241,6400,6561,6724,6889,7056,7225,7396,7569,7744,7921,8100,8281,8464,8649,8836,9025,9216,9409,9604,9801,10000,10201,10404,10609,10816,11025,11236,11449,11664,11881,12100,12321,12544,12769,12996,13225,13456,13689,13924,14161,14400,14641,14884,15129,15376,15625,15876,16129,16384]
 


=====================================
testsuite/tests/codeGen/should_run/all.T
=====================================
@@ -74,7 +74,7 @@ test('cgrun065', normal, compile_and_run, [''])
 test('cgrun066', normal, compile_and_run, [''])
 test('cgrun067', [extra_files(['Cgrun067A.hs'])], compile_and_run, [''])
 test('cgrun069',
-     [ omit_ways(['ghci']), js_skip],
+     [ omit_ways(['ghci']), req_cmm],
      multi_compile_and_run,
      ['cgrun069', [('cgrun069_cmm.cmm', '')], ''])
 test('cgrun070', normal, compile_and_run, [''])
@@ -99,7 +99,7 @@ test('T3207', normal, compile_and_run, [''])
 test('T3561', normal, compile_and_run, [''])
 test('T3677', extra_run_opts('+RTS -K8k -RTS'), compile_and_run, [''])
 test('T4441', normal, compile_and_run, [''])
-test('T5149', [omit_ways(['ghci']), js_skip], multi_compile_and_run,
+test('T5149', [omit_ways(['ghci']), req_cmm], multi_compile_and_run,
                  ['T5149', [('T5149_cmm.cmm', '')], ''])
 test('T5129',
      # The bug is in simplifier when run with -O1 and above, so only run it
@@ -148,8 +148,8 @@ test('T9013', omit_ways(['ghci']),  # ghci doesn't support unboxed tuples
      compile_and_run, [''])
 test('T9340', normal, compile_and_run, [''])
 test('cgrun074', normal, compile_and_run, [''])
-test('CmmSwitchTest32', [unless(wordsize(32), skip),js_skip], compile_and_run, [''])
-test('CmmSwitchTest64', [unless(wordsize(64), skip),js_skip], compile_and_run, [''])
+test('CmmSwitchTest32', [unless(wordsize(32), skip), req_cmm], compile_and_run, [''])
+test('CmmSwitchTest64', [unless(wordsize(64), skip), req_cmm], compile_and_run, [''])
 # Skipping WAY=ghci, because it is not broken.
 test('T10245', normal, compile_and_run, [''])
 test('T10246', normal, compile_and_run, [''])
@@ -163,7 +163,7 @@ test('T10521b', normal, compile_and_run, [''])
 test('T10870', when(wordsize(32), skip), compile_and_run, [''])
 test('PopCnt',
   [omit_ways(['ghci'])
-  , js_skip # use Cmm
+  ,req_cmm
   ], multi_compile_and_run,
                  ['PopCnt', [('PopCnt_cmm.cmm', '')], ''])
 test('T12059',


=====================================
testsuite/tests/codeGen/should_run/cgrun070.hs
=====================================
@@ -74,12 +74,21 @@ test_copyMutableByteArray =
 -- Perform a copy where the source and destination part overlap.
 test_copyMutableByteArrayOverlap :: String
 test_copyMutableByteArrayOverlap =
-    let arr = runST $ do
+    let arr1 = runST $ do
             marr <- fromList inp
             -- Overlap of two elements
             copyMutableByteArray marr 5 marr 7 8
             unsafeFreezeByteArray marr
-    in shows (toList arr (length inp)) "\n"
+        arr2 = runST $ do
+            marr <- fromList inp
+            -- Overlap of two elements
+            -- Offset 1 > offset 2 (cf #23033)
+            copyMutableByteArray marr 7 marr 5 8
+            unsafeFreezeByteArray marr
+    in shows (toList arr1 (length inp))
+       . showChar '\n'
+       . shows (toList arr2 (length inp))
+       $ "\n"
   where
      -- This case was known to fail at some point.
      inp = [0,169,196,9,16,25,36,16,25,81,100,121,144,169,196]


=====================================
testsuite/tests/codeGen/should_run/cgrun070.stdout
=====================================
@@ -3,6 +3,7 @@
 [255,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30,31,32,33,34,35,36,37,38,39,40,41,42,43,44,45,46,47,48,49,50,51,52,53,54,55,56,57,58,59,60,61,62,63,64,65,66,67,68,69,70,71,72,73,74,75,76,77,78,79,80,81,82,83,84,85,86,87,88,89,90,91,92,93,94,95,96,97,98,99,100,101,102,103,104,105,106,107,108,109,110,111,112,113,114,115,116,117,118,119,120,121,122,123,124,125,126,127,128,255]
 
 [0,169,196,9,16,25,36,25,36,16,25,81,100,121,144]
+[0,169,196,9,16,16,25,81,100,121,144,169,196,169,196]
 
 [255,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30,31,32,33,34,35,36,37,38,39,40,41,42,43,44,45,46,47,48,49,50,51,52,53,54,55,56,57,58,59,60,61,62,63,64,65,66,67,68,69,70,71,72,73,74,75,76,77,78,79,80,81,82,83,84,85,86,87,88,89,90,91,92,93,94,95,96,97,98,99,100,101,102,103,104,105,106,107,108,109,110,111,112,113,114,115,116,117,118,119,120,121,122,123,124,125,126,127,128,255]
 


=====================================
testsuite/tests/simplCore/should_compile/T23024.hs
=====================================
@@ -0,0 +1,8 @@
+{-# OPTIONS_GHC -fspecialize-aggressively -fexpose-all-unfoldings  #-}
+{-# LANGUAGE RankNTypes #-}
+module T23024 (testPolyn) where
+
+import T23024a
+
+testPolyn :: (forall r. Tensor r => r) -> Vector Double
+testPolyn f = gradientFromDelta f


=====================================
testsuite/tests/simplCore/should_compile/T23024a.hs
=====================================
@@ -0,0 +1,82 @@
+{-# OPTIONS_GHC -fspecialize-aggressively -fexpose-all-unfoldings -Wno-missing-methods #-}
+{-# LANGUAGE FlexibleInstances, FlexibleContexts, UndecidableInstances,
+             DataKinds, MultiParamTypeClasses, RankNTypes, MonoLocalBinds #-}
+module T23024a where
+
+import System.IO.Unsafe
+import Control.Monad.ST ( ST, runST )
+import Foreign.ForeignPtr
+import Foreign.Storable
+import GHC.ForeignPtr ( unsafeWithForeignPtr )
+
+class MyNum a where
+  fi :: a
+
+class (MyNum a, Eq a) => MyReal a
+
+class (MyReal a) => MyRealFrac a  where
+  fun :: a -> ()
+
+class (MyRealFrac a, MyNum a) => MyRealFloat a
+
+instance MyNum Double
+instance MyReal Double
+instance MyRealFloat Double
+instance MyRealFrac Double
+
+newtype Vector a = Vector (ForeignPtr a)
+
+class GVector v a where
+instance Storable a => GVector Vector a
+
+vunstream :: () -> ST s (v a)
+vunstream () = vunstream ()
+
+empty :: GVector v a => v a
+empty = runST (vunstream ())
+{-# NOINLINE empty #-}
+
+instance (Storable a, Eq a) => Eq (Vector a) where
+  xs == ys = idx xs == idx ys
+
+{-# NOINLINE idx #-}
+idx (Vector fp) = unsafePerformIO
+                        $ unsafeWithForeignPtr fp $ \p ->
+                          peekElemOff p 0
+
+instance MyNum (Vector Double)
+instance (MyNum (Vector a), Storable a, Eq a) => MyReal (Vector a)
+instance (MyNum (Vector a), Storable a, Eq a) => MyRealFrac (Vector a)
+instance (MyNum (Vector a), Storable a, MyRealFloat a) => MyRealFloat (Vector a)
+
+newtype ORArray a = A a
+
+instance (Eq a) => Eq (ORArray a) where
+  A x == A y = x == y
+
+instance (MyNum (Vector a)) => MyNum (ORArray a)
+instance (MyNum (Vector a), Storable a, Eq a) => MyReal (ORArray a)
+instance (MyRealFrac (Vector a), Storable a, Eq a) => MyRealFrac (ORArray a)
+instance (MyRealFloat (Vector a), Storable a, Eq a) => MyRealFloat (ORArray a)
+
+newtype Ast r = AstConst (ORArray r)
+
+instance Eq (Ast a) where
+  (==) = undefined
+
+instance MyNum (ORArray a) => MyNum (Ast a) where
+  fi = AstConst fi
+
+instance MyNum (ORArray a) => MyReal (Ast a)
+instance MyRealFrac (ORArray a) => MyRealFrac (Ast a) where
+  {-# INLINE fun #-}
+  fun x = ()
+  
+instance MyRealFloat (ORArray a) => MyRealFloat (Ast a)
+
+class (MyRealFloat a) => Tensor a
+instance (MyRealFloat a, MyNum (Vector a), Storable a) => Tensor (Ast a)
+
+gradientFromDelta :: Storable a => Ast a -> Vector a
+gradientFromDelta _ = empty
+{-# NOINLINE gradientFromDelta #-}


=====================================
testsuite/tests/simplCore/should_compile/all.T
=====================================
@@ -475,3 +475,4 @@ test('T22761', normal, multimod_compile, ['T22761', '-O2 -v0'])
 test('T23012', normal, compile, ['-O'])
 
 test('RewriteHigherOrderPatterns', normal, compile, ['-O -ddump-rule-rewrites -dsuppress-all -dsuppress-uniques'])
+test('T23024', normal, multimod_compile, ['T23024', '-O -v0'])


=====================================
testsuite/tests/simplCore/should_run/T22998.hs
=====================================
@@ -0,0 +1,10 @@
+{-# LANGUAGE DataKinds #-}
+module Main where
+
+import Data.Proxy (Proxy(Proxy))
+import GHC.TypeLits (natVal)
+
+main :: IO ()
+main = print x
+  where
+    x = natVal @18446744073709551616 Proxy + natVal @18446744073709551616 Proxy


=====================================
testsuite/tests/simplCore/should_run/T22998.stdout
=====================================
@@ -0,0 +1 @@
+36893488147419103232


=====================================
testsuite/tests/simplCore/should_run/all.T
=====================================
@@ -108,3 +108,5 @@ test('T21575', normal, compile_and_run, ['-O'])
 test('T21575b', [], multimod_compile_and_run, ['T21575b', '-O'])
 test('T20836', normal, compile_and_run, ['-O0']) # Should not time out; See #20836
 test('T22448', normal, compile_and_run, ['-O1'])
+test('T22998', normal, compile_and_run, ['-O0 -fspecialise -dcore-lint'])
+


=====================================
testsuite/tests/typecheck/should_compile/T23018.hs
=====================================
@@ -0,0 +1,9 @@
+module T23018 where
+
+import qualified Control.DeepSeq as DeepSeq
+
+class XX f where
+   rnf :: DeepSeq.NFData a => f a -> ()
+
+instance XX Maybe where
+   rnf = DeepSeq.rnf


=====================================
testsuite/tests/typecheck/should_compile/all.T
=====================================
@@ -863,3 +863,4 @@ test('T22912', normal, compile, [''])
 test('T22924', normal, compile, [''])
 test('T22985a', normal, compile, ['-O'])
 test('T22985b', normal, compile, [''])
+test('T23018', normal, compile, [''])



View it on GitLab: https://gitlab.haskell.org/ghc/ghc/-/compare/e7c86c812d2093494e56867e18307b851865671d...fe3b7cfa51a74e99166b900f5ce56a36a3c42ffc

-- 
View it on GitLab: https://gitlab.haskell.org/ghc/ghc/-/compare/e7c86c812d2093494e56867e18307b851865671d...fe3b7cfa51a74e99166b900f5ce56a36a3c42ffc
You're receiving this email because of your account on gitlab.haskell.org.


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.haskell.org/pipermail/ghc-commits/attachments/20230228/3266b38f/attachment-0001.html>


More information about the ghc-commits mailing list