[Git][ghc/ghc][wip/T22439] Improve the Lint checking for empty cases

Simon Peyton Jones (@simonpj) gitlab at gitlab.haskell.org
Fri Nov 11 13:27:48 UTC 2022



Simon Peyton Jones pushed to branch wip/T22439 at Glasgow Haskell Compiler / GHC


Commits:
1289ccbc by Simon Peyton Jones at 2022-11-11T13:29:36+00:00
Improve the Lint checking for empty cases

- - - - -


4 changed files:

- compiler/GHC/Core/Lint.hs
- compiler/GHC/Core/Opt/Arity.hs
- compiler/GHC/Core/Opt/SetLevels.hs
- compiler/GHC/Core/Utils.hs


Changes:

=====================================
compiler/GHC/Core/Lint.hs
=====================================
@@ -56,7 +56,7 @@ import GHC.Core.TyCon as TyCon
 import GHC.Core.Coercion.Axiom
 import GHC.Core.Unify
 import GHC.Core.Coercion.Opt ( checkAxInstCo )
-import GHC.Core.Opt.Arity    ( typeArity, exprIsDeadEnd )
+import GHC.Core.Opt.Arity    ( typeArity, tryHardExprIsDeadEnd )
 
 import GHC.Core.Opt.Monad
 
@@ -84,7 +84,6 @@ import GHC.Data.List.SetOps
 import GHC.Utils.Monad
 import GHC.Utils.Outputable as Outputable
 import GHC.Utils.Panic
-import GHC.Utils.Constants (debugIsOn)
 import GHC.Utils.Misc
 import GHC.Utils.Error
 import qualified GHC.Utils.Error as Err
@@ -1227,49 +1226,6 @@ Utterly bogus.  `f` expects an `Int` and we are giving it an `Age`.
 No no no.  Casts destroy the tail-call property.  Henc markAllJoinsBad
 in the (Cast expr co) case of lintCoreExpr.
 
-Note [No alternatives lint check]
-~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
-Case expressions with no alternatives are odd beasts, and it would seem
-like they would worth be looking at in the linter (cf #10180). We
-used to check two things:
-
-* exprIsHNF is false: it would *seem* to be terribly wrong if
-  the scrutinee was already in head normal form.
-
-* exprIsDeadEnd is true: we should be able to see why GHC believes the
-  scrutinee is diverging for sure.
-
-It was already known that the second test was not entirely reliable.
-Unfortunately (#13990), the first test turned out not to be reliable
-either. Getting the checks right turns out to be somewhat complicated.
-
-For example, suppose we have (comment 8)
-
-  data T a where
-    TInt :: T Int
-
-  absurdTBool :: T Bool -> a
-  absurdTBool v = case v of
-
-  data Foo = Foo !(T Bool)
-
-  absurdFoo :: Foo -> a
-  absurdFoo (Foo x) = absurdTBool x
-
-GHC initially accepts the empty case because of the GADT conditions. But then
-we inline absurdTBool, getting
-
-  absurdFoo (Foo x) = case x of
-
-x is in normal form (because the Foo constructor is strict) but the
-case is empty. To avoid this problem, GHC would have to recognize
-that matching on Foo x is already absurd, which is not so easy.
-
-More generally, we don't really know all the ways that GHC can
-lose track of why an expression is bottom, so we shouldn't make too
-much fuss when that happens.
-
-
 Note [Beta redexes]
 ~~~~~~~~~~~~~~~~~~~
 Consider:
@@ -1434,55 +1390,45 @@ lintTyKind tyvar arg_ty
 -}
 
 lintCaseExpr :: CoreExpr -> Id -> Type -> [CoreAlt] -> LintM (LintedType, UsageEnv)
-lintCaseExpr scrut var alt_ty alts =
-  do { let e = Case scrut var alt_ty alts   -- Just for error messages
+lintCaseExpr scrut case_bndr alt_ty alts =
+  do { let e = Case scrut case_bndr alt_ty alts   -- Just for error messages
 
      -- Check the scrutinee
      ; (scrut_ty, scrut_ue) <- markAllJoinsBad $ lintCoreExpr scrut
           -- See Note [Join points are less general than the paper]
           -- in GHC.Core
-     ; let scrut_mult = varMult var
+     ; let scrut_mult = varMult case_bndr
 
      ; alt_ty <- addLoc (CaseTy scrut) $
                  lintValueType alt_ty
-     ; var_ty <- addLoc (IdTy var) $
-                 lintValueType (idType var)
 
-     -- We used to try to check whether a case expression with no
-     -- alternatives was legitimate, but this didn't work.
-     -- See Note [No alternatives lint check] for details.
+     -- Don't use lintIdBndr on case_bndr, because unboxed tuple is legitimate
+     ; case_bndr_ty <- addLoc (IdTy case_bndr) $
+                       lintValueType (idType case_bndr)
 
      -- Check that the scrutinee is not a floating-point type
      -- if there are any literal alternatives
      -- See GHC.Core Note [Case expression invariants] item (5)
      -- See Note [Rules for floating-point comparisons] in GHC.Core.Opt.ConstantFold
-     ; let isLitPat (Alt (LitAlt _) _  _) = True
-           isLitPat _                     = False
-     ; checkL (not $ isFloatingPrimTy scrut_ty && any isLitPat alts)
+     ; let is_lit_alt (Alt (LitAlt _) _  _) = True
+           is_lit_alt _                     = False
+     ; checkL (not $ isFloatingPrimTy scrut_ty && any is_lit_alt alts)
          (text "Lint warning: Scrutinising floating-point expression with literal pattern in case analysis (see #9238)."
           $$ text "scrut" <+> ppr scrut)
 
-     ; case tyConAppTyCon_maybe (idType var) of
-         Just tycon
-              | debugIsOn
-              , isAlgTyCon tycon
-              , not (isAbstractTyCon tycon)
-              , null (tyConDataCons tycon)
-              , not (exprIsDeadEnd scrut)
-              -> pprTrace "Lint warning: case binder's type has no constructors" (ppr var <+> ppr (idType var))
-                        -- This can legitimately happen for type families
-                      $ return ()
-         _otherwise -> return ()
-
-        -- Don't use lintIdBndr on var, because unboxed tuple is legitimate
+     -- Check whether a case expression with no alternatives is legitimate.
+     -- See Note [No alternatives lint check] for details.
+     ; checkL (not (null alts) || tryHardExprIsDeadEnd scrut) $
+       hang (text "Illegal empty case")
+          2 (ppr case_bndr <+> ppr case_bndr_ty $$ ppr e)
 
      ; subst <- getSubst
-     ; ensureEqTys var_ty scrut_ty (mkScrutMsg var var_ty scrut_ty subst)
+     ; ensureEqTys case_bndr_ty scrut_ty (mkScrutMsg case_bndr case_bndr_ty scrut_ty subst)
        -- See GHC.Core Note [Case expression invariants] item (7)
 
-     ; lintBinder CaseBind var $ \_ ->
+     ; lintBinder CaseBind case_bndr $ \_ ->
        do { -- Check the alternatives
-          ; alt_ues <- mapM (lintCoreAlt var scrut_ty scrut_mult alt_ty) alts
+          ; alt_ues <- mapM (lintCoreAlt case_bndr scrut_ty scrut_mult alt_ty) alts
           ; let case_ue = (scaleUE scrut_mult scrut_ue) `addUE` supUEs alt_ues
           ; checkCaseAlts e scrut_ty alts
           ; return (alt_ty, case_ue) } }
@@ -1584,6 +1530,42 @@ lintCoreAlt case_bndr scrut_ty _scrut_mult alt_ty alt@(Alt (DataAlt con) args rh
   = zeroUE <$ addErrL (mkBadAltMsg scrut_ty alt)
 
 {-
+Note [No alternatives lint check]
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+Case expressions with no alternatives are odd beasts, and it would seem
+like they would worth be looking at in the linter (cf #10180). We check
+that the scrutinee is definitely diverging, using exprIsDeadEnd.
+
+Note that exprIsDeadEnv checks for /two/ things:
+* Type:  the type is empty; see Note [Empty types] in GHC.Core.Utils
+* Value: the value visibly bottoming; e.g. (error "urk")
+See Note [Bottoming expressions] in GHC.Core.Opt.Arity
+
+--- Historical note ---
+
+We used to check that exprIsHNF is False for a no-alternative case;
+but that's not quite right (#13990).  Consider:
+
+   data T a = T1 !(F a) | T2 Int
+   data DataConCantHappen -- No constructors
+   type instance F Bool = DataConCantHappen
+
+   f (x::T Bool) = case x of
+                     T1 v -> case (v |> co) of {}
+                     T2 x -> blah
+
+   where co :: F Bool ~ Void
+
+Now T1 is in fact inaccessible: it is a strict constructor and there
+are no inhabitants of (F Bool) = DataConCantHappen.  GHC itself uses
+this a lot in HsSyn to make inactive constructors inaccessible.
+
+However in this example `v` is marked "evaluated" because it is a
+strict field; so the scrutinee responds True to exprIsHNF.  It's
+all fine; but don't complain about exprIsHNF.
+
+--- End of historical note ---
+
 Note [Validating multiplicities in a case]
 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
 Suppose 'MkT :: a %m -> T m a'.


=====================================
compiler/GHC/Core/Opt/Arity.hs
=====================================
@@ -27,7 +27,9 @@ module GHC.Core.Opt.Arity
    , arityTypeArity, idArityType
 
    -- ** Bottoming things
-   , exprIsDeadEnd, exprBotStrictness_maybe, arityTypeBotSigs_maybe
+   , exprIsDeadEnd, tryHardExprIsDeadEnd
+   , arityTypeBotSigs_maybe
+   , exprBotStrictness_maybe, idBotStrictness_maybe
 
    -- ** typeArity and the state hack
    , typeArity, typeOneShots, typeOneShot
@@ -146,13 +148,21 @@ exprBotStrictness_maybe e = arityTypeBotSigs_maybe (cheapArityType e)
 arityTypeBotSigs_maybe :: ArityType ->  Maybe (Arity, DmdSig, CprSig)
 -- Arity of a divergent function
 arityTypeBotSigs_maybe (AT lams div)
-  | isDeadEndDiv div = Just ( arity
-                            , mkVanillaDmdSig arity botDiv
+  | isDeadEndDiv div = Just (arity
+                            , mkVanillaDmdSig arity div
                             , mkCprSig arity botCpr)
   | otherwise        = Nothing
   where
     arity = length lams
 
+idBotStrictness_maybe :: Id ->  Maybe (Arity, DmdSig, CprSig)
+idBotStrictness_maybe id
+  | isDeadEndDiv div = Just (length dmds, dmd_sig, idCprSig id)
+  | otherwise        = Nothing
+  where
+    (dmds, div) = splitDmdSig dmd_sig
+    dmd_sig     = idDmdSig id
+
 
 {- Note [exprArity for applications]
 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
@@ -1652,15 +1662,22 @@ exprArity e = go e
     go _                           = 0
 
 ---------------
-exprIsDeadEnd :: CoreExpr -> Bool
+exprIsDeadEnd, tryHardExprIsDeadEnd :: CoreExpr -> Bool
+tryHardExprIsDeadEnd e = expr_is_dead_end True  e
+exprIsDeadEnd        e = expr_is_dead_end False e
+
+expr_is_dead_end :: Bool -> CoreExpr -> Bool
 -- See Note [Bottoming expressions]
 -- This function is, in effect, just a specialised (and hence cheap)
 --    version of cheapArityType:
 --    exprIsDeadEnd e = case cheapArityType e of
 --                         AT lams div -> null lams && isDeadEndDiv div
+-- The try_hard flag governs how hard it tries to find the right
+-- answer.  It's always OK to say False
+--
 -- See also exprBotStrictness_maybe, which uses cheapArityType
-exprIsDeadEnd e
-  = go 0 e
+expr_is_dead_end try_hard e
+  = go 0 e || isEmptyTy (exprType e)
   where
     go :: Arity -> CoreExpr -> Bool
     -- (go n e) = True <=> expr applied to n value args is bottom
@@ -1675,11 +1692,13 @@ exprIsDeadEnd e
     go n (Lam v e) | isTyVar v   = go n e
                    | otherwise   = False
 
-    go _ (Case _ _ _ alts)       = null alts
-       -- See Note [Empty case alternatives] in GHC.Core
+    go n (Case scrut _ _ alts)
+      | not try_hard = null alts
+      | otherwise    = go 0 scrut || and [ go n rhs | Alt _ _ rhs <- alts ]
+      -- If try_hard then look at scrutinee and all alternatives
+      -- See Note [Empty case alternatives] in GHC.Core
 
     go n (Var v) | isDeadEndAppSig (idDmdSig v) n = True
-                 | isEmptyTy (idType v)           = True
                  | otherwise                      = False
 
 {- Note [Bottoming expressions]
@@ -1692,24 +1711,26 @@ checks for both of these situations:
       (error Int "Hello")
   is visibly bottom.  The strictness analyser also finds out if
   a function diverges or raises an exception, and puts that info
-  in its strictness signature.
+  in its strictness signature.  The `go` function spots this.
 
-* Empty types.  If a type is empty, its only inhabitant is bottom.
-  For example:
-      data T
-      f :: T -> Bool
-      f = \(x:t). case x of Bool {}
+* Empty types. The `isEmptyTy` function spots this. If a type is empty, its
+  only inhabitant is bottom. For example:
+      data Empty   -- No constructors
+      f :: Empty -> Bool
+      f = \(x:Empty). case x of Bool {}
   Since T has no data constructors, the case alternatives are of course
   empty.  However note that 'x' is not bound to a visibly-bottom value;
   it's the *type* that tells us it's going to diverge.
 
-A GADT may also be empty even though it has constructors:
-        data T a where
-          T1 :: a -> T Bool
-          T2 :: T Int
-        ...(case (x::T Char) of {})...
-Here (T Char) is uninhabited.  A more realistic case is (Int ~ Bool),
-which is likewise uninhabited.
+Wrinkles:
+
+(W1) isEmptyTy: a GADT may also be empty even though it has constructors:
+        data G a where
+          G1 :: a -> G Bool
+          g2 :: G Int
+        ...(case (x::G Char) of {})...
+     Here (G Char) is uninhabited.  A more realistic case is (Int ~ Bool),
+     which is likewise uninhabited.  See Note [Empty types] in GHC.Core.Utils
 
 Note [No free join points in arityType]
 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~


=====================================
compiler/GHC/Core/Opt/SetLevels.hs
=====================================
@@ -73,7 +73,7 @@ import GHC.Core.Utils   ( exprType, exprIsHNF
                         , collectMakeStaticArgs
                         , mkLamTypes, extendInScopeSetBndrs
                         )
-import GHC.Core.Opt.Arity   ( exprBotStrictness_maybe, isOneShotBndr )
+import GHC.Core.Opt.Arity   ( exprBotStrictness_maybe, idBotStrictness_maybe, isOneShotBndr )
 import GHC.Core.FVs     -- all of it
 import GHC.Core.Subst
 import GHC.Core.Make    ( sortQuantVars )
@@ -1094,7 +1094,7 @@ lvlBind :: LevelEnv
         -> LvlM (LevelledBind, LevelEnv)
 
 lvlBind env (AnnNonRec bndr rhs)
-  | isTyVar bndr    -- Don't do anything for TyVar binders
+  | is_tyvar        -- Don't do anything for TyVar binders
                     --   (simplifier gets rid of them pronto)
   || isCoVar bndr   -- Difficult to fix up CoVar occurrences (see extendPolyLvlEnv)
                     -- so we will ignore this case for now
@@ -1105,7 +1105,9 @@ lvlBind env (AnnNonRec bndr rhs)
           -- bit brutal, but unlifted bindings aren't expensive either
 
   = -- No float
-    do { rhs' <- lvlRhs env NonRecursive is_bot_lam mb_join_arity rhs
+    do { rhs' <- if is_tyvar
+                 then lvlExpr env rhs
+                 else lvlRhs env NonRecursive is_bot_lam mb_join_arity rhs
        ; let  bind_lvl        = incMinorLvl (le_ctxt_lvl env)
               (env', [bndr']) = substAndLvlBndrs NonRecursive env bind_lvl [bndr]
        ; return (NonRec bndr' rhs', env') }
@@ -1128,6 +1130,8 @@ lvlBind env (AnnNonRec bndr rhs)
        ; return (NonRec (TB bndr2 (FloatMe dest_lvl)) rhs', env') }
 
   where
+    is_tyvar   = isTyVar bndr
+    deann_rhs  = deAnnotate rhs
     bndr_ty    = idType bndr
     ty_fvs     = tyCoVarsOfType bndr_ty
     rhs_fvs    = freeVarsOf rhs
@@ -1135,11 +1139,13 @@ lvlBind env (AnnNonRec bndr rhs)
     abs_vars   = abstractVars dest_lvl env bind_fvs
     dest_lvl   = destLevel env bind_fvs ty_fvs (isFunction rhs) is_bot_lam is_join
 
-    deann_rhs  = deAnnotate rhs
-    mb_bot_str = exprBotStrictness_maybe deann_rhs
+    mb_bot_str = idBotStrictness_maybe bndr
     is_bot_lam = isJust mb_bot_str
-        -- is_bot_lam: looks like (\xy. bot), maybe zero lams
-        -- NB: not isBottomThunk!  See Note [Bottoming floats] point (3)
+    -- The Simplifier pins on strictness info, based on a call to arityType
+    -- Using that is faster and more accurate than calling exprBotStrictness_maybe
+    -- is_bot_lam: looks like (\xy. bot), maybe zero lams
+    -- NB: not isBottomThunk!  See Note [Bottoming floats] point (3)
+    -- NB: these two defns only work for Ids, not TyVars
 
     n_extra    = count isId abs_vars
     mb_join_arity = isJoinId_maybe bndr


=====================================
compiler/GHC/Core/Utils.hs
=====================================
@@ -2281,34 +2281,37 @@ locBind loc b1 b2 diffs = map addLoc diffs
                 | otherwise = ppr b1 <> char '/' <> ppr b2
 
 
-{- *********************************************************************
+{-
+************************************************************************
 *                                                                      *
-\subsection{Determining non-updatable right-hand-sides}
+              Type utilities
 *                                                                      *
 ************************************************************************
 
-Top-level constructor applications can usually be allocated
-statically, but they can't if the constructor, or any of the
-arguments, come from another DLL (because we can't refer to static
-labels in other DLLs).
+Note [Empty types]
+~~~~~~~~~~~~~~~~~~
+For a type `ty`, if (isEmptyTy ty) is True, then bottom is the only
+inhabitant of `ty`.  So if (e :: ty), we know that `e` must diverge.
 
-If this happens we simply make the RHS into an updatable thunk,
-and 'execute' it rather than allocating it statically.
--}
+How can a type be empty?
 
-{-
-************************************************************************
-*                                                                      *
-\subsection{Type utilities}
-*                                                                      *
-************************************************************************
+* It is a data type with no constructors.  e.g.
+     data T a
+  Then (T Int) and (T b) are empty types
+
+* It is a GADT, but the type parameters excludes all the constructors. e.g.
+     data G a where
+        G1 :: G Int
+        G2 :: G Bool
+  Then (G Char) is empty, because no value can have that type.  But G itself
+  isn't an empty type -- it certainly has data constructors.
+
+See Note [Bottoming expressions] in GHC.Core.Opt.Arity
+See Note [No alternatives lint check] in GHC.Core.Lint
 -}
 
--- | True if the type has no non-bottom elements, e.g. when it is an empty
--- datatype, or a GADT with non-satisfiable type parameters, e.g. Int :~: Bool.
--- See Note [Bottoming expressions]
---
--- See Note [No alternatives lint check] for another use of this function.
+-- | True if the type has no non-bottom elements
+-- See Note [Empty types]
 isEmptyTy :: Type -> Bool
 isEmptyTy ty
     -- Data types where, given the particular type parameters, no data



View it on GitLab: https://gitlab.haskell.org/ghc/ghc/-/commit/1289ccbc7356a5df69944af7c2ed405c71aba34b

-- 
View it on GitLab: https://gitlab.haskell.org/ghc/ghc/-/commit/1289ccbc7356a5df69944af7c2ed405c71aba34b
You're receiving this email because of your account on gitlab.haskell.org.


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.haskell.org/pipermail/ghc-commits/attachments/20221111/7196a565/attachment-0001.html>


More information about the ghc-commits mailing list