[Git][ghc/ghc][wip/marge_bot_batch_merge_job] 3 commits: DeriveFunctor: Check for last type variables using dataConUnivTyVars

Sun Sep 18 08:50:49 UTC 2022

Marge Bot pushed to branch wip/marge_bot_batch_merge_job at Glasgow Haskell Compiler / GHC


Commits:
d555c9f2 by Ryan Scott at 2022-09-18T04:50:38-04:00
DeriveFunctor: Check for last type variables using dataConUnivTyVars

Previously, derived instances of `Functor` (as well as the related classes
`Foldable`, `Traversable`, and `Generic1`) would determine which constraints to
infer by checking for fields that contain the last type variable. The problem
was that this last type variable was taken from `tyConTyVars`. For GADTs, the
type variables in each data constructor are _not_ the same type variables as
in `tyConTyVars`, leading to #22167.

This fixes the issue by instead checking for the last type variable using
`dataConUnivTyVars`. (This is very similar in spirit to the fix for #21185,
which also replaced an errant use of `tyConTyVars` with type variables from
each data constructor.)

Fixes #22167.

- - - - -
70860d42 by Vladislav Zavialov at 2022-09-18T04:50:39-04:00
Lexer: pass updated buffer to actions (#22201)

In the lexer, predicates have the following type:
	{ ... } :: user       -- predicate state
		-> AlexInput  -- input stream before the token
		-> Int        -- length of the token
		-> AlexInput  -- input stream after the token
		-> Bool       -- True <=> accept the token
This is documented in the Alex manual.

There is access to the input stream both before and after the token.
But when the time comes to construct the token, GHC passes only the
initial string buffer to the lexer action. This patch fixes it:

	- type Action = PsSpan -> StringBuffer -> Int ->                 P (PsLocated Token)
	+ type Action = PsSpan -> StringBuffer -> Int -> StringBuffer -> P (PsLocated Token)

Now lexer actions have access to the string buffer both before and after
the token, just like the predicates. It's just a matter of passing an
additional function parameter throughout the lexer.

- - - - -
3f0c994c by Vladislav Zavialov at 2022-09-18T04:50:39-04:00
Lexer: define varsym without predicates (#22201)

Before this patch, the varsym lexing rules were defined as follows:

	<0> {
	  @varsym / { precededByClosingToken `alexAndPred` followedByOpeningToken } { varsym_tight_infix }
	  @varsym / { followedByOpeningToken }  { varsym_prefix }
	  @varsym / { precededByClosingToken }  { varsym_suffix }
	  @varsym                               { varsym_loose_infix }
	}

Unfortunately, this meant that the predicates 'precededByClosingToken' and
'followedByOpeningToken' were recomputed several times before we could figure
out the whitespace context.

With this patch, we check for whitespace context directly in the lexer
action:

	<0> {
	  @varsym { with_op_ws varsym }
	}

The checking for opening/closing tokens happens in 'with_op_ws' now,
which is part of the lexer action rather than the lexer predicate.

- - - - -


6 changed files:

- compiler/GHC/Parser/Lexer.x
- compiler/GHC/Tc/Deriv/Functor.hs
- compiler/GHC/Tc/Deriv/Generics.hs
- compiler/GHC/Tc/Deriv/Infer.hs
- + testsuite/tests/deriving/should_compile/T22167.hs
- testsuite/tests/deriving/should_compile/all.T


Changes:

=====================================
compiler/GHC/Parser/Lexer.x
=====================================
@@ -490,20 +490,12 @@ $tab          { warnTab }
   @conid "#"+       / { ifExtension MagicHashBit } { idtoken conid }
 }

--- Operators classified into prefix, suffix, tight infix, and loose infix.
--- See Note [Whitespace-sensitive operator parsing]
-<0> {
-  @varsym / { precededByClosingToken `alexAndPred` followedByOpeningToken } { varsym_tight_infix }
-  @varsym / { followedByOpeningToken }  { varsym_prefix }
-  @varsym / { precededByClosingToken }  { varsym_suffix }
-  @varsym                               { varsym_loose_infix }
-}
-
 -- ToDo: - move `var` and (sym) into lexical syntax?
 --       - remove backquote from $special?
 <0> {
   @qvarsym                                         { idtoken qvarsym }
   @qconsym                                         { idtoken qconsym }
+  @varsym                                          { with_op_ws varsym }
   @consym                                          { consym }
 }
 
@@ -707,6 +699,14 @@ $tab          { warnTab }
 
 {
 
+-- Operator whitespace occurrence. See Note [Whitespace-sensitive operator parsing].
+data OpWs
+  = OpWsPrefix         -- a !b
+  | OpWsSuffix         -- a! b
+  | OpWsTightInfix     -- a!b
+  | OpWsLooseInfix     -- a ! b
+  deriving Show
+
 -- -----------------------------------------------------------------------------
 -- The token type
 
@@ -1093,60 +1093,61 @@ reservedSymsFM = listToUFM $
 -- -----------------------------------------------------------------------------
 -- Lexer actions
 
-type Action = PsSpan -> StringBuffer -> Int -> P (PsLocated Token)
+type Action = PsSpan -> StringBuffer -> Int -> StringBuffer -> P (PsLocated Token)
 
 special :: Token -> Action
-special tok span _buf _len = return (L span tok)
+special tok span _buf _len _buf2 = return (L span tok)
 
 token, layout_token :: Token -> Action
-token t span _buf _len = return (L span t)
-layout_token t span _buf _len = pushLexState layout >> return (L span t)
+token t span _buf _len _buf2 = return (L span t)
+layout_token t span _buf _len _buf2 = pushLexState layout >> return (L span t)
 
 idtoken :: (StringBuffer -> Int -> Token) -> Action
-idtoken f span buf len = return (L span $! (f buf len))
+idtoken f span buf len _buf2 = return (L span $! (f buf len))
 
 qdo_token :: (Maybe FastString -> Token) -> Action
-qdo_token con span buf len = do
+qdo_token con span buf len _buf2 = do
     maybe_layout token
     return (L span $! token)
   where
     !token = con $! Just $! fst $! splitQualName buf len False
 
 skip_one_varid :: (FastString -> Token) -> Action
-skip_one_varid f span buf len
+skip_one_varid f span buf len _buf2
   = return (L span $! f (lexemeToFastString (stepOn buf) (len-1)))
 
 skip_two_varid :: (FastString -> Token) -> Action
-skip_two_varid f span buf len
+skip_two_varid f span buf len _buf2
   = return (L span $! f (lexemeToFastString (stepOn (stepOn buf)) (len-2)))
 
 strtoken :: (String -> Token) -> Action
-strtoken f span buf len =
+strtoken f span buf len _buf2 =
   return (L span $! (f $! lexemeToString buf len))
 
 begin :: Int -> Action
-begin code _span _str _len = do pushLexState code; lexToken
+begin code _span _str _len _buf2 = do pushLexState code; lexToken
 
 pop :: Action
-pop _span _buf _len = do _ <- popLexState
-                         lexToken
+pop _span _buf _len _buf2 =
+  do _ <- popLexState
+     lexToken
 -- See Note [Nested comment line pragmas]
 failLinePrag1 :: Action
-failLinePrag1 span _buf _len = do
+failLinePrag1 span _buf _len _buf2 = do
   b <- getBit InNestedCommentBit
   if b then return (L span ITcomment_line_prag)
        else lexError LexErrorInPragma
 
 -- See Note [Nested comment line pragmas]
 popLinePrag1 :: Action
-popLinePrag1 span _buf _len = do
+popLinePrag1 span _buf _len _buf2 = do
   b <- getBit InNestedCommentBit
   if b then return (L span ITcomment_line_prag) else do
     _ <- popLexState
     lexToken
 
 hopefully_open_brace :: Action
-hopefully_open_brace span buf len
+hopefully_open_brace span buf len buf2
  = do relaxed <- getBit RelaxedLayoutBit
       ctx <- getContext
       (AI l _) <- getInput
@@ -1155,17 +1156,23 @@ hopefully_open_brace span buf len
                  case ctx of
                  Layout prev_off _ : _ -> prev_off < offset
                  _                     -> True
-      if isOK then pop_and open_brace span buf len
+      if isOK then pop_and open_brace span buf len buf2
               else addFatalError $
                      mkPlainErrorMsgEnvelope (mkSrcSpanPs span) PsErrMissingBlock
 
 pop_and :: Action -> Action
-pop_and act span buf len = do _ <- popLexState
-                              act span buf len
+pop_and act span buf len buf2 =
+  do _ <- popLexState
+     act span buf len buf2
 
 -- See Note [Whitespace-sensitive operator parsing]
-followedByOpeningToken :: AlexAccPred ExtsBitmap
-followedByOpeningToken _ _ _ (AI _ buf)
+followedByOpeningToken, precededByClosingToken :: AlexAccPred ExtsBitmap
+followedByOpeningToken _ _ _ (AI _ buf) = followedByOpeningToken' buf
+precededByClosingToken _ (AI _ buf) _ _ = precededByClosingToken' buf
+
+-- The input is the buffer *after* the token.
+followedByOpeningToken' :: StringBuffer -> Bool
+followedByOpeningToken' buf
   | atEnd buf = False
   | otherwise =
       case nextChar buf of
@@ -1179,9 +1186,9 @@ followedByOpeningToken _ _ _ (AI _ buf)
         ('⦇', _) -> True
         (c, _) -> isAlphaNum c
 
--- See Note [Whitespace-sensitive operator parsing]
-precededByClosingToken :: AlexAccPred ExtsBitmap
-precededByClosingToken _ (AI _ buf) _ _ =
+-- The input is the buffer *before* the token.
+precededByClosingToken' :: StringBuffer -> Bool
+precededByClosingToken' buf =
   case prevChar buf '\n' of
     '}' -> decodePrevNChars 1 buf /= "-"
     ')' -> True
@@ -1193,6 +1200,19 @@ precededByClosingToken _ (AI _ buf) _ _ =
     '⦈' -> True
     c -> isAlphaNum c
 
+get_op_ws :: StringBuffer -> StringBuffer -> OpWs
+get_op_ws buf1 buf2 =
+    mk_op_ws (precededByClosingToken' buf1) (followedByOpeningToken' buf2)
+  where
+    mk_op_ws False True  = OpWsPrefix
+    mk_op_ws True  False = OpWsSuffix
+    mk_op_ws True  True  = OpWsTightInfix
+    mk_op_ws False False = OpWsLooseInfix
+
+{-# INLINE with_op_ws #-}
+with_op_ws :: (OpWs -> Action) -> Action
+with_op_ws act span buf len buf2 = act (get_op_ws buf buf2) span buf len buf2
+
 {-# INLINE nextCharIs #-}
 nextCharIs :: StringBuffer -> (Char -> Bool) -> Bool
 nextCharIs buf p = not (atEnd buf) && p (currentChar buf)
@@ -1289,7 +1309,7 @@ alexOrPred p1 p2 userState in1 len in2
   = p1 userState in1 len in2 || p2 userState in1 len in2
 
 multiline_doc_comment :: Action
-multiline_doc_comment span buf _len = {-# SCC "multiline_doc_comment" #-} withLexedDocType worker
+multiline_doc_comment span buf _len _buf2 = {-# SCC "multiline_doc_comment" #-} withLexedDocType worker
   where
     worker input@(AI start_loc _) docType checkNextLine = go start_loc "" [] input
       where
@@ -1335,11 +1355,11 @@ multiline_doc_comment span buf _len = {-# SCC "multiline_doc_comment" #-} withLe
           Nothing -> input
 
 lineCommentToken :: Action
-lineCommentToken span buf len = do
+lineCommentToken span buf len buf2 = do
   b <- getBit RawTokenStreamBit
   if b then do
          lt <- getLastLocComment
-         strtoken (\s -> ITlineComment s lt) span buf len
+         strtoken (\s -> ITlineComment s lt) span buf len buf2
        else lexToken
 
 
@@ -1348,7 +1368,7 @@ lineCommentToken span buf len = do
   using regular expressions.
 -}
 nested_comment :: Action
-nested_comment span buf len = {-# SCC "nested_comment" #-} do
+nested_comment span buf len _buf2 = {-# SCC "nested_comment" #-} do
   l <- getLastLocComment
   let endComment input (L _ comment) = commentEnd lexToken input (Nothing, ITblockComment comment l) buf span
   input <- getInput
@@ -1357,7 +1377,7 @@ nested_comment span buf len = {-# SCC "nested_comment" #-} do
   nested_comment_logic endComment start_decorator input span
 
 nested_doc_comment :: Action
-nested_doc_comment span buf _len = {-# SCC "nested_doc_comment" #-} withLexedDocType worker
+nested_doc_comment span buf _len _buf2 = {-# SCC "nested_doc_comment" #-} withLexedDocType worker
   where
     worker input docType _checkNextLine = nested_comment_logic endComment "" input span
       where
@@ -1496,7 +1516,7 @@ mkHdkCommentSection loc n mkDS = (HdkCommentSection n ds, ITdocComment ds loc)
 -- RULES pragmas turn on the forall and '.' keywords, and we turn them
 -- off again at the end of the pragma.
 rulePrag :: Action
-rulePrag span buf len = do
+rulePrag span buf len _buf2 = do
   setExts (.|. xbit InRulePragBit)
   let !src = lexemeToString buf len
   return (L span (ITrules_prag (SourceText src)))
@@ -1504,26 +1524,26 @@ rulePrag span buf len = do
 -- When 'UsePosPragsBit' is not set, it is expected that we emit a token instead
 -- of updating the position in 'PState'
 linePrag :: Action
-linePrag span buf len = do
+linePrag span buf len buf2 = do
   usePosPrags <- getBit UsePosPragsBit
   if usePosPrags
-    then begin line_prag2 span buf len
+    then begin line_prag2 span buf len buf2
     else let !src = lexemeToString buf len
          in return (L span (ITline_prag (SourceText src)))
 
 -- When 'UsePosPragsBit' is not set, it is expected that we emit a token instead
 -- of updating the position in 'PState'
 columnPrag :: Action
-columnPrag span buf len = do
+columnPrag span buf len buf2 = do
   usePosPrags <- getBit UsePosPragsBit
   let !src = lexemeToString buf len
   if usePosPrags
-    then begin column_prag span buf len
+    then begin column_prag span buf len buf2
     else let !src = lexemeToString buf len
          in return (L span (ITcolumn_prag (SourceText src)))
 
 endPrag :: Action
-endPrag span _buf _len = do
+endPrag span _buf _len _buf2 = do
   setExts (.&. complement (xbit InRulePragBit))
   return (L span ITclose_prag)
 
@@ -1567,11 +1587,11 @@ errBrace (AI end _) span =
               (\srcLoc -> mkPlainErrorMsgEnvelope srcLoc (PsErrLexer LexUnterminatedComment LexErrKind_EOF))
 
 open_brace, close_brace :: Action
-open_brace span _str _len = do
+open_brace span _str _len _buf2 = do
   ctx <- getContext
   setContext (NoLayout:ctx)
   return (L span ITocurly)
-close_brace span _str _len = do
+close_brace span _str _len _buf2 = do
   popContext
   return (L span ITccurly)
 
@@ -1614,7 +1634,7 @@ splitQualName orig_buf len parens = split orig_buf orig_buf
         qual_size = orig_buf `byteDiff` dot_buf
 
 varid :: Action
-varid span buf len =
+varid span buf len _buf2 =
   case lookupUFM reservedWordsFM fs of
     Just (ITcase, _) -> do
       lastTk <- getLastTk
@@ -1660,8 +1680,8 @@ qvarsym buf len = ITqvarsym $! splitQualName buf len False
 qconsym buf len = ITqconsym $! splitQualName buf len False
 
 -- See Note [Whitespace-sensitive operator parsing]
-varsym_prefix :: Action
-varsym_prefix = sym $ \span exts s ->
+varsym :: OpWs -> Action
+varsym OpWsPrefix = sym $ \span exts s ->
   let warnExtConflict errtok =
         do { addPsMessage (mkSrcSpanPs span) (PsWarnOperatorWhitespaceExtConflict errtok)
            ; return (ITvarsym s) }
@@ -1693,10 +1713,7 @@ varsym_prefix = sym $ \span exts s ->
                 (mkSrcSpanPs span)
                 (PsWarnOperatorWhitespace s OperatorWhitespaceOccurrence_Prefix)
             ; return (ITvarsym s) }
-
--- See Note [Whitespace-sensitive operator parsing]
-varsym_suffix :: Action
-varsym_suffix = sym $ \span _ s ->
+varsym OpWsSuffix = sym $ \span _ s ->
   if | s == fsLit "@" -> failMsgP (\srcLoc -> mkPlainErrorMsgEnvelope srcLoc $ PsErrSuffixAT)
      | s == fsLit "." -> return ITdot
      | otherwise ->
@@ -1704,10 +1721,7 @@ varsym_suffix = sym $ \span _ s ->
                 (mkSrcSpanPs span)
                 (PsWarnOperatorWhitespace s OperatorWhitespaceOccurrence_Suffix)
             ; return (ITvarsym s) }
-
--- See Note [Whitespace-sensitive operator parsing]
-varsym_tight_infix :: Action
-varsym_tight_infix = sym $ \span exts s ->
+varsym OpWsTightInfix = sym $ \span exts s ->
   if | s == fsLit "@" -> return ITat
      | s == fsLit ".", OverloadedRecordDotBit `xtest` exts  -> return (ITproj False)
      | s == fsLit "." -> return ITdot
@@ -1716,10 +1730,7 @@ varsym_tight_infix = sym $ \span exts s ->
                 (mkSrcSpanPs span)
                 (PsWarnOperatorWhitespace s (OperatorWhitespaceOccurrence_TightInfix))
             ;  return (ITvarsym s) }
-
--- See Note [Whitespace-sensitive operator parsing]
-varsym_loose_infix :: Action
-varsym_loose_infix = sym $ \_ _ s ->
+varsym OpWsLooseInfix = sym $ \_ _ s ->
   if | s == fsLit "."
      -> return ITdot
      | otherwise
@@ -1729,7 +1740,7 @@ consym :: Action
 consym = sym (\_span _exts s -> return $ ITconsym s)
 
 sym :: (PsSpan -> ExtsBitmap -> FastString -> P Token) -> Action
-sym con span buf len =
+sym con span buf len _buf2 =
   case lookupUFM reservedSymsFM fs of
     Just (keyword, NormalSyntax, 0) ->
       return $ L span keyword
@@ -1760,7 +1771,7 @@ tok_integral :: (SourceText -> Integer -> Token)
              -> Int -> Int
              -> (Integer, (Char -> Int))
              -> Action
-tok_integral itint transint transbuf translen (radix,char_to_int) span buf len = do
+tok_integral itint transint transbuf translen (radix,char_to_int) span buf len _buf2 = do
   numericUnderscores <- getBit NumericUnderscoresBit  -- #14473
   let src = lexemeToString buf len
   when ((not numericUnderscores) && ('_' `elem` src)) $ do
@@ -1802,7 +1813,7 @@ hexadecimal = (16,hexDigit)
 
 -- readSignificandExponentPair can understand negative rationals, exponents, everything.
 tok_frac :: Int -> (String -> Token) -> Action
-tok_frac drop f span buf len = do
+tok_frac drop f span buf len _buf2 = do
   numericUnderscores <- getBit NumericUnderscoresBit  -- #14473
   let src = lexemeToString buf (len-drop)
   when ((not numericUnderscores) && ('_' `elem` src)) $ do
@@ -1837,7 +1848,7 @@ readFractionalLitX readStr b str =
 
 -- we're at the first token on a line, insert layout tokens if necessary
 do_bol :: Action
-do_bol span _str _len = do
+do_bol span _str _len _buf2 = do
         -- See Note [Nested comment line pragmas]
         b <- getBit InNestedCommentBit
         if b then return (L span ITcomment_line_prag) else do
@@ -1888,7 +1899,7 @@ maybe_layout t = do -- If the alternative layout rule is enabled then
 -- by a 'do', then we allow the new context to be at the same indentation as
 -- the previous context.  This is what the 'strict' argument is for.
 new_layout_context :: Bool -> Bool -> Token -> Action
-new_layout_context strict gen_semic tok span _buf len = do
+new_layout_context strict gen_semic tok span _buf len _buf2 = do
     _ <- popLexState
     (AI l _) <- getInput
     let offset = srcLocCol (psRealLoc l) - len
@@ -1907,7 +1918,7 @@ new_layout_context strict gen_semic tok span _buf len = do
                 return (L span tok)
 
 do_layout_left :: Action
-do_layout_left span _buf _len = do
+do_layout_left span _buf _len _buf2 = do
     _ <- popLexState
     pushLexState bol  -- we must be at the start of a line
     return (L span ITvccurly)
@@ -1916,7 +1927,7 @@ do_layout_left span _buf _len = do
 -- LINE pragmas
 
 setLineAndFile :: Int -> Action
-setLineAndFile code (PsSpan span _) buf len = do
+setLineAndFile code (PsSpan span _) buf len _buf2 = do
   let src = lexemeToString buf (len - 1)  -- drop trailing quotation mark
       linenumLen = length $ head $ words src
       linenum = parseUnsignedInteger buf linenumLen 10 octDecDigit
@@ -1943,7 +1954,7 @@ setLineAndFile code (PsSpan span _) buf len = do
   lexToken
 
 setColumn :: Action
-setColumn (PsSpan span _) buf len = do
+setColumn (PsSpan span _) buf len _buf2 = do
   let column =
         case reads (lexemeToString buf len) of
           [(column, _)] -> column
@@ -1969,7 +1980,7 @@ lex_string_prag mkTok = lex_string_prag_comment mkTok'
     mkTok' s _ = mkTok s
 
 lex_string_prag_comment :: (String -> PsSpan -> Token) -> Action
-lex_string_prag_comment mkTok span _buf _len
+lex_string_prag_comment mkTok span _buf _len _buf2
     = do input <- getInput
          start <- getParsedLoc
          l <- getLastLocComment
@@ -1998,7 +2009,7 @@ lex_string_prag_comment mkTok span _buf _len
 -- This stuff is horrible.  I hates it.
 
 lex_string_tok :: Action
-lex_string_tok span buf _len = do
+lex_string_tok span buf _len _buf2 = do
   tok <- lex_string ""
   (AI end bufEnd) <- getInput
   let
@@ -2068,7 +2079,7 @@ lex_char_tok :: Action
 -- (the parser does that).
 -- So we have to do two characters of lookahead: when we see 'x we need to
 -- see if there's a trailing quote
-lex_char_tok span buf _len = do        -- We've seen '
+lex_char_tok span buf _len _buf2 = do        -- We've seen '
    i1 <- getInput       -- Look ahead to first character
    let loc = psSpanStart span
    case alexGetChar' i1 of
@@ -2246,7 +2257,7 @@ getCharOrFail i =  do
 -- QuasiQuote
 
 lex_qquasiquote_tok :: Action
-lex_qquasiquote_tok span buf len = do
+lex_qquasiquote_tok span buf len _buf2 = do
   let (qual, quoter) = splitQualName (stepOn buf) (len - 2) False
   quoteStart <- getParsedLoc
   quote <- lex_quasiquote (psRealLoc quoteStart) ""
@@ -2258,7 +2269,7 @@ lex_qquasiquote_tok span buf len = do
                            mkPsSpan quoteStart end)))
 
 lex_quasiquote_tok :: Action
-lex_quasiquote_tok span buf len = do
+lex_quasiquote_tok span buf len _buf2 = do
   let quoter = tail (lexemeToString buf (len - 1))
                 -- 'tail' drops the initial '[',
                 -- while the -1 drops the trailing '|'
@@ -2297,14 +2308,14 @@ quasiquote_error start = do
 -- Warnings
 
 warnTab :: Action
-warnTab srcspan _buf _len = do
+warnTab srcspan _buf _len _buf2 = do
     addTabWarning (psRealSpan srcspan)
     lexToken
 
 warnThen :: PsMessage -> Action -> Action
-warnThen warning action srcspan buf len = do
+warnThen warning action srcspan buf len buf2 = do
     addPsMessage (RealSrcSpan (psRealSpan srcspan) Strict.Nothing) warning
-    action srcspan buf len
+    action srcspan buf len buf2
 
 -- -----------------------------------------------------------------------------
 -- The Parse Monad
@@ -3401,7 +3412,7 @@ lexToken = do
         let span = mkPsSpan loc1 end
         let bytes = byteDiff buf buf2
         span `seq` setLastToken span bytes
-        lt <- t span buf bytes
+        lt <- t span buf bytes buf2
         let lt' = unLoc lt
         if (isComment lt') then setLastComment lt else setLastTk lt
         return lt
@@ -3490,9 +3501,10 @@ twoWordPrags = Map.fromList [
      ]
 
 dispatch_pragmas :: Map String Action -> Action
-dispatch_pragmas prags span buf len = case Map.lookup (clean_pragma (lexemeToString buf len)) prags of
-                                       Just found -> found span buf len
-                                       Nothing -> lexError LexUnknownPragma
+dispatch_pragmas prags span buf len buf2 =
+  case Map.lookup (clean_pragma (lexemeToString buf len)) prags of
+    Just found -> found span buf len buf2
+    Nothing -> lexError LexUnknownPragma
 
 known_pragma :: Map String Action -> AlexAccPred ExtsBitmap
 known_pragma prags _ (AI _ startbuf) _ (AI _ curbuf)
@@ -3514,13 +3526,13 @@ clean_pragma prag = canon_ws (map toLower (unprefix prag))
                           canon_ws s = unwords (map canonical (words s))
 
 warn_unknown_prag :: Map String Action -> Action
-warn_unknown_prag prags span buf len = do
+warn_unknown_prag prags span buf len buf2 = do
   let uppercase    = map toUpper
       unknown_prag = uppercase (clean_pragma (lexemeToString buf len))
       suggestions  = map uppercase (Map.keys prags)
   addPsMessage (RealSrcSpan (psRealSpan span) Strict.Nothing) $
     PsWarnUnrecognisedPragma unknown_prag suggestions
-  nested_comment span buf len
+  nested_comment span buf len buf2
 
 {-
 %************************************************************************


=====================================
compiler/GHC/Tc/Deriv/Functor.hs
=====================================
@@ -538,8 +538,36 @@ functorLikeTraverse var (FT { ft_triv = caseTrivial,     ft_var = caseVar
 
     go _ _ = (caseTrivial,False)
 
--- Return all syntactic subterms of ty that contain var somewhere
--- These are the things that should appear in instance constraints
+-- | Return all syntactic subterms of a 'Type' that are applied to the 'TyVar'
+-- argument. This determines what constraints should be inferred for derived
+-- 'Functor', 'Foldable', and 'Traversable' instances in "GHC.Tc.Deriv.Infer".
+-- For instance, if we have:
+--
+-- @
+-- data Foo a = MkFoo Int a (Maybe a) (Either Int (Maybe a))
+-- @
+--
+-- Then the following would hold:
+--
+-- * @'deepSubtypesContaining' a Int@ would return @[]@, since @Int@ does not
+--   contain the type variable @a@ at all.
+--
+-- * @'deepSubtypesContaining' a a@ would return @[]@. Although the type @a@
+--   contains the type variable @a@, it is not /applied/ to @a@, which is the
+--   criterion that 'deepSubtypesContaining' checks for.
+--
+-- * @'deepSubtypesContaining' a (Maybe a)@ would return @[Maybe]@, as @Maybe@
+--   is applied to @a at .
+--
+-- * @'deepSubtypesContaining' a (Either Int (Maybe a))@ would return
+--   @[Either Int, Maybe]@. Both of these types are applied to @a@ through
+--   composition.
+--
+-- As used in "GHC.Tc.Deriv.Infer", the 'Type' argument will always come from
+-- 'derivDataConInstArgTys', so it is important that the 'TyVar' comes from
+-- 'dataConUnivTyVars' to match. Make sure /not/ to take the 'TyVar' from
+-- 'tyConTyVars', as these differ from the 'dataConUnivTyVars' when the data
+-- type is a GADT. (See #22167 for what goes wrong if 'tyConTyVars' is used.)
 deepSubtypesContaining :: TyVar -> Type -> [TcType]
 deepSubtypesContaining tv
   = functorLikeTraverse tv


=====================================
compiler/GHC/Tc/Deriv/Generics.hs
=====================================
@@ -93,10 +93,25 @@ gen_Generic_binds gk loc dit = do
 ************************************************************************
 -}
 
+-- | Called by 'GHC.Tc.Deriv.Infer.inferConstraints'; generates a list of
+-- types, each of which must be a 'Functor' in order for the 'Generic1'
+-- instance to work. For instance, if we have:
+--
+-- @
+-- data Foo a = MkFoo Int a (Maybe a) (Either Int (Maybe a))
+-- @
+--
+-- Then @'get_gen1_constrained_tys' a (f (g a))@ would return @[Either Int]@,
+-- as a derived 'Generic1' instance would need to call 'fmap' at that type.
+-- Invoking @'get_gen1_constrained_tys' a@ on any of the other fields would
+-- return @[]@.
+--
+-- 'get_gen1_constrained_tys' is very similar in spirit to
+-- 'deepSubtypesContaining' in "GHC.Tc.Deriv.Functor". Just like with
+-- 'deepSubtypesContaining', it is important that the 'TyVar' argument come
+-- from 'dataConUnivTyVars'. (See #22167 for what goes wrong if 'tyConTyVars'
+-- is used.)
 get_gen1_constrained_tys :: TyVar -> Type -> [Type]
--- called by GHC.Tc.Deriv.Infer.inferConstraints; generates a list of
--- types, each of which must be a Functor in order for the Generic1 instance to
--- work.
 get_gen1_constrained_tys argVar
   = argTyFold argVar $ ArgTyAlg { ata_rec0 = const []
                                 , ata_par1 = [], ata_rec1 = const []


=====================================
compiler/GHC/Tc/Deriv/Infer.hs
=====================================
@@ -178,9 +178,10 @@ inferConstraintsStock dit@(DerivInstTys { dit_cls_tys     = cls_tys
 
               -- Constraints arising from the arguments of each constructor
            con_arg_constraints
-             :: (CtOrigin -> TypeOrKind
-                          -> Type
-                          -> [(ThetaSpec, Maybe Subst)])
+             :: ([TyVar] -> CtOrigin
+                         -> TypeOrKind
+                         -> Type
+                         -> [(ThetaSpec, Maybe Subst)])
              -> (ThetaSpec, [TyVar], [TcType], DerivInstTys)
            con_arg_constraints get_arg_constraints
              = let -- Constraints from the fields of each data constructor.
@@ -195,7 +196,8 @@ inferConstraintsStock dit@(DerivInstTys { dit_cls_tys     = cls_tys
                      , not (isUnliftedType arg_ty)
                      , let orig = DerivOriginDC data_con arg_n wildcard
                      , preds_and_mbSubst
-                         <- get_arg_constraints orig arg_t_or_k arg_ty
+                         <- get_arg_constraints (dataConUnivTyVars data_con)
+                                                orig arg_t_or_k arg_ty
                      ]
                    -- Stupid constraints from DatatypeContexts. Note that we
                    -- must gather these constraints from the data constructors,
@@ -237,21 +239,39 @@ inferConstraintsStock dit@(DerivInstTys { dit_cls_tys     = cls_tys
            is_functor_like = tcTypeKind inst_ty `tcEqKind` typeToTypeKind
                           || is_generic1
 
-           get_gen1_constraints :: Class -> CtOrigin -> TypeOrKind -> Type
-                                -> [(ThetaSpec, Maybe Subst)]
-           get_gen1_constraints functor_cls orig t_or_k ty
+           get_gen1_constraints ::
+                Class
+             -> [TyVar] -- The universally quantified type variables for the
+                        -- data constructor
+             -> CtOrigin -> TypeOrKind -> Type
+             -> [(ThetaSpec, Maybe Subst)]
+           get_gen1_constraints functor_cls dc_univs orig t_or_k ty
               = mk_functor_like_constraints orig t_or_k functor_cls $
-                get_gen1_constrained_tys last_tv ty
-
-           get_std_constrained_tys :: CtOrigin -> TypeOrKind -> Type
-                                   -> [(ThetaSpec, Maybe Subst)]
-           get_std_constrained_tys orig t_or_k ty
+                get_gen1_constrained_tys last_dc_univ ty
+             where
+               -- If we are deriving an instance of 'Generic1' and have made
+               -- it this far, then there should be at least one universal type
+               -- variable, making this use of 'last' safe.
+               last_dc_univ = assert (not (null dc_univs)) $
+                              last dc_univs
+
+           get_std_constrained_tys ::
+                [TyVar] -- The universally quantified type variables for the
+                        -- data constructor
+             -> CtOrigin -> TypeOrKind -> Type
+             -> [(ThetaSpec, Maybe Subst)]
+           get_std_constrained_tys dc_univs orig t_or_k ty
                | is_functor_like
                = mk_functor_like_constraints orig t_or_k main_cls $
-                 deepSubtypesContaining last_tv ty
+                 deepSubtypesContaining last_dc_univ ty
                | otherwise
                = [( [mk_cls_pred orig t_or_k main_cls ty]
                   , Nothing )]
+             where
+               -- If 'is_functor_like' holds, then there should be at least one
+               -- universal type variable, making this use of 'last' safe.
+               last_dc_univ = assert (not (null dc_univs)) $
+                              last dc_univs
 
            mk_functor_like_constraints :: CtOrigin -> TypeOrKind
                                        -> Class -> [Type]
@@ -279,9 +299,6 @@ inferConstraintsStock dit@(DerivInstTys { dit_cls_tys     = cls_tys
                              , tcUnifyTy ki typeToTypeKind
                              )
 
-           rep_tc_tvs      = tyConTyVars rep_tc
-           last_tv         = last rep_tc_tvs
-
            -- Extra Data constraints
            -- The Data class (only) requires that for
            --    instance (...) => Data (T t1 t2)
@@ -320,7 +337,7 @@ inferConstraintsStock dit@(DerivInstTys { dit_cls_tys     = cls_tys
              -- Generic1 needs Functor
              -- See Note [Getting base classes]
           |  is_generic1
-           -> assert (rep_tc_tvs `lengthExceeds` 0) $
+           -> assert (tyConTyVars rep_tc `lengthExceeds` 0) $
               -- Generic1 has a single kind variable
               assert (cls_tys `lengthIs` 1) $
               do { functorClass <- lift $ tcLookupClass functorClassName


=====================================
testsuite/tests/deriving/should_compile/T22167.hs
=====================================
@@ -0,0 +1,24 @@
+module T22167 where
+
+import GHC.Generics (Generic1)
+
+data T1 f a = MkT1 (f a)
+  deriving (Functor, Foldable, Traversable)
+
+data T2 f a where
+  MkT2 :: f a -> T2 f a
+  deriving (Functor, Foldable, Traversable)
+
+-- A slightly more complicated example from the `syntactic` library
+data (sym1 :+: sym2) sig
+  where
+    InjL :: sym1 a -> (sym1 :+: sym2) a
+    InjR :: sym2 a -> (sym1 :+: sym2) a
+  deriving (Functor, Foldable, Traversable)
+
+-- Test Generic1 instances with inferred Functor constraints
+data G1 f g a = MkG1 (f (g a)) deriving Generic1
+
+data G2 f g a where
+  MkG2 :: f (g a) -> G2 f g a
+  deriving Generic1


=====================================
testsuite/tests/deriving/should_compile/all.T
=====================================
@@ -139,3 +139,4 @@ test('T20387', normal, compile, [''])
 test('T20501', normal, compile, [''])
 test('T20719', normal, compile, [''])
 test('T20994', normal, compile, [''])
+test('T22167', normal, compile, [''])



View it on GitLab: https://gitlab.haskell.org/ghc/ghc/-/compare/3763e72b4c951669ebb79a6760fe390a9f9d41dd...3f0c994cb7741b08470f187c67f237d039160708

-- 
View it on GitLab: https://gitlab.haskell.org/ghc/ghc/-/compare/3763e72b4c951669ebb79a6760fe390a9f9d41dd...3f0c994cb7741b08470f187c67f237d039160708
You're receiving this email because of your account on gitlab.haskell.org.


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.haskell.org/pipermail/ghc-commits/attachments/20220918/a5084f28/attachment-0001.html>