[Git][ghc/ghc][wip/parsing-shift] 3 commits: Require happy >=1.20
Vladislav Zavialov
gitlab at gitlab.haskell.org
Wed Aug 26 15:10:09 UTC 2020
Vladislav Zavialov pushed to branch wip/parsing-shift at Glasgow Haskell Compiler / GHC
Commits:
44338dfc by Vladislav Zavialov at 2020-08-25T21:18:40+03:00
Require happy >=1.20
- - - - -
5bb141e9 by Vladislav Zavialov at 2020-08-25T23:19:01+03:00
Resolve shift/reduce conflicts with %shift (#17232)
- - - - -
3d9297cd by Vladislav Zavialov at 2020-08-26T18:08:14+03:00
WIP: docs
[ci skip]
- - - - -
5 changed files:
- .gitlab-ci.yml
- aclocal.m4
- compiler/GHC/Parser.y
- hadrian/cabal.project
- hadrian/hadrian.cabal
Changes:
=====================================
.gitlab-ci.yml
=====================================
@@ -2,7 +2,7 @@ variables:
GIT_SSL_NO_VERIFY: "1"
# Commit of ghc/ci-images repository from which to pull Docker images
- DOCKER_REV: b65e1145d7c0a62c3533904a88dac14f56fb371b
+ DOCKER_REV: f408f461fcadcb6081a330f6570186425d99ade7
# Sequential version number capturing the versions of all tools fetched by
# .gitlab/ci.sh.
=====================================
aclocal.m4
=====================================
@@ -1021,8 +1021,8 @@ changequote([, ])dnl
])
if test ! -f compiler/GHC/Parser.hs || test ! -f compiler/GHC/Cmm/Parser.hs
then
- FP_COMPARE_VERSIONS([$fptools_cv_happy_version],[-lt],[1.19.10],
- [AC_MSG_ERROR([Happy version 1.19.10 or later is required to compile GHC.])])[]
+ FP_COMPARE_VERSIONS([$fptools_cv_happy_version],[-lt],[1.20.0],
+ [AC_MSG_ERROR([Happy version 1.20.0 or later is required to compile GHC.])])[]
fi
HappyVersion=$fptools_cv_happy_version;
AC_SUBST(HappyVersion)
=====================================
compiler/GHC/Parser.y
=====================================
@@ -95,27 +95,146 @@ import GHC.Builtin.Types ( unitTyCon, unitDataCon, tupleTyCon, tupleDataCon, nil
manyDataConTyCon)
}
-%expect 232 -- shift/reduce conflicts
+%expect 0 -- shift/reduce conflicts
-{- Last updated: 08 June 2020
+{- Note [shift/reduce conflicts]
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+The 'happy' tool turns this grammar into an efficient parser that follows the
+shift-reduce parsing model. There's a parse stack that contains items parsed so
+far (both terminals and non-terminals). Every next token produced by the lexer
+results in one of two actions:
-If you modify this parser and add a conflict, please update this comment.
-You can learn more about the conflicts by passing 'happy' the -i flag:
+ SHIFT: push the token onto the parse stack
- happy -agc --strict compiler/GHC/Parser.y -idetailed-info
+ REDUCE: pop a few items off the parse stack and combine them
+ with a function (reduction rule)
-How is this section formatted? Look up the state the conflict is
-reported at, and copy the list of applicable rules (at the top, without the
-rule numbers). Mark *** for the rule that is the conflicting reduction (that
-is, the interpretation which is NOT taken). NB: Happy doesn't print a rule
-in a state if it is empty, but you should include it in the list (you can
-look these up in the Grammar section of the info file).
+However, sometimes it's unclear which of the two actions to take.
+Consider this code example:
-Obviously the state numbers are not stable across modifications to the parser,
-the idea is to reproduce enough information on each conflict so you can figure
-out what happened if the states were renumbered. Try not to gratuitously move
-productions around in this file.
+ if x then y else f z
+There are two ways to parse it:
+
+ (if x then y else f) z
+ if x then y else (f z)
+
+How is this determined? At some point, the parser gets to the following state:
+
+ parse stack: 'if' exp 'then' exp 'else' "f"
+ next token: "z"
+
+Scenario A (simplified):
+
+ 1. REDUCE, parse stack: 'if' exp 'then' exp 'else' exp
+ next token: "z"
+
+ 2. REDUCE, parse stack: exp
+ next token: "z"
+
+ 3. SHIFT, parse stack: exp "z"
+ next token: ...
+
+ 4. REDUCE, parse stack: exp
+ next token: ...
+
+ This way we get: (if x then y else f) z
+
+Scenario B (simplified):
+
+ 1. SHIFT, parse stack: 'if' exp 'then' exp 'else' "f" "z"
+ next token: ...
+
+ 2. REDUCE, parse stack: 'if' exp 'then' exp 'else' exp
+ next token: ...
+
+ 3. REDUCE, parse stack: exp
+ next token: "z"
+
+ This way we get: if x then y else (f z)
+
+The end result is determined by the chosen action. When Happy detects this, it
+reports a shift/reduce conflict. At the top of the file, we have the following
+directive:
+
+ %expect 0
+
+It means that we expect no unresolved shift/reduce conflicts in this grammar.
+If you modify the grammar and get shift/reduce conflicts, follow the steps
+below to resolve them.
+
+STEP ONE
+ is to figure out what causes the conflict.
+ That's where the -i flag comes in handy:
+
+ happy -agc --strict compiler/GHC/Parser.y -idetailed-info
+
+ By analysing the output of this command, you can figure out which reduction
+ rule causes the issue. At the top of the generated report, you will see a
+ line like this:
+
+ state 147 contains 67 shift/reduce conflicts.
+
+ Scroll down to section State 147 (in your case it could be a different
+ state). The start of the section lists the reduction rules that can fire
+ and shows their context:
+
+ exp10 -> fexp . (rule 492)
+ fexp -> fexp . aexp (rule 498)
+ fexp -> fexp . PREFIX_AT atype (rule 499)
+
+ And then, for every token, it tells you the parsing action:
+
+ ']' reduce using rule 492
+ '::' reduce using rule 492
+ '(' shift, and enter state 178
+ QVARID shift, and enter state 44
+ DO shift, and enter state 182
+ ...
+
+ But if you look closer, some of these tokens also have another parsing action
+ in parentheses:
+
+ QVARID shift, and enter state 44
+ (reduce using rule 492)
+
+ That's how you know rule 492 is causing trouble.
+ Scroll back to the top to see what this rule is:
+
+ ----------------------------------
+ Grammar
+ ----------------------------------
+ ...
+ ...
+ exp10 -> fexp (492)
+ optSemi -> ';' (493)
+ ...
+ ...
+
+ Hence the shift/reduce conflict is caused by this parser production:
+
+ exp10 :: { ECP }
+ : '-' fexp { ... }
+ | fexp { ... } -- problematic rule
+
+STEP TWO
+ is to mark the problematic rule with the %shift pragma. This signals to
+ 'happy' that any shift/reduce conflicts involving this rule must be resolved
+ in favor of a shift.
+
+STEP THREE
+ is to add a dedicated Note for this specific conflict, as is done for all
+ other conflicts below.
+-}
+
+{- Note [%shift: rule_activation -> {- empty -}]
+
+TODO (int-index)
+
+-}
+
+
+{-
-------------------------------------------------------------------------------
state 60 contains 1 shift/reduce conflict.
@@ -1682,7 +1801,8 @@ rule :: { LRuleDecl GhcPs }
-- Rules can be specified to be NeverActive, unlike inline/specialize pragmas
rule_activation :: { ([AddAnn],Maybe Activation) }
- : {- empty -} { ([],Nothing) }
+ -- See Note [%shift: rule_activation -> {- empty -}]
+ : {- empty -} %shift { ([],Nothing) }
| rule_explicit_activation { (fst $1,Just (snd $1)) }
-- This production is used to parse the tilde syntax in pragmas such as
@@ -1718,9 +1838,12 @@ rule_foralls :: { ([AddAnn], Maybe [LHsTyVarBndr () GhcPs], [LRuleBndr GhcPs]) }
>> return ([mu AnnForall $1,mj AnnDot $3,
mu AnnForall $4,mj AnnDot $6],
Just (mkRuleTyVarBndrs $2), mkRuleBndrs $5) }
- | 'forall' rule_vars '.' { ([mu AnnForall $1,mj AnnDot $3],
+
+ -- See Note [%shift: rule_foralls -> 'forall' rule_vars '.']
+ | 'forall' rule_vars '.' %shift { ([mu AnnForall $1,mj AnnDot $3],
Nothing, mkRuleBndrs $2) }
- | {- empty -} { ([], Nothing, []) }
+ -- See Note [%shift: rule_foralls -> {- empty -}]
+ | {- empty -} %shift { ([], Nothing, []) }
rule_vars :: { [LRuleTyTmVar] }
: rule_var rule_vars { $1 : $2 }
@@ -1954,7 +2077,8 @@ is connected to the first type too.
-}
type :: { LHsType GhcPs }
- : btype { $1 }
+ -- See Note [%shift: type -> btype]
+ : btype %shift { $1 }
| btype '->' ctype {% ams $1 [mu AnnRarrow $2] -- See note [GADT decl discards annotations]
>> ams (sLL $1 $> $ HsFunTy noExtField HsUnrestrictedArrow $1 $3)
[mu AnnRarrow $2] }
@@ -1970,7 +2094,8 @@ btype :: { LHsType GhcPs }
: infixtype {% runPV $1 }
infixtype :: { forall b. DisambTD b => PV (Located b) }
- : ftype { $1 }
+ -- See Note [%shift: infixtype -> ftype]
+ : ftype %shift { $1 }
| ftype tyop infixtype { $1 >>= \ $1 ->
$3 >>= \ $3 ->
mkHsOpTyPV $1 $2 $3 }
@@ -1999,7 +2124,8 @@ tyop :: { Located RdrName }
atype :: { LHsType GhcPs }
: ntgtycon { sL1 $1 (HsTyVar noExtField NotPromoted $1) } -- Not including unit tuples
- | tyvar { sL1 $1 (HsTyVar noExtField NotPromoted $1) } -- (See Note [Unit tuples])
+ -- See Note [%shift: atype -> tyvar]
+ | tyvar %shift { sL1 $1 (HsTyVar noExtField NotPromoted $1) } -- (See Note [Unit tuples])
| '*' {% do { warnStarIsType (getLoc $1)
; return $ sL1 $1 (HsStarTy noExtField (isUnicode $1)) } }
@@ -2485,7 +2611,8 @@ exp :: { ECP }
ams (sLL $1 $> $ HsCmdArrApp noExtField $3 $1
HsHigherOrderApp False)
[mu AnnRarrowtail $2] }
- | infixexp { $1 }
+ -- See Note [%shift: exp -> infixexp]
+ | infixexp %shift { $1 }
| exp_prag(exp) { $1 } -- See Note [Pragmas and operator fixity]
infixexp :: { ECP }
@@ -2513,11 +2640,13 @@ exp_prag(e) :: { ECP }
(fst $ unLoc $1) }
exp10 :: { ECP }
- : '-' fexp { ECP $
+ -- See Note [%shift: exp10 -> '-' fexp]
+ : '-' fexp %shift { ECP $
unECP $2 >>= \ $2 ->
amms (mkHsNegAppPV (comb2 $1 $>) $2)
[mj AnnMinus $1] }
- | fexp { $1 }
+ -- See Note [%shift: exp10 -> fexp]
+ | fexp %shift { $1 }
optSemi :: { ([Located Token],Bool) }
: ';' { ([$1],True) }
@@ -2708,7 +2837,8 @@ aexp1 :: { ECP }
aexp2 :: { ECP }
: qvar { ECP $ mkHsVarPV $! $1 }
| qcon { ECP $ mkHsVarPV $! $1 }
- | ipvar { ecpFromExp $ sL1 $1 (HsIPVar noExtField $! unLoc $1) }
+ -- See Note [%shift: aexp2 -> ipvar]
+ | ipvar %shift { ecpFromExp $ sL1 $1 (HsIPVar noExtField $! unLoc $1) }
| overloaded_label { ecpFromExp $ sL1 $1 (HsOverLabel noExtField Nothing $! unLoc $1) }
| literal { ECP $ mkHsLitPV $! $1 }
-- This will enable overloaded strings permanently. Normally the renamer turns HsString
@@ -2750,7 +2880,8 @@ aexp2 :: { ECP }
| SIMPLEQUOTE qcon {% fmap ecpFromExp $ ams (sLL $1 $> $ HsBracket noExtField (VarBr noExtField True (unLoc $2))) [mj AnnSimpleQuote $1,mj AnnName $2] }
| TH_TY_QUOTE tyvar {% fmap ecpFromExp $ ams (sLL $1 $> $ HsBracket noExtField (VarBr noExtField False (unLoc $2))) [mj AnnThTyQuote $1,mj AnnName $2] }
| TH_TY_QUOTE gtycon {% fmap ecpFromExp $ ams (sLL $1 $> $ HsBracket noExtField (VarBr noExtField False (unLoc $2))) [mj AnnThTyQuote $1,mj AnnName $2] }
- | TH_TY_QUOTE {- nothing -} {% reportEmptyDoubleQuotes (getLoc $1) }
+ -- See Note [%shift: aexp2 -> TH_TY_QUOTE]
+ | TH_TY_QUOTE %shift {% reportEmptyDoubleQuotes (getLoc $1) }
| '[|' exp '|]' {% runPV (unECP $2) >>= \ $2 ->
fmap ecpFromExp $
ams (sLL $1 $> $ HsBracket noExtField (ExpBr noExtField $2))
@@ -2892,7 +3023,8 @@ tup_tail :: { forall b. DisambECP b => PV [Located (Maybe (Located b))] }
return ((L (gl $1) (Just $1)) : snd $2) }
| texp { unECP $1 >>= \ $1 ->
return [L (gl $1) (Just $1)] }
- | {- empty -} { return [noLoc Nothing] }
+ -- See Note [%shift: tup_tail -> {- empty -}]
+ | {- empty -} %shift { return [noLoc Nothing] }
-----------------------------------------------------------------------------
-- List expressions
@@ -3403,7 +3535,8 @@ child.
-}
qtyconop :: { Located RdrName } -- Qualified or unqualified
- : qtyconsym { $1 }
+ -- See Note [%shift: qtyconop -> qtyconsym]
+ : qtyconsym %shift { $1 }
| '`' qtycon '`' {% ams (sLL $1 $> (unLoc $2))
[mj AnnBackquote $1,mj AnnVal $2
,mj AnnBackquote $3] }
@@ -3570,7 +3703,8 @@ special_id
| 'capi' { sL1 $1 (fsLit "capi") }
| 'prim' { sL1 $1 (fsLit "prim") }
| 'javascript' { sL1 $1 (fsLit "javascript") }
- | 'group' { sL1 $1 (fsLit "group") }
+ -- See Note [%shift: special_id -> 'group']
+ | 'group' %shift { sL1 $1 (fsLit "group") }
| 'stock' { sL1 $1 (fsLit "stock") }
| 'anyclass' { sL1 $1 (fsLit "anyclass") }
| 'via' { sL1 $1 (fsLit "via") }
=====================================
hadrian/cabal.project
=====================================
@@ -1,7 +1,7 @@
packages: ./
-- This essentially freezes the build plan for hadrian
-index-state: 2020-06-16T03:59:14Z
+index-state: 2020-08-25T12:30:13Z
-- N.B. Compile with -O0 since this is not a performance-critical executable
-- and the Cabal takes nearly twice as long to build with -O1. See #16817.
=====================================
hadrian/hadrian.cabal
=====================================
@@ -148,7 +148,7 @@ executable hadrian
, transformers >= 0.4 && < 0.6
, unordered-containers >= 0.2.1 && < 0.3
build-tools: alex >= 3.1
- , happy >= 1.19.10
+ , happy >= 1.20.0
ghc-options: -Wall
-Wincomplete-record-updates
-Wredundant-constraints
View it on GitLab: https://gitlab.haskell.org/ghc/ghc/-/compare/81f6b2e28d2626529dc46291b1d3a2bbbbdde11c...3d9297cd4525a2b0b8000580eaf26964cd0fabdc
--
View it on GitLab: https://gitlab.haskell.org/ghc/ghc/-/compare/81f6b2e28d2626529dc46291b1d3a2bbbbdde11c...3d9297cd4525a2b0b8000580eaf26964cd0fabdc
You're receiving this email because of your account on gitlab.haskell.org.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.haskell.org/pipermail/ghc-commits/attachments/20200826/a977824e/attachment-0001.html>
More information about the ghc-commits
mailing list