[Git][ghc/ghc][wip/marge_bot_batch_merge_job] 2 commits: Profiling: Adds an option to not start time profiling at startup
Marge Bot (@marge-bot)
gitlab at gitlab.haskell.org
Tue Jan 16 05:38:46 UTC 2024
Marge Bot pushed to branch wip/marge_bot_batch_merge_job at Glasgow Haskell Compiler / GHC
Commits:
f7059ad2 by Javier Sagredo at 2024-01-16T00:38:38-05:00
Profiling: Adds an option to not start time profiling at startup
Using the functionality provided by
d89deeba47ce04a5198a71fa4cbc203fe2c90794, this patch creates a new rts
flag `--no-automatic-time-samples` which disables the time profiling
when starting a program. It is then expected that the user starts it
whenever it is needed.
Fixes #24337
- - - - -
2c46fb3c by Simon Peyton Jones at 2024-01-16T00:38:38-05:00
Improve SpecConstr (esp nofib/spectral/ansi)
This MR makes three improvements to SpecConstr: see #24282
* It fixes an outright (and recently-introduced) bug in `betterPat`, which
was wrongly forgetting to compare the lengths of the argument lists.
* It enhances ConVal to inclue a boolean for work-free-ness, so that the
envt can contain non-work-free constructor applications, so that we
can do more: see Note [ConVal work-free-ness]
* It rejigs `subsumePats` so that it doesn't reverse the list. This can
make a difference because, when patterns overlap, we arbitrarily pick
the first. There is no "right" way, but this retains the old
pre-subsumePats behaviour, thereby "fixing" the regression in #24282.
Nofib results
+========================================
| spectral/ansi -21.14%
| spectral/hartel/comp_lab_zift -0.12%
| spectral/hartel/parstof +0.09%
| spectral/last-piece -2.32%
| spectral/multiplier +6.03%
| spectral/para +0.60%
| spectral/simple -0.26%
+========================================
| geom mean -0.18%
+----------------------------------------
The regression in `multiplier` is sad, but it simply replicates GHC's
previous behaviour (e.g. GHC 9.6).
- - - - -
9 changed files:
- compiler/GHC/Core/Opt/SpecConstr.hs
- compiler/GHC/Core/Rules.hs
- compiler/GHC/CoreToStg/Prep.hs
- docs/users_guide/9.10.1-notes.rst
- docs/users_guide/profiling.rst
- libraries/base/src/GHC/Profiling.hs
- rts/Profiling.c
- rts/RtsFlags.c
- rts/include/rts/Flags.h
Changes:
=====================================
compiler/GHC/Core/Opt/SpecConstr.hs
=====================================
@@ -789,47 +789,70 @@ scTopBinds env (b:bs) = do { (usg, b', bs') <- scBind TopLevel env b $
* *
************************************************************************
-Note [Work-free values only in environment]
-~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
-The sc_vals field keeps track of in-scope value bindings, so
-that if we come across (case x of Just y ->...) we can reduce the
-case from knowing that x is bound to a pair.
-
-But only *work-free* values are ok here. For example if the envt had
- x -> Just (expensive v)
-then we do NOT want to expand to
- let y = expensive v in ...
-because the x-binding still exists and we've now duplicated (expensive v).
-
-This seldom happens because let-bound constructor applications are
-ANF-ised, but it can happen as a result of on-the-fly transformations in
-SpecConstr itself. Here is #7865:
-
- let {
- a'_shr =
- case xs_af8 of _ {
- [] -> acc_af6;
- : ds_dgt [Dmd=<L,A>] ds_dgu [Dmd=<L,A>] ->
- (expensive x_af7, x_af7
- } } in
- let {
- ds_sht =
- case a'_shr of _ { (p'_afd, q'_afe) ->
- TSpecConstr_DoubleInline.recursive
- (GHC.Types.: @ GHC.Types.Int x_af7 wild_X6) (q'_afe, p'_afd)
- } } in
-
-When processed knowing that xs_af8 was bound to a cons, we simplify to
- a'_shr = (expensive x_af7, x_af7)
-and we do NOT want to inline that at the occurrence of a'_shr in ds_sht.
-(There are other occurrences of a'_shr.) No no no.
-
-It would be possible to do some on-the-fly ANF-ising, so that a'_shr turned
-into a work-free value again, thus
- a1 = expensive x_af7
- a'_shr = (a1, x_af7)
-but that's more work, so until its shown to be important I'm going to
-leave it for now.
+Note [ConVal work-free-ness]
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+The sc_vals field keeps track of in-scope value bindings, and is used in
+two ways:
+
+(1) To do case-of-known-constructor in a case expression. E.g. if sc_vals
+ includes [x :-> ConVal Just e], then we can simplify
+ case x of Just y -> ...
+ with the case-of-known-constructor transformation. (Yes this is
+ done by the Simplifier, but SpecConstr creates new opportunities when
+ it makes a specialised RHS for a function.)
+
+ For (1) it is crucial that the arguments are /work-free/; see (CV1)
+ below.
+
+(2) To figure out call pattresns. E.g. if sc_vals includes
+ [x :-> ConVal Just e], and we have call (f x), then we might want
+ to specialise `f (Just _)`
+
+ For (2) it is /not/ important that the constructor arguments are work-free;
+ indeed, it would be bad to insist on that. For example
+ let x = Just <expensive>
+ in ....(f x)...
+ Here we want to specialise for `f (Just _)`, and we won't do so if we
+ don't allow [x :-> ConVal Just e] into the environment. Does this ever happen?
+ Yes: see #24282.
+
+ (Yes, the Simplifier will ANF that let-binding, but SpecConstr can
+ make more: see (CV1) for an example.)
+
+Wrinkle:
+
+(CV1) Why is work-free-ness important for (1)? In the example in (1) above, of `e` is
+ expensive, we do /not/ want to simplify
+ case x of { Just y -> ... } ==> let y = e in ...
+ because the x-binding still exists and we've now duplicated `e`.
+
+ This seldom happens because let-bound constructor applications are ANF-ised, but
+ it can happen as a result of on-the-fly transformations in SpecConstr itself.
+ Here is #7865:
+
+ let { a'_shr =
+ case xs_af8 of _ {
+ [] -> acc_af6;
+ : ds_dgt [Dmd=<L,A>] ds_dgu [Dmd=<L,A>] ->
+ (expensive x_af7, x_af7
+ } } in
+ let { ds_sht =
+ case a'_shr of _ { (p'_afd, q'_afe) ->
+ TSpecConstr_DoubleInline.recursive
+ (GHC.Types.: @ GHC.Types.Int x_af7 wild_X6) (q'_afe, p'_afd)
+ } } in
+
+ When processed knowing that xs_af8 was bound to a cons, we simplify to
+ a'_shr = (expensive x_af7, x_af7)
+ and we do NOT want to inline that at the occurrence of a'_shr in ds_sht.
+ (There are other occurrences of a'_shr.) No no no.
+
+ It would be possible to do some on-the-fly ANF-ising, so that a'_shr turned
+ into a work-free value again, thus
+ a1 = expensive x_af7
+ a'_shr = (a1, x_af7)
+ but that's more work, so until its shown to be important I'm going to
+ leave it for now.
Note [Making SpecConstr keener]
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
@@ -910,10 +933,6 @@ data ScEnv = SCE { sc_opts :: !SpecConstrOpts,
sc_vals :: ValueEnv,
-- Domain is OutIds (*after* applying the substitution)
-- Used even for top-level bindings (but not imported ones)
- -- The range of the ValueEnv is *work-free* values
- -- such as (\x. blah), or (Just v)
- -- but NOT (Just (expensive v))
- -- See Note [Work-free values only in environment]
sc_annotations :: UniqFM Name SpecConstrAnnotation
}
@@ -922,14 +941,22 @@ data ScEnv = SCE { sc_opts :: !SpecConstrOpts,
type HowBoundEnv = VarEnv HowBound -- Domain is OutVars
---------------------
-type ValueEnv = IdEnv Value -- Domain is OutIds
-data Value = ConVal AltCon [CoreArg] -- _Saturated_ constructors
- -- The AltCon is never DEFAULT
- | LambdaVal -- Inlinable lambdas or PAPs
+type ValueEnv = IdEnv Value -- Domain is OutIds
+
+data Value = ConVal -- Constructor application
+ Bool -- True <=> all args are work-free
+ -- See Note [ConVal work-free-ness]
+ AltCon -- Never DEFAULT
+ [CoreArg] -- Saturates the constructor
+ | LambdaVal -- Inlinable lambdas or PAPs
instance Outputable Value where
- ppr (ConVal con args) = ppr con <+> interpp'SP args
- ppr LambdaVal = text "<Lambda>"
+ ppr LambdaVal = text "<Lambda>"
+ ppr (ConVal wf con args) = ppr con <> braces pp_wf <+> interpp'SP args
+ where
+ pp_wf | wf = text "wf"
+ | otherwise = text "not-wf"
+
---------------------
initScOpts :: DynFlags -> Module -> SpecConstrOpts
@@ -1058,11 +1085,10 @@ extendBndr env bndr = (env { sc_subst = subst' }, bndr')
(subst', bndr') = substBndr (sc_subst env) bndr
extendValEnv :: ScEnv -> Id -> Maybe Value -> ScEnv
-extendValEnv env _ Nothing = env
-extendValEnv env id (Just cv)
- | valueIsWorkFree cv -- Don't duplicate work!! #7865
- = env { sc_vals = extendVarEnv (sc_vals env) id cv }
-extendValEnv env _ _ = env
+extendValEnv env id mb_val
+ = case mb_val of
+ Nothing -> env
+ Just cv -> env { sc_vals = extendVarEnv (sc_vals env) id cv }
extendCaseBndrs :: ScEnv -> OutExpr -> OutId -> AltCon -> [Var] -> (ScEnv, [Var])
-- When we encounter
@@ -1089,8 +1115,8 @@ extendCaseBndrs env scrut case_bndr con alt_bndrs
cval = case con of
DEFAULT -> Nothing
- LitAlt {} -> Just (ConVal con [])
- DataAlt {} -> Just (ConVal con vanilla_args)
+ LitAlt {} -> Just (ConVal True con [])
+ DataAlt {} -> Just (ConVal True con vanilla_args)
where
vanilla_args = map Type (tyConAppArgs (idType case_bndr)) ++
varsToCoreExprs alt_bndrs
@@ -1497,8 +1523,11 @@ scExpr' env (Let bind body)
scExpr' env (Case scrut b ty alts)
= do { (scrut_usg, scrut') <- scExpr env scrut
; case isValue (sc_vals env) scrut' of
- Just (ConVal con args) -> sc_con_app con args scrut'
- _other -> sc_vanilla scrut_usg scrut'
+ Just (ConVal args_are_work_free con args)
+ | args_are_work_free -> sc_con_app con args scrut'
+ -- Don't duplicate work!! #7865
+ -- See Note [ConVal work-free-ness] (1)
+ _other -> sc_vanilla scrut_usg scrut'
}
where
sc_con_app con args scrut' -- Known constructor; simplify
@@ -2608,7 +2637,8 @@ argToPat1 env in_scope val_env (Cast arg co) arg_occ arg_str
-- Check for a constructor application
-- NB: this *precedes* the Var case, so that we catch nullary constrs
argToPat1 env in_scope val_env arg arg_occ _arg_str
- | Just (ConVal (DataAlt dc) args) <- isValue val_env arg
+ | Just (ConVal _wf (DataAlt dc) args) <- isValue val_env arg
+ -- Ignore `_wf` here; see Note [ConVal work-free-ness] (2)
, not (ignoreDataCon env dc) -- See Note [NoSpecConstr]
, Just arg_occs <- mb_scrut dc
= do { let (ty_args, rest_args) = splitAtList (dataConUnivTyVars dc) args
@@ -2726,7 +2756,7 @@ wildCardPat ty str
isValue :: ValueEnv -> CoreExpr -> Maybe Value
isValue _env (Lit lit)
| litIsLifted lit = Nothing
- | otherwise = Just (ConVal (LitAlt lit) [])
+ | otherwise = Just (ConVal True (LitAlt lit) [])
isValue env (Var v)
| Just cval <- lookupVarEnv env v
@@ -2757,7 +2787,7 @@ isValue _env expr -- Maybe it's a constructor application
DataConWorkId con | args `lengthAtLeast` dataConRepArity con
-- Check saturated; might be > because the
-- arity excludes type args
- -> Just (ConVal (DataAlt con) args)
+ -> Just (ConVal (all exprIsWorkFree args) (DataAlt con) args)
DFunId {} -> Just LambdaVal
-- DFunId: see Note [Specialising on dictionaries]
@@ -2770,34 +2800,43 @@ isValue _env expr -- Maybe it's a constructor application
isValue _env _expr = Nothing
-valueIsWorkFree :: Value -> Bool
-valueIsWorkFree LambdaVal = True
-valueIsWorkFree (ConVal _ args) = all exprIsWorkFree args
-
betterPat :: InScopeSet -> CallPat -> CallPat -> Bool
-- pat1 f @a (Just @a (x::a))
-- is better than
-- pat2 f @Int (Just @Int (x::Int))
--- That is, we can instantiate pat1 to get pat2
+-- That is, we can instantiate pat1 to get pat2, using only type instantiate
-- See Note [Pattern duplicate elimination]
betterPat is (CP { cp_qvars = vs1, cp_args = as1 })
(CP { cp_qvars = vs2, cp_args = as2 })
+ | equalLength as1 as2
= case matchExprs ise vs1 as1 as2 of
Just (_, ms) -> all exprIsTrivial ms
Nothing -> False
+
+ | otherwise -- We must handle patterns of unequal length separately (#24282)
+ = False -- For the pattern with more args, the last arg is "interesting"
+ -- but the corresponding one on the other is "not interesting";
+ -- So we can't get from one to the other with only exprIsTrivial
+ -- instantiation. Example nofib/spectral/ansi, function `loop`:
+ -- P1: loop (I# x) (a : b)
+ -- P2: loop (I# y) -- Pattern eta-reduced
+ -- Neither is better than the other, in the sense of betterPat
where
ise = ISE (is `extendInScopeSetList` vs2) (const noUnfolding)
subsumePats :: InScopeSet -> [CallPat] -> [CallPat]
-- Remove any patterns subsumed by others
-- See Note [Pattern duplicate elimination]
-subsumePats is pats = foldr add [] pats
+-- Other than deleting subsumed patterns, this operation is a no-op;
+-- in particular it does not reverse the input. It should not matter
+-- but in #24282 it did; doing it this way keeps the existing behaviour.
+subsumePats is pats = foldl add [] pats
where
- add :: CallPat -> [CallPat] -> [CallPat]
- add ci [] = [ci]
- add ci1 (ci2:cis) | betterPat is ci2 ci1 = ci2:cis
- | betterPat is ci1 ci2 = ci1:cis
- | otherwise = ci2 : add ci1 cis
+ add :: [CallPat] -> CallPat -> [CallPat]
+ add [] ci = [ci]
+ add (ci1:cis) ci2 | betterPat is ci1 ci2 = ci1 : cis
+ | betterPat is ci2 ci1 = ci2 : cis
+ | otherwise = ci1 : add cis ci2
{-
Note [Pattern duplicate elimination]
=====================================
compiler/GHC/Core/Rules.hs
=====================================
@@ -605,10 +605,8 @@ isMoreSpecific :: InScopeSet -> CoreRule -> CoreRule -> Bool
isMoreSpecific _ (BuiltinRule {}) _ = False
isMoreSpecific _ (Rule {}) (BuiltinRule {}) = True
isMoreSpecific in_scope (Rule { ru_bndrs = bndrs1, ru_args = args1 })
- (Rule { ru_bndrs = bndrs2, ru_args = args2
- , ru_name = rule_name2, ru_rhs = rhs2 })
- = isJust (matchN in_scope_env
- rule_name2 bndrs2 args2 args1 rhs2)
+ (Rule { ru_bndrs = bndrs2, ru_args = args2 })
+ = isJust (matchExprs in_scope_env bndrs2 args2 args1)
where
full_in_scope = in_scope `extendInScopeSetList` bndrs1
in_scope_env = ISE full_in_scope noUnfoldingFun
=====================================
compiler/GHC/CoreToStg/Prep.hs
=====================================
@@ -1983,8 +1983,12 @@ zipManyFloats :: [Floats] -> Floats
zipManyFloats = foldr zipFloats emptyFloats
mkNonRecFloat :: CorePrepEnv -> Demand -> Bool -> Id -> CpeRhs -> FloatingBind
-mkNonRecFloat env dmd is_unlifted bndr rhs = -- pprTraceWith "mkNonRecFloat" ppr $
- Float (NonRec bndr' rhs) bound info
+mkNonRecFloat env dmd is_unlifted bndr rhs
+ = -- pprTrace "mkNonRecFloat" (ppr bndr <+> ppr (bound,info)
+ -- <+> ppr is_lifted <+> ppr is_strict
+ -- <+> ppr ok_for_spec
+ -- $$ ppr rhs) $
+ Float (NonRec bndr' rhs) bound info
where
bndr' = setIdDemandInfo bndr dmd -- See Note [Pin demand info on floats]
(bound,info)
=====================================
docs/users_guide/9.10.1-notes.rst
=====================================
@@ -140,6 +140,9 @@ Runtime system
See :ghc-ticket:`23340`.
:rts-flag:`--nonmoving-dense-allocator-count=⟨count⟩` has been added to fine-tune this behaviour.
+- Add a :rts-flag:`--no-automatic-time-samples` flag which stops time profiling samples being automatically started on
+ startup. Time profiling can be controlled manually using functions in ``GHC.Profiling``.
+
``base`` library
~~~~~~~~~~~~~~~~
=====================================
docs/users_guide/profiling.rst
=====================================
@@ -1009,6 +1009,13 @@ There are three more options which relate to heap profiling:
option is enabled, it's expected that the user will manually start heap
profiling or request specific samples using functions from ``GHC.Profiling``.
+.. rts-flag:: --no-automatic-time-samples
+
+ :since: 9.10.1
+
+ Don't start time profiling from the start of program execution. If this
+ option is enabled, it's expected that the user will manually start time
+ profiling or request specific samples using functions from ``GHC.Profiling``.
.. rts-flag:: --null-eventlog-writer
=====================================
libraries/base/src/GHC/Profiling.hs
=====================================
@@ -20,7 +20,7 @@ import GHC.Base
foreign import ccall stopProfTimer :: IO ()
-- | Start attributing ticks to cost centres. This is called by the RTS on
--- startup.
+-- startup but can be disabled using the rts flag @--no-automatic-time-samples at .
--
-- @since 4.7.0.0
foreign import ccall startProfTimer :: IO ()
=====================================
rts/Profiling.c
=====================================
@@ -296,8 +296,10 @@ void
initTimeProfiling(void)
{
traceProfBegin();
- /* Start ticking */
- startProfTimer();
+ if (RtsFlags.ProfFlags.startTimeProfileAtStartup) {
+ /* Start ticking */
+ startProfTimer();
+ }
};
void
=====================================
rts/RtsFlags.c
=====================================
@@ -216,6 +216,7 @@ void initRtsFlagsDefaults(void)
RtsFlags.ProfFlags.doHeapProfile = false;
RtsFlags.ProfFlags.heapProfileInterval = USToTime(100000); // 100ms
RtsFlags.ProfFlags.startHeapProfileAtStartup = true;
+ RtsFlags.ProfFlags.startTimeProfileAtStartup = true;
#if defined(PROFILING)
RtsFlags.ProfFlags.showCCSOnException = false;
@@ -1154,6 +1155,12 @@ error = true;
RtsFlags.ProfFlags.startHeapProfileAtStartup = false;
break;
}
+ else if (strequal("no-automatic-time-samples",
+ &rts_argv[arg][2])) {
+ OPTION_SAFE;
+ RtsFlags.ProfFlags.startTimeProfileAtStartup = false;
+ break;
+ }
else {
OPTION_SAFE;
errorBelch("unknown RTS option: %s",rts_argv[arg]);
=====================================
rts/include/rts/Flags.h
=====================================
@@ -148,6 +148,7 @@ typedef struct _PROFILING_FLAGS {
Time heapProfileInterval; /* time between samples */
uint32_t heapProfileIntervalTicks; /* ticks between samples (derived) */
bool startHeapProfileAtStartup; /* true if we start profiling from program startup */
+ bool startTimeProfileAtStartup; /* true if we start profiling from program startup */
bool showCCSOnException;
View it on GitLab: https://gitlab.haskell.org/ghc/ghc/-/compare/465671991bb05c5dfb486ebb0d434ccd3a3b4198...2c46fb3cc0e87cf79d962c08e2a67a86440fd2d9
--
View it on GitLab: https://gitlab.haskell.org/ghc/ghc/-/compare/465671991bb05c5dfb486ebb0d434ccd3a3b4198...2c46fb3cc0e87cf79d962c08e2a67a86440fd2d9
You're receiving this email because of your account on gitlab.haskell.org.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.haskell.org/pipermail/ghc-commits/attachments/20240116/9910a8be/attachment-0001.html>
More information about the ghc-commits
mailing list