<html>
  <head>
    <meta http-equiv="Content-Type" content="text/html; charset=UTF-8">
  </head>
  <body>
    <p>My guess is most of the "noise" is not run time, but the compiled
      code changing in hard to predict ways.</p>
    <p><a class="moz-txt-link-freetext" href="https://gitlab.haskell.org/ghc/ghc/-/merge_requests/1776/diffs">https://gitlab.haskell.org/ghc/ghc/-/merge_requests/1776/diffs</a>
      for example was a very small PR that took *months* of on-off work
      to get passing metrics tests. In the end, binding `is_boot` twice
      helped a bit, and dumb luck helped a little bit more. No matter
      how you analyze that, that's a lot of pain for what's manifestly a
      performance-irrelevant MR --- no one is writing 10,000 default
      methods or whatever could possibly make this the micro-optimizing
      worth it!</p>
    <p>Perhaps this is an extreme example, but my rough sense is that
      it's not an isolated outlier.<br>
    </p>
    <p>John<br>
    </p>
    <div class="moz-cite-prefix">On 3/18/21 1:39 PM, davean wrote:<br>
    </div>
    <blockquote type="cite"
cite="mid:CABFs2VhCGMLF8hv64m4L-b1ORgn_jTO-eo7OiOXWjPVvjuMmLg@mail.gmail.com">
      <meta http-equiv="content-type" content="text/html; charset=UTF-8">
      <div dir="ltr">
        <div>I left the wiggle room for things like longer wall time
          causing more time events in the IO Manager/RTS which can be a
          thermal/HW issue.</div>
        <div>They're small and indirect though<br>
        </div>
        <div><br>
        </div>
        <div>-davean<br>
        </div>
      </div>
      <br>
      <div class="gmail_quote">
        <div dir="ltr" class="gmail_attr">On Thu, Mar 18, 2021 at 1:37
          PM Sebastian Graf <<a href="mailto:sgraf1337@gmail.com"
            moz-do-not-send="true">sgraf1337@gmail.com</a>> wrote:<br>
        </div>
        <blockquote class="gmail_quote" style="margin:0px 0px 0px
          0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">
          <div dir="ltr">To be clear: All performance tests that run as
            part of CI measure allocations only. No wall clock time.<br>
            Those measurements are (mostly) deterministic and
            reproducible between compiles of the same worktree and not
            impacted by thermal issues/hardware at all.<br>
          </div>
          <br>
          <div class="gmail_quote">
            <div dir="ltr" class="gmail_attr">Am Do., 18. März 2021 um
              18:09 Uhr schrieb davean <<a
                href="mailto:davean@xkcd.com" target="_blank"
                moz-do-not-send="true">davean@xkcd.com</a>>:<br>
            </div>
            <blockquote class="gmail_quote" style="margin:0px 0px 0px
              0.8ex;border-left:1px solid
              rgb(204,204,204);padding-left:1ex">
              <div dir="ltr">
                <div>That really shouldn't be near system noise for a
                  well constructed performance test. You might be seeing
                  things like thermal issues, etc though - good
                  benchmarking is a serious subject.</div>
                <div>Also we're not talking wall clock tests, we're
                  talking specific metrics. The machines do tend to be
                  bare metal, but many of these are entirely CPU
                  performance independent, memory timing independent,
                  etc. Well not quite but that's a longer discussion.</div>
                <div><br>
                </div>
                <div>The investigation of Haskell code performance is a
                  very good thing to do BTW, but you'd still want to
                  avoid regressions in the improvements you made. How
                  well we can do that and the cost of it is the primary
                  issue here.<br>
                </div>
                <div><br>
                </div>
                <div>-davean<br>
                </div>
                <br>
              </div>
              <br>
              <div class="gmail_quote">
                <div dir="ltr" class="gmail_attr">On Wed, Mar 17, 2021
                  at 6:22 PM Karel Gardas <<a
                    href="mailto:karel.gardas@centrum.cz"
                    target="_blank" moz-do-not-send="true">karel.gardas@centrum.cz</a>>
                  wrote:<br>
                </div>
                <blockquote class="gmail_quote" style="margin:0px 0px
                  0px 0.8ex;border-left:1px solid
                  rgb(204,204,204);padding-left:1ex">On 3/17/21 4:16 PM,
                  Andreas Klebinger wrote:<br>
                  > Now that isn't really an issue anyway I think.
                  The question is rather is<br>
                  > 2% a large enough regression to worry about? 5%?
                  10%?<br>
                  <br>
                  5-10% is still around system noise even on lightly
                  loaded workstation.<br>
                  Not sure if CI is not run on some shared cloud
                  resources where it may be<br>
                  even higher.<br>
                  <br>
                  I've done simple experiment of pining ghc compiling
                  ghc-cabal and I've<br>
                  been able to "speed" it up by 5-10% on W-2265.<br>
                  <br>
                  Also following this CI/performance regs discussion I'm
                  not entirely sure<br>
                  if  this is not just a witch-hunt hurting/beating
                  mostly most active GHC<br>
                  developers. Another idea may be to give up on CI doing
                  perf reg testing<br>
                  at all and invest saved resources into proper
                  investigation of<br>
                  GHC/Haskell programs performance. Not sure, if this
                  would not be more<br>
                  beneficial longer term.<br>
                  <br>
                  Just one random number thrown to the ring. Linux's
                  perf claims that<br>
                  nearly every second L3 cache access on the example
                  above ends with cache<br>
                  miss. Is it a good number or bad number? See stats
                  below (perf stat -d<br>
                  on ghc with +RTS -T -s -RTS').<br>
                  <br>
                  Good luck to anybody working on that!<br>
                  <br>
                  Karel<br>
                  <br>
                  <br>
                  Linking utils/ghc-cabal/dist/build/tmp/ghc-cabal ...<br>
                    61,020,836,136 bytes allocated in the heap<br>
                     5,229,185,608 bytes copied during GC<br>
                       301,742,768 bytes maximum residency (19
                  sample(s))<br>
                         3,533,000 bytes maximum slop<br>
                               840 MiB total memory in use (0 MB lost
                  due to fragmentation)<br>
                  <br>
                                                       Tot time
                  (elapsed)  Avg pause  Max<br>
                  pause<br>
                    Gen  0      2012 colls,     0 par    5.725s 
                   5.731s     0.0028s<br>
                  0.1267s<br>
                    Gen  1        19 colls,     0 par    1.695s 
                   1.696s     0.0893s<br>
                  0.2636s<br>
                  <br>
                    TASKS: 4 (1 bound, 3 peak workers (3 total), using
                  -N1)<br>
                  <br>
                    SPARKS: 0 (0 converted, 0 overflowed, 0 dud, 0 GC'd,
                  0 fizzled)<br>
                  <br>
                    INIT    time    0.000s  (  0.000s elapsed)<br>
                    MUT     time   27.849s  ( 32.163s elapsed)<br>
                    GC      time    7.419s  (  7.427s elapsed)<br>
                    EXIT    time    0.000s  (  0.010s elapsed)<br>
                    Total   time   35.269s  ( 39.601s elapsed)<br>
                  <br>
                    Alloc rate    2,191,122,004 bytes per MUT second<br>
                  <br>
                    Productivity  79.0% of total user, 81.2% of total
                  elapsed<br>
                  <br>
                  <br>
                   Performance counter stats for<br>
                  '/export/home/karel/sfw/ghc-8.10.3/bin/ghc -H32m -O
                  -Wall -optc-Wall -O0<br>
                  -hide-all-packages -package ghc-prim -package base
                  -package binary<br>
                  -package array -package transformers -package time
                  -package containers<br>
                  -package bytestring -package deepseq -package process
                  -package pretty<br>
                  -package directory -package filepath -package
                  template-haskell -package<br>
                  unix --make utils/ghc-cabal/Main.hs -o<br>
                  utils/ghc-cabal/dist/build/tmp/ghc-cabal
                  -no-user-package-db -Wall<br>
                  -fno-warn-unused-imports
                  -fno-warn-warnings-deprecations<br>
                  -DCABAL_VERSION=3,4,0,0 -DBOOTSTRAPPING -odir
                  bootstrapping -hidir<br>
                  bootstrapping
                  libraries/Cabal/Cabal/Distribution/Fields/Lexer.hs<br>
                  -ilibraries/Cabal/Cabal -ilibraries/binary/src
                  -ilibraries/filepath<br>
                  -ilibraries/hpc -ilibraries/mtl -ilibraries/text/src<br>
                  libraries/text/cbits/cbits.c -Ilibraries/text/include<br>
                  -ilibraries/parsec/src +RTS -T -s -RTS':<br>
                  <br>
                           39,632.99 msec task-clock                #   
                  0.999 CPUs<br>
                  utilized<br>
                              17,191      context-switches          #   
                  0.434 K/sec<br>
                  <br>
                                   0      cpu-migrations            #   
                  0.000 K/sec<br>
                  <br>
                             899,930      page-faults               #   
                  0.023 M/sec<br>
                  <br>
                     177,636,979,975      cycles                    #   
                  4.482 GHz<br>
                                (87.54%)<br>
                     181,945,795,221      instructions              #   
                  1.02  insn per<br>
                  cycle           (87.59%)<br>
                      34,033,574,511      branches                  # 
                  858.718 M/sec<br>
                                (87.42%)<br>
                       1,664,969,299      branch-misses             #   
                  4.89% of all<br>
                  branches          (87.48%)<br>
                      41,522,737,426      L1-dcache-loads           #
                  1047.681 M/sec<br>
                                (87.53%)<br>
                       2,675,319,939      L1-dcache-load-misses     #   
                  6.44% of all<br>
                  L1-dcache hits    (87.48%)<br>
                         372,370,395      LLC-loads                 #   
                  9.395 M/sec<br>
                                (87.49%)<br>
                         173,614,140      LLC-load-misses           # 
                   46.62% of all<br>
                  LL-cache hits     (87.46%)<br>
                  <br>
                        39.663103602 seconds time elapsed<br>
                  <br>
                        38.288158000 seconds user<br>
                         1.358263000 seconds sys<br>
                  _______________________________________________<br>
                  ghc-devs mailing list<br>
                  <a href="mailto:ghc-devs@haskell.org" target="_blank"
                    moz-do-not-send="true">ghc-devs@haskell.org</a><br>
                  <a
                    href="http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs"
                    rel="noreferrer" target="_blank"
                    moz-do-not-send="true">http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs</a><br>
                </blockquote>
              </div>
              _______________________________________________<br>
              ghc-devs mailing list<br>
              <a href="mailto:ghc-devs@haskell.org" target="_blank"
                moz-do-not-send="true">ghc-devs@haskell.org</a><br>
              <a
                href="http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs"
                rel="noreferrer" target="_blank" moz-do-not-send="true">http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs</a><br>
            </blockquote>
          </div>
        </blockquote>
      </div>
      <br>
      <fieldset class="mimeAttachmentHeader"></fieldset>
      <pre class="moz-quote-pre" wrap="">_______________________________________________
ghc-devs mailing list
<a class="moz-txt-link-abbreviated" href="mailto:ghc-devs@haskell.org">ghc-devs@haskell.org</a>
<a class="moz-txt-link-freetext" href="http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs">http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs</a>
</pre>
    </blockquote>
  </body>
</html>