[GHC] #14257: Heap profiling with ghc and hp2ps and strict function application ($!) gives samples out of sequence (regression)

GHC ghc-devs at haskell.org
Wed Sep 20 21:42:40 UTC 2017


#14257: Heap profiling with ghc and hp2ps and strict function application ($!)
gives samples out of sequence (regression)
-------------------------------------+-------------------------------------
        Reporter:  carlostome        |                Owner:  (none)
            Type:  bug               |               Status:  new
        Priority:  normal            |            Milestone:
       Component:  Runtime System    |              Version:  8.2.1
      Resolution:                    |             Keywords:
Operating System:  Unknown/Multiple  |         Architecture:
                                     |  Unknown/Multiple
 Type of failure:  None/Unknown      |            Test Case:
      Blocked By:                    |             Blocking:
 Related Tickets:  #14006            |  Differential Rev(s):
       Wiki Page:                    |
-------------------------------------+-------------------------------------

Comment (by bgamari):

 So the bad sample is being written by `endHeapProfiling` which gets its
 timestamp from `mut_user_time`. This is in contrast to the samples
 produced during the bulk of the program's execution, which come from
 `dumpCensus`, which takes its timestamp from `mut_user_time_until(t)`,
 where `t` comes from `getProcessTimes` (by way of `gct->gc_start_cpu`).

 I've added a bit of instrumentation,
 {{{#!patch
 --- a/rts/ProfHeap.c
 +++ b/rts/ProfHeap.c
 @@ -363,6 +363,12 @@ void endProfiling( void )
  static void
  printSample(bool beginSample, StgDouble sampleValue)
  {
 +    static StgDouble lastValue = 0;
 +    CHECK(sampleValue >= lastValue);
 +    printf("sampleValue=%f\n", sampleValue);
 +    lastValue = sampleValue;
 +    (void) lastValue;
 +
      fprintf(hp_file, "%s %f\n",
              (beginSample ? "BEGIN_SAMPLE" : "END_SAMPLE"),
              sampleValue);
 diff --git a/rts/Stats.c b/rts/Stats.c
 index 6a5f80130e..773c55e431 100644
 --- a/rts/Stats.c
 +++ b/rts/Stats.c
 @@ -77,6 +77,7 @@ Time stat_getElapsedTime(void)
  double
  mut_user_time_until( Time t )
  {
 +    printf("mut_user_time_until %f %f \n", TimeToSecondsDbl(t),
 TimeToSecondsDbl(stats.gc_cpu_ns));
      return TimeToSecondsDbl(t - stats.gc_cpu_ns);
      // heapCensus() time is included in GC_tot_cpu, so we don't need
      // to subtract it here.
 @@ -331,6 +332,7 @@ stat_endGC (Capability *cap, gc_thread *gct,
          stats.cumulative_par_balanced_copied_bytes +=
              stats.gc.par_balanced_copied_bytes;
      }
 +    printf("end gc %f\n", TimeToSecondsDbl(NSToTime(stats.gc.cpu_ns)));
      stats.gc_cpu_ns += stats.gc.cpu_ns;
      stats.gc_elapsed_ns += stats.gc.elapsed_ns;
 }}}

 Which produces the following,
 {{{
 ...
 end gc 0.000205
 end gc 0.000195
 mut_user_time_until 0.930493 0.603123
 sampleValue=0.327370
 sampleValue=0.327370
 end gc 0.000200
 True
 end gc 0.014411
 mut_user_time_until 0.945025 0.617734
 hi: internal error: ASSERTION FAILED: file rts/ProfHeap.c, line 367

     (GHC version 8.3.20170919 for x86_64_unknown_linux)
     Please report this as a GHC bug:
 http://www.haskell.org/ghc/reportabug
 Aborted
 }}}

 I believe what is happening here is that we are producing the second-to-
 last sample before `stat_endGC` has been run. This means that
 `stats.gc_cpu_ns` is stale as it does not include the time spent during
 this present GC. Then, some time later we go to emit the final sample.
 However, now `stats.gc_cpu_ns` includes the time spent in the final GC,
 meaning that it may appear that we have "gone backwards" in time.

-- 
Ticket URL: <http://ghc.haskell.org/trac/ghc/ticket/14257#comment:2>
GHC <http://www.haskell.org/ghc/>
The Glasgow Haskell Compiler


More information about the ghc-tickets mailing list