[RFC] Compiler pipeline timings visualization
Simon Peyton Jones
simonpj at microsoft.com
Fri May 1 10:10:29 UTC 2020
This sounds great!
Event logging needn't add much cost if we don't log much -- eg just compiler phases. It simply depends on how many events you generate!
For naïve uses, the log you show in [9] is a bit complicated. Something even simpler might be good.
Simon
| -----Original Message-----
| From: ghc-devs <ghc-devs-bounces at haskell.org> On Behalf Of Andreas
| Klebinger
| Sent: 01 May 2020 11:04
| To: Sergej Jaskiewicz <jaskiewiczs at icloud.com>
| Cc: ghc-devs at haskell.org
| Subject: Re: [RFC] Compiler pipeline timings visualization
|
| Hi Sergej,
|
| I think this is a good idea in general, and it seems you did some great
| work there already.
| Something like this can also help with pinpointing performance issues
| inside the compiler
| so would not just be useful to end users.
|
| intuitively I would assume that instead of adding another way
| to produce compiler events we should:
| * Ship GHC with eventlog enabled by default
| * Provide an official converter from eventLog to the chrome trace format.
|
| The eventlog format is quite flexible, and if it's unsuitable to what
| you want I would
| prefer to extend it rather than adding support for additional formats
| inside GHC itself.
|
| This way we:
| * Continue to have one unified way to dump events from haskell apps (the
| eventlog)
| * Users need not go to third party apps, as the converter could
| reasonably come with GHC (like hp2ps)
| * We are free to include information that can't (easily) be encoded in
| the chrome format.
|
| The downside is that users will have to invoke ghc, followed by some
| other tool to get the
| chrome trace, but to me that seems like a worthwhile tradeoff for
| keeping the compiler and
| various visualization formats somewhat separated.
|
| The obvious question there is how much enabling the eventlog by default
| would impact non-logging ghc
| invocations. I have not measured this and it might rule out this
| approach if it has a big impact and isn't
| easily corrected.
|
| As a last point I want to encourage you to open a ticket about this.
| Mailing list threads tend to be harder to follow and find down the line
| then tickets in my experience.
|
| Cheers,
| Andreas
|
| Sergej Jaskiewicz via ghc-devs schrieb am 01.05.2020 um 11:09:
| > tl;dr: I propose adding a new GHC flag for generating a log that allows
| > visualizing how much time each stage in the compiler pipleline took,
| similar
| > to Clang's -ftime-trace.
| >
| > Hello, everyone.
| >
| > I'd like to tell you about a feature I've been working on recently,
| namely,
| > the ability to generate and visualize how long each stage of the
| compilation
| > pipeline takes to complete. This is basically about the same as
| > the -ftime-trace flag that has landed in Clang several months ago [1].
| >
| > The initial motivation for this feature was the desire to understand
| why
| > compilation times are so long when using LLVM as backend. But later I
| realized
| > that this functionality is useful on its own for end users, not just
| GHC devs,
| > so it would make sense to add a new flag -ddump-time-trace.
| >
| > Since not only does Haskell have complex type system, but also there is
| > a variety of language extensions, we, the Haskell users, often
| experience
| > very long compilation times. For instance, the usage of the
| TypeFaimilies
| > extension can considerably slow down the compilation process [2].
| > It is useful to understand how you can fix your code so that it
| compiles faster.
| >
| > There are two options for that right now:
| > - Building GHC from source for profiling and using the just-built GHC
| for
| > compiling your problem code.
| > - Building the compiler from source with event log support [3].
| >
| > The former option just doesn't do it, since the output would be
| > "which GHC function calls took how much time", but there'd be no
| information
| > about which part of the user code was being compiled.
| >
| > The latter option is much closer to what we need. If we link the GHC
| executable
| > with the -eventlog flag, then various significant events will be
| written to
| > a special log file. For example, "Parser began parsing the Main.hs file
| after
| > 5 ms since GHC has started, and ended parsing it 3 ms after that".
| > The resulting log file can be parsed with the ghc-events library [4],
| and also
| > visualized using the ThreadScope app [5].
| >
| > Bu this approach has its disadvantages.
| >
| > Firstly, if the user wants visualization, they'd have to install
| ThreadScope.
| > Some companies' internal policies prohibit installing third-party apps
| from
| > the internet. It would be good if we could generate a log that could be
| > visualized on any computer with a browser. For that we could use
| > the Chrome Trace Event format [6]. This is an ordinary JSON file with a
| specific
| > structure that can be viewed in the Chrome browser by going to
| > the chrome://tracing page, or on https://speedscope.app. A file in
| exactly this
| > format would be generated by Clang if you passed it the -ftime-trace
| flag.
| >
| > Secondly, the event log contains many events that are not relevant to
| the end
| > user, for example, thread creation, memory allocation, etc.
| >
| > As an initial proof of concept, I've developed a command line tool for
| > transforming event log files to Chrome Trace Event files [7].
| >
| > Thirdly, in order for the event log to be generated, you'd still have
| to build
| > GHC from source. The majority of the GHC users won't go this way. Not
| only
| > would it require some basic understanding of the GHC building process,
| but also
| > building itself takes quite some time. It would be great if the needed
| > functionality came with GHC out of the box.
| >
| > This is why I've added support for generating Trace Event files into my
| fork
| > of GHC [8], and I would like to propose including this feature into the
| mainline
| > GHC.
| >
| > (Note that my implementation is still a bit buggy, for example, it only
| works
| > in -j mode. This will be fixed.)
| >
| > You can now execute the following command:
| >
| > ghc -ddump-to-file -ddump-file-prefix="Main." -ddump-time-trace -j -O3
| Main.hs
| >
| > And receive this image [9]. Here, we've compiled two modules, one of
| which
| > depends on the other.
| >
| > One more difference from event log is that now various metadata
| > (like file and module names) are emitted as a separate JSON attribute,
| instead
| > of being encoded in the name of an event. For example, parsing the
| Main.hs file
| > and parsing the QSort.hs file in one compilation are events with the
| same name,
| > but different metadata. We can group them together if we want to know
| how much
| > time the parsing took overall. The event log format doesn't allow us to
| do it.
| >
| > So, we can now generate a Trace Event file from a single GHC
| invocation.
| > However, most projects use build systems that invoke the compiler many
| times.
| > It would be good if we could form a log for the whole project.
| >
| > This is solved by merging logs. However, there is one subtlety: the
| events in
| > logs use relative timestamps (the number of microseconds since the
| process
| > has started). To combine logs from multiple processes, we add a clock
| > synchronization point into each trace log in the form of an additional
| > 'beginningOfTime' JSON attribute that contains the absolute time when
| > the proccess has started.
| >
| > There is a Python script that performs the actual merging [10].
| >
| > This is how it looks like with multiple processes [11].
| >
| > Also, I've implemented generation of the 'beginningOfTime' attribute in
| > LLVM [12], so build systems could take advantage of that and combine
| GHC trace
| > logs with llc/opt trace logs when GHC uses LLVM as backend.
| >
| > Thoughts?
| >
| > Sergej.
| >
| > [1] https://aras-p.info/blog/2019/01/16/time-trace-timeline-flame-
| chart-profiler-for-Clang/
| > [2] https://gitlab.haskell.org/ghc/ghc/issues/8095
| > [3] https://gitlab.haskell.org/ghc/ghc/-/wikis/event-log
| > [4] http://hackage.haskell.org/package/ghc-events
| > [5] https://wiki.haskell.org/ThreadScope
| > [6] https://docs.google.com/document/d/1CvAClvFfyA5R-
| PhYUmn5OOQtYMH4h6I0nSsKchNAySU/edit
| > [7] https://github.com/broadwaylamb/ghc-eventlog-chrome
| > [8] https://gitlab.haskell.org/broadwaylamb/ghc/-/commits/timetrace-
| better-metadata
| > [9] https://user-images.githubusercontent.com/16309982/79079705-
| 39775e00-7d19-11ea-9507-eb0f11581c63.png
| > [10] https://github.com/broadwaylamb/merge_trace_events
| > [11] https://user-images.githubusercontent.com/16309982/79080338-
| ad673580-7d1c-11ea-9e30-5e6f72e77555.png
| > [12] https://github.com/llvm/llvm-
| project/commit/2899103108d38215af8aae377cd0e54794278209
| >
| > _______________________________________________
| > ghc-devs mailing list
| > ghc-devs at haskell.org
| > http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs
|
| _______________________________________________
| ghc-devs mailing list
| ghc-devs at haskell.org
| http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs
More information about the ghc-devs
mailing list