A small but useful tool for performance characterisation

Sun Jan 5 03:34:07 UTC 2020

There is the "useful tools" page [1] which has mentioned the ghc-utils repository where the aforementioned script lives for a few years now. That being said, I get the impression that not many people have found it via this page. Everyone who I know of who has used anything in ghc-utils has discovered it via word of mouth.

I'm not sure what to do about this. The page isn't *that* buried: from the wiki home page one arrives at it via the link path Working Conventions/Various tools.

Cheers,

- Ben 

On January 4, 2020 8:51:07 PM EST, Richard Eisenberg <rae at richarde.dev> wrote:
>Hi Ben,
>
>This sounds great. Is there a place on the wiki to catalog tools like
>this?
>
>Thanks for telling us about it!
>Richard
>
>> On Jan 4, 2020, at 7:37 PM, Ben Gamari <ben at well-typed.com> wrote:
>> 
>> Hi everyone,
>> 
>> I have recently been doing a fair amount of performance
>characterisation
>> and have long wanted a convenient means of collecting GHC runtime
>> statistics for later analysis. For this I quickly developed a small
>> wrapper utility [1].
>> 
>> To see what it does, let's consider an example. Say we made a change
>to
>> GHC which we believe might affect the runtime performance of
>Program.hs.
>> We could quickly check this by running,
>> 
>>    $ ghc-before/_build/stage1/bin/ghc -O Program.hs
>>    $ ghc_perf.py -o before.json ./Program
>>    $ ghc-before/_build/stage1/bin/ghc -O Program.hs
>>    $ ghc_perf.py -o after.json ./Program
>> 
>> This will produce two files, before.json and after.json, which
>contain
>> the various runtime statistics emitted by +RTS -s --machine-readable.
>> These files are in the same format as is used by my nofib branch [2]
>and
>> therefore can be compared using `nofib-compare` from that branch.
>> 
>> In addition to being able to collect runtime metrics, ghc_perf is
>also
>> able to collect performance counters (on Linux only) using perf. For
>> instance,
>> 
>>    $ ghc_perf.py -o program.json \
>>        -e instructions,cycles,cache-misses ./Program
>> 
>> will produce program.json containing not only RTS statistics but also
>> event counts from the perf instructions, cycles, and cache-misses
>> events. Alternatively, passing simply `ghc_perf.py --perf` enables a
>> reasonable default set of events (namely instructions, cycles,
>> cache-misses, branches, and branch-misses).
>> 
>> Finally, ghc_perf can also handle repeated runs. For instance,
>> 
>>    $ ghc_perf.py -o program.json -r 5 --summarize \
>>         -e instructions,cycles,cache-misses ./Program
>> 
>> will run Program 5 times, emit all of the collected samples to
>> program.json, and produce a (very basic) statistical summary of what
>it
>> collected on stdout.
>> 
>> Note that there are a few possible TODOs that I've been considering:
>> 
>> * I chose JSON as the output format to accomodate structured data
>(e.g.
>>   capture experimental parameters in a structured way). However, in
>>   practice this choice has lead to significantly more inconvenience
>>   than I would like, especially given that so far I've only used the
>>   format to capture basic key/value pairs. Perhaps reverting to CSV
>>   would be preferable.
>> 
>> * It might be nice to also add support for cachegrind.
>> 
>> Anyways, I hope that others find this as useful as I have.
>> 
>> Cheers,
>> 
>> - Ben
>> 
>> 
>> [1]
>https://gitlab.haskell.org/bgamari/ghc-utils/blob/master/ghc_perf.py
>> [2] https://gitlab.haskell.org/ghc/nofib/merge_requests/24
>> _______________________________________________
>> ghc-devs mailing list
>> ghc-devs at haskell.org
>> http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs

-- 
Sent from my Android device with K-9 Mail. Please excuse my brevity.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.haskell.org/pipermail/ghc-devs/attachments/20200104/91f9c004/attachment.html>