Diagnosing excessive memory usage / crash when compiling - 9.8.1

Simon Peyton Jones simon.peytonjones at gmail.com
Fri Feb 16 08:47:55 UTC 2024


Sorry about that!

Maybe you have a giant data type with deriving(Generic)?  GHC tends to
behave badly on those.   And yes, you seem to have a lot of type-family
stuff going on!   Usually we see 10k coercion sizes; you have 400k.

Quite  a lot of improvements have happened in this area, which may (or may
not) help.  Once you have whittled a bit, perhaps it'd be possible to test
with HEAD?

This was better with ... 9.6?  9.4?

Simon

On Fri, 16 Feb 2024 at 01:36, Justin Bailey <jgbailey at gmail.com> wrote:

> Well, after running with these flags, one of the `.dump-simpl` files
> is 26 GB! That's also the module it seems to hang on, so pretty sure
> something is going wrong there!
>
> I was seeing output indicating GHC had allocated 146GB during some of
> the passes ???
>
> ```
>
> *** Simplifier [xxx.AirTrafficControl.Types.ATCMessage]:
> Result size of Simplifier iteration=1
>   = {terms: 9,134,
>      types: 49,937,
>      coercions: 388,802,399,
>      joins: 53/289}
> Result size of Simplifier iteration=2
>   = {terms: 8,368,
>      types: 46,864,
>      coercions: 176,356,474,
>      joins: 25/200}
> Result size of Simplifier
>   = {terms: 8,363,
>      types: 46,848,
>      coercions: 176,356,474,
>      joins: 25/200}
> !!! Simplifier [xxx.AirTrafficControl.Types.ATCMessage]: finished in
> 294595.62 milliseconds, allocated 146497.087 megabytes
> ```
>
> So anyways I'll continue whittling this down. This module does use a
> lot of higher-kinded types and fancy stuff.
>
> On Thu, Feb 15, 2024 at 3:56 PM Simon Peyton Jones
> <simon.peytonjones at gmail.com> wrote:
> >
> > Using `-dshow-passes` is very helpful too. It shows the program size
> after each pass of the compiler.
> >
> > Simon
> >
> > On Thu, 15 Feb 2024 at 19:36, Teofil Camarasu <teofilcamarasu at gmail.com>
> wrote:
> >>
> >> Hi Justin,
> >>
> >> From your description, it sounds to me like there's something in your
> source code that's causing the optimiser to generate too much code, which
> then causes the crash because of memory exhaustion (though I might be wrong
> about this).
> >> In the past, when I've run into similar things. I've followed the
> following vague process to help find a minimal reproducer of the issue.
> >>
> >> - pass `-ddump-simpl -ddump-timings -ddump-to-file` to GHC. (See here
> for docs on these flags:
> https://downloads.haskell.org/ghc/latest/docs/users_guide/debugging.html)
> >> These will write some extra debugging information to either your
> `dist-newstyle` or `.stack-work` directory depending on whether you use
> cabal or stack.
> >> They will create for each source file a `.dump-simpl` file that will
> give you the compiler's intermediate output. And a `.dump-timings` file
> that will show timings information about how long each phase of compilation
> took.
> >>
> >> - The first step is to hone down on the problematic module or modules.
> Maybe you already have a good idea from where in your build the compiler
> crashes.
> >> But if not, you can use the `.dump-timings` files and/or a tool that
> summarises them like https://github.com/codedownio/time-ghc-modules. To
> get a sense of where the problem lies.
> >>
> >> - Once you've found your worst module, the next step is to determine
> what about that module is causing the issue.
> >> I find that often you can just try to find what top level identifiers
> in your `.dump-simpl` file are big. This will give a good idea of which
> part of your source code is to blame.
> >> Then I tend to try to delete everything that is irrelevant, and check
> again. Incrementally you get something that is smaller and smaller, and in
> time you tend to end up with something that is small enough to write up as
> a ticket.
> >>
> >> I hope that helps. I've found this process to work quite well for
> hunting down issues where GHC's optimiser goes wrong, but it is a bit of a
> labour intensive process.
> >>
> >> One last thing. You mention that you are on M2. If it's easily doable
> for you, try to reproduce on x86_64 just to make sure it's not some bug
> specific to M2.
> >>
> >> Cheers,
> >> Teo
> >>
> >> On Thu, Feb 15, 2024 at 7:08 PM Justin Bailey <jgbailey at gmail.com>
> wrote:
> >>>
> >>> Hi!
> >>>
> >>> I'm trying to upgrade our (large) codebase to use 9.8.1. (I'm on an
> M2).
> >>>
> >>> When building with -01, memory on the GHC process climbs until it
> >>> reaches the limit of my machine (64G) and then crashes with a
> >>> segfault.
> >>>
> >>> With -00, that does not happen.
> >>>
> >>> How would I go about diagnosing what's happening? Using RTS flags to
> >>> limit the heap to 32G produced the same behavior, just faster.
> >>>
> >>> Strangely, `-v5` does not produce any more output in the console
> >>> (passed via cabal's --ghc-options). Maybe I'm doing it wrong?
> >>>
> >>> Pointers to existing issues or documentation welcome! Thank you!
> >>>
> >>> Justin
> >>> _______________________________________________
> >>> ghc-devs mailing list
> >>> ghc-devs at haskell.org
> >>> http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs
> >>
> >> _______________________________________________
> >> ghc-devs mailing list
> >> ghc-devs at haskell.org
> >> http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.haskell.org/pipermail/ghc-devs/attachments/20240216/110c1ba5/attachment.html>


More information about the ghc-devs mailing list