Diagnosing excessive memory usage / crash when compiling - 9.8.1

Thu Feb 15 21:31:13 UTC 2024

I did notice this in CI (which are linux machines running x86_64) so
at least it is not limited to M2.

Great tips! Much appreciated!

On Thu, Feb 15, 2024 at 11:36 AM Teofil Camarasu
<teofilcamarasu at gmail.com> wrote:
>
> Hi Justin,
>
> From your description, it sounds to me like there's something in your source code that's causing the optimiser to generate too much code, which then causes the crash because of memory exhaustion (though I might be wrong about this).
> In the past, when I've run into similar things. I've followed the following vague process to help find a minimal reproducer of the issue.
>
> - pass `-ddump-simpl -ddump-timings -ddump-to-file` to GHC. (See here for docs on these flags: https://downloads.haskell.org/ghc/latest/docs/users_guide/debugging.html)
> These will write some extra debugging information to either your `dist-newstyle` or `.stack-work` directory depending on whether you use cabal or stack.
> They will create for each source file a `.dump-simpl` file that will give you the compiler's intermediate output. And a `.dump-timings` file that will show timings information about how long each phase of compilation took.
>
> - The first step is to hone down on the problematic module or modules. Maybe you already have a good idea from where in your build the compiler crashes.
> But if not, you can use the `.dump-timings` files and/or a tool that summarises them like https://github.com/codedownio/time-ghc-modules. To get a sense of where the problem lies.
>
> - Once you've found your worst module, the next step is to determine what about that module is causing the issue.
> I find that often you can just try to find what top level identifiers in your `.dump-simpl` file are big. This will give a good idea of which part of your source code is to blame.
> Then I tend to try to delete everything that is irrelevant, and check again. Incrementally you get something that is smaller and smaller, and in time you tend to end up with something that is small enough to write up as a ticket.
>
> I hope that helps. I've found this process to work quite well for hunting down issues where GHC's optimiser goes wrong, but it is a bit of a labour intensive process.
>
> One last thing. You mention that you are on M2. If it's easily doable for you, try to reproduce on x86_64 just to make sure it's not some bug specific to M2.
>
> Cheers,
> Teo
>
> On Thu, Feb 15, 2024 at 7:08 PM Justin Bailey <jgbailey at gmail.com> wrote:
>>
>> Hi!
>>
>> I'm trying to upgrade our (large) codebase to use 9.8.1. (I'm on an M2).
>>
>> When building with -01, memory on the GHC process climbs until it
>> reaches the limit of my machine (64G) and then crashes with a
>> segfault.
>>
>> With -00, that does not happen.
>>
>> How would I go about diagnosing what's happening? Using RTS flags to
>> limit the heap to 32G produced the same behavior, just faster.
>>
>> Strangely, `-v5` does not produce any more output in the console
>> (passed via cabal's --ghc-options). Maybe I'm doing it wrong?
>>
>> Pointers to existing issues or documentation welcome! Thank you!
>>
>> Justin
>> _______________________________________________
>> ghc-devs mailing list
>> ghc-devs at haskell.org
>> http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs