GHC memory usage when typechecking from source vs. loading ModIfaces
Erdi, Gergo
Gergo.Erdi at sc.com
Thu Jan 23 03:19:25 UTC 2025
PUBLIC
Hi Matt & Zubin,
Thanks for the help on this so far!
I managed to hack the linked MR onto 9.8.4 (see https://gitlab.haskell.org/cactus/ghc/-/tree/cactus/backport-13675) and basically it seems to do what it says on the tin on a small example (see attached heap profile examples for typechecking 4313 modules), but I am unsure how to actually use it.
So my understanding of the improvement here is that since now there is only one single HPT [*], I should be able to avoid unnecessary ballooning by doing two things:
* Evicting `HomeModInfo`s wholesale from the HPT that are not going to be needed anymore, because I am done with all modules that would transitively depend on them. This of course only makes sense when typechecking a forest.
* Replacing remaining `HomeModInfo`s with new ones that contain the same ModInterface but the ModDetails is replaced with a fresh one from initModDetails.
The attached `-after` profile shows typechecking with both of these ideas implemented. The first one doesn’t seem to help much on its own, but it’s tricky to evaluate that because it is very dependent on the shape of the workload (how tree-y it is). But the second one shows some serious promise in curtailing memory usage. However, it is also very slow – even on this small example, you can see its effect. On my full 35k+ module example, it more than doubles the runtime.
What would be a good policy on when to replace ModDetails with thunks to avoid both the space leak and excessive rehydration churn?
Also, perhaps unrelated, perhaps not – what’s with all those lists?!
Thanks,
Gergo
[*] BTW is it normal that I am still seeing several (17 in a small test case involving a couple hundred modules) HPT constructors in the heap? (I hacked it locally to be a datatype instead of a newtype just so I can see it in the heap). I expected to see only one.
From: Matthew Pickering <matthewtpickering at gmail.com>
Sent: Tuesday, January 21, 2025 8:24 PM
To: ÉRDI Gergő <gergo at erdi.hu>
Cc: Zubin Duggal <zubin at well-typed.com>; Erdi, Gergo <Gergo.Erdi at sc.com>; Montelatici, Raphael Laurent <Raphael.Montelatici at sc.com>; GHC Devs <ghc-devs at haskell.org>
Subject: [External] Re: GHC memory usage when typechecking from source vs. loading ModIfaces
Thanks Gergo, I think that unless we have access to your code base or a realistic example then the before vs after snapshot will not be so helpful. It's known that `ModDetails` will leak space like this.
Let us know how it goes for you.
Cheers,
Matt
On Fri, Jan 17, 2025 at 11:30 AM ÉRDI Gergő <gergo at erdi.hu<mailto:gergo at erdi.hu>> wrote:
On Fri, 17 Jan 2025, Matthew Pickering wrote:
> 1. As Zubin points out we have recently been concerned with improving the memory usage
> of large module sessions (#25511, !13675, !13593)
>
> I imagine all these patches will greatly help the memory usage in your use case.
I'll try these out and report back.
> 2. You are absolutely right that ModDetails can get forced and is never reset.
>
> If you try !13675, it should be much more easily possible to reset the ModDetails by
> writing into the IORef which stores each home package.
Yes, that makes sense.
> 3. If you share your example or perhaps even a trace from ghc-debug then I will be
> happy to investigate further as it seems like a great test case for the work we have
> recently been doing.
Untangling just the parts that exercise the GHC API from all the other
in-house bits will be quite a lot of work. But if just a ghc-debug
snapshot of e.g. a small example from scratch vs. from existing ModIfaces
would be helpful (with e.g. the top HscEnv at the time of finishing all
typechecking as a saved closure), I can provide that no prob.
Thanks,
Gergo
----------------------------------------------------------------------
This email and any attachments are confidential and may also be privileged. If you are not the intended recipient, please delete all copies and notify the sender immediately. You may wish to refer to the incorporation details of Standard Chartered PLC, Standard Chartered Bank and their subsidiaries together with Standard Chartered Bank’s Privacy Policy via our public website.
----------------------------------------------------------------------
This email and any attachments are confidential and may also be privileged. If you are not the intended recipient, please delete all copies and notify the sender immediately. You may wish to refer to the incorporation details of Standard Chartered PLC, Standard Chartered Bank and their subsidiaries together with Standard Chartered Bank’s Privacy Policy via our main Standard Chartered PLC (UK) website at sc. com
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.haskell.org/pipermail/ghc-devs/attachments/20250123/48ac1cc7/attachment-0001.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: ghc-mu-core-to-exp-before.pdf
Type: application/pdf
Size: 432226 bytes
Desc: ghc-mu-core-to-exp-before.pdf
URL: <http://mail.haskell.org/pipermail/ghc-devs/attachments/20250123/48ac1cc7/attachment-0002.pdf>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: ghc-mu-core-to-exp-after.pdf
Type: application/pdf
Size: 660335 bytes
Desc: ghc-mu-core-to-exp-after.pdf
URL: <http://mail.haskell.org/pipermail/ghc-devs/attachments/20250123/48ac1cc7/attachment-0003.pdf>
More information about the ghc-devs
mailing list