GHC memory usage when typechecking from source vs. loading ModIfaces
Matthew Pickering
matthewtpickering at gmail.com
Thu Jan 23 09:50:55 UTC 2025
That's good news.
I don't think the first idea will do very much as there are other
references to the final "HomeModInfo" not stored in the HPT.
Have you constructed a time profile to determine why the runtime is higher?
With the second approach you are certainly trading space usage for
repeating work.
If you actually do have a forest, then ideally you would replace the
ModDetails after it will never be used again.
You are likely also missing other patches important for memory usage.
* https://gitlab.haskell.org/ghc/ghc/-/merge_requests/12582
* https://gitlab.haskell.org/ghc/ghc/-/merge_requests/12347
I can't comment about the 17 HPT, what do the retainer stacks look like in
ghc-debug?
PS. Please use eventlog2html so the profiles are readable! You can use it
on .hp profiles.
Cheers,
Matt
On Thu, Jan 23, 2025 at 3:19 AM Erdi, Gergo <Gergo.Erdi at sc.com> wrote:
> PUBLIC
>
> Hi Matt & Zubin,
>
>
>
> Thanks for the help on this so far!
>
>
>
> I managed to hack the linked MR onto 9.8.4 (see
> https://gitlab.haskell.org/cactus/ghc/-/tree/cactus/backport-13675) and
> basically it seems to do what it says on the tin on a small example (see
> attached heap profile examples for typechecking 4313 modules), but I am
> unsure how to actually use it.
>
>
>
> So my understanding of the improvement here is that since now there is
> only one single HPT [*], I should be able to avoid unnecessary ballooning
> by doing two things:
>
>
>
> - Evicting `HomeModInfo`s wholesale from the HPT that are not going to
> be needed anymore, because I am done with all modules that would
> transitively depend on them. This of course only makes sense when
> typechecking a forest.
> - Replacing remaining `HomeModInfo`s with new ones that contain the
> same ModInterface but the ModDetails is replaced with a fresh one from
> initModDetails.
>
>
>
> The attached `-after` profile shows typechecking with both of these ideas
> implemented. The first one doesn’t seem to help much on its own, but it’s
> tricky to evaluate that because it is very dependent on the shape of the
> workload (how tree-y it is). But the second one shows some serious promise
> in curtailing memory usage. However, it is also very slow – even on this
> small example, you can see its effect. On my full 35k+ module example, it
> more than doubles the runtime.
>
>
>
> What would be a good policy on when to replace ModDetails with thunks to
> avoid both the space leak and excessive rehydration churn?
>
>
>
> Also, perhaps unrelated, perhaps not – what’s with all those lists?!
>
>
>
> Thanks,
>
> Gergo
>
>
>
> [*] BTW is it normal that I am still seeing several (17 in a small test
> case involving a couple hundred modules) HPT constructors in the heap? (I
> hacked it locally to be a datatype instead of a newtype just so I can see
> it in the heap). I expected to see only one.
>
>
>
> *From:* Matthew Pickering <matthewtpickering at gmail.com>
> *Sent:* Tuesday, January 21, 2025 8:24 PM
> *To:* ÉRDI Gergő <gergo at erdi.hu>
> *Cc:* Zubin Duggal <zubin at well-typed.com>; Erdi, Gergo <Gergo.Erdi at sc.com>;
> Montelatici, Raphael Laurent <Raphael.Montelatici at sc.com>; GHC Devs <
> ghc-devs at haskell.org>
> *Subject:* [External] Re: GHC memory usage when typechecking from source
> vs. loading ModIfaces
>
>
>
> Thanks Gergo, I think that unless we have access to your code base or a
> realistic example then the before vs after snapshot will not be so helpful.
> It's known that `ModDetails` will leak space like this.
>
>
>
> Let us know how it goes for you.
>
>
>
> Cheers,
>
>
>
> Matt
>
>
>
>
>
>
>
> On Fri, Jan 17, 2025 at 11:30 AM ÉRDI Gergő <gergo at erdi.hu> wrote:
>
> On Fri, 17 Jan 2025, Matthew Pickering wrote:
>
> > 1. As Zubin points out we have recently been concerned with improving
> the memory usage
> > of large module sessions (#25511, !13675, !13593)
> >
> > I imagine all these patches will greatly help the memory usage in your
> use case.
>
> I'll try these out and report back.
>
> > 2. You are absolutely right that ModDetails can get forced and is never
> reset.
> >
> > If you try !13675, it should be much more easily possible to reset the
> ModDetails by
> > writing into the IORef which stores each home package.
>
> Yes, that makes sense.
>
> > 3. If you share your example or perhaps even a trace from ghc-debug then
> I will be
> > happy to investigate further as it seems like a great test case for the
> work we have
> > recently been doing.
>
> Untangling just the parts that exercise the GHC API from all the other
> in-house bits will be quite a lot of work. But if just a ghc-debug
> snapshot of e.g. a small example from scratch vs. from existing ModIfaces
> would be helpful (with e.g. the top HscEnv at the time of finishing all
> typechecking as a saved closure), I can provide that no prob.
>
> Thanks,
> Gergo
>
> ------------------------------
> This email and any attachments are confidential and may also be
> privileged. If you are not the intended recipient, please delete all copies
> and notify the sender immediately. You may wish to refer to the
> incorporation details of Standard Chartered PLC, Standard Chartered Bank
> and their subsidiaries together with Standard Chartered Bank’s Privacy
> Policy via our public website.
> ------------------------------
> This email and any attachments are confidential and may also be
> privileged. If you are not the intended recipient, please delete all copies
> and notify the sender immediately. You may wish to refer to the
> incorporation details of Standard Chartered PLC, Standard Chartered Bank
> and their subsidiaries together with Standard Chartered Bank’s Privacy
> Policy via our main Standard Chartered PLC (UK) website at sc. com
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.haskell.org/pipermail/ghc-devs/attachments/20250123/d55331fd/attachment.html>
More information about the ghc-devs
mailing list