[External] Re: GHC memory usage when typechecking from source vs. loading ModIfaces
Erdi, Gergo
Gergo.Erdi at sc.com
Wed Feb 5 07:02:18 UTC 2025
PUBLIC
Hi Matt,
Thanks for your help so far!
One vacation later, I am back looking at this. Unfortunately, the latest results I am seeing only confuse me more.
I have this small test load of a 4313 module forest that I am typechecking. The baseline resource usage, i.e. before any tricks about rehydrating the ModDetails in the HPT, is 1 GB maximum residency, 113s MUT time and 87s GC time. My aim is to reduce the maximum residency with as little disruption as possible to the total runtime.
My first test was the completely brute-force approach of rehydrating every single ModDetails in the HPT after typechecking every single module. Of course, this has catastrophic runtime performance, since I end up re-re-re-re-rehydrating every ModDetails for a total of 8,443,380 times (not counting the initial rehydration just after typechecking to put it in the HPT). So I get 290s MUT time, 252s GC time. But, the max residency goes down to 490 MB, showing that the idea, at least in principle, has legs.
So far so good. But then my problem starts -- how do I get this max residency improvement with acceptable runtime? My idea was that when typechecking a module, it should only unfold parts of ModDetails that are its transitive dependencies, so it should be enough to rehydrate only those ModDetails. Since this still results in 3,603,206 rehydrations, I shouldn't be too optimistic about its performance, but it should still cut the overhead in half. When I try this out, I get MUT time of 257s, GC time of 186s. However, the max residency is 883 MB! But how is it possible that max residency is not the same 490 MB?!?! Does that mean typechecking a module can unfold parts of ModDetails that are not transitive dependencies of it? How would I track this down?
For reference, here is how I do the rehydration of the HPT, let me know if it seems fishy:
```
recreateModDetailsInHpt :: HscEnv -> [ModuleName] -> IO ()
recreateModDetailsInHpt hsc_env mods = do
hpt <- readIORef hptr
fixIO \hpt' -> do
writeIORef hptr hpt'
traverse recreate_hmi hpt
pure ()
where
hpt at HPT{ table = hptr } = hsc_HPT hsc_env
recreate_hmi hmi@(HomeModInfo iface _details linkable)
| moduleName mod `elem` mods
= do
!fresh_details <- genModDetails hsc_env iface
pure $ HomeModInfo iface fresh_details linkable
| otherwise
= pure hmi
where
mod = mi_module iface
```
In summary, my questions going forward are:
* How come rehydrating transitive dependencies doesn't help as much for max residency as rehydrating all already-loaded modules?
* What exactly does GHC itself do to use this new mutable HPT feature to good effect? I'm sure it doesn't suffer from the above-described quadratic slowdown.
Thanks for the tip on the other two memory usage improvement MRs -- I haven't had time yet to backport them. !12582 in particular seems like it will need quite a bit of work to be applied on 9.8.
Unfortunately, I couldn't get eventlog2html to work -- if I pass an .hp file with the `-p` parameter, I get an HTML file that claims "This eventlog was generated without heap profiling.".
Thanks,
Gergo
From: Matthew Pickering <matthewtpickering at gmail.com>
Sent: Thursday, January 23, 2025 5:51 PM
To: Erdi, Gergo <Gergo.Erdi at sc.com>
Cc: ÉRDI Gergő <gergo at erdi.hu>; Zubin Duggal <zubin at well-typed.com>; Montelatici, Raphael Laurent <Raphael.Montelatici at sc.com>; GHC Devs <ghc-devs at haskell.org>
Subject: [External] Re: GHC memory usage when typechecking from source vs. loading ModIfaces
That's good news.
I don't think the first idea will do very much as there are other references to the final "HomeModInfo" not stored in the HPT.
Have you constructed a time profile to determine why the runtime is higher? With the second approach you are certainly trading space usage for repeating work.
If you actually do have a forest, then ideally you would replace the ModDetails after it will never be used again.
You are likely also missing other patches important for memory usage.
* https://urldefense.com/v3/__https://gitlab.haskell.org/ghc/ghc/-/merge_requests/12582__;!!ASp95G87aa5DoyK5mB3l!8j2-zkmKQghR93XL-RPF1V9V1kplxBgAdAb456h8PjDVH7dx9jPdv0xP7GyikMyzP3qbiZPYaJL0ytEl2nUOva2t$
* https://urldefense.com/v3/__https://gitlab.haskell.org/ghc/ghc/-/merge_requests/12347__;!!ASp95G87aa5DoyK5mB3l!8j2-zkmKQghR93XL-RPF1V9V1kplxBgAdAb456h8PjDVH7dx9jPdv0xP7GyikMyzP3qbiZPYaJL0ytEl2kDCIO5S$
I can't comment about the 17 HPT, what do the retainer stacks look like in ghc-debug?
PS. Please use eventlog2html so the profiles are readable! You can use it on .hp profiles.
Cheers,
Matt
On Thu, Jan 23, 2025 at 3:19 AM Erdi, Gergo <mailto:Gergo.Erdi at sc.com> wrote:
PUBLIC
Hi Matt & Zubin,
Thanks for the help on this so far!
I managed to hack the linked MR onto 9.8.4 (see https://urldefense.com/v3/__https://gitlab.haskell.org/cactus/ghc/-/tree/cactus/backport-13675__;!!ASp95G87aa5DoyK5mB3l!8j2-zkmKQghR93XL-RPF1V9V1kplxBgAdAb456h8PjDVH7dx9jPdv0xP7GyikMyzP3qbiZPYaJL0ytEl2mon4aUz$) and basically it seems to do what it says on the tin on a small example (see attached heap profile examples for typechecking 4313 modules), but I am unsure how to actually use it.
So my understanding of the improvement here is that since now there is only one single HPT [*], I should be able to avoid unnecessary ballooning by doing two things:
• Evicting `HomeModInfo`s wholesale from the HPT that are not going to be needed anymore, because I am done with all modules that would transitively depend on them. This of course only makes sense when typechecking a forest.
• Replacing remaining `HomeModInfo`s with new ones that contain the same ModInterface but the ModDetails is replaced with a fresh one from initModDetails.
The attached `-after` profile shows typechecking with both of these ideas implemented. The first one doesn’t seem to help much on its own, but it’s tricky to evaluate that because it is very dependent on the shape of the workload (how tree-y it is). But the second one shows some serious promise in curtailing memory usage. However, it is also very slow – even on this small example, you can see its effect. On my full 35k+ module example, it more than doubles the runtime.
What would be a good policy on when to replace ModDetails with thunks to avoid both the space leak and excessive rehydration churn?
Also, perhaps unrelated, perhaps not – what’s with all those lists?!
Thanks,
Gergo
[*] BTW is it normal that I am still seeing several (17 in a small test case involving a couple hundred modules) HPT constructors in the heap? (I hacked it locally to be a datatype instead of a newtype just so I can see it in the heap). I expected to see only one.
From: Matthew Pickering <mailto:matthewtpickering at gmail.com>
Sent: Tuesday, January 21, 2025 8:24 PM
To: ÉRDI Gergő <mailto:gergo at erdi.hu>
Cc: Zubin Duggal <mailto:zubin at well-typed.com>; Erdi, Gergo <mailto:Gergo.Erdi at sc.com>; Montelatici, Raphael Laurent <mailto:Raphael.Montelatici at sc.com>; GHC Devs <mailto:ghc-devs at haskell.org>
Subject: [External] Re: GHC memory usage when typechecking from source vs. loading ModIfaces
Thanks Gergo, I think that unless we have access to your code base or a realistic example then the before vs after snapshot will not be so helpful. It's known that `ModDetails` will leak space like this.
Let us know how it goes for you.
Cheers,
Matt
On Fri, Jan 17, 2025 at 11:30 AM ÉRDI Gergő <mailto:gergo at erdi.hu> wrote:
On Fri, 17 Jan 2025, Matthew Pickering wrote:
> 1. As Zubin points out we have recently been concerned with improving the memory usage
> of large module sessions (#25511, !13675, !13593)
>
> I imagine all these patches will greatly help the memory usage in your use case.
I'll try these out and report back.
> 2. You are absolutely right that ModDetails can get forced and is never reset.
>
> If you try !13675, it should be much more easily possible to reset the ModDetails by
> writing into the IORef which stores each home package.
Yes, that makes sense.
> 3. If you share your example or perhaps even a trace from ghc-debug then I will be
> happy to investigate further as it seems like a great test case for the work we have
> recently been doing.
Untangling just the parts that exercise the GHC API from all the other
in-house bits will be quite a lot of work. But if just a ghc-debug
snapshot of e.g. a small example from scratch vs. from existing ModIfaces
would be helpful (with e.g. the top HscEnv at the time of finishing all
typechecking as a saved closure), I can provide that no prob.
Thanks,
Gergo
________________________________________
This email and any attachments are confidential and may also be privileged. If you are not the intended recipient, please delete all copies and notify the sender immediately. You may wish to refer to the incorporation details of Standard Chartered PLC, Standard Chartered Bank and their subsidiaries together with Standard Chartered Bank’s Privacy Policy via our public website.
________________________________________
This email and any attachments are confidential and may also be privileged. If you are not the intended recipient, please delete all copies and notify the sender immediately. You may wish to refer to the incorporation details of Standard Chartered PLC, Standard Chartered Bank and their subsidiaries together with Standard Chartered Bank’s Privacy Policy via our main Standard Chartered PLC (UK) website at sc. com
----------------------------------------------------------------------
This email and any attachments are confidential and may also be privileged. If you are not the intended recipient, please delete all copies and notify the sender immediately. You may wish to refer to the incorporation details of Standard Chartered PLC, Standard Chartered Bank and their subsidiaries together with Standard Chartered Bank’s Privacy Policy via our main Standard Chartered PLC (UK) website at sc. com
More information about the ghc-devs
mailing list