[Haskell] Realistic max size of GHC heap

Karl Grapone kgrapone at gmail.com
Thu Sep 15 18:36:13 EDT 2005


On 9/15/05, Simon Marlow <simonmar at microsoft.com> wrote:
> 
> On 15 September 2005 01:04, Karl Grapone wrote:
> 
> > I'm considering using haskell for a system that could, potentially,
> > need 5GB-10GB of live data.
> > My intention is to use GHC on Opteron boxes which will give me a max
> > of 16GB-32GB of real ram. I gather that GHC is close to being ported
> > to amd64.
> >
> > Is it a realistic goal to operate with a heap size this large in GHC?
> > The great majority of this data will be very long tenured, so I'm
> > hoping that it'll be possible to configure the GC to not need to much
> > peak memory during the collection phase.
> 
> It'll be a good stress test for the GC, at least. 


Ouch! It scares me when people say that something will be a good stress 
test! :) 

There are no reasons
> in principle why you can't have a heap this big, but major collections
> are going to take a long time. It sounds like in your case most of this
> data is effectively static, so in fact a major collection will be of
> little use.


You're correct, the system will gradually accrue permanent data. I forsee 
there being two distinct generations, a fairly constant sized short-lived 
one, and a gradually increasing set of immortal allocations.
Response times will be critical, but hopefully the GC can be tweaked to a 
sweet spot.

Generational collection tries to deal with this in an adaptive way:
> long-lived data gets traversed less and less often as the program runs,
> as long as you have enough generations. But if the programmer really
> knows that a large chunk of data is going to be live for a long time, it
> would be interesting to see whether this information could be fed back
> in a way that the GC can take advantage of it. I'm sure there must be
> existing techniques for this sort of thing.

Well, I would naively say I only need two, maybe three, generations, as any 
memory that has been around for more than a matter of a couple of hours is 
definitely going to be around until system shutdown. But I'm completely new 
to haskell and I don't know if that holds for a lazy language. My hope was 
that laziness would allow for better response times but it certainly seems 
to muddy the GC waters.

I'd like to recommend haskell, but I just don't know enough to be 
comfortable yet... more research methinks.

Thanks for your responses.
Karl
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://www.haskell.org//pipermail/haskell/attachments/20050916/91e5b899/attachment.htm


More information about the Haskell mailing list