garbage collection

Wed Apr 20 10:56:27 EDT 2005

Hello Simon,

Tuesday, April 19, 2005, 4:15:53 PM, you wrote:

>> 1) can you add disableGC and enableGC procedures? this can
>> significantly improve performance in some cases

SM> Sure.  I imagine you want to do this to avoid a major collection right
SM> at the peak of a residency spike.

SM> You probably only want to disable major collections though: it's safe
SM> for minor collections to happen.

no, in that particular case i have very simple and fast algorithm,
which allocates plenty of memory. minor GC's in such situation is just
waste of time. so i want to do:

disableGC
result <- eatMemory
enableGC

with a effect that all memory allocated in 'eatMemory' procedure will
be garbage collected only after return from this procedure. currently
i have this stats:

  INIT  time    0.01s  (  0.00s elapsed)
  MUT   time    0.57s  (  0.60s elapsed)
  GC    time    1.41s  (  1.41s elapsed)
  EXIT  time    0.00s  (  0.00s elapsed)
  Total time    1.99s  (  2.01s elapsed)

  %GC time      70.8%  (70.1% elapsed)

  Alloc rate    171,249,142 bytes per MUT second

  Productivity  28.7% of total user, 28.4% of total elapsed

as you see, it is very inefficient

>> 2) if, for example, program's data before GC is 80 mb and after GC is
>> 60 mb then the program will occupy after GC the whole 140 mb and ALL
>> this space will be marked by OS as used! if there's a memory shortage,
>> old program data even can be swapped to disk despite the fact that we
>> absolutely don't need them! that behaviour significantly enlarge
>> memory needs of GHC-compiled programs
>> 
>> if this unused memory will be returned to OS or marked as unneeded
>> after GC then will problem will go on. preferably this must be done
>> during the time of GC, on each page which contents has been already
>> compacted. in this case such program will never use more than 80mb of
>> real memory (+ 1 page + memory for GC)

SM> I guess you're proposing using madvise(M_FREE) (or whatever the
SM> equivalent is on your favourite OS).  This would certainly be a good
SM> idea if the program is swapping, but might impose an overhead when
SM> running in memory.  I don't know, I haven't tried.

i don't see resons why this can be slower. we will be a "good citizens"
- return memory what is not used at current moment and reallocate
memory when needed. current implementation only allows memory usage to
grow and that is not perfect too. imho it will be better to release
unneeded memory after major GC and perform next major GC after
allocating fixed amount of memory or, say, after doubling used memory area

SM> Also, you might be better off using +RTS -c or +RTS -M<size> to avoid
SM> swapping in the first place.

it seems to me that we can join benefits of compacting and copying
algorithms. at least this combined algorithm will be better than
compacting in all cases and better than copying in almost all cases.
am i right, right or right? :)))

currently, compacting algorithm works "in place" and copying alg. use
additional memory. this alg. will use additional logical memory but
works almost in-place in terms of physical memory

-- 
Best regards,
 Bulat                            mailto:bulatz at HotPOP.com