[Haskell-cafe] Re: [Haskell] ANNOUNCE: Haskell 2010 Report (final)
john at repetae.net
Wed Jul 14 19:26:05 EDT 2010
On Wed, Jul 14, 2010 at 10:35:50AM +0100, Simon Marlow wrote:
>> Yeah, I didn't realize how important the allocator was until I started
>> benchmarking, spending time cutting the cost of marking garbage in
>> half didn't help nearly as much as shaving a few cycles off the
>> allocator. The fast pass of the allocator is actually very fast, each
>> object type has its own allocator and free list so allocation is pretty
>> much just pulling an object off of the free list, it is already of the
>> appropriate size and its constant fields are pre-initialized as they
>> arn't cleared during garbage collection. (there is a heuristic that
>> claims full pages back to the general pool sometimes).
>> The slow path has to grab a full page and initialize it, but that isn't
>> really much slower as I can prefetch the cache lines needed so the cost
>> is on the order of another cache line fill. (thinking about
>> computational complexity in terms of O(cache line fills) rather than
>> O(operations) is much more appropriate on todays architectures.).
> Right, you can see how important locality is by looking at these graphs
> that Don produced recently:
> generational GC these days is important more for locality than for the
> benefits of avoiding repeated tracing.
Yeah, jhc's GC is currently not generational, it is certainly something
I want to do, something I have considered as a quick hack was splitting
up allocations based on whether they were updatable or not. allocating
things that can statically be determined to be in HNF into an "older"
generation and allocating updatable thunks in a younger one. The theory
being updatable thunks are more likely to be overwritten by a
indirection and become garbage soon, I was thinking this could give me
some of the benefits of a generational collector without having to
rewrite my GC around being able to promote objects to an older
generation. I am not sure if it will work, but it is an idea I have been
toying with. Implementing better heap profiling support for jhc generated
code is a prerequisite in any case.
> Speaking of prefetching, we get a lot of benefit in GHC from the
> automatic prefetching done by modern CPUs; I'm not sure how this would
> be affected by having multiple allocation regions. Manual prefetching
> is almost impossible to get right in my experience, see also
> Nethercote/Mycroft where they did some prefetching experiments with GHC:
Ah, this paper looks very interesting, I was wondering if you had
experimented with prefetching just ahead of the allocation pointer.
Looks like it helped :)
John Meacham - ⑆repetae.net⑆john⑈ - http://notanumber.net/
More information about the Haskell-Cafe