aha! I think.
John Meacham
john at repetae.net
Thu Oct 27 04:04:57 EDT 2005
On Thu, Oct 27, 2005 at 08:44:10AM +0100, Simon Marlow wrote:
> I'd be surprised if this is an issue. GHC doesn't normally touch the
> info tables during execution (with one exception - getting the tag from
> a constructor in a datatype with >8 constructors). It touches the info
> tables during GC, but it doesn't touch the code during GC. So we might
> push some code out of the cache on a GC, but that shouldn't have a large
> effect.
Yeah, you are right. I realized this after some more thought, we don't
make a new copy of the code for each thunk :)
> It could be an alignment issue, I suppose. Or passing arguments in
> registers (we don't, at the moment, on x86_64).
I tried some experiments using regparm on jhc output on i386 and it
didnot cause the dramatic effect noticed with x86_64, so I don't think
it is just that. well, it is possible, the x86_64 core might be
optimized assuming things are passed in registers while the i386 core
might keep the top few stack members in phantom registers or
something...
but an alignment issue sounds more likely, if we are stradling 4 byte
boundries with our 8 byte pointers and ints, that could affect things
very much. it is the number one cause of performance problems according
to the AMD optimization manual.
>
> If you have any handy test programs, can you try fiddling with the
> alignment of code blocks and see if you get a measurable difference?
I will try that.
> (I'm still digesting your other message, I'll reply in due course).
I am digesting the c-- papers at the moment :)
John
--
John Meacham - ⑆repetae.net⑆john⑈
More information about the Glasgow-haskell-users
mailing list