Alignment of doubles in ghc

Mon, 12 May 2003 03:58:50 -0700 (PDT)

--0-491158674-1052737130=:12974
Content-Type: text/plain; charset=us-ascii

I've recently browsed some assembly code generated by GHC (via gcc). It appeared that most values were accessed via memory, because the stack is managed by GHC explicity. Even intermediate results seem accessed via memory, probably due to shortcomings of aliasing analysis in gcc. (?) This brings on the topic of eval/apply vs push/enter. In the eval/apply model, the stack management can (possibly) be left to anunderlying compiler, which would remove most memory accesses, and leave most of the alignment issue to the back-end. (http://research.microsoft.com/Users/simonpj/papers/eval-apply/) From that article, I can only assume that GHC will switch to the eval/apply model. Can we expect that soon? The "alignment" thing wouldbecome easier to deal with... Cheers,Jean-Philippe.
Simon Marlow <simonmar@microsoft.com> wrote:
> I was wondering if anyone has investigated the alignment of doubles in
> ghc.
> 
> On x86 machines, doubles can be aligned on 4 byte boundaries, 
> but there
> is a performace improvement if they have 8 byte alignment. 
> As far as I
> can tell, ghc uses 4 byte alignment for doubles.
> 
> I started to look into the changes needed to go from 4 to 8 byte
> alignment...

It's quite a difficult task. You would need to arrange that the stack
pointer and heap pointer are always 8-byte aligned, which is something
we don't do at the moment: they're always 4-byte aligned only. This
would mean changes to the code generator to add alignment padding to
stack frames, and to pad the heap pointer to an 8-byte boundary after
each allocation.

We haven't looked into it, but you're welcome to try!

We *have* seen examples where gcc spilled some intermediate double
values to the stack, and it made a big difference whether the stack
address used was 8-byte aligned or not. I seem to recall this was the
mysterious cause of a 40% or so difference in speed between two runs of
the same binary at different times of the day :-)

Cheers,
Simon

_______________________________________________
Glasgow-haskell-users mailing list
Glasgow-haskell-users@haskell.org
http://www.haskell.org/mailman/listinfo/glasgow-haskell-users

---------------------------------
Do you Yahoo!?
The New Yahoo! Search - Faster. Easier. Bingo.
--0-491158674-1052737130=:12974
Content-Type: text/html; charset=us-ascii

<DIV>
<DIV>
<DIV>
<DIV>I've recently browsed some assembly code generated by GHC (via gcc). It appeared that most values&nbsp;were accessed via memory, because the stack is managed by GHC explicity. Even intermediate results seem accessed via memory, probably&nbsp;due to shortcomings of aliasing analysis in gcc. (?)</DIV>
<DIV>&nbsp;</DIV>
<DIV>This brings on the topic of eval/apply vs push/enter.&nbsp;In the eval/apply model, the stack management can (possibly) be left to an</DIV>
<DIV>underlying compiler, which would remove most memory accesses, and leave most of the alignment issue to the back-end. (<A href="http://research.microsoft.com/Users/simonpj/papers/eval-apply/">http://research.microsoft.com/Users/simonpj/papers/eval-apply/</A>)</DIV>
<DIV>&nbsp;</DIV>
<DIV>From that article, I can only assume that GHC will switch to the eval/apply model.&nbsp;Can we expect that soon? The "alignment" thing would</DIV>
<DIV>become easier to deal with...</DIV>
<DIV>&nbsp;</DIV>
<DIV>Cheers,</DIV>
<DIV>Jean-Philippe.</DIV>
<DIV><BR><B><I>Simon Marlow &lt;simonmar@microsoft.com&gt;</I></B> wrote:</DIV>
<BLOCKQUOTE style="PADDING-LEFT: 5px; MARGIN-LEFT: 5px; BORDER-LEFT: #1010ff 2px solid"><BR>&gt; I was wondering if anyone has investigated the alignment of doubles in<BR>&gt; ghc.<BR>&gt; <BR>&gt; On x86 machines, doubles can be aligned on 4 byte boundaries, <BR>&gt; but there<BR>&gt; is a performace improvement if they have 8 byte alignment. <BR>&gt; As far as I<BR>&gt; can tell, ghc uses 4 byte alignment for doubles.<BR>&gt; <BR>&gt; I started to look into the changes needed to go from 4 to 8 byte<BR>&gt; alignment...<BR><BR>It's quite a difficult task. You would need to arrange that the stack<BR>pointer and heap pointer are always 8-byte aligned, which is something<BR>we don't do at the moment: they're always 4-byte aligned only. This<BR>would mean changes to the code generator to add alignment padding to<BR>stack frames, and to pad the heap pointer to an 8-byte boundary after<BR>each allocation.<BR><BR>We haven't looked into it, but you're welcome to try!<BR><BR>We *have* seen examples where gcc spilled some intermediate double<BR>values to the stack, and it made a big difference whether the stack<BR>address used was 8-byte aligned or not. I seem to recall this was the<BR>mysterious cause of a 40% or so difference in speed between two runs of<BR>the same binary at different times of the day :-)<BR><BR>Cheers,<BR>Simon<BR><BR><BR>_______________________________________________<BR>Glasgow-haskell-users mailing list<BR>Glasgow-haskell-users@haskell.org<BR>http://www.haskell.org/mailman/listinfo/glasgow-haskell-users</BLOCKQUOTE></DIV></DIV></DIV><p><hr SIZE=1>
Do you Yahoo!?<br>
<a href="http://us.rd.yahoo.com/search/mailsig/*http://search.yahoo.com">The New Yahoo! Search</a> - Faster. Easier. Bingo.
--0-491158674-1052737130=:12974--