<p dir="ltr">The relief comes when we can confirm, explain, and hopefully avoid it :)</p>
<div class="gmail_quote">On Jun 19, 2013 3:20 PM, "Jan Stolarek" <<a href="mailto:jan.stolarek@p.lodz.pl">jan.stolarek@p.lodz.pl</a>> wrote:<br type="attribution"><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
Nicolas, I kinda like that explanation, because it relieves me of any responsibility for this<br>
problem :) Still, I have reasons to suspect that this might actually be my fault. Generated Core<br>
is slightly different - the generated worker function accepts parameters in different order - and<br>
I don't know why that happens. I also don't see why this would impact performance. Looks like I<br>
will need to become familiar with the profiling tools that you mentioned.<br>
<br>
Janek<br>
<br>
Dnia środa, 19 czerwca 2013, Nicolas Frisby napisał:<br>
> I'm also seeing performance regressions in the shootout benchmarks that I<br>
> can't identify in the asm. The new asm looks better but performs worse,<br>
> with a ~15% slowdown.<br>
><br>
> I fired up the performance counters in my CPU and the free Intel code for<br>
> inspecting them showed that my CPU utilization took about a 10% hit, even<br>
> while executing fewer total instructions.<br>
><br>
> 1) Jan, perhaps we're seeing the same sort of behavior — the shootout<br>
> benchmarks have extremely hot loops (hundreds of millions of iterations<br>
> IIRC). I used ticky profiling too, and saw no suspicious changes in any<br>
> counters.<br>
><br>
> 2) Dear Low-level Gurus: How feasible is it that a ~15% slowdown in a<br>
> program with a very hot loop is due to incidentally inhibiting some caching<br>
> behavior (instr? data?)? Or perhaps effecting alignment? FTR my CPU is a<br>
> Core i7-2620M, Sandy Bridge.<br>
><br>
> Thanks all.<br>
><br>
> On Wed, Jun 19, 2013 at 9:27 AM, Jan Stolarek <<a href="mailto:jan.stolarek@p.lodz.pl">jan.stolarek@p.lodz.pl</a>>wrote:<br>
> > > If it's not sorted out, can you open a ticket, put in the relevant info<br>
> ><br>
> > (so<br>
> ><br>
> > > we don't need to look at the email trail), and we can tackle it when<br>
> > > you get here.<br>
> ><br>
> > Currently there's a temporary workaround: I'm using new folding rules for<br>
> > all primitive types,<br>
> > except for Integer, in which case I left the old folding rules unchanged.<br>
> > This of course should<br>
> > be modified to make all rules uniform, but for now it at least passes<br>
> > validation. I didn't fill<br>
> > the ticket, because the bug does not exist yet :) It only manifests<br>
> > itself in my patches, which<br>
> > have not been applied yet. I'll add all the information from this<br>
> > discussion to my github fork of<br>
> > GHC and then move it to Trac once the bug makes it to HEAD.<br>
> ><br>
> > What worries me more about my patches is the performance regression in<br>
> > kahan, because I see no<br>
> > obvious differences in the generated assembly.<br>
> ><br>
> > Janek<br>
> ><br>
> > > Simon<br>
> > ><br>
> > > -----Original Message-----<br>
> > > From: <a href="mailto:ghc-devs-bounces@haskell.org">ghc-devs-bounces@haskell.org</a><br>
> > > [mailto:<a href="mailto:ghc-devs-bounces@haskell.org">ghc-devs-bounces@haskell.org</a>]<br>
> ><br>
> > On<br>
> ><br>
> > > Behalf Of Jan Stolarek Sent: 20 May 2013 12:35<br>
> > > To: Ian Lynagh<br>
> > > Cc: <a href="mailto:ghc-devs@haskell.org">ghc-devs@haskell.org</a><br>
> > > Subject: Re: Integer constant folding in the presence of new primops<br>
> > ><br>
> > > > If you remove everything but the quotInteger test from<br>
> > > > integerConstantFolding and compile with -ddump-rule-rewrites then<br>
> > > > you'll see that the eqInteger rule fires before quotInteger. This is<br>
> > > > presumably comparing against 0, as the definition of quot for Integer<br>
> > > > (in GHC.Real) is<br>
> > > > _ `quot` 0 = divZeroError<br>
> > > > n `quot` d = n `quotInteger` d<br>
> > ><br>
> > > Yes, I noticed these two rules firing together - perhaps that's the<br>
> > > explanation why. I created a small program for testing:<br>
> > ><br>
> > > main = print quotInt<br>
> > > quotInt :: Integer<br>
> > > quotInt = 100063 `quot` 156<br>
> > ><br>
> > > I noticed that when I define eqInteger wrapper to be NOINLINE, the call<br>
> ><br>
> > to<br>
> ><br>
> > > quot is translated to Core as:<br>
> > ><br>
> > > Main.quotInt =<br>
> > > GHC.Real.$fIntegralInteger_$cquot<br>
> > > (__integer 100063) (__integer 156)<br>
> > ><br>
> > > but when I change the wrapper to INLINE I get:<br>
> > ><br>
> > > Main.quotInt =<br>
> > > GHC.Real.$fNumRatio_$cquot <-------- NumRatio instead of<br>
> > > IntegralInteger (__integer 100063) (__integer 156)<br>
> > ><br>
> > > All rule firing happens later (I used -ddump-simpl-iterations<br>
> > > -ddump-rule-firings), except that for $fNumRatio_$cquot the quot rules<br>
> > > don't fire.<br>
> > ><br>
> > > > Do you also still have eqInteger wired in? It sounds like you might<br>
> > > > have given them both the same unique?<br>
> > ><br>
> > > No, they didn't have the same unique. I modified the existing rules to<br>
> ><br>
> > work<br>
> ><br>
> > > on the new primops and ignore their wrappers. At the moment I reverted<br>
> > > these changes so that I can make progress and leave this problem for<br>
> ><br>
> > later.<br>
> ><br>
> > > Janek<br>
> > ><br>
> > > _______________________________________________<br>
> > > ghc-devs mailing list<br>
> > > <a href="mailto:ghc-devs@haskell.org">ghc-devs@haskell.org</a><br>
> > > <a href="http://www.haskell.org/mailman/listinfo/ghc-devs" target="_blank">http://www.haskell.org/mailman/listinfo/ghc-devs</a><br>
> ><br>
> > _______________________________________________<br>
> > ghc-devs mailing list<br>
> > <a href="mailto:ghc-devs@haskell.org">ghc-devs@haskell.org</a><br>
> > <a href="http://www.haskell.org/mailman/listinfo/ghc-devs" target="_blank">http://www.haskell.org/mailman/listinfo/ghc-devs</a><br>
<br>
<br>
</blockquote></div>