Integer constant folding in the presence of new primops
Nicolas Frisby
nicolas.frisby at gmail.com
Wed Jun 19 18:46:33 CEST 2013
Thanks Austin.
The program exhibiting these behaviors is shootout/reverse-complement. The
performance monitoring I used was Intel's pcm from
http://software.intel.com/en-us/articles/intel-performance-counter-monitor-a-better-way-to-measure-cpu-utilization
I've been working only on my MBP, so no perfmon yet. I plan to investigate
this with different architectures/machines when this issue percolates back
up my todo list.
On Wed, Jun 19, 2013 at 11:39 AM, Austin Seipp <aseipp at pobox.com> wrote:
> I mean, it certainly *seems* reasonable a 15% hit could come from
> pipelining changes or cache behavior or something. I don't think
> alignment would really be a huge issue; post-Nehalem I believe
> non-aligned writes/reads are extremely cheap. Non-intuitive behavior
> can totally happen too: I've seen cases of adding instructions to a
> loop which speeds things up (e.g. by taking the extra step, you may
> mitigate a dependency stall, which massively helps pipelining across
> the loop body etc.)
>
> Nicolas, can I ask what benchmark you're looking at? And what
> performance tools are you using, Intels'? If you're on Linux, the
> 'perf' tool on a modern kernel can be used to quickly get an overview
> of how many cache misses/hits your process has, how many pipeline
> stalls occur, etc. You can then use it to drill down a bit into the
> assembly that's problematic.
>
> That might not give you an exact culprit (it could be many changes and
> accumulative hits,) but it's a start.
>
> On Wed, Jun 19, 2013 at 10:43 AM, Nicolas Frisby
> <nicolas.frisby at gmail.com> wrote:
> > I'm also seeing performance regressions in the shootout benchmarks that I
> > can't identify in the asm. The new asm looks better but performs worse,
> with
> > a ~15% slowdown.
> >
> > I fired up the performance counters in my CPU and the free Intel code for
> > inspecting them showed that my CPU utilization took about a 10% hit, even
> > while executing fewer total instructions.
> >
> > 1) Jan, perhaps we're seeing the same sort of behavior — the shootout
> > benchmarks have extremely hot loops (hundreds of millions of iterations
> > IIRC). I used ticky profiling too, and saw no suspicious changes in any
> > counters.
> >
> > 2) Dear Low-level Gurus: How feasible is it that a ~15% slowdown in a
> > program with a very hot loop is due to incidentally inhibiting some
> caching
> > behavior (instr? data?)? Or perhaps effecting alignment? FTR my CPU is a
> > Core i7-2620M, Sandy Bridge.
> >
> > Thanks all.
> >
> > On Wed, Jun 19, 2013 at 9:27 AM, Jan Stolarek <jan.stolarek at p.lodz.pl>
> > wrote:
> >>
> >> > If it's not sorted out, can you open a ticket, put in the relevant
> info
> >> > (so
> >> > we don't need to look at the email trail), and we can tackle it when
> you
> >> > get here.
> >> Currently there's a temporary workaround: I'm using new folding rules
> for
> >> all primitive types,
> >> except for Integer, in which case I left the old folding rules
> unchanged.
> >> This of course should
> >> be modified to make all rules uniform, but for now it at least passes
> >> validation. I didn't fill
> >> the ticket, because the bug does not exist yet :) It only manifests
> itself
> >> in my patches, which
> >> have not been applied yet. I'll add all the information from this
> >> discussion to my github fork of
> >> GHC and then move it to Trac once the bug makes it to HEAD.
> >>
> >> What worries me more about my patches is the performance regression in
> >> kahan, because I see no
> >> obvious differences in the generated assembly.
> >>
> >> Janek
> >>
> >> >
> >> > Simon
> >> >
> >> > -----Original Message-----
> >> > From: ghc-devs-bounces at haskell.org [mailto:
> ghc-devs-bounces at haskell.org]
> >> > On
> >> > Behalf Of Jan Stolarek Sent: 20 May 2013 12:35
> >> > To: Ian Lynagh
> >> > Cc: ghc-devs at haskell.org
> >> > Subject: Re: Integer constant folding in the presence of new primops
> >> >
> >> > > If you remove everything but the quotInteger test from
> >> > > integerConstantFolding and compile with -ddump-rule-rewrites then
> >> > > you'll see that the eqInteger rule fires before quotInteger. This is
> >> > > presumably comparing against 0, as the definition of quot for
> Integer
> >> > > (in GHC.Real) is
> >> > > _ `quot` 0 = divZeroError
> >> > > n `quot` d = n `quotInteger` d
> >> >
> >> > Yes, I noticed these two rules firing together - perhaps that's the
> >> > explanation why. I created a small program for testing:
> >> >
> >> > main = print quotInt
> >> > quotInt :: Integer
> >> > quotInt = 100063 `quot` 156
> >> >
> >> > I noticed that when I define eqInteger wrapper to be NOINLINE, the
> call
> >> > to
> >> > quot is translated to Core as:
> >> >
> >> > Main.quotInt =
> >> > GHC.Real.$fIntegralInteger_$cquot
> >> > (__integer 100063) (__integer 156)
> >> >
> >> > but when I change the wrapper to INLINE I get:
> >> >
> >> > Main.quotInt =
> >> > GHC.Real.$fNumRatio_$cquot <-------- NumRatio instead of
> >> > IntegralInteger (__integer 100063) (__integer 156)
> >> >
> >> > All rule firing happens later (I used -ddump-simpl-iterations
> >> > -ddump-rule-firings), except that for $fNumRatio_$cquot the quot rules
> >> > don't fire.
> >> >
> >> > > Do you also still have eqInteger wired in? It sounds like you might
> >> > > have given them both the same unique?
> >> >
> >> > No, they didn't have the same unique. I modified the existing rules to
> >> > work
> >> > on the new primops and ignore their wrappers. At the moment I reverted
> >> > these changes so that I can make progress and leave this problem for
> >> > later.
> >> >
> >> > Janek
> >> >
> >> > _______________________________________________
> >> > ghc-devs mailing list
> >> > ghc-devs at haskell.org
> >> > http://www.haskell.org/mailman/listinfo/ghc-devs
> >>
> >>
> >>
> >> _______________________________________________
> >> ghc-devs mailing list
> >> ghc-devs at haskell.org
> >> http://www.haskell.org/mailman/listinfo/ghc-devs
> >
> >
> >
> > _______________________________________________
> > ghc-devs mailing list
> > ghc-devs at haskell.org
> > http://www.haskell.org/mailman/listinfo/ghc-devs
> >
>
>
>
> --
> Regards,
> Austin - PGP: 4096R/0x91384671
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.haskell.org/pipermail/ghc-devs/attachments/20130619/d5fbd875/attachment.htm>
More information about the ghc-devs
mailing list