performance regressions
Simon Peyton Jones
simonpj at microsoft.com
Wed Dec 17 09:15:07 UTC 2014
If you use INLINEABLE, that should make the function specialisable to a particular monad, even if it's in a different module. You shouldn't need INLINE for that.
I don't understand the difference between cases (2) and (3).
I am still suspicious of why there are so many calls to this one function that it, alone, is allocating a significant proportion of compilation of the entire run of GHC. Are you sure there isn't an algorithmic improvement to be had, to simply reduce the number of calls?
Simon
| -----Original Message-----
| From: ghc-devs [mailto:ghc-devs-bounces at haskell.org] On Behalf Of
| Richard Eisenberg
| Sent: 16 December 2014 21:46
| To: Joachim Breitner
| Cc: ghc-devs at haskell.org
| Subject: Re: performance regressions
|
| I've learned several very interesting things in this analysis.
|
| - Inlining polymorphic methods is very important. Here are some data
| points to back up that claim:
| * Original implementation using zipWithAndUnzipM: 8,472,613,440
| bytes allocated in the heap
| * Adding {-# INLINE #-} to the definition thereof: 6,639,253,488
| bytes allocated in the heap
| * Using `inline` at call site to force inlining: 6,281,539,792
| bytes allocated in the heap
|
| The middle step above allowed GHC to specialize zipWithAndUnzipM to my
| particular monad, but GHC didn't see fit to actually inline the
| function. Using `inline` forced it, to good effect. (I did not collect
| data on code sizes, but it wouldn't be hard to.)
|
| By comparison:
| * Hand-written recursion: 6,587,809,112 bytes allocated in the
| heap
| Interestingly, this is *not* the best result!
|
| Conclusion: We should probably add INLINE pragmas to Util and
| MonadUtils.
|
|
| - I then looked at rejiggering the algorithm to keep the common case
| fast. This had a side effect of changing the zipWithAndUnzipM to
| mapAndUnzipM, from Control.Monad. To my surprise, this brought
| disaster!
| * Using `inline` and mapAndUnzipM: 7,463,047,432 bytes
| allocated in the heap
| * Hand-written recursion: 5,848,602,848 bytes
| allocated in the heap
|
| That last number is better than the numbers above because of the
| algorithm streamlining. But, the inadequacy of mapAndUnzipM surprised
| me -- it already has an INLINE pragma in Control.Monad of course.
| Looking at -ddump-simpl, it seems that mapAndUnzipM was indeed getting
| inlined, but a call to `map` remained, perhaps causing extra
| allocation.
|
| Conclusion: We should examine the implementation of mapAndUnzipM (and
| similar functions) in Control.Monad. Is it as fast as possible?
|
|
|
| In the end, I was unable to bring the allocation numbers down to where
| they were before my work. This is because the flattener now deals in
| roles. Most of its behavior is the same between nominal and
| representational roles, so it seems silly (though very possible) to
| specialize the code to nominal to keep that path fast. Instead, I
| identified one key spot and made that go fast.
|
| Thus, there is a 7% bump to memory usage on very-type-family-heavy
| code, compared to before my commit on Friday. (On more ordinary code,
| there is no noticeable change.)
|
| Validating my patch locally now; will push when that's done.
|
| Thanks,
| Richard
|
| On Dec 16, 2014, at 10:41 AM, Joachim Breitner <mail at joachim-
| breitner.de> wrote:
|
| > Hi,
| >
| >
| > Am Dienstag, den 16.12.2014, 09:59 -0500 schrieb Richard Eisenberg:
| >> On Dec 16, 2014, at 4:01 AM, Joachim Breitner <mail at joachim-
| breitner.de> wrote:
| >>
| >>> another guess (without looking at the code, sorry): Are they in
| the
| >>> same module? I.e., can GHC specialize the code to your particular
| Monad?
| >
| >> No, they're not in the same module. I could also try moving the
| >> zipWithAndUnzipM function to the same module, and even specializing
| >> it by hand to the right monad.
| >
| > I did mean zipWithAndUnzipM, so maybe yes: Try that.
| >
| > (I find it hard to believe that any polymorphic monadic code should
| > perform well, with those many calls to an unknown (>>=) with a
| > function parameter, but maybe I'm too pessimistic here.)
| >
| >
| >> Could that be preventing the fusing?
| >
| > There is not going to be any fusing here, at least not list fusion;
| > that would require your code to be written in terms of functions
| with
| > fusion rules.
| >
| > Greetings,
| > Joachim
| >
| > --
| > Joachim "nomeata" Breitner
| > mail at joachim-breitner.de * http://www.joachim-breitner.de/
| > Jabber: nomeata at joachim-breitner.de * GPG-Key: 0xF0FBF51F Debian
| > Developer: nomeata at debian.org
| >
| > _______________________________________________
| > ghc-devs mailing list
| > ghc-devs at haskell.org
| > http://www.haskell.org/mailman/listinfo/ghc-devs
|
| _______________________________________________
| ghc-devs mailing list
| ghc-devs at haskell.org
| http://www.haskell.org/mailman/listinfo/ghc-devs
More information about the ghc-devs
mailing list