[Haskell-cafe] How to make Repa do stencils unboxed?
benl at ouroborus.net
Mon Jul 4 04:37:33 UTC 2016
> On 3 Jul 2016, at 8:46 AM, William Yager <will.yager at gmail.com> wrote:
> 2. Regrettably, using Dimensional does seem to have some negative effect on performance. It's about 1.5x slower with Dimensional. The fragility of our currently used fusion techniques renders empty the promise of "overhead-free" newtype abstractions.
Yes, this is a massive problem with the Repa approach to array fusion. The compilation method (and resulting runtime performance) depends on the GHC simplifier acting in a certain way — yet there is no specification of exactly what the simplifier should do, and no easy way to check that it did what was expected other than eyeballing the intermediate code.
We really need a different approach to program optimisation, such as a working supercompiler, instead of just “a lot of bullets in the gun” that might hit the target or not. The latter is fine for general purpose code optimisation but not “compile by transformation” where we really depend on the transformations doing what they’re supposed to. The use of syntactic witnesses of type equality in the core language doesn’t help either, as they tend do get in the way of other program transformations.
I think the way the Repa API is *set up* is fine, but expecting a general purpose compiler to always successfully fuse such code is too optimistic. We had the same sort of problems in DPH. Live and learn.
> 3. I got a *huge* performance boost by calculating `outputs` through `runIdentity` rather than treating it as an IO action. Several times faster. This makes sense, but I'm surprised the results are so drastic.
If a source level change causes some syntactic wibble in the intermediate code that helps fusion, then the result will be drastic.
> Can anyone explain to me how the Identity monad manages to guarantee the sequencing/non-nesting requirements of computeP?
It doesn’t. I wrapped computeP in a identity monad to discourage people from trying to perform a computeP in the worker function of another parallel operator (like map).
> As far as I can tell, it should be reduced to plain old function application, which is the same as nesting computeP, which is what we were trying to prevent.
Yes. Doing this would create a nested parallel computation, which Repa does not support (as you mentioned). If you were to use ‘runIdentity' to discharge the identity constructor, then create a nested parallel computation anyway, then you’d get a runtime warning on the console.
> In other words, what effect does the Identity monad have over not having a monad at all? Its bind definition has no sequencing effects or anything, so I can't imagine that it actually accomplishes anything.
Its purpose is psychological. It forces the user to question what is really going on.
More information about the Haskell-Cafe