Issues resulting from inlining build

Sun Jul 27 19:48:28 UTC 2014

Hi,

Am Sonntag, den 27.07.2014, 15:29 -0400 schrieb David Feuer:
> I think I finally figured out the deeper problem behind an issue I was
> having getting some things to fuse fully. I'm hoping someone has a
> brilliant idea for getting around it. Of course, it may be that
> someone else has thoroughly studied the matter and determined that
> there's no good solution. Suppose we write
> 
> crazy :: T -> [U]
> crazy x = some $ long $ fusion $ pipeline
> 
> oops f = map f $ crazy x
> uhOh c n = foldr c n $ crazy Pig
> ohMy = take bignum $ crazy amigo
> 
> When GHC compiles crazy, it rewrites the pieces of the pipeline to
> build/foldr forms, fuses them, and produces
> 
> crazy x = build someBigFunction
> 
> In then inlines build, producing
> 
> crazy x = anotherBiggy
>
> So far, this looks reasonably sensible, but it's likely bad.

It is good for the uses of crazy where no further fusion happens.

For the other cases, I believe GHC will rather try to get the original
definition inlined. Maybe alredy "some $ long $ fusion $ pipeline" was
deemed to big to be inlined – in that case an {-# INLINEABLE crazy #-}
could help.

>  The problem is that GHC will (rightly) conclude that `build
> someBigFunction` is too big to inline, and therefore the fusion will
> break at that boundary and we'll produce intermediate lists in the
> functions that use crazy. Now if we were playing the part of the
> compiler by hand, we would likely factor out someBigFunction and then
> *refrain from inlining build*. That is, we would get
> 
> {-# NOINLINE crazyB #-}
> crazyB x = someBigFunction
> 
> crazy = nonInliningBuild crazyB
> 
> Since we've factored out someBigFunction into crazyB, we can now
> safely inline crazy itself, allowing the pipeline to continue beyond
> it. The problem, of course, is that when we *don't* fuse beyond, there
> is some performance penalty (I have not tried to measure it yet) to
> passing in (:) and [] at runtime instead of fixing them at compile
> time.

There is another downside to this: This way, with fusion, you will get
rid of the intermediate list, but you will have calls from
someBigFunction to unknown functions, which is slow. The nice thing
about fusion is not just getting rid of the list, but also the local
optimizations that happen when the new “cons” and “nil” get in touch
with the code in someBigFunction.

So the approach taken seems to be: Either inline "some $ long $ fusion $
pipeline" and do all the transformation at the use site, or call
"anotherBiggy", but nothing in between.

Greetings,
Joachim

-- 
Joachim “nomeata” Breitner
  mail at joachim-breitner.de • http://www.joachim-breitner.de/
  Jabber: nomeata at joachim-breitner.de  • GPG-Key: 0xF0FBF51F
  Debian Developer: nomeata at debian.org

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 819 bytes
Desc: This is a digitally signed message part
URL: <http://www.haskell.org/pipermail/libraries/attachments/20140727/dfe55b47/attachment.sig>