[Haskell-cafe] Profiling nested case
batterseapower at hotmail.com
Fri Jul 18 15:57:38 EDT 2008
2008/7/18 Mitar <mmitar at gmail.com>:
> On Sat, Jul 12, 2008 at 3:33 AM, Max Bolingbroke
> <batterseapower at hotmail.com> wrote:
>> If findColor had been a function defined in terms of foldr rather than
>> using explicit recursion, then theres a good chance GHC 6.9 would have
>> fused it with the list to yield your optimized, loop unrolled,
> My first version was with msum. Is this also covered by this fusion?
Note that as I said the fusion only applies to 6.9 onwards. However,
assuming that you were using msum at the list monad then since msum =
concat is defined in terms of foldr there is a chance it could happen.
The best guide to this kind of thing is not asking me but rather
looking at the Core output by GHC for your particular program.
> (And it is interesting that my own recursion version is faster than
> the version with msum. Why?)
I don't know why you find this suprising :-). Your own version is
specialized exactly for the situation you wish to use it for. msum is
a generic combinator, which naturally makes it less amenable to
optimization because there is less information available about it's
usage pattern. Note that GHC will of course try to do its best to
de-specialize the msum function for any particular scenario it's used
in, through inlining etc, but there are no guarantees.
> It is a little more tricky. I choose in an IO monad which scene it
> will render (selected by a program argument). So at compile time it
> does not yet know which one it will use. But there is a finite number
> of possibilities (in my case two) - why not inline both versions and
> at run time choose one?
If there was only one static occurance of each identifier in your
module then it would do that. This is because there is no code size
implications for inlining a function into it's unitary use site.
Inlining is not a cure-all for performance though: if you inline too
much then you increase code size and hence increase the amount of main
memory you're reading and reduce instruction cache hits, not to
mention fill up your disk with multi-MB binaries.
More information about the Haskell-Cafe