Containers INLINE issue continues

Sun Sep 26 09:16:12 EDT 2010

On 25/09/10 01:17, Milan Straka wrote:
> Hi,
>
> we still need to find consensus on containers INLINE issue
> (sorry about me already deciding and pushing :().
>
> To recapitulate, one part of the containers performance improvements was
> to put INLINE on nearly all Data.Map methods. That has positive impact
> on performance (especially for Map Int, Map Word, Map Char, etc, where
> the comparison is just an instruction; for fold method the speedup works
> for almost any type). The reason for this is that inlining allows
> specialization to happen.
>
> The negative effect is that the code for insert, lookup, etc. is
> repeated on every call site. That is not very reasonable (code bloat,
> I-cache misses).
>
> In the long run we could have something like 'SPECIALISABLE' pragma,
> which would work like c++ template specialization -- every specialization
> of a method would be only once in the resulting binary. That would have
> both speed and reasonably small code bloat.
>
> GHC 7.0 supports INLINABLE -- the method unfolding is present in the .hi
> file and GHC decides whether to inline or not.
>
> In the short run we need to choose some kind of compromise. What are the
> possibilities:
>
> a) we remove all INLINE and will mark every method INLINABLE. This keeps
>    smallest code, but if someone needs great performance, they can
>    persuade GHC-7.0 inliner to inline the methods from containers.
>
> b) we leave INLINE on small amount of small methods where there is a big
>     gain, and mark the rest as INLINABLE.
>
>     This is what is currently present in the repo (what I pushed without
>     consensus).
>
>     This is a compromise, some code bloat, small performance losses. The
>     performance gains are present also on ghc-6.* family (not true with
>     a).

> c) leave all methods marked INLINE. Biggest code bloat (numbers in some
>     of my previous mails), theoretically biggest performance
>     (theoretically I-cache misses could be larger).
>
> I think b) is good in the short run, but it is only my opinion. But a)
> is good idea too and I would also support it.
>
> I am flying to ICFP in an hour. I will investigate a) [what switches to
> give to GHC-7.0 to inline] and post the results, but it will take some
> time.

There's something else we need to resolve.  Earlier I reported that GHC 
itself allocates more with the dons/tibbe containers patches - the issue 
seems to be worse with the current patches, to the extent that two of 
the GHC performance regression tests fail (T1969 and T3294).

in T1969:
   containers-0.3,                allocations: 246,402,492
   containers + dons/tibbe/milan, allocations: 283,370,180 (+15%)

in T3294:
   containers-0.3,                allocations: 832,307,880
   containers + dons/tibbe/milan, allocations: 950,107,000 (+14%)

We'll need to isolate exactly which change(s) causes the problem and 
track down the cause.

Cheers,
	Simon