Summary of containers patches
dons at galois.com
Thu Sep 23 21:17:37 EDT 2010
> Now the timing results. The results are the change in runtime in %.
>  speedup between inline and noinline (negative numbers means slowdown)
>  speedup between inline and inlinable
>  speedup between inline and mix
> (the results are += several %, I did not have time to do it better)
>   
> alter -50.33 -49.17 -47.51
> delete -46.67 -50.93 -48.79
> difference 1.93 0.18 6.33
> insert -44.11 -44.25 -3.09
> insertLookupWithKey empty -14.62 -15.70 1.22
> insertLookupWithKey update -21.54 -19.23 6.20
> insertLookupWithKey' empty -16.79 -23.67 -2.82
> insertLookupWithKey' update -26.62 -26.99 0.47
> insertWith empty -44.42 -48.29 -5.88
> insertWith update -43.28 -45.03 2.50
> insertWith' empty -45.30 -46.66 -3.45
> insertWith' update -38.15 -38.27 3.94
> insertWithKey empty -42.60 -45.58 -6.99
> insertWithKey update -40.13 -36.62 -1.71
> insertWithKey' empty -46.54 -46.53 -4.22
> insertWithKey' update -43.10 -42.16 0.98
> intersection 1.42 -1.75 0.37
> lookup -94.03 -94.62 6.14
> lookupIndex -86.25 -86.92 4.91
> union -0.04 -0.87 2.12
> update -50.64 -55.27 -54.17
> updateLookupWithKey -43.90 -43.31 -49.22
> So, the INLINABLE does not help at all.
> I in fact expected this outcome -- it is not inlining itself that is
> beneficial, it is the _specialization_ (which happens during inlining if
> Please bear in mind that the speedup caused by specialization is
> significant only in the case when the Ord.compare function is trivial
> (if it is an instruction like <=, the result between instruction call
> and a dynamic dictionary call is very big). That means things like Map
> Int, Map Word, Map Int8 and so on. But with these types one usually uses
> IntMap. What I am saying is that the usage of Map which would benefit
> from specialization may not be very common.
Do you have enough time to look on Hackage to see at which types Map is
used most frequently? I suspect Map Int is actually pretty common.
> At this point I suggest the following:
> - as immediate solution, remove all INLINE from the containers package,
> just leave INLINE in lookup*, member* and insert* functions. This
> causes small code bloat and quite big wins on these functions, which
> I consider to be very frequent.
> I could imagine to remove also INLINE for insert* functions, or to add
> INLINE for delete*, update* and alter*.
> - in the long term, find a way to get back the speedup without the code
> bloat. The problem is that _every_ call to a lookup has a different
> code in the binary. We need to share the code of specialized functions
> (like the c++ templates do). I mean the following:
> - before linking, find out which dictionaries are passed to
> Data.Map.lookup method. Specialize Data.Map.lookup for these types
> (only once for each type) and change the calls to lookup with
> specific dictionaries.
> The effect should be as if user said SPECIALIZE, for each method
> only for types it is non-generically being used for.
> - I would not bother with INLINABLE currently.
> Simon, others, do you agree with my short-term solution? If you do,
> I will push the patches (I need to do it before I fly to ICFP, which
> will be in 30 hours).
Well, I still don't understand what penalty the code bloat is exactly.
Making things a lot slower is clearly bad, making disk size a bit bigger
-- when my harddrive is already 8x bigger than 3 years ago, doesn't seem
More information about the Libraries