[Haskell-cafe] sum $ map f xs ... ghc-7.10 performance regression?

Thomas Koster tkoster at gmail.com
Tue Dec 15 06:15:53 UTC 2015


On 15 December 2015 at 09:30, Johannes Waldmann
<johannes.waldmann at htwk-leipzig.de> wrote:
> Is there a difference between "sum $ map f xs" and "sum $! map f xs"?
>
> I would think no, since "$!" just evaluates the argument to WHNF,
> so it would evaluate just the first cons of the list.
>
> This little program
>
> main = print $ sum $ map bitcount [0, 4 .. 2^24-1 ]
>
> bitcount :: Int -> Int
> bitcount x =
>   if x > 0 then let (d,m) = divMod x 2 in bitcount d + m else 0
>
> runs in 1.6 seconds when compiled with -O2
> on ghc-7.6.3, ghc-7.8.4,
>
> but takes 3.2 seconds on ghc-7.10.[1-3]. Why?
>
> when I change the main function (note: $ => $!) to
>
> main = print $ sum $! map bitcount [0, 4 .. 2^24-1 ]
>
> and compile with 7.10.[1-3], then it also runs in 1.6 seconds.

On 15 December 2015 at 11:20, Jonas Scholl <anselm.scholl at tu-harburg.de> wrote:
> The reason for the difference seems
> to manifest itself after the first float-out pass, somehow the
> simplifier rebuilds the expression differently... but right now I do not
> see what exactly is happening there and why. My first though was a
> different set of rules from the list fusion stuff fires, but at least
> the names and order of the rule firings are exactly the same... But the
> more I think about it, the more I think this is a bug.

On 15 December 2015 at 16:21, Phil Ruffwind <rf at rufflewind.com> wrote:
> sum for lists is implemented using foldl rather than foldl' so I
> suspect that's the origin of the issue.  Somehow, ($!) seems to give
> GHC enough of a hint so as to optimize smarter thereby avoiding the
> thunk build-up.  I don't know how this occurs though.

Phil, I think that was true in 7.8, but if I'm reading the haddocks
correctly, Data.List.sum = Data.Foldable.sum in 7.10, and
Data.Foldable.sum uses foldMap/foldr.

I don't have 7.8 handy at the moment, but the 7.8 base probably would
have used the definition now in GHC.OldList, which uses foldl as you
say.

The switch from foldl to foldr might be a factor in the list fusion
issue hinted by Jonas. Since foldl is almost always used incorrectly
by beginners (right?), I wouldn't be surprised if the reason the space
leak is magically eliminated in 7.8 is because of beginner-friendly
rewrite rules and strictness analyses targeted at foldl.

--
Thomas Koster


More information about the Haskell-Cafe mailing list