[GHC] #9339: last is not a good consumer
GHC
ghc-devs at haskell.org
Mon Jul 21 22:00:58 UTC 2014
#9339: last is not a good consumer
-------------------------------------+-------------------------------------
Reporter: dfeuer | Owner:
Type: bug | Status: new
Priority: normal | Milestone:
Component: | Version: 7.8.3
libraries/base | Keywords:
Resolution: | Operating System: Unknown/Multiple
Differential Revisions: | Type of failure: Runtime
Architecture: | performance bug
Unknown/Multiple | Test Case:
Difficulty: Unknown | Blocking:
Blocked By: |
Related Tickets: |
-------------------------------------+-------------------------------------
Comment (by nomeata):
Ok, let’s do this systematically.
My benchmarks: One possibly fusing invocation of `last`:
{{{
main = print $ Last.last $ filter odd $ [1::Int ..100000000]
}}}
and one non-fusing
{{{
f = id
{-# NOINLINE f #-}
main = print $ Last.last $ f $ filter odd $ [1::Int ..100000000]
}}}
I am comparing the existing implementation of `last`, which is
{{{
last [] = errorEmptyList "last"
last (x:xs) = last' x xs
where last' y [] = y
last' _ (y:ys) = last' y ys
}}}
with the simpler
{{{
last = foldl (\_ x -> x) (errorEmptyList "last")
}}}
Just for fun (and because GHC HEAD still compiles), here the numbers with
GHC-7.6:
{{{
LastTestFusing.hs
<<ghc: 3200051208 bytes, 6104 GCs, 36460/44416 avg/max bytes residency (2
samples), 1M in use, 0.00 INIT (0.00 elapsed), 1.29 MUT (1.29 elapsed),
0.01 GC (0.01 elapsed) :ghc>>
LastTestNonFusing.hs
<<ghc: 3200051224 bytes, 6104 GCs, 36460/44416 avg/max bytes residency (2
samples), 1M in use, 0.00 INIT (0.00 elapsed), 1.31 MUT (1.31 elapsed),
0.01 GC (0.01 elapsed) :ghc>>
NewLastTestFusing.hs
<<ghc: 3200051224 bytes, 6104 GCs, 36460/44416 avg/max bytes residency (2
samples), 1M in use, 0.00 INIT (0.00 elapsed), 1.34 MUT (1.34 elapsed),
0.01 GC (0.01 elapsed) :ghc>>
NewLastTestNonFusing.hs
<<ghc: 3200051240 bytes, 6104 GCs, 36460/44416 avg/max bytes residency (2
samples), 1M in use, 0.00 INIT (0.00 elapsed), 1.31 MUT (1.31 elapsed),
0.01 GC (0.01 elapsed) :ghc>>
}}}
No significant difference in either of these.
Now the interesting numbers are those with ghc HEAD, and for that I have
to wait for the compilation to finish...
So here they are:
{{{
LastTestFusing.hs
<<ghc: 3200050800 bytes, 6104 GCs, 36372/44312 avg/max bytes residency (2
samples), 1M in use, 0.00 INIT (0.00 elapsed), 1.27 MUT (1.27 elapsed),
0.01 GC (0.01 elapsed) :ghc>>
LastTestNonFusing.hs
<<ghc: 3200050816 bytes, 6104 GCs, 36372/44312 avg/max bytes residency (2
samples), 1M in use, 0.00 INIT (0.00 elapsed), 1.26 MUT (1.26 elapsed),
0.01 GC (0.01 elapsed) :ghc>>
NewLastTestFusing.hs
<<ghc: 800050800 bytes, 1526 GCs, 36356/44312 avg/max bytes residency (2
samples), 1M in use, 0.00 INIT (0.00 elapsed), 1.00 MUT (1.00 elapsed),
0.00 GC (0.00 elapsed) :ghc>>
NewLastTestNonFusing.hs
<<ghc: 3200050832 bytes, 6104 GCs, 36372/44312 avg/max bytes residency (2
samples), 1M in use, 0.00 INIT (0.00 elapsed), 1.26 MUT (1.26 elapsed),
0.01 GC (0.01 elapsed) :ghc>>
}}}
We see that the new implementation performs the same in the non-fusing
case, but much better in the fusing case.
Why does it even allocate in the fusing case? Because it needs to box them
for the recursive call:
{{{
Rec {
main_go :: Int# -> Int -> Int
main_go =
\ (x :: Int#) (eta :: Int) ->
case remInt# x 2 of _ {
__DEFAULT ->
case x of wild {
__DEFAULT -> main_go (+# wild 1) (I# wild);
100000000 -> lvl
};
0 ->
case x of wild {
__DEFAULT -> main_go (+# wild 1) eta;
100000000 -> eta
}
}
end Rec }
}}}
Guess that’s unavoidable here.
Also, to make sure that some integer unboxing doesn’t skew the comparison,
here the numbers with `Integer`:
{{{
<<ghc: 5600050800 bytes, 10703 GCs, 36380/44312 avg/max bytes residency (2
samples), 1M in use, 0.00 INIT (0.00 elapsed), 2.77 MUT (2.77 elapsed),
0.02 GC (0.02 elapsed) :ghc>>
LastTestNonFusing.hs
<<ghc: 5600050816 bytes, 10703 GCs, 36380/44312 avg/max bytes residency (2
samples), 1M in use, 0.00 INIT (0.00 elapsed), 2.77 MUT (2.77 elapsed),
0.02 GC (0.02 elapsed) :ghc>>
NewLastTestFusing.hs
<<ghc: 3200050816 bytes, 6104 GCs, 36380/44312 avg/max bytes residency (2
samples), 1M in use, 0.00 INIT (0.00 elapsed), 2.45 MUT (2.45 elapsed),
0.01 GC (0.01 elapsed) :ghc>>
NewLastTestNonFusing.hs
<<ghc: 5600050848 bytes, 10703 GCs, 36388/44312 avg/max bytes residency (2
samples), 1M in use, 0.00 INIT (0.00 elapsed), 2.93 MUT (2.93 elapsed),
0.02 GC (0.02 elapsed) :ghc>>
}}}
This only confirms the previous findings.
TL;DR: Implementing `last` with `foldl` has no negative consequences, but
works nicely when fused. I don’t expect that to happen often, but it’s
still nice to have, and actually simplifies the existing library
implementation. If noone complains, I’ll push this soon.
--
Ticket URL: <http://ghc.haskell.org/trac/ghc/ticket/9339#comment:6>
GHC <http://www.haskell.org/ghc/>
The Glasgow Haskell Compiler
More information about the ghc-tickets
mailing list