[GHC] #12241: Surprising constructor accumulation
GHC
ghc-devs at haskell.org
Wed Jun 29 01:49:55 UTC 2016
#12241: Surprising constructor accumulation
-------------------------------------+-------------------------------------
Reporter: dfeuer | Owner:
Type: bug | Status: new
Priority: normal | Milestone:
Component: Runtime | Version: 7.10.3
System |
Keywords: | Operating System: Unknown/Multiple
Architecture: | Type of failure: Runtime
Unknown/Multiple | performance bug
Test Case: | Blocked By:
Blocking: | Related Tickets:
Differential Rev(s): | Wiki Page:
-------------------------------------+-------------------------------------
`containers` version 0.5.7.1 (and a few earlier versions) uses the
following implementation of `fromList` by Ross Paterson:
{{{#!hs
fromList :: [a] -> Seq a
fromList = Seq . mkTree 1 . map_elem
where
{-# SPECIALIZE mkTree :: Int -> [Elem a] -> FingerTree (Elem a) #-}
{-# SPECIALIZE mkTree :: Int -> [Node a] -> FingerTree (Node a) #-}
mkTree :: (Sized a) => Int -> [a] -> FingerTree a
mkTree !_ [] = EmptyT
mkTree _ [x1] = Single x1
mkTree s [x1, x2] = Deep (2*s) (One x1) EmptyT (One x2)
mkTree s [x1, x2, x3] = Deep (3*s) (One x1) EmptyT (Two x2 x3)
mkTree s (x1:x2:x3:x4:xs) = case getNodes (3*s) x4 xs of
(ns, sf) -> case mkTree (3*s) ns of
!m -> Deep (3*size x1 + size m + size sf) (Three x1 x2 x3) m sf
getNodes :: Int -> a -> [a] -> ([Node a], Digit a)
getNodes !_ x1 [] = ([], One x1)
getNodes _ x1 [x2] = ([], Two x1 x2)
getNodes _ x1 [x2, x3] = ([], Three x1 x2 x3)
getNodes s x1 (x2:x3:x4:xs) = (Node3 s x1 x2 x3:ns, d)
where (ns, d) = getNodes s x4 xs
map_elem :: [a] -> [Elem a]
#if __GLASGOW_HASKELL__ >= 708
map_elem xs = coerce xs
#else
map_elem xs = Data.List.map Elem xs
#endif
{-# INLINE map_elem #-}
}}}
This uses one lazy list per "level" in the tree being constructed. I
believe Paterson (and pretty much everyone else) expected that there would
be `O(log n)` pair constructors and conses live at any given time.
Wadler's technique in
[http://homepages.inf.ed.ac.uk/wadler/papers/leak/leak.ps.gz Fixing some
space leaks with a garbage collector], which the GHC commentary indicates
is used in GHC, should clean up the pairs in `getNodes`'s `d` thunks as
they reach WHNF.
Lennart Spitzner dug into the unimpressive performance of the above code
and using
{{{#!hs
main = evaluate $ S.fromList [(0::Int)..999999]
}}}
produced
[http://heap.ezyang.com/view/72d4d1eb879a2085ffd49d270b03c7a037b4d5c2 this
heap profile]. If I'm reading it right, this suggests that there are lots
of `(,)` and also `(:)` constructors live, more `O(n)` than `O(log n)`.
I had previously found that I could improve performance by building the
intermediate lists strictly, but that violates the generational hypothesis
and leads to a slow-down for very large arguments
([http://heap.ezyang.com/view/2e80598f73d9281f5eadfeb041f5b9aef6e448b0
Spitzner's heap profile]). Spitzner was able to come up with a very clever
(but much trickier) implementation that skirted all these problems
([http://heap.ezyang.com/view/7f1994c0417931360fe7c41bd995c37ebc3fd6c5
profile]) and avoids ever allocating the troublesome pairs.
So the problem is thoroughly bypassed for `containers`, but it seems like
something is not quite right here, and it might bear looking into.
--
Ticket URL: <http://ghc.haskell.org/trac/ghc/ticket/12241>
GHC <http://www.haskell.org/ghc/>
The Glasgow Haskell Compiler
More information about the ghc-tickets
mailing list