[GHC] #9345: Data.List.inits is extremely slow

GHC ghc-devs at haskell.org
Mon Sep 1 20:22:15 UTC 2014


#9345: Data.List.inits is extremely slow
-------------------------------------+-------------------------------------
              Reporter:  dfeuer      |            Owner:
                  Type:  bug         |           Status:  new
              Priority:  high        |        Milestone:  7.8.4
             Component:              |          Version:  7.8.3
  libraries/base                     |         Keywords:
            Resolution:              |     Architecture:  Unknown/Multiple
      Operating System:              |       Difficulty:  Easy (less than 1
  Unknown/Multiple                   |  hour)
       Type of failure:  Runtime     |       Blocked By:
  performance bug                    |  Related Tickets:
             Test Case:              |
              Blocking:              |
Differential Revisions:              |
-------------------------------------+-------------------------------------

Comment (by dfeuer):

 Just to keep us on the same page, I rewrote `scanl'` like this so it can
 be written back to a simple form.

 {{{#!hs
 {-# NOINLINE [1] scanl' #-}
 scanl'           :: (b -> a -> b) -> b -> [a] -> [b]
 scanl' f q ls    = q `seq` q : (case ls of
                                 []   -> []
                                 x:xs -> scanl' f (f q x) xs)

 {-# RULES
 "scanl'"  [~1] forall f a bs . scanl' f a bs =
   build (\c n -> a `c` foldr (scanlFB' f c) (flipSeqScanl' n) bs a)
 "scanl'List" [1] forall f a bs . foldr (scanlFB' f (:)) (flipSeqScanl' [])
 bs a =
   tailScanl' f a bs
  #-}

 {-# INLINE [0] scanlFB' #-}
 scanlFB' f c = \b g x -> let b' = f x b in b' `seq` b' `c` g b'

 {-# INLINE [0] flipSeqScanl' #-}
 flipSeqScanl' = flip seq

 {-# NOINLINE [1] tailScanl' #-}
 tailScanl' f a bs = a `seq` foldr (scanlFB' f (:)) (flip seq []) bs a
 }}}

 This doesn't seem to affect the problem, however. I am curious if there is
 a way to manually force whatever's getting the wrong arity to get the
 right one. Bertram pointed out that fusing `inits` on the right with an
 expensive producer is ''really'' bad, and I realized that fusing it on the
 right with something that leads to lots of reboxing is also really bad. So
 for now, probably the best thing is just to NOINLINE inits altogether.

 Unrelatedly: I was successful in my quest to write rules to make `concat`
 and `inits` fuse into something that's actually good (in case anyone
 actually uses such a thing). Interestingly the arity/strictness analysis
 only works on it if it's inlined (maybe you can figure out a fix for
 that?):

 {{{#!hs
 -- We don't particularly like the idea of inlining this function, since
 it's
 -- a bit large, but it appears to be necessary to get the analyses to work
 -- right and avoid eating all available memory. Making this INLINE [0] to
 -- be able to rewrite it to a simpler form unfortunately seems to make it
 miss some
 -- analysis that speeds it up by a relatively small but still significant
 amount.
 -- There may be some way around this problem, but I haven't found it yet.
 {-# INLINE concatInitsFB #-}
 concatInitsFB xs cons nil =
   let
     go _ _ []     = nil
     go n k (y:ys)
       | n == k    = go (n+1) 0 xs
       | otherwise = y `cons` go n (k+1) ys
   in go (1::Int) 0 xs

 {-# RULES
 -- We might catch concat before it's rewritten.
 "concatInits1" forall xs . concat (inits xs) = build (concatInitsFB xs)
 -- If not, we might be able to recognize its rewritten form.
 "concatInits2" forall c n xs .
   foldr (\x y -> foldr c y x) n (inits xs) = concatInitsFB xs c n
  #-}
 }}}

--
Ticket URL: <http://ghc.haskell.org/trac/ghc/ticket/9345#comment:25>
GHC <http://www.haskell.org/ghc/>
The Glasgow Haskell Compiler


More information about the ghc-tickets mailing list