Formal proposal: replace Data.List.inits

Bertram Felgenhauer bertram.felgenhauer at googlemail.com
Tue Aug 19 16:46:54 UTC 2014


David Feuer wrote:
> The only question that's not quite a no-brainer is whether the situations
> that slow down the initsR implementation will ever occur in practice.
> I'm not sure of the answer.

Do such cases exist? For reference, we're discussing

  initsR = map reverse . scanl (flip (:)) []
  initsHO = map ($ []) . scanl (\f x -> f . (x:)) id

initsR is slow if for some reason, only the first couple of elements
of the result lists are used in a computation, as happens in

  foldl' (+) 0 . map head . tail . inits

In that case, initsR shows quadratic behavior. However, that is also
true for initsHO, because of the way that the calls to (.) are nested.

For better laziness, one can base 'inits' on 'take':

  initsT xs = [] : zipWith (\l _ -> take l xs) [1..] xs

which with a bit of tuning comes surprisingly close to initsR [*]:

  initsT' xs = [] : go (1 :: Int) xs where
    go !l (_:ls) = take' l xs : go (l+1) ls
    go _  []     = []
    take' 0 _      = []
    take' n (x:xs) = x : take' (n-1) xs

I find it surprising because the function has to allocate a thunk for
every call of take', whereas the 'reverse' calls of initsR can directly
evaluate the (:) constructors as they construct their results.

In any case my preference is for 'initsR', perhaps with a hint in the
documentation that the zipWith/take variant may be faster in some cases.

Cheers,

Bertram

[*] some ad-hoc timings
(sum' = foldl' (+) 0, all inits* functions compiled with -O2):

- using only first elements of result lists:

> sum' $ map head $ tail $ inits [1..10000]
10000
(5.38 secs, 5905331504 bytes)
> sum' $ map head $ tail $ initsR [1..10000]
10000
(0.55 secs, 1203045184 bytes)
> sum' $ map head $ tail $ initsHO [1..10000]
10000
(1.11 secs, 1205462216 bytes)
> sum' $ map head $ tail $ initsT [1..10000]
10000
(0.01 secs, 7226208 bytes)
> sum' $ map head $ tail $ initsT' [1..10000]
10000
(0.01 secs, 8119224 bytes)

- using whole result:

> sum' $ map sum' $ tail $ inits [1..10000]
166716670000
(7.79 secs, 7900276296 bytes)
> sum' $ map sum' $ tail $ initsR [1..10000]
166716670000
(1.35 secs, 2006560272 bytes)
> sum' $ map sum' $ tail $ initsHO [1..10000]
166716670000
(1.93 secs, 2003170216 bytes)
> sum' $ map sum' $ tail $ initsT [1..10000]
166716670000
(2.16 secs, 3603697344 bytes)
> sum' $ map sum' $ tail $ initsT' [1..10000]
166716670000
(1.61 secs, 3603897320 bytes)


More information about the Libraries mailing list