[Haskell-cafe] Lists concatenation being O(n)

Daniel Fischer daniel.is.fischer at googlemail.com
Fri Oct 14 17:47:11 CEST 2011


On Friday 14 October 2011, 16:55:14, Bas van Dijk wrote:
> On 13 October 2011 20:53, Albert Y. C. Lai <trebla at vex.net> wrote:
> > The number of new cons cells created in due course is Θ(length xs).
> 
> I was actually surprised by this because I expected: length(xs++ys) to
> fuse into one efficient loop which doesn't create cons cells at all.
> 
> Unfortunately, I was mistaken since length is defined recursively.
> 
> length :: [a] -> Int
> length l =  len l 0#
>   where
>     len :: [a] -> Int# -> Int
>     len []     a# = I# a#
>     len (_:xs) a# = len xs (a# +# 1#)
> 
> However, if we would define it as:
> 
> length = foldl' (l _ -> l+1) 0
> 
> And implemented foldl' using foldr as described here:
> 
> http://www.haskell.org/pipermail/libraries/2011-October/016895.html
> 
> then fuse = length(xs++ys) where for example xs = replicate 1000000 1
> and ys = replicate 5000 (1::Int) would compile to the following
> totally fused core:
> 
> fuse :: Int
> fuse = case $wxs 1000000 0 of ww_srS {
>          __DEFAULT -> I# ww_srS
>        }
> 
> $wxs :: Int# -> Int# -> Int#
> $wxs = \ (w_srL :: Int#) (ww_srO :: Int#) ->
>     case <=# w_srL 1 of _ {
>       False -> $wxs (-# w_srL 1) (+# ww_srO 1);
>       True  -> $wxs1_rs8 5000 (+# ww_srO 1)
>     }
> 
> $wxs1_rs8 :: Int# -> Int# -> Int#
> $wxs1_rs8 =
>   \ (w_srA :: Int#) (ww_srD :: Int#) ->
>     case <=# w_srA 1 of _ {
>       False -> $wxs1_rs8 (-# w_srA 1) (+# ww_srD 1);
>       True  -> +# ww_srD 1
>     }

Yes, that's wonderful, but it's not so wonderful for types more complicated 
than Int.

Integer is evil enough: With 

fuse = length ([1 .. 50000000] ++ [0 .. 60000000])

Prelude Fuse> fuse
Heap exhausted;
Current maximum heap size is 1258291200 bytes (1200 MB);
use `+RTS -M<size>' to increase it.

The current length has no problems:

Prelude> length ([1 .. 50000000] ++ [0 .. 60000000])
110000001
(2.55 secs, 11609850632 bytes)

Before foldl('), sum, length can be implemented in terms of foldr to get 
fusion, a lot has to be done still.
Currently you'd get an improvement in some cases for a catastrophic 
behaviour in many others.

Cheers,
Daniel



More information about the Haskell-Cafe mailing list