[Haskell-cafe] Performance best practices

Thu Aug 1 09:24:45 UTC 2019

I just realized seqTails and seqInits are unnecessary, seqList is enough.

On 01-08-2019 11:04, Jaro Reinders wrote:
> Replying to myself, you can actually write an evaluation function that
> forces all values in the result of tails and inits in linear time:
> 
>     -- force the result of tails in linear time
>     seqTails (x:xs) = x `deepseq` seqList xs
>     seqTails [] = ()
> 
>     -- force all the values in a list to whnf in linear time
>     -- https://wiki.haskell.org/Weak_head_normal_form
>     seqList (x:xs) = x `seq` seqList xs
>     seqList [] = ()
> 
>     -- force the result of inits in linear time
>     seqInits xs = last xs `deepseq` seqList xs
> 
> Try it in ghci with :sprint
> (https://downloads.haskell.org/~ghc/latest/docs/html/users_guide/ghci.html#ghci-cmd-:sprint):
> 
>     > let x = tails [1..3::Int]
>     > :sprint x
>     x = _
>     > seqTails x
>     ()
>     > :sprint x
>     x = [[1,2,3],[2,3],[3],[]]
> 
>     > let y = inits [1..3::Int]
>     > :sprint y
>     y = _
>     > seqInits y
>     ()
>     > :sprint y
>     y = [[],[1],[1,2],[1,2,3]]
> 
> Using criterion you can see that it is actually linear time:
> 
>     main = defaultMain
>       [ bgroup "inits"
>         [ bench "1000" $ whnf (seqInits . inits) [1..1000 :: Int]
>         , bench "10000" $ whnf (seqInits . inits) [1..10000 :: Int]
>         , bench "100000" $ whnf (seqInits . inits) [1..100000 :: Int]
>         , bench "1000000" $ whnf (seqInits . inits) [1..1000000 :: Int]
>         ]
>       , bgroup "tails"
>         [ bench "1000" $ whnf (seqTails . tails) [1..1000 :: Int]
>         , bench "10000" $ whnf (seqTails . tails) [1..10000 :: Int]
>         , bench "100000" $ whnf (seqTails . tails) [1..100000 :: Int]
>         , bench "1000000" $ whnf (seqTails . tails) [1..1000000 :: Int]
>         ]
>       ]
> 
> 
>     benchmarking inits/1000
>     time                 204.2 μs   (203.2 μs .. 205.4 μs)
>                          1.000 R²   (1.000 R² .. 1.000 R²)
>     mean                 203.4 μs   (202.8 μs .. 204.1 μs)
>     std dev              2.163 μs   (1.755 μs .. 2.664 μs)
> 
>     benchmarking inits/10000
>     time                 3.127 ms   (3.107 ms .. 3.148 ms)
>                          1.000 R²   (0.999 R² .. 1.000 R²)
>     mean                 3.105 ms   (3.088 ms .. 3.118 ms)
>     std dev              45.73 μs   (32.97 μs .. 69.14 μs)
> 
>     benchmarking inits/100000
>     time                 41.05 ms   (39.11 ms .. 42.87 ms)
>                          0.993 R²   (0.988 R² .. 0.998 R²)
>     mean                 41.52 ms   (40.62 ms .. 42.46 ms)
>     std dev              1.912 ms   (1.330 ms .. 2.930 ms)
>     variance introduced by outliers: 12% (moderately inflated)
> 
>     benchmarking inits/1000000
>     time                 423.0 ms   (318.2 ms .. 535.5 ms)
>                          0.991 R²   (0.969 R² .. 1.000 R²)
>     mean                 459.1 ms   (428.8 ms .. 505.2 ms)
>     std dev              44.05 ms   (10.06 ms .. 58.49 ms)
>     variance introduced by outliers: 22% (moderately inflated)
> 
> 
> 
>     benchmarking tails/1000
>     time                 8.811 μs   (8.768 μs .. 8.873 μs)
>                          1.000 R²   (0.999 R² .. 1.000 R²)
>     mean                 8.874 μs   (8.819 μs .. 8.963 μs)
>     std dev              225.7 ns   (168.4 ns .. 325.2 ns)
>     variance introduced by outliers: 28% (moderately inflated)
> 
>     benchmarking tails/10000
>     time                 87.21 μs   (86.85 μs .. 87.79 μs)
>                          1.000 R²   (0.999 R² .. 1.000 R²)
>     mean                 87.42 μs   (87.01 μs .. 87.88 μs)
>     std dev              1.481 μs   (1.132 μs .. 1.953 μs)
>     variance introduced by outliers: 11% (moderately inflated)
> 
>     benchmarking tails/100000
>     time                 886.9 μs   (882.9 μs .. 890.9 μs)
>                          1.000 R²   (1.000 R² .. 1.000 R²)
>     mean                 881.5 μs   (878.1 μs .. 885.7 μs)
>     std dev              12.40 μs   (9.598 μs .. 18.97 μs)
> 
>     benchmarking tails/1000000
>     time                 9.796 ms   (9.757 ms .. 9.840 ms)
>                          1.000 R²   (1.000 R² .. 1.000 R²)
>     mean                 9.817 ms   (9.791 ms .. 9.845 ms)
>     std dev              78.47 μs   (60.63 μs .. 99.31 μs)
> 
> 
> On 01-08-2019 10:25, Jaro Reinders wrote:
>> If you fully evaluate the list produced by tails, then you're still
>> spending O(n^2) time, because that is just the size of the produced
>> list. But constructing the list and the memory taken by the list is
>> O(n), because most of the lists are shared
>> (https://wiki.haskell.org/Sharing).
>>
>> On 01-08-2019 04:45, Todd Wilson wrote:
>>> It seems that, asymptotically, tails is O(n) while inits is O(n^2) in
>>> both time and space (when fully evaluated)