# Is scanl' safe for foldr/build fusion?

Joachim Breitner mail at joachim-breitner.de
Thu Jul 24 14:01:51 UTC 2014

```Hi David,

Am Mittwoch, den 23.07.2014, 16:17 -0400 schrieb David Feuer:
> In a comment on trac #9345, nomeata proposed the implementation of
> Data.List.inits below. He indicates that the performance is very good,
> and wrote it in this fashion so it could fuse. My concern is that
> since seq is used in the argument to build, we need to make sure that
> fusion on the left will be safe. If I did the calculations right, this
> specifically requires that for all f, cons, nil, and bs, and all a≠_|
> _, the following "correct" expression

> e1 = a `cons`
>      foldr cons nil \$
>        foldr (\b g x -> (let b' = f x b in b' `seq` (b' : g b')))
>           (\b -> b `seq` [])
>           bs
>           a
>
> gives the same result as the fused expression
>
> e2 = a `cons`
>      foldr (\b g x -> (let b' = f x b in b' `seq` (b' `cons` g b')))
>            (\b -> b `seq` nil)
>            bs
>            a
>
> I haven't been able to figure out how to prove this or give a
> counterexample, so far, but I have very little experience with such.
>
> myScanl' :: (b -> a -> b) -> b -> [a] -> [b]
> myScanl' f a bs = build \$ \c n ->
>      a `seq` a `c`
>      foldr (\b g x -> (let b' = f x b in b' `seq` (b' `c` g b')))
>            (\b -> b `seq` n)
>            bs
>            a

I don’t have much experience either, but let’s try.

First of all, let’s simplify it a bit. It is unlikely that the problem
will appear for lists of varying length. Since all finite lists are
one-element-lists¹, let us substitute bs = [b]:

myScanl' f a [b]
==
build \$ \c n ->
a `seq` a `c`
foldr (\b g x -> (let b' = f x b in b' `seq` (b' `c` g b')))
(\b -> b `seq` n)
(b : [])
a
== simplifying foldr
build \$ \c n ->
a `seq` a `c`
(\ x -> let b' = f x b
in b' `seq` (b' `c` (b' `seq` n)))
a
== more β-reduction, redundant seq
build \$ \c n ->
a `seq` a `c`
let b' = f a b
in b' `seq` (b' `c` n)

Now let’s calculate the expression
foldr c n \$ myScanl' f a [b]
without and with fusion.

before_fusion
==
foldr c n \$ myScanl' f a [b]
==
foldr c n \$ a `seq` (a : let b' = f a b in b' `seq` (b' : []))
== foldr is strict in its third argument, so it commutes with seq
a `seq` foldr c n \$ (a : let b' = f a b in b' `seq` (b' : []))
== foldr on cons
a `seq` a `c` (foldr c n \$ let b' = f a b in b' `seq` (b' : []))
== foldr commutes with let and seq
a `seq` a `c` (let b' = f a b in b' `seq` (foldr c n \$  b' : []))
== foldr on cons and []
a `seq` a `c` (let b' = f a b in b' `seq` (b' `c` n))

after_fusion
==
(\c n -> a `seq` a `c` let b' = f a b in b' `seq` (b' `c` n)) c n
==
a `seq` a `c` (let b' = f a b in b' `seq` (b' `c` n))

So they yield the same results, independent of c and n. This includes
the counter example from “Free Theorems in the Presence of seq”, (c =
seq, n = []).

Judging from this I conclude that nothing can go wrong for finite lists.
What about partial lists, i.e. bs = b : ⊥?

myScanl' f a (b : ⊥)
==
build \$ \c n ->
a `seq` a `c`
foldr (\b g x -> (let b' = f x b in b' `seq` (b' `c` g b')))
(\b -> b `seq` n)
(b : ⊥)
a
== simplifying foldr
build \$ \c n ->
a `seq` a `c`
(\ x -> let b' = f x b
in b' `seq` (b' `c` (b' `seq` ⊥)))
a
== more β-reduction, redundant seq
build \$ \c n ->
a `seq` a `c`
let b' = f a b
in b' `seq` (b' `c` ⊥)

before_fusion
==
foldr c n \$ myScanl' f a (b:⊥)
==
foldr c n \$ a `seq` (a : let b' = f a b in b' `seq` (b' : ⊥))
== foldr is strict in its third argument, so it commutes with seq
a `seq` foldr c n \$ (a : let b' = f a b in b' `seq` (b' : ⊥))
== foldr on cons
a `seq` a `c` (foldr c n \$ let b' = f a b in b' `seq` (b' : ⊥))
== foldr commutes with let and seq
a `seq` a `c` (let b' = f a b in b' `seq` (foldr c n \$  b' : ⊥))
== foldr on cons and []
a `seq` a `c` (let b' = f a b in b' `seq` (b' `c` ⊥))

after_fusion
==
(\c n -> a `seq` a `c` let b' = f a b in b' `seq` (b' `c` ⊥)) c n
==
a `seq` a `c` (let b' = f a b in b' `seq` (b' `c` ⊥))

So nothing fancy happening here either. I conclude that we are safe, and
my gut feeling tell me the reason is likely that we don’t `seq` anything
built from the unknown c and n.

Greetings,
Joachim

¹ citation needed

--
Joachim “nomeata” Breitner
mail at joachim-breitner.de • http://www.joachim-breitner.de/
Jabber: nomeata at joachim-breitner.de  • GPG-Key: 0xF0FBF51F
Debian Developer: nomeata at debian.org

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 819 bytes
Desc: This is a digitally signed message part