Is scanl' safe for foldr/build fusion?

Thu Jul 24 14:01:51 UTC 2014

Hi David,

Am Mittwoch, den 23.07.2014, 16:17 -0400 schrieb David Feuer:
> In a comment on trac #9345, nomeata proposed the implementation of
> Data.List.inits below. He indicates that the performance is very good,
> and wrote it in this fashion so it could fuse. My concern is that
> since seq is used in the argument to build, we need to make sure that
> fusion on the left will be safe. If I did the calculations right, this
> specifically requires that for all f, cons, nil, and bs, and all a≠_|
> _, the following "correct" expression

> e1 = a `cons`
>      foldr cons nil $
>        foldr (\b g x -> (let b' = f x b in b' `seq` (b' : g b')))
>           (\b -> b `seq` [])
>           bs
>           a
> 
> gives the same result as the fused expression
> 
> e2 = a `cons`
>      foldr (\b g x -> (let b' = f x b in b' `seq` (b' `cons` g b')))
>            (\b -> b `seq` nil)
>            bs
>            a
> 
> I haven't been able to figure out how to prove this or give a
> counterexample, so far, but I have very little experience with such.
> 
> myScanl' :: (b -> a -> b) -> b -> [a] -> [b]
> myScanl' f a bs = build $ \c n ->
>      a `seq` a `c`
>      foldr (\b g x -> (let b' = f x b in b' `seq` (b' `c` g b')))
>            (\b -> b `seq` n)
>            bs
>            a

I don’t have much experience either, but let’s try.

First of all, let’s simplify it a bit. It is unlikely that the problem
will appear for lists of varying length. Since all finite lists are
one-element-lists¹, let us substitute bs = [b]:

myScanl' f a [b]
==
build $ \c n -> 
     a `seq` a `c`
     foldr (\b g x -> (let b' = f x b in b' `seq` (b' `c` g b')))
           (\b -> b `seq` n)
           (b : [])
           a
== simplifying foldr
build $ \c n -> 
     a `seq` a `c`
     (\ x -> let b' = f x b
             in b' `seq` (b' `c` (b' `seq` n)))
     a
== more β-reduction, redundant seq
build $ \c n -> 
     a `seq` a `c`
     let b' = f a b
     in b' `seq` (b' `c` n)

Now let’s calculate the expression
        foldr c n $ myScanl' f a [b]
without and with fusion.

before_fusion
==
foldr c n $ myScanl' f a [b]
==
foldr c n $ a `seq` (a : let b' = f a b in b' `seq` (b' : []))
== foldr is strict in its third argument, so it commutes with seq
a `seq` foldr c n $ (a : let b' = f a b in b' `seq` (b' : []))
== foldr on cons
a `seq` a `c` (foldr c n $ let b' = f a b in b' `seq` (b' : []))
== foldr commutes with let and seq
a `seq` a `c` (let b' = f a b in b' `seq` (foldr c n $  b' : []))
== foldr on cons and []
a `seq` a `c` (let b' = f a b in b' `seq` (b' `c` n))

after_fusion
==
(\c n -> a `seq` a `c` let b' = f a b in b' `seq` (b' `c` n)) c n 
==
a `seq` a `c` (let b' = f a b in b' `seq` (b' `c` n))

So they yield the same results, independent of c and n. This includes
the counter example from “Free Theorems in the Presence of seq”, (c =
seq, n = []).

Judging from this I conclude that nothing can go wrong for finite lists.
What about partial lists, i.e. bs = b : ⊥?

myScanl' f a (b : ⊥)
==
build $ \c n -> 
     a `seq` a `c`
     foldr (\b g x -> (let b' = f x b in b' `seq` (b' `c` g b')))
           (\b -> b `seq` n)
           (b : ⊥)
           a
== simplifying foldr
build $ \c n -> 
     a `seq` a `c`
     (\ x -> let b' = f x b
             in b' `seq` (b' `c` (b' `seq` ⊥)))
     a
== more β-reduction, redundant seq
build $ \c n -> 
     a `seq` a `c`
     let b' = f a b
     in b' `seq` (b' `c` ⊥)

before_fusion
==
foldr c n $ myScanl' f a (b:⊥)
==
foldr c n $ a `seq` (a : let b' = f a b in b' `seq` (b' : ⊥))
== foldr is strict in its third argument, so it commutes with seq
a `seq` foldr c n $ (a : let b' = f a b in b' `seq` (b' : ⊥))
== foldr on cons
a `seq` a `c` (foldr c n $ let b' = f a b in b' `seq` (b' : ⊥))
== foldr commutes with let and seq
a `seq` a `c` (let b' = f a b in b' `seq` (foldr c n $  b' : ⊥))
== foldr on cons and []
a `seq` a `c` (let b' = f a b in b' `seq` (b' `c` ⊥))

after_fusion
==
(\c n -> a `seq` a `c` let b' = f a b in b' `seq` (b' `c` ⊥)) c n 
==
a `seq` a `c` (let b' = f a b in b' `seq` (b' `c` ⊥))

So nothing fancy happening here either. I conclude that we are safe, and
my gut feeling tell me the reason is likely that we don’t `seq` anything
built from the unknown c and n.

Greetings,
Joachim

¹ citation needed

-- 
Joachim “nomeata” Breitner
  mail at joachim-breitner.de • http://www.joachim-breitner.de/
  Jabber: nomeata at joachim-breitner.de  • GPG-Key: 0xF0FBF51F
  Debian Developer: nomeata at debian.org

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 819 bytes
Desc: This is a digitally signed message part
URL: <http://www.haskell.org/pipermail/libraries/attachments/20140724/1a648ad5/attachment.sig>