Is scanl' safe for foldr/build fusion?

Thu Jul 24 14:41:35 UTC 2014

Your approach points the way very clearly toward the proper inductive
argument as well, so I will look into that when I get home from the
dentist. Assuming all goes as planned, you seem to have a winner. One thing
that caught my eye in your benchmark report: it looked to me like initsQ2
was very slightly slower than initsQ on the NF tests. Do you think you
could do some bigger tests there and see if it's real? If so, my *guess* is
that the scanl' is wasting time forcing things that have already been
forced, which we probably can't do anything about but should bear in mind.

David Feuer
On Jul 24, 2014 10:02 AM, "Joachim Breitner" <mail at joachim-breitner.de>
wrote:

> Hi David,
>
> Am Mittwoch, den 23.07.2014, 16:17 -0400 schrieb David Feuer:
> > In a comment on trac #9345, nomeata proposed the implementation of
> > Data.List.inits below. He indicates that the performance is very good,
> > and wrote it in this fashion so it could fuse. My concern is that
> > since seq is used in the argument to build, we need to make sure that
> > fusion on the left will be safe. If I did the calculations right, this
> > specifically requires that for all f, cons, nil, and bs, and all a≠_|
> > _, the following "correct" expression
>
> > e1 = a `cons`
> >      foldr cons nil $
> >        foldr (\b g x -> (let b' = f x b in b' `seq` (b' : g b')))
> >           (\b -> b `seq` [])
> >           bs
> >           a
> >
> > gives the same result as the fused expression
> >
> > e2 = a `cons`
> >      foldr (\b g x -> (let b' = f x b in b' `seq` (b' `cons` g b')))
> >            (\b -> b `seq` nil)
> >            bs
> >            a
> >
> > I haven't been able to figure out how to prove this or give a
> > counterexample, so far, but I have very little experience with such.
> >
> > myScanl' :: (b -> a -> b) -> b -> [a] -> [b]
> > myScanl' f a bs = build $ \c n ->
> >      a `seq` a `c`
> >      foldr (\b g x -> (let b' = f x b in b' `seq` (b' `c` g b')))
> >            (\b -> b `seq` n)
> >            bs
> >            a
>
> I don’t have much experience either, but let’s try.
>
> First of all, let’s simplify it a bit. It is unlikely that the problem
> will appear for lists of varying length. Since all finite lists are
> one-element-lists¹, let us substitute bs = [b]:
>
> myScanl' f a [b]
> ==
> build $ \c n ->
>      a `seq` a `c`
>      foldr (\b g x -> (let b' = f x b in b' `seq` (b' `c` g b')))
>            (\b -> b `seq` n)
>            (b : [])
>            a
> == simplifying foldr
> build $ \c n ->
>      a `seq` a `c`
>      (\ x -> let b' = f x b
>              in b' `seq` (b' `c` (b' `seq` n)))
>      a
> == more β-reduction, redundant seq
> build $ \c n ->
>      a `seq` a `c`
>      let b' = f a b
>      in b' `seq` (b' `c` n)
>
>
> Now let’s calculate the expression
>         foldr c n $ myScanl' f a [b]
> without and with fusion.
>
> before_fusion
> ==
> foldr c n $ myScanl' f a [b]
> ==
> foldr c n $ a `seq` (a : let b' = f a b in b' `seq` (b' : []))
> == foldr is strict in its third argument, so it commutes with seq
> a `seq` foldr c n $ (a : let b' = f a b in b' `seq` (b' : []))
> == foldr on cons
> a `seq` a `c` (foldr c n $ let b' = f a b in b' `seq` (b' : []))
> == foldr commutes with let and seq
> a `seq` a `c` (let b' = f a b in b' `seq` (foldr c n $  b' : []))
> == foldr on cons and []
> a `seq` a `c` (let b' = f a b in b' `seq` (b' `c` n))
>
>
> after_fusion
> ==
> (\c n -> a `seq` a `c` let b' = f a b in b' `seq` (b' `c` n)) c n
> ==
> a `seq` a `c` (let b' = f a b in b' `seq` (b' `c` n))
>
> So they yield the same results, independent of c and n. This includes
> the counter example from “Free Theorems in the Presence of seq”, (c =
> seq, n = []).
>
> Judging from this I conclude that nothing can go wrong for finite lists.
> What about partial lists, i.e. bs = b : ⊥?
>
> myScanl' f a (b : ⊥)
> ==
> build $ \c n ->
>      a `seq` a `c`
>      foldr (\b g x -> (let b' = f x b in b' `seq` (b' `c` g b')))
>            (\b -> b `seq` n)
>            (b : ⊥)
>            a
> == simplifying foldr
> build $ \c n ->
>      a `seq` a `c`
>      (\ x -> let b' = f x b
>              in b' `seq` (b' `c` (b' `seq` ⊥)))
>      a
> == more β-reduction, redundant seq
> build $ \c n ->
>      a `seq` a `c`
>      let b' = f a b
>      in b' `seq` (b' `c` ⊥)
>
>
> before_fusion
> ==
> foldr c n $ myScanl' f a (b:⊥)
> ==
> foldr c n $ a `seq` (a : let b' = f a b in b' `seq` (b' : ⊥))
> == foldr is strict in its third argument, so it commutes with seq
> a `seq` foldr c n $ (a : let b' = f a b in b' `seq` (b' : ⊥))
> == foldr on cons
> a `seq` a `c` (foldr c n $ let b' = f a b in b' `seq` (b' : ⊥))
> == foldr commutes with let and seq
> a `seq` a `c` (let b' = f a b in b' `seq` (foldr c n $  b' : ⊥))
> == foldr on cons and []
> a `seq` a `c` (let b' = f a b in b' `seq` (b' `c` ⊥))
>
>
> after_fusion
> ==
> (\c n -> a `seq` a `c` let b' = f a b in b' `seq` (b' `c` ⊥)) c n
> ==
> a `seq` a `c` (let b' = f a b in b' `seq` (b' `c` ⊥))
>
>
> So nothing fancy happening here either. I conclude that we are safe, and
> my gut feeling tell me the reason is likely that we don’t `seq` anything
> built from the unknown c and n.
>
> Greetings,
> Joachim
>
>
> ¹ citation needed
>
> --
> Joachim “nomeata” Breitner
>   mail at joachim-breitner.de • http://www.joachim-breitner.de/
>   Jabber: nomeata at joachim-breitner.de  • GPG-Key: 0xF0FBF51F
>   Debian Developer: nomeata at debian.org
>
>
> _______________________________________________
> Libraries mailing list
> Libraries at haskell.org
> http://www.haskell.org/mailman/listinfo/libraries
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.haskell.org/pipermail/libraries/attachments/20140724/ff3d4429/attachment.html>