Is scanl' safe for foldr/build fusion?
David Feuer
david.feuer at gmail.com
Thu Jul 24 14:57:08 UTC 2014
Ah, never mind that last remark. If it allocates less, then even that tiny
slowdown applies only to benchmarks.
On Jul 24, 2014 10:41 AM, "David Feuer" <david.feuer at gmail.com> wrote:
> Your approach points the way very clearly toward the proper inductive
> argument as well, so I will look into that when I get home from the
> dentist. Assuming all goes as planned, you seem to have a winner. One thing
> that caught my eye in your benchmark report: it looked to me like initsQ2
> was very slightly slower than initsQ on the NF tests. Do you think you
> could do some bigger tests there and see if it's real? If so, my *guess* is
> that the scanl' is wasting time forcing things that have already been
> forced, which we probably can't do anything about but should bear in mind.
>
> David Feuer
> On Jul 24, 2014 10:02 AM, "Joachim Breitner" <mail at joachim-breitner.de>
> wrote:
>
>> Hi David,
>>
>> Am Mittwoch, den 23.07.2014, 16:17 -0400 schrieb David Feuer:
>> > In a comment on trac #9345, nomeata proposed the implementation of
>> > Data.List.inits below. He indicates that the performance is very good,
>> > and wrote it in this fashion so it could fuse. My concern is that
>> > since seq is used in the argument to build, we need to make sure that
>> > fusion on the left will be safe. If I did the calculations right, this
>> > specifically requires that for all f, cons, nil, and bs, and all a≠_|
>> > _, the following "correct" expression
>>
>> > e1 = a `cons`
>> > foldr cons nil $
>> > foldr (\b g x -> (let b' = f x b in b' `seq` (b' : g b')))
>> > (\b -> b `seq` [])
>> > bs
>> > a
>> >
>> > gives the same result as the fused expression
>> >
>> > e2 = a `cons`
>> > foldr (\b g x -> (let b' = f x b in b' `seq` (b' `cons` g b')))
>> > (\b -> b `seq` nil)
>> > bs
>> > a
>> >
>> > I haven't been able to figure out how to prove this or give a
>> > counterexample, so far, but I have very little experience with such.
>> >
>> > myScanl' :: (b -> a -> b) -> b -> [a] -> [b]
>> > myScanl' f a bs = build $ \c n ->
>> > a `seq` a `c`
>> > foldr (\b g x -> (let b' = f x b in b' `seq` (b' `c` g b')))
>> > (\b -> b `seq` n)
>> > bs
>> > a
>>
>> I don’t have much experience either, but let’s try.
>>
>> First of all, let’s simplify it a bit. It is unlikely that the problem
>> will appear for lists of varying length. Since all finite lists are
>> one-element-lists¹, let us substitute bs = [b]:
>>
>> myScanl' f a [b]
>> ==
>> build $ \c n ->
>> a `seq` a `c`
>> foldr (\b g x -> (let b' = f x b in b' `seq` (b' `c` g b')))
>> (\b -> b `seq` n)
>> (b : [])
>> a
>> == simplifying foldr
>> build $ \c n ->
>> a `seq` a `c`
>> (\ x -> let b' = f x b
>> in b' `seq` (b' `c` (b' `seq` n)))
>> a
>> == more β-reduction, redundant seq
>> build $ \c n ->
>> a `seq` a `c`
>> let b' = f a b
>> in b' `seq` (b' `c` n)
>>
>>
>> Now let’s calculate the expression
>> foldr c n $ myScanl' f a [b]
>> without and with fusion.
>>
>> before_fusion
>> ==
>> foldr c n $ myScanl' f a [b]
>> ==
>> foldr c n $ a `seq` (a : let b' = f a b in b' `seq` (b' : []))
>> == foldr is strict in its third argument, so it commutes with seq
>> a `seq` foldr c n $ (a : let b' = f a b in b' `seq` (b' : []))
>> == foldr on cons
>> a `seq` a `c` (foldr c n $ let b' = f a b in b' `seq` (b' : []))
>> == foldr commutes with let and seq
>> a `seq` a `c` (let b' = f a b in b' `seq` (foldr c n $ b' : []))
>> == foldr on cons and []
>> a `seq` a `c` (let b' = f a b in b' `seq` (b' `c` n))
>>
>>
>> after_fusion
>> ==
>> (\c n -> a `seq` a `c` let b' = f a b in b' `seq` (b' `c` n)) c n
>> ==
>> a `seq` a `c` (let b' = f a b in b' `seq` (b' `c` n))
>>
>> So they yield the same results, independent of c and n. This includes
>> the counter example from “Free Theorems in the Presence of seq”, (c =
>> seq, n = []).
>>
>> Judging from this I conclude that nothing can go wrong for finite lists.
>> What about partial lists, i.e. bs = b : ⊥?
>>
>> myScanl' f a (b : ⊥)
>> ==
>> build $ \c n ->
>> a `seq` a `c`
>> foldr (\b g x -> (let b' = f x b in b' `seq` (b' `c` g b')))
>> (\b -> b `seq` n)
>> (b : ⊥)
>> a
>> == simplifying foldr
>> build $ \c n ->
>> a `seq` a `c`
>> (\ x -> let b' = f x b
>> in b' `seq` (b' `c` (b' `seq` ⊥)))
>> a
>> == more β-reduction, redundant seq
>> build $ \c n ->
>> a `seq` a `c`
>> let b' = f a b
>> in b' `seq` (b' `c` ⊥)
>>
>>
>> before_fusion
>> ==
>> foldr c n $ myScanl' f a (b:⊥)
>> ==
>> foldr c n $ a `seq` (a : let b' = f a b in b' `seq` (b' : ⊥))
>> == foldr is strict in its third argument, so it commutes with seq
>> a `seq` foldr c n $ (a : let b' = f a b in b' `seq` (b' : ⊥))
>> == foldr on cons
>> a `seq` a `c` (foldr c n $ let b' = f a b in b' `seq` (b' : ⊥))
>> == foldr commutes with let and seq
>> a `seq` a `c` (let b' = f a b in b' `seq` (foldr c n $ b' : ⊥))
>> == foldr on cons and []
>> a `seq` a `c` (let b' = f a b in b' `seq` (b' `c` ⊥))
>>
>>
>> after_fusion
>> ==
>> (\c n -> a `seq` a `c` let b' = f a b in b' `seq` (b' `c` ⊥)) c n
>> ==
>> a `seq` a `c` (let b' = f a b in b' `seq` (b' `c` ⊥))
>>
>>
>> So nothing fancy happening here either. I conclude that we are safe, and
>> my gut feeling tell me the reason is likely that we don’t `seq` anything
>> built from the unknown c and n.
>>
>> Greetings,
>> Joachim
>>
>>
>> ¹ citation needed
>>
>> --
>> Joachim “nomeata” Breitner
>> mail at joachim-breitner.de • http://www.joachim-breitner.de/
>> Jabber: nomeata at joachim-breitner.de • GPG-Key: 0xF0FBF51F
>> Debian Developer: nomeata at debian.org
>>
>>
>> _______________________________________________
>> Libraries mailing list
>> Libraries at haskell.org
>> http://www.haskell.org/mailman/listinfo/libraries
>>
>>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.haskell.org/pipermail/libraries/attachments/20140724/6d77555e/attachment-0001.html>
More information about the Libraries
mailing list