until, laziness, and performance

Sun Mar 27 20:42:55 UTC 2016

Dan, I really appreciate that you responded so quickly and  took the time
to explain this. Unfortunately I still don't understand.

I did think of laziness as a potential source of bad performance but to me
both internal functions were lazy in the second argument. Since I couldn't
explain why both weren't equally slow due to laziness, I dismissed laziness
as a possible source of the problem instead of questioning my understanding
of laziness.

I still don't understand why

as'

i s
     | i >= n    = s
     | otherwise = as' (i + 2) (s + (-1) / i + 1 / (i + 1))

is not lazy in its second arg as it only has to evaluate it if i >= n. This
seems exactly analogous to or, ||, which only needs to evaluate its second
arg if its first arg is False. || is definitely lazy in its second
argument. Can you explain why as' is strict in its second arg, s? Is it due
to the following from,
https://ghc.haskell.org/trac/ghc/wiki/Commentary/Compiler/Demand ?

For functions, it is assumed, that the result of the function is used
strictly.

I guess  I should have checked my intuition against the compiler strictness
analysis. I modified sum.hs to only have a definition of sumr, commenting
out sumh and tried the various tools.

The user's guide,
https://downloads.haskell.org/~ghc/8.0.1-rc2/docs/html/users_guide/sooner.html#faster-producing-a-program-that-runs-quicker,
recommends:

Look for your function in the interface file, then for the third field in
the pragma; it should say Strictness: ⟨string⟩. The ⟨string⟩ gives the
strictness of the function’s arguments: see the GHC Commentary for a
description of the strictness notation.

It doesn't mention that the interface file is binary and that you need to
use ghc --show-iface, e.g.

ghc --show-iface sum.hi > sum.hir

Doing this seems to give the wrong answer, i.e. the second arg is lazy.

                     $was' :: Double# -> Double# -> Double#
          {- Arity: 2, Strictness: <S,U><*L*,U>, Inline: [0] -}

Is this a bug?

ghc -ddump-stranal -O sum.hs gives the right answer:

as'_s25G [Occ=LoopBreaker] :: Double -> Double -> Double
      [LclId,
       Arity=2,
       CallArity=2,
       Str=DmdType <S(S),1*U(U)><*S*,1*U(U)>m {avM-><S(S),U(U)>},
       Unf=Unf{Src=<vanilla>, TopLvl=False, Value=True, ConLike=True,
       WorkFree=True, Expandable=True, Guidance=IF_ARGS [20 20] 108 0}]

Unfortunately the simplest way to find out doesn't give information on
internal functions. I will file an ER for this:

 ghc -ddump-strsigs -O sum.hs
[1 of 1] Compiling Main             ( sum.hs, sum.o )
==================== Strictness signatures ====================
:Main.main:
Main.$trModule: m
Main.main: x
Main.sumr: <S(S),U(U)>m

I really like your definition of until' as the solution to this. To me it
seems analogous to foldl', in that it provides a way to avoid an
optimization problem that may be subtle to many. I guess it is not general
enough to be added to a library. The following world be general enough but
it would probably be overkill:

until' :: NFData a => (a -> Bool) -> (a -> a) -> a -> a
until' p f = go
  where
    go x | p $!! x   = x
         | otherwise = go (f x)

Thanks again for all your help
George

On Sat, Mar 26, 2016 at 5:40 PM Dan Doel <dan.doel at gmail.com> wrote:

> By the way, in case this helps your mental model, if you modify sumr to be:
>
>     sumr n = snd $ as' 1 0
>       where
>       as' i s
>         | i >= n = (i, s)
>         | otherwise = ...
>
> Then it has the same problem as sumh. Your original as' for sumr is
> strict in s, but this modified one isn't.
>
> This shows another way to fix sumh, too. Create a version of until
> that separates out the part of the state that is only for testing.
> Then the until loop will be strict in the result part of the state,
> and the desired optimizations will happen (in this case):
>
>     until' p step = go
>      where
>      go t r
>        | p t = r
>        | otherwise = uncurry go $ step (t, r)
>
> -- Dan
>
> On Sat, Mar 26, 2016 at 1:50 PM, George Colpitts
> <george.colpitts at gmail.com> wrote:
> > The following higher order function, sumh, seems to be 3 to 14 times
> slower
> > than the equivalent recursive function, sumr:
> >
> > sumh :: Double -> Double
> > sumh n =
> >     snd $ until ((>= n) . fst) as' (1, 0)
> >     where
> >       as' (i,s) =
> >           (i + 2, s + (-1) / i + 1 / (i + 1))
> >
> > sumr :: Double -> Double
> > sumr n =
> >     as' 1 0
> >     where
> >       as' i  s
> >           | i >= n    = s
> >           | otherwise = as' (i + 2) (s + (-1) / i + 1 / (i + 1))
> >
> > This is true in 7.10.3 as well as 8.0.1 so this is not a regression. From
> > the size usage my guess is that this is due to the allocation of tuples
> in
> > sumh. Maybe there is a straightforward way to optimize sumh but I
> couldn't
> > find it. Adding a Strict pragma didn't help nor did use of
> > -funbox-strict-fields -flate-dmd-anal. Have I missed something or should
> I
> > file a bug?
> >
> > Timings from 8.0.1 rc2:
> >
> > ghc --version
> > The Glorious Glasgow Haskell Compilation System, version 8.0.0.20160204
> > bash-3.2$ ghc -O2 -dynamic sum.hs
> > ghc -O2 -dynamic sum.hs
> > [1 of 1] Compiling Main             ( sum.hs, sum.o )
> > Linking sum ...
> > bash-3.2$ ghci
> > Prelude> :load sum
> > Ok, modules loaded: Main.
> > (0.05 secs,)
> > Prelude Main> sumh (10^6)
> > -0.6931466805602525
> > it :: Double
> > (0.14 secs, 40,708,016 bytes)
> > Prelude Main> sumr (10^6)
> > -0.6931466805602525
> > it :: Double
> > (0.01 secs, 92,000 bytes)
> >
> > Thanks
> > George
> >
> > _______________________________________________
> > ghc-devs mailing list
> > ghc-devs at haskell.org
> > http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs
> >
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.haskell.org/pipermail/ghc-devs/attachments/20160327/6c848335/attachment.html>