Call Arity, oneShot or both

Johan Tibell johan.tibell at
Sat Oct 25 16:57:00 UTC 2014

Thanks for the interesting analysis Joachim. I wouldn't trust nofib's
runtime numbers (but the allocation ones should be good). The tests don't
run long enough and the way we don't handle typical noise issues (e.g.
clock resolution, benchmark length, etc) well enough. I'd love to see some
numbers using Criterion instead.

It might also be worth to check for which programs the Core changed.

On Sat, Oct 25, 2014 at 9:24 AM, Joachim Breitner <mail at>

> Hi,
> some months ago I tried to make foldl a good consumer in the common
> case. The starting point is always to write
>         foldl k a xs = foldr (\v f a -> f (v `k` a)) id xs a
> and then somehow make GHC produce good code with this. I came up with
> two solutions: A more sophisticated compiler analysis (Call Arity), or
> an explicit annotation in the form of
>         foldlB k a xs = foldr (\v f -> oneShot (\a -> f (v `k` a))) id
>         xs a
> where oneShot :: (a -> b) -> (a -> b) is a built-in function,
> semantically the identity, but telling the compiler that it can assume
> that the (\a -> ...) is called at most once.
> Back then, we decided to use Call Arity, on the grounds that it might
> improve other code as well, despite not having a lot of evindence that
> this may happen.
> Then recently David Feuer built on my work by making more functions
> fuse, including functions like scanl that are not built on foldl, but
> benefit from the same analysis. This supports the usefulness of Call
> Arity.
> But he also found cases where Call Arity is not powerful enough and GHC
> would produce bad code. So I wanted to properly compare Call Arity with
> the oneShot approach.
> Based on todays master (0855b249), I disabled Call Arity and changed the
> definitions of foldl, foldl', scanl and scanl' to use oneShot, and ran
> nofib.
> The result are mixed. With the current code, the oneShot machinery does
> not always work as expected:
>         Program           Size    Allocs   Runtime   Elapsed  TotalMem
>             Min          -0.1%     -1.5%     -2.8%     -2.8%     -5.8%
>             Max          +0.4%     +4.7%     +5.8%     +5.6%     +5.4%
>  Geometric Mean          -0.0%     +0.1%     +0.3%     +0.3%     +0.1%
> The biggest loser is calendar, which uses scanl. I am not fully sure
> what went wrong here: Either the one-shot annotation on the lambda’s
> variable got lost somewhere in the pipeline, or despite it being there,
> the normal arity analysis did not use it.
> But there is also a winner, fft2, with -1.5% allocations. Here Call
> Arity was not good enough, but oneShot did the jobs.
> There is also the option of combining both. Then we do not get the
> regression, but still the improvement for fft2:
>             Min          -0.1%     -1.5%     -3.9%     -3.8%     -5.8%
>             Max          +0.2%     +0.1%     +6.4%     +6.3%    +13.1%
>  Geometric Mean          -0.0%     -0.0%     +0.0%     +0.0%     +0.1%
> The oneShot code is on the branch wip/oneShot. The changes are clearly
> not ready to be merged. In particular, there is the question of how to
> best keep the oneShot annotation in the unfoldings: The isOneShotLambda
> flag is currently not stored in the interface. I work around this by
> making sure that the oneShot function is never inlined in unfoldings,
> but maybe it would be better to serialize the isOneShotLambda flag in
> interfaces, which might have other good effects?
> If we want as much performance as possible, we should simply include
> both approaches. But there might be other things to consider... so not
> sure what the best thing to do is.
> Greetings,
> Joachim
> --
> Joachim “nomeata” Breitner
>   mail at joachim-breitner.de
>   Jabber: nomeata at  • GPG-Key: 0xF0FBF51F
>   Debian Developer: nomeata at
> _______________________________________________
> ghc-devs mailing list
> ghc-devs at
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>

More information about the ghc-devs mailing list