Would it be a good or bad idea to make dropWhile try to fuse?

David Feuer david.feuer at gmail.com
Sat Jul 26 16:05:54 UTC 2014


I am not convinced that Criterion is the appropriate tool for most of
this work. I believe we need to look primarily at allocation numbers
when looking at these in isolation, and only look at benchmarks
comparing the performance of realistic programs using them. The
trouble is that allocation and nursery GC are both very fast (normally
that's a good thing), so I believe the primary benefit we hope to get
from fusion is better L1 cache utilization. We will only see that in
benchmarks when enough other computation is going on in the vicinity
of the list pipeline. It may be possible to write benchmarks to
investigate that, but I certainly don't know how. That said, I do
believe your idea of some sort of automated/semi-automated test suite
is a good one. I will try to learn a bit more about GHC profiling and
see if I can come up with at least a first draft of a rough test
suite.

On Sat, Jul 26, 2014 at 7:16 AM, Joachim Breitner
<mail at joachim-breitner.de> wrote:
> Hi David,
>
>
> Am Freitag, den 25.07.2014, 17:32 -0400 schrieb David Feuer:
>> I think I'm going to leave that one alone for now and focus on the
>> ones that are more obviously good ideas: scanr, takeWhile, and (now
>> that we have appropriate optimizations) scanl and (hopefully, if no
>> one objects) scanl', as well as the semi-related inits. I'd also like
>> to revisit length, which is currently a horrible mess of rules, and
>> see if the new foldl' allows it to be written efficiently without so
>> much trouble.
>
> thanks for that, by the way. I guess people generally expect more
> functions to be fusable than they are at the moment.
>
> I would suggest to attack the problem more systemematically. Can you
> create a comprehensive benchmark for list functions? Something along the
> lines of
>  * for every list function, create one or a few micro benchmarks.
>  * put them into one suite, using criterion to measure them. Make it
>    possible to easily test different implementations (e.g.
>    have lists `lengthImpls = [("Foo", Foo.length), ... ]` so that
>    testing another implementations amounts to writing Foo.hs and
>    importing that.
>  * additionally, put them into one benchmark not using criterion (but
>    rather executing all of them on a reasonable size setting) so that
>    we can add it to nofib.
>  * the, based on that, we can make founded decisions.
>
> The point of the nofib benchmark is to make sure we don’t regress when
> we change something in the compiler or the libraries. It will notify us
> that something went amiss, and then we can use the criterion-based
> testsuite to figure out what exactly broke.
>
>
> It would also be a benchmark for other fusion techniques to be measured
> against. If it turns out to be useful and if you are in academics you
> can probably talk about the benchmark suite at some workshop and get
> citations from every following list-fusion paper ;-)
>
>
> The microbenchmarks should test both fusable and unfusable uses of the
> functions. You can easily turn one into the other using
>         don'tFuse = id
>         {-# NOINLINE don'tFuse #-}
> somewhere in the pipeline. For functions that both consume and produce,
> we’d want to test all four combinations.
>
> Finally, we need to put an eye on binary sizes: If fusion does not
> happen we want to use a compiled version of the function, and not
> re-create it in every use site. nofib tracks module sizes, so if the
> nofib version of your benchmark is split into modules in a sensible way,
> these module size reports will be worth watching.
>
>
> Is that something you would like to work on, and take the lead? If you
> put your code in a git repo somewhere I’ll review and contribute, of
> course.
>
> Greetings,
> Joachim
>
>
>
> --
> Joachim “nomeata” Breitner
>   mail at joachim-breitner.dehttp://www.joachim-breitner.de/
>   Jabber: nomeata at joachim-breitner.de  • GPG-Key: 0xF0FBF51F
>   Debian Developer: nomeata at debian.org
>
>
> _______________________________________________
> Libraries mailing list
> Libraries at haskell.org
> http://www.haskell.org/mailman/listinfo/libraries
>


More information about the Libraries mailing list