Low-level array performance
dan.doel at gmail.com
Wed Jun 18 17:00:22 EDT 2008
On Wednesday 18 June 2008, Daniel Fischer wrote:
> Am Dienstag, 17. Juni 2008 22:37 schrieb Dan Doel:
> > I'll attach new, hopefully bug-free versions of the benchmark to this
> > message.
> With -O2 -fvia-C -optc-O3, the difference is small (less than 1%), but
> today, ByteArr is faster more often.
Hmm, well, I'm a bit flummoxed. I still get Addr# outperforming MBA# by
perhaps 10% - 15%, even with -fvia-C -optc-O3 (and before the slight speedup
below). Perhaps gcc's optimizer isn't doing as good a job for me for some
In any case, I've entered a bug for this on the GHC trac:
It contains a Ptr benchmark that performs slightly faster on very small arrays
(under, say, 40 elements; I noticed such runs were taking more time than
those with larger arrays with correspondingly fewer iterations, so I
eliminated the replicateM_ in favor of an explicit loop. It gains a little
time on the small arrays, but not enough to match the performance on the
larger arrays, so I guess there are yet more factors. :) In any case, it
makes it closer to being the same code as ByteArr).
The bug is filed against the native code generator, since it shows up more
clearly there. I haven't gotten to looking at C-- or assembly yet, but
hopefully I will in the near future. I'll try to do further followup on the
bug report, since that's probably easier for the developers to keep track of.
More information about the Glasgow-haskell-users