Low-level array performance
Daniel Fischer
daniel.is.fischer at web.de
Tue Jun 17 13:10:45 EDT 2008
Am Dienstag, 17. Juni 2008 18:32 schrieb Dan Doel:
> On Tuesday 17 June 2008, Simon Marlow wrote:
> > So I tried your examples and the Addr# version looks slower than the MBA#
> > version:
>
> Hmm...
>
> > I tried with 6.8.2 and 6.8.3, using -O2 in both cases. I tried the Ptr
> > version with and without -fvia-C -optc-O2, no difference.
>
> I had forgotten about the via-c in the pragma when I sent it, but I've
> tested it both via-c and with the new backend (and triple checked since
> your message), and I always come away with the Ptr version being faster.
> -fvia-c doesn't seem to affect the speed of the Addr# version much, while
> it improves the speed of the MBA# version. However, even with the improved
> speed, Addr# seems to edge it out here.
>
> With the new backend, I get the results I sent in my initial mail. The
> ByteArray version takes 11 - 12 seconds to reverse a size 10 array 250
> million times, whereas the Addr# version takes around 7 seconds.
>
I've experimented a bit and found that Ptr is faster for small arrays (only
very slightly so if compiled with -fvia-C -optc-O3), but ByteArr performs
much better for larger arrays
dafis at linux:~/Documents/haskell/move> ./PtrC +RTS -sstderr -RTS 20 10000000
./PtrC 20 10000000 +RTS -sstderr
Done.
481,596,836 bytes allocated in the heap
257,665,360 bytes copied during GC (scavenged)
171,919,440 bytes copied during GC (not scavenged)
117,149,696 bytes maximum residency (8 sample(s))
919 collections in generation 0 ( 3.44s)
8 collections in generation 1 ( 24.99s)
226 Mb total memory in use
INIT time 0.00s ( 0.00s elapsed)
MUT time 8.16s ( 9.06s elapsed)
GC time 28.43s ( 30.11s elapsed)
EXIT time 0.00s ( 0.00s elapsed)
Total time 36.59s ( 39.16s elapsed)
%GC time 77.7% (76.9% elapsed)
Alloc rate 59,019,220 bytes per MUT second
Productivity 22.3% of total user, 20.8% of total elapsed
dafis at linux:~/Documents/haskell/move> ./ByteArrC +RTS -sstderr -RTS 20
10000000
./ByteArrC 20 10000000 +RTS -sstderr
Done.
40,041,976 bytes allocated in the heap
1,272 bytes copied during GC (scavenged)
0 bytes copied during GC (not scavenged)
16,384 bytes maximum residency (1 sample(s))
2 collections in generation 0 ( 0.00s)
1 collections in generation 1 ( 0.00s)
40 Mb total memory in use
INIT time 0.00s ( 0.02s elapsed)
MUT time 5.03s ( 5.32s elapsed)
GC time 0.00s ( 0.01s elapsed)
EXIT time 0.00s ( 0.00s elapsed)
Total time 5.03s ( 5.35s elapsed)
%GC time 0.0% (0.3% elapsed)
Alloc rate 7,960,631 bytes per MUT second
Productivity 100.0% of total user, 94.0% of total elapsed
Using GHC 6.8.2
The GC time for the Addr# version is frightening
More information about the Glasgow-haskell-users
mailing list