[Haskell-cafe] Array copy performance
duncan.coutts at worc.ox.ac.uk
Fri Feb 2 09:46:07 EST 2007
On Fri, 2007-02-02 at 17:02 +0300, Bulat Ziganshin wrote:
> Hello Chris,
> Friday, February 2, 2007, 4:44:37 PM, you wrote:
> > If I have two identical STUArrays (same type and bounds) then what is the most
> > efficient way to overwrite the data in the destination with the data in the
> > source? Does this work for STArrays?
> > Is there a way to avoid the long loop?
> >> forM_ (range b) $ \index ->
> >> readArray source index >>= writeArray destination index
> the topic of efficient looping over arrays is briefly covered in
I should note that it is possible to generate good loops, though it's
not easy yet from high level code.
The binary package has a memory throughput benchmark which compares C
and Haskell byte/word read/write loops:
All the Haskell versions compile down to good loops, though the loop
body is bigger than in the C case (take a look at the assembly output
for the -fasm route). However that's enough for the Haskell versions to
get a significant fraction of the memory throughput of the C versions:
C memory throughput benchmarks:
500MB of bytes written in 0.468s, at: 1068.3MB/s
500MB of bytes read in 0.380s, at: 1315.7MB/s
500MB of words written in 0.376s, at: 1329.7MB/s
500MB of words read in 0.192s, at: 2604.0MB/s
Haskell memory throughput benchmarks:
500MB of bytes written in 1.560s, at: 320.5MB/s
500MB of bytes read in 2.192s, at: 228.1MB/s
500MB of words written in 0.340s, at: 1470.5MB/s
500MB of words read in 0.344s, at: 1453.4MB/s
As you can see, the extra size of the loop body hurts us more in the
byte size cases than in the word size one, where we're getting much
closer to saturating the memory bandwidth of the box. It's a 64bit box
so the words are 8 bytes.
I'm not quite sure what is going on in the "words written" case. Perhaps
someone can see why the C one is loosing out to the Haskell version.
More information about the Haskell-Cafe