[Haskell-cafe] blas bindings, why are they so much slower the C?
Anatoly Yakovenko
aeyakovenko at gmail.com
Wed Jun 18 00:00:26 EDT 2008
here is the C:
#include <cblas.h>
#include <stdlib.h>
int main() {
int size = 1024;
int ii = 0;
double* v1 = malloc(sizeof(double) * (size));
double* v2 = malloc(sizeof(double) * (size));
for(ii = 0; ii < size*size; ++ii) {
double _dd = cblas_ddot(0, v1, size, v2, size);
}
free(v1);
free(v2);
}
this is the haskell:
module Main where
import Data.Vector.Dense.IO
main = do
let size = 1024
v1::IOVector Int Double <- newListVector size [0..]
v2::IOVector Int Double <- newListVector size [0..]
mapM_ (\ ii -> do v1 `getDot` v2) [0..size*size]
time ./testdot
real 0m0.017s
user 0m0.010s
sys 0m0.010s
time ./htestdot
real 0m4.692s
user 0m4.670s
sys 0m0.030s
so like 250x difference
htestdot.prof is no help
Tue Jun 17 20:46 2008 Time and Allocation Profiling Report (Final)
htestdot +RTS -p -RTS
total time = 3.92 secs (196 ticks @ 20 ms)
total alloc = 419,653,032 bytes (excludes profiling overheads)
COST CENTRE MODULE %time %alloc
main Main 88.3 83.0
CAF Main 11.7 17.0
individual inherited
COST CENTRE MODULE
no. entries %time %alloc %time %alloc
MAIN MAIN
1 0 0.0 0.0 100.0 100.0
CAF Main
216 7 11.7 17.0 100.0 100.0
main Main
222 1 88.3 83.0 88.3 83.0
CAF GHC.Float
187 1 0.0 0.0 0.0 0.0
CAF GHC.Handle
168 3 0.0 0.0 0.0 0.0
More information about the Haskell-Cafe
mailing list