[Haskell-cafe] blas bindings, why are they so much slower the C?
David Roundy
droundy at darcs.net
Wed Jun 18 13:06:17 EDT 2008
On Wed, Jun 18, 2008 at 06:03:42PM +0100, Jules Bean wrote:
> Anatoly Yakovenko wrote:
> >>>#include <cblas.h>
> >>>#include <stdlib.h>
> >>>
> >>>int main() {
> >>> int size = 1024;
> >>> int ii = 0;
> >>> double* v1 = malloc(sizeof(double) * (size));
> >>> double* v2 = malloc(sizeof(double) * (size));
> >>> for(ii = 0; ii < size*size; ++ii) {
> >>> double _dd = cblas_ddot(0, v1, size, v2, size);
> >>> }
> >>> free(v1);
> >>> free(v2);
> >>>}
> >>Your C compiler sees that you're not using the result of cblas_ddot,
> >>so it doesn't even bother to call it. That loop never gets run. All
> >>your program does at runtime is call malloc and free twice, which is
> >>very fast :-)
> >
> >C doesn't work like that :).
>
> C compilers can do what they like ;)
>
> GCC in particular is pretty good at removing dead code, including entire
> loops. However it shouldn't eliminate the call to cblas_ddot unless it
> thinks cblas_ddot has no side effects at all, which would be surprising
> unless it's inlined somehow.
Or unless it's been annotated as pure, which it should be.
David
More information about the Haskell-Cafe
mailing list