[Haskell-cafe] idea for avoiding temporaries
Claus Reinke
claus.reinke at talk21.com
Sun Mar 11 20:35:43 EDT 2007
Hi Bulat,
> obviously, arrays version should create no temporary cells.
that's why the memory traffic surprised me. i knew there had to be something wrong.
>the problems was mainly due to 2 factors:
> 1) readArray m (i,j)
yes, indeed. since we are dealing in bulk operations, we might as well take advantage
of that, so dropping the repeated bounds-checks inside the loops makes a lot of sense.
moral/note to self: bulk array operations are your friend (i knew that!-), but you need
to use that when defining them (unsafeRead et al are only for library writers, but library
writers ought to use them; and i was writing a small inlined library)
> 2) 'op' in 'l' which was passed as real closure and was not inlined
> due to weakness of ghc optimizer
it irked me having to write the same loop twice, but i didn't consider the consequences.
an INLINE pragma on l almost seems sufficient to regain the loss, so i prefer that; but
writing out the loop twice is still a tiny tick faster..
> we should help strictness analyzer by marking all the variables used in tight loops as strict.
ah, there were a few surprises in that one. i thought i had considered possible strictness
problems, but obviously, i missed a few relevant possibilities. annotating everything is the
safe option, and clearly documents the intentions, but i cannot help thinking about which
annotations could be omitted:
- modArray: a and i are both used anyway
- i index in loop is definitely checked (but j isn't, and some others weren't, either; so
better safe than sorry)
- some of the parameters need not be annotated in this example, but should be if one
wanted to reuse the code elsewhere
- the one i really don't get yet is the one on the accumulators (!s in l, in dotA/matA);
i thought i had covered that by using $! in applying the loop, but annotating the
formal loop parameter, apart from being nicer, also seems faster..
moral/note to self: don't try to be clever, try to be clear..; strictness in formal
parameters is better than strictness in actual parameters; bang-patterns are good;-)
>after that is done, we got 1000 times less temporary data allocated and 5x faster
>execution. now it's a bit faster than strict lists
is this now anywhere near what the number-crunchers use, when they use Haskell?-)
Bulat, thanks for looking into it and for isolating the issues so quickly!-)
claus
More information about the Haskell-Cafe
mailing list