[Haskell-cafe] idea for avoiding temporaries

Claus Reinke claus.reinke at talk21.com
Sun Mar 11 20:35:43 EDT 2007

Hi Bulat,

> obviously, arrays version should create no temporary cells. 

that's why the memory traffic surprised me. i knew there had to be something wrong.

>the problems was mainly due to 2 factors:
> 1) readArray m (i,j)

yes, indeed. since we are dealing in bulk operations, we might as well take advantage 
of that, so dropping the repeated bounds-checks inside the loops makes a lot of sense.

moral/note to self: bulk array operations are your friend (i knew that!-), but you need 
to use that when defining them (unsafeRead et al are only for library writers, but library 
writers ought to use them; and i was writing a small inlined library)

> 2) 'op' in 'l' which was passed as real closure and was not inlined
> due to weakness of ghc optimizer

it irked me having to write the same loop twice, but i didn't consider the consequences.
an INLINE pragma on l almost seems sufficient to regain the loss, so i prefer that; but 
writing out the loop twice is still a tiny tick faster..

> we should help strictness analyzer by marking all the variables used in tight loops as strict. 

ah, there were a few surprises in that one. i thought i had considered possible strictness
problems, but obviously, i missed a few relevant possibilities. annotating everything is the
safe option, and clearly documents the intentions, but i cannot help thinking about which 
annotations could be omitted:

- modArray: a and i are both used anyway
- i index in loop is definitely checked (but j isn't, and some others weren't, either; so 
   better safe than sorry)
- some of the parameters need not be annotated in this example, but should be if one
  wanted to reuse the code elsewhere 

- the one i really don't get yet is the one on the accumulators (!s in l, in dotA/matA);
  i thought i had covered that by using $! in applying the loop, but annotating the
  formal loop parameter, apart from being nicer, also seems faster..

moral/note to self: don't try to be clever, try to be clear..; strictness in formal 
parameters is better than strictness in actual parameters; bang-patterns are good;-)

>after that is done, we got 1000 times less temporary data allocated and 5x faster 
>execution. now it's a bit faster than strict lists

is this now anywhere near what the number-crunchers use, when they use Haskell?-)

Bulat, thanks for looking into it and for isolating the issues so quickly!-)

More information about the Haskell-Cafe mailing list