Loop unrolling + fusion ?
Claus Reinke
claus.reinke at talk21.com
Sat Feb 28 14:42:57 EST 2009
> import Data.Array.Vector
> import Data.Bits
> main = print . productU . mapU (*2) . mapU (`shiftL` 2) $ replicateU (100000000 :: Int)
> (5::Int)
>
> and turns it into a loop like this:
>
> $wfold :: Int# -> Int# -> Int#
> $wfold =
> \ (ww_sWX :: Int#) (ww1_sX1 :: Int#) ->
> case ww1_sX1 of wild_B1 {
> __DEFAULT ->
> $wfold (*# ww_sWX 40) (+# wild_B1 1);
> 100000000 -> ww_sWX
> }
..
> So now, since we've gone to such effort to produce a tiny loop like, this,
> can't we unroll it just a little?
> Anyone think of a way to apply Claus' TH unroller, or somehow convince GCC
> it is worth unrolling this guy, so we get the win of both aggressive high level
> fusion, and aggressive low level loop optimisations?
I'm not sure this is what you're after (been too long since I read assembler;-),
but it sounds as if you wanted to unroll the source of that fold, which seems
to be a local definition in foldS? Since unrolling is not always a good idea, it
would also be nice to have a way to control/initiate it from outside of the
uvector package (perhaps a RULE to redirect the call from foldS to a
foldSN, but foldS is hidden, and gets inlined away; but something
like that). If that works, you'd then run into the issue of wanting to
rearrange the *# and *# by variable and constant.
Claus
More information about the Glasgow-haskell-users
mailing list