Loop unrolling + fusion ?

Claus Reinke claus.reinke at talk21.com
Fri Mar 6 19:07:34 EST 2009

> My preferred spec would be roughly
> {-# NOINLINE f #-}
>    as now
> {-# INLINE f #-} 
>    works as now, which is for non-recursive f only (might in future
>    be taken as go-ahead for analysis-based recursion unfolding)
> {-# INLINE f PEEL n #-}
>    inline calls *into* recursive f (called loop peeling for loops)
> {-# INLINE f UNROLL m #-}
>    inline recursive calls to f *inside* f (called loop unrolling for loops)
> {-# INLINE f PEEL n UNROLL m #-}
>    combine the previous two
>    The numeric parameters are to be interpreted as if each call to
>    f was annotated with both PEEL and UNROLL limits, to be
>    decreased as appropriate for every PEEL or UNROLL action.

hmm, "appropriate" is one of those words that shouldn't occur in specs,
not even rough ones, so let's flesh this out a bit, by abstract example.

let f = ..f.. in f{n,m} -PEEL-> let f = ..f.. in ..f{n-1,m}..

let f = ..f{n,m}.. in .. -UNROLL-> let f = ..|..f{n,m-1}..|.. in ..

In words: the call being peeled/unrolled disappears, being replaced by
a copy of the definition, in which the decremented counts are applied to 
the calls of the same function created by unfolding. Is that specific enough?

>    Peeling and unrolling stop when the respective count annotation
>    has reached 0. Note that mutual recursion is the domain of PEEL,
>    while UNROLL only applies to direct recursion.
>    {-# INLINE f PEEL n #-}, for n>0, corresponds to worker/
>    wrapper transforms (previously done manually) + inline wrapper,
>    and should therefore also be taken as a hint for the compiler to 
>    try the static argument transformation for f (the "worker").
>    Non-supporting implementations should treat these as INLINE
>    pragmas (same warning/ignore or automatic unfold behaviour).
> Since we are talking about a refinement of the INLINE pragma, we
> also need to look at that pragma's existing subtleties:-(
> - no functions inlined into f: should be subject to override by
>    INLINE pragmas (even for the non-recursive case?)
> - no float-in/float-out/cse: ??
> - no worker/wrapper transform in strictness analyser: we do get the 
>    same effect from INLINE PEEL, so this should be okay, right?
> - loop breakers: PEEL/UNROLL have their own limits, creating
>    an intrinsic loop breaker when the counters run out

Loop breakers are still needed, in spite of the explicit limits. Consider

let {odd x = ..even{1,0}..; even x = ..odd{1,0}..} in odd{1,0} n

Peeling odd gives a call to even, peeling of which gives a fresh, not
decremented, call to odd! Unless one makes a copy of the whole
mutual recursion, with the odd calls adjusted. This might be easier 
to handle in your "unfolding as a separate core2core pass" scenario, 
where the pass might keep track of unfoldings already done (instead 
of trying to encode that information locally, in annotations).

Other issues?

More information about the Glasgow-haskell-users mailing list