Fusing loops by specializing on functions with SpecConstr?

Wed Apr 1 08:21:21 UTC 2020

>
> Looking at the optimized core, it’s true that the conversion of Maybe to
> Either and back again gets eliminated, which is wonderful! But what’s less
> wonderful is the value passed around through `s`:
>
>     mapMaybeSF
>       = \ (@ a) (@ b) (f :: SF a b) ->
>           case f of { SF @ s f2 s2 ->
>           SF
>             (\ (a1 :: Maybe a) (ds2 :: ((), ((), (((), (((), (((), s),
> ())), ((), ((), ())))), ((), ()))))) ->
>

That is indeed true. But note that as long as you manage to inline
`mapMaybeSF`, the final `runSF` will only allocate once on the "edge" of
each iteration, all intermediate allocations will have been fused away. But
the allocation of these non-sense records seems unfortunate.

Optimisation-wise, I see two problems here:

   1. `mapMaybeSF` is already too huge to inline without INLINE. That is
   because its lambda isn't floated out to the top-level, which is because of
   the existential @s (that shouldn't be a problem), but also its mention of
   f2. The fact that f2 occurs free rather than as an argument makes the
   simplifier specialise `mapMaybeSF` for it, so if it were floated out
   (thereby necessarily lambda-lifted) to top-level, then we'd lose the
   ability to specialise without SpecConstr (which currently only applies to
   recursive functions anyway).
   2. The lambda isn't let-bound (which is probably a consequence of the
   previous point), so it isn't strictness analysed and we have no W/W split.
   If we had, I imagine we would have a worker of type `s -> ...` here. W/W is
   unnecessary if we manage to inline the function anyway, but I'm pretty
   certain we won't inline for larger programs (like `mapMaybeSF` already), in
   which case every failure to inline leaves behind such a residue of records.

So this already seems quite brittle. Maybe a very targeted optimisation
that gets rid of the boring ((), _) wrappers could be worthwhile, given
that a potential caller is never able to construct such a thing themselves.
But that very much hinges on being able to prove that in fact every such
((), _) constructed in the function itself terminates.

There are a few ways I can think of in which we as the programmer could
have been smarter, though:

   - Simply by specialising `SF` for the `()` case:

   data SF a b where
     SFState :: !(a -> s -> Step s b) -> !s -> SF a b
     SFNoState :: !(a -> Step () b) -> SF a b

   And then implementing every action 2^n times, where n is the number of
   `SF` arguments. That undoubtly leads to even more code bloat.
   - An alternative that I'm a little uncertain would play out would be

   data SMaybe a = SNothing | SJust !a
   data SF a b where
     SF :: !(SMaybe (s :~: ()) ->  !(a -> s -> Step s b) -> !s -> SF a b

   and try match on the proof everywhere needed to justify e.g. in `(.)`
   only storing e.g. s1 instead of (s1, s2). Basically do some type algebra in
   the implementation.
   - An even simpler thing would be to somehow use `Void#` (which should
   have been named `Unit#`), but I think that doesn't work due to runtime rep
   polymorphism restrictions.

I think there is lots that can be done to tune this idea.

Am Mi., 1. Apr. 2020 um 01:16 Uhr schrieb Alexis King <lexi.lambda at gmail.com
>:

> > On Mar 31, 2020, at 17:05, Sebastian Graf <sgraf1337 at gmail.com> wrote:
> >
> > Yeah, SPEC is quite unreliable, because IIRC at some point it's either
> consumed or irrelevant. But none of the combinators you mentioned should
> rely on SpecConstr! They are all non-recursive, so the Simplifier will take
> care of "specialisation". And it works just fine, I just tried it
>
> Ah! You are right, I did not read carefully enough and misinterpreted.
> That approach is clever, indeed. I had tried something similar with a CPS
> encoding, but the piece I was missing was using the existential to tie the
> final knot.
>
> I have tried it out on some of my experiments. It’s definitely a
> significant improvement, but it isn’t perfect. Here’s a small example:
>
>     mapMaybeSF :: SF a b -> SF (Maybe a) (Maybe b)
>     mapMaybeSF f = proc v -> case v of
>       Just x -> do
>         y <- f -< x
>         returnA -< Just y
>       Nothing -> returnA -< Nothing
>
> Looking at the optimized core, it’s true that the conversion of Maybe to
> Either and back again gets eliminated, which is wonderful! But what’s less
> wonderful is the value passed around through `s`:
>
>     mapMaybeSF
>       = \ (@ a) (@ b) (f :: SF a b) ->
>           case f of { SF @ s f2 s2 ->
>           SF
>             (\ (a1 :: Maybe a) (ds2 :: ((), ((), (((), (((), (((), s),
> ())), ((), ((), ())))), ((), ()))))) ->
>
> Yikes! GHC has no obvious way to clean this type up, so it will just grow
> indefinitely, and we end up doing a dozen pattern-matches in the body
> followed by another dozen allocations, just wrapping and unwrapping tuples.
>
> Getting rid of that seems probably a lot more tractable than fusing the
> recursive loops, but I’m still not immediately certain how to do it. GHC
> would have to somehow deduce that `s` is existentially-bound, so it can
> rewrite something like
>
>     SF (\a ((), x) -> ... Yield ((), y) b ...) ((), s)
>
> to
>
>     SF (\a x -> ... Yield y b) s
>
> by parametricity. Is that an unreasonable ask? I don’t know!
>
> Another subtlety I considered involves recursive arrows, where I currently
> depend on laziness in (|||). Here’s one example:
>
>     mapSF :: SF a b -> SF [a] [b]
>     mapSF f = proc xs -> case xs of
>       x:xs -> do
>         y <- f -< x
>         ys <- mapSF f -< xs
>         returnA -< (y:ys)
>       [] -> returnA -< []
>
> Currently, GHC will just compile this to `mapSF f = mapSF f` under your
> implementation, since (|||) and (>>>) are both strict. However, I think
> this is not totally intractable—we can easily introduce an explicit `lazy`
> combinator to rein in strictness:
>
>     lazy :: SF a b -> SF a b
>     lazy sf0 = SF g (Unit sf0) where
>       g a (Unit sf1) = case runSF sf1 a of
>         (b, sf2) -> Yield (Unit sf2) b
>
> And now we can write `lazy (mapSF f)` at the point of the recursive call
> to avoid the infinite loop. This defeats some optimizations, of course, but
> `mapSF` is fundamentally recursive, so there’s only so much we can really
> expect.
>
> So perhaps my needs here are less ambitious, after all! Getting rid of all
> those redundant tuples is my next question, but that’s rather unrelated
> from what we’ve been talking about so far.
>
> Alexis
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.haskell.org/pipermail/ghc-devs/attachments/20200401/12d5cd0e/attachment.html>