Status of Stream Fusion?

Sebastian Graf sgraf1337 at gmail.com
Mon Nov 14 13:03:01 UTC 2022


> I believe the reason that this is easier than higher order matching in
general because it is restricted to applications of unification variables
to locally bound variables.

Indeed, it is easier to confine oneself to the pattern fragment. I think
that's entirely the point of pattern unification: Meta variables may appear
in the head of an application, but then the arguments may only be distinct
bound variables. E.g., `α x x` would be disallowed.

Perhaps it is possible to tweak RULEs just enough to match

  forall f next. concatMap (λx → Stream next (f x)) = concatMap' next f

(Now that I re-read your RULE, the explicit app `f x` makes sense compared
to the absence of an app in `next` to indicate it must not mention `x`. I
completely overread it before.)

If `f x` should match `x*2+x` with `f := \x -> 2*x+x` then that means RULE
matching modulo beta reductions, for a start.
Do we also want eta? I think we do, because we are already matching modulo
eta since https://gitlab.haskell.org/ghc/ghc/-/merge_requests/6222 (and are
very careful not to increase arity of target term in doing so).

There are many other troubling issues with pattern unification stemming
from meta variables on both sides of the equation, but perhaps they don't
come up because the target term does not contain meta variables. Solving a
meta variable simply does `eqExpr` with solutions from other use sites, as
today. It doesn't matter much that we are `eq`ing lambdas, because that's
pretty simple to do.
I can't think of an obvious example where higher-order pattern *matching*
(urgh, what an unfortunate intersection of meaning between "higher-order
pattern unification" and "pattern matching") would be much more
complicated, but I'm no expert in the field.

I didn't know the paper Simon cited, but note that Miller's higher-order
pattern unification (which has become bog standard in dependently-typed
langs) identifies a useful subset that is also efficient to check. It's
surprising that the authors do not compare the pattern fragment to the
identified fragment of their implementation (although they cite it in one
place as [16]).

Anyway, unless someone comes up with a tricky counter-example for a RULE
that would be complicated to check, you should probably just give it a try.

Cheers,
Sebastian


Am Mo., 14. Nov. 2022 um 12:32 Uhr schrieb J. Reinders <
jaro.reinders at gmail.com>:

> Thank you both for the quick responses.
>
> > Can you say precisely what you mean by "using stream fusion instead of
> foldr/build fusion in base"?   For example, do you have a prototype library
> that demonstrates what you intend, all except concatMap?
>
>
> I believe the stream-fusion library [1] and in particular the
> Data.List.Stream module [2] implements a version of Data.List from base
> with stream fusion instead of foldr/build fusion. It is pretty old at this
> point, so it may not completely match up with the current Data.List module
> any more and it doesn’t use skip-less stream fusion yet.
>
> I now also see that #915 [3] tracks the exact issue of replacing
> foldr/build fusion with stream fusion in base, but is closed because it
> required more research at the time. And in that thread, Sebastian already
> asked the same question as me 5 years ago [4]:
>
> > At least this could get rid of the concatMap roadblock, are there any
> others I'm not aware of?
>
> [1] https://hackage.haskell.org/package/stream-fusion
> [2]
> https://hackage.haskell.org/package/stream-fusion-0.1.2.5/docs/Data-List-Stream.html
> [3] https://gitlab.haskell.org/ghc/ghc/-/issues/915
> [4] https://gitlab.haskell.org/ghc/ghc/-/issues/915#note_141373
>
> > But what about
> >
> > concatMap (\x. Stream next (x*2 +x))
> >
> > Then you want matching to succeed, with the substitution
> > f :->  (\p. p*2 +p)
> >
> > This is called "higher order matching" and is pretty tricky.
>
>
> First of all, I’d like to clarify I’ll write ‘unification variable’, but
> that might be the wrong term. What I mean is a variable that is bound by a
> ‘forall’ in a rewrite rule. So in the example program, I’ll call ‘f’ and
> ’next’ unification variables and ‘x’ a local variable.
>
> Higher order matching sounds like it makes the problem too general
> (although I’ll admit I haven’t looked into it fully). For this matching I
> feel like the core is to change the free variable ‘x’ in the expression
> ‘(x*2 +x)’ into a bound variable to get ‘\p -> p*2 +p’. That part sounds
> very easy. The only problem that remains is when to actually perform this
> change. I would suggest that this should happen when a unification variable
> ‘f’ is applied to a locally bound variable ‘x’. The local variable ‘x’
> should only be allowed to occur in unification variables that are applied
> to it.
>
> And indeed this seems to be what Sebastian suggests:
>
> > perhaps you could prevent that by saying that `x` may not occur freely
> in `next` or `f`, but the paper explicitly *wants* `x` to occur in `next`
>
>
> The paper explicitly says that ‘x’ should *not* occur in ’next’ and ‘f’
> (except indirectly because ‘f’ is applied to ‘x’), so that doesn’t seem
> like a problem to me.
>
> In a way I’m suggesting that function application could be used in rewrite
> rules to indicate which local variables may scope over which unification
> variables. If a unification variable is applied to a local variable then
> the local variable may occur in that unification variable. It should not
> only match on a literal application.
>
> I believe the reason that this is easier than higher order matching in
> general because it is restricted to applications of unification variables
> to locally bound variables. No other forms are required for the concatMap
> rule.
>
> There can be some discussion about whether this syntax is really desirable
> because it changes the way locally bound variables are treated in rewrite
> rules, but I personally think the current behavior is not very reliable
> anyway. And I don’t think many people are using rewrite rules that contain
> local variables. But maybe it would be better to introduce special syntax
> for indicating scoping in rewrite rules. Duncan Coutts also uses this
> alternative notation:
>
>     λx → Stream (⟨next⟩[]) (⟨e⟩[x])
>
> But I’d guess it would be difficult to extend GHC’s parser to support that
> syntax. Re-using function application seems preferable from that
> perspective. Another alternative that I considered was this syntax:
>
>     concatMap (\x -> Stream next e) = concatMap' next (\x -> e)
>
> Now the matching local variable name on the left and the right could be
> used to infer that they are really the same and that ‘x’ should only be
> allowed to scope over ‘e’. GHC could just take the intersection of the
> scopes on the left and the right to deduce where the local variable ‘x’
> should be allowed to occur (in this case that intersection is just ‘e’).
> But this syntax seems a bit more magical to me, so it isn’t my first
> choice. This also leads to ambiguities if two locally bound variables on
> the same side share the same name, but that could simply be rejected.
>
> Cheers,
>
> Jaro
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.haskell.org/pipermail/ghc-devs/attachments/20221114/a194bc24/attachment.html>


More information about the ghc-devs mailing list