Status of Stream Fusion?

Sebastian Graf sgraf1337 at gmail.com
Mon Nov 14 13:30:29 UTC 2022


Yes, if you confine yourself to Miller's "pattern fragment" (Meta variables
may appear in the head of an application, but then the arguments may only
be *distinct* bound variables) then you might be fine.

Thanks for giving it a try!


Am Mo., 14. Nov. 2022 um 14:27 Uhr schrieb J. Reinders <
jaro.reinders at gmail.com>:

> I think higher order pattern unification is different from higher order
> matching because the unification means both sides are allowed to be
> patterns that may contain meta variables, while matching works on only one
> pattern and one concrete term.
>
> The example pattern ‘forall a. someFunction (\x -> a x x)’ would indeed be
> a potential problem, but indeed I think disallowing that is fine because it
> is never necessary to write such patterns in practice.
>
> The paper linked by Simon contains another example: ‘forall f x. f x’
> matched to ‘0’, which is also a problem because it has infinitely many
> possible solutions, but I would consider this out of scope because it
> involves an application of two unification variables. For the practical
> examples I’ve seen up to now, we only need to handle the application of one
> unification variable with one locally bound variable.
>
> I’m pretty optimistic now that it is possible to implement an algorithm
> that covers the practical use cases (also the Simon’s ‘mapFB' example falls
> into this category) while being much simpler than existing algorithms for
> both higher order unification and higher order matching.
>
> I’ll do some experiments with the rule matcher.
>
> Cheers,
> Jaro
>
> > On 14 Nov 2022, at 14:03, Sebastian Graf <sgraf1337 at gmail.com> wrote:
> >
> > > I believe the reason that this is easier than higher order matching in
> general because it is restricted to applications of unification variables
> to locally bound variables.
> >
> > Indeed, it is easier to confine oneself to the pattern fragment. I think
> that's entirely the point of pattern unification: Meta variables may appear
> in the head of an application, but then the arguments may only be distinct
> bound variables. E.g., `α x x` would be disallowed.
> >
> > Perhaps it is possible to tweak RULEs just enough to match
> >
> >   forall f next. concatMap (λx → Stream next (f x)) = concatMap' next f
> >
> > (Now that I re-read your RULE, the explicit app `f x` makes sense
> compared to the absence of an app in `next` to indicate it must not mention
> `x`. I completely overread it before.)
> >
> > If `f x` should match `x*2+x` with `f := \x -> 2*x+x` then that means
> RULE matching modulo beta reductions, for a start.
> > Do we also want eta? I think we do, because we are already matching
> modulo eta since https://gitlab.haskell.org/ghc/ghc/-/merge_requests/6222
> (and are very careful not to increase arity of target term in doing so).
> >
> > There are many other troubling issues with pattern unification stemming
> from meta variables on both sides of the equation, but perhaps they don't
> come up because the target term does not contain meta variables. Solving a
> meta variable simply does `eqExpr` with solutions from other use sites, as
> today. It doesn't matter much that we are `eq`ing lambdas, because that's
> pretty simple to do.
> > I can't think of an obvious example where higher-order pattern
> *matching* (urgh, what an unfortunate intersection of meaning between
> "higher-order pattern unification" and "pattern matching") would be much
> more complicated, but I'm no expert in the field.
> >
> > I didn't know the paper Simon cited, but note that Miller's higher-order
> pattern unification (which has become bog standard in dependently-typed
> langs) identifies a useful subset that is also efficient to check. It's
> surprising that the authors do not compare the pattern fragment to the
> identified fragment of their implementation (although they cite it in one
> place as [16]).
> >
> > Anyway, unless someone comes up with a tricky counter-example for a RULE
> that would be complicated to check, you should probably just give it a try.
> >
> > Cheers,
> > Sebastian
> >
> >
> > Am Mo., 14. Nov. 2022 um 12:32 Uhr schrieb J. Reinders <
> jaro.reinders at gmail.com>:
> > Thank you both for the quick responses.
> >
> > > Can you say precisely what you mean by "using stream fusion instead of
> foldr/build fusion in base"?   For example, do you have a prototype library
> that demonstrates what you intend, all except concatMap?
> >
> >
> > I believe the stream-fusion library [1] and in particular the
> Data.List.Stream module [2] implements a version of Data.List from base
> with stream fusion instead of foldr/build fusion. It is pretty old at this
> point, so it may not completely match up with the current Data.List module
> any more and it doesn’t use skip-less stream fusion yet.
> >
> > I now also see that #915 [3] tracks the exact issue of replacing
> foldr/build fusion with stream fusion in base, but is closed because it
> required more research at the time. And in that thread, Sebastian already
> asked the same question as me 5 years ago [4]:
> >
> > > At least this could get rid of the concatMap roadblock, are there any
> others I'm not aware of?
> >
> > [1] https://hackage.haskell.org/package/stream-fusion
> > [2]
> https://hackage.haskell.org/package/stream-fusion-0.1.2.5/docs/Data-List-Stream.html
> > [3] https://gitlab.haskell.org/ghc/ghc/-/issues/915
> > [4] https://gitlab.haskell.org/ghc/ghc/-/issues/915#note_141373
> >
> > > But what about
> > >
> > > concatMap (\x. Stream next (x*2 +x))
> > >
> > > Then you want matching to succeed, with the substitution
> > > f :->  (\p. p*2 +p)
> > >
> > > This is called "higher order matching" and is pretty tricky.
> >
> >
> > First of all, I’d like to clarify I’ll write ‘unification variable’, but
> that might be the wrong term. What I mean is a variable that is bound by a
> ‘forall’ in a rewrite rule. So in the example program, I’ll call ‘f’ and
> ’next’ unification variables and ‘x’ a local variable.
> >
> > Higher order matching sounds like it makes the problem too general
> (although I’ll admit I haven’t looked into it fully). For this matching I
> feel like the core is to change the free variable ‘x’ in the expression
> ‘(x*2 +x)’ into a bound variable to get ‘\p -> p*2 +p’. That part sounds
> very easy. The only problem that remains is when to actually perform this
> change. I would suggest that this should happen when a unification variable
> ‘f’ is applied to a locally bound variable ‘x’. The local variable ‘x’
> should only be allowed to occur in unification variables that are applied
> to it.
> >
> > And indeed this seems to be what Sebastian suggests:
> >
> > > perhaps you could prevent that by saying that `x` may not occur freely
> in `next` or `f`, but the paper explicitly *wants* `x` to occur in `next`
> >
> >
> > The paper explicitly says that ‘x’ should *not* occur in ’next’ and ‘f’
> (except indirectly because ‘f’ is applied to ‘x’), so that doesn’t seem
> like a problem to me.
> >
> > In a way I’m suggesting that function application could be used in
> rewrite rules to indicate which local variables may scope over which
> unification variables. If a unification variable is applied to a local
> variable then the local variable may occur in that unification variable. It
> should not only match on a literal application.
> >
> > I believe the reason that this is easier than higher order matching in
> general because it is restricted to applications of unification variables
> to locally bound variables. No other forms are required for the concatMap
> rule.
> >
> > There can be some discussion about whether this syntax is really
> desirable because it changes the way locally bound variables are treated in
> rewrite rules, but I personally think the current behavior is not very
> reliable anyway. And I don’t think many people are using rewrite rules that
> contain local variables. But maybe it would be better to introduce special
> syntax for indicating scoping in rewrite rules. Duncan Coutts also uses
> this alternative notation:
> >
> >     λx → Stream (⟨next⟩[]) (⟨e⟩[x])
> >
> > But I’d guess it would be difficult to extend GHC’s parser to support
> that syntax. Re-using function application seems preferable from that
> perspective. Another alternative that I considered was this syntax:
> >
> >     concatMap (\x -> Stream next e) = concatMap' next (\x -> e)
> >
> > Now the matching local variable name on the left and the right could be
> used to infer that they are really the same and that ‘x’ should only be
> allowed to scope over ‘e’. GHC could just take the intersection of the
> scopes on the left and the right to deduce where the local variable ‘x’
> should be allowed to occur (in this case that intersection is just ‘e’).
> But this syntax seems a bit more magical to me, so it isn’t my first
> choice. This also leads to ambiguities if two locally bound variables on
> the same side share the same name, but that could simply be rejected.
> >
> > Cheers,
> >
> > Jaro
> >
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.haskell.org/pipermail/ghc-devs/attachments/20221114/316be014/attachment.html>


More information about the ghc-devs mailing list