<div dir="ltr"><div>Hi Alexis,</div><div><br></div><div>that's a very interesting example you have there!</div><div><br></div><div>So far, what we referred to as Nested CPR concerned unboxing for returned nested *records*, e.g., the `annotation` field in your example. That's what I try to exploit in <a href="https://gitlab.haskell.org/ghc/ghc/-/merge_requests/1866">!1866</a>, which after a rebase that I'll hopefully be doing this week, some more sleuthing and then documenting what I did will finally make it into master.</div><div><br></div><div>CPR'ing the Lambda, i.e., what is returned for `parser`, on the other hand, is a surprising new opportunity for what Nested CPR could do beyond unboxing records! And it's pretty simple, too: Because it's a function, we don't care about subtleties such as whether all callers actually evaluate the pair that deep (actually, that's wrong, as I realise below). I think it's entirely within the reach of !1866 today. So we could transform (provided that `(,) <$> a <*> b` inlines `<$>` and `<*>` and then will actually have the CPR property) <br></div><div><br></div><div> AnnotatedParser ann1 f <+> AnnotatedParser ann2 g = AnnotatedParser<br>
{ annotation = Seq ann1 ann2<br>
, parser = \s1 -><br>
let !(a, s2) = f s1<br>
!(b, s3) = g s2<br>
in ((,) <$> a <*> b, s3)<br>
}</div><div><br></div><div>to</div><div><br></div><div> $w<+> :: Annotation<br>
-> (String -> (Maybe a, String))<br>
-> Annotation<br>
-> (String -> (Maybe b, String))<br>
-> (# Annotation, String -> (# Maybe (a, b), String #) #)<br>
$w<+> ann1 f ann2 g =<br>
(# Seq ann1 ann2<br>
, \s1 -> case (\s1 -> let !(a, s2) = f s1<br>
!(b, s3) = g s2<br>
in ((,) <$> a <*> b) s1 of (p, q) -> (#p, q#), s3) #)</div><div><br></div><div> <+> :: AnnotatedParser a -> AnnotatedParser b -> AnnotatedParser (a, b)<br>
<+> (AnnotatedParser ann1 f) (AnnotatedParser ann2 g) =<br>
case $w<+> ann1 f ann2 g of<br>
(# a, b #) -> AnnotatedParser (\s1 -> case a s1 of (# p, q#) -> (p, q)) b<br>
{-# INLINE <+> #-}</div><div><br></div><div>Actually writing out the transformation tells me that this isn't always a win: We now have to allocate a lambda in the wrapper. That is only a win if that lambda cancels away at call sites! So we have to make sure that all call sites of the wrapper actually call the `parser`, so that the lambda simplifies away. If it doesn't, we have a situation akin to reboxing. So I was wrong above when I said "we don't care about subtleties such as whether all callers actually evaluate the pair that deep": We very much need to know whether all call sites call the lambda. Luckily, I implemented <a href="https://gitlab.haskell.org/ghc/ghc/-/merge_requests/4493">just that</a> for exploitation by Nested CPR! That's the reason why I need to rebase !1866 now. I'll ḱeep you posted.<br></div><div><br></div><div>---<br></div><div><br></div><div>You might wonder why CPR today doesn't care for lambdas. Well, they only make sense in nested scenarios (otherwise the function wasn't eta-expanded that far, for good reasons) and CPR currently doesn't bother unboxing records nestedly, which is what <a href="https://gitlab.haskell.org/ghc/ghc/-/issues/18174">#18174</a> discusses and what !1866 tries to fix.<br></div><div><br></div><div>Cheers,<br></div><div>Sebastian<br></div></div><br><div class="gmail_quote"><div dir="ltr" class="gmail_attr">Am Di., 15. Dez. 2020 um 06:52 Uhr schrieb Alexis King <<a href="mailto:lexi.lambda@gmail.com">lexi.lambda@gmail.com</a>>:<br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">Hi all,<br>
<br>
I spent some time today looking into the performance of a program<br>
involving a parser type that looks something like this:<br>
<br>
data AnnotatedParser a = AnnotatedParser<br>
{ annotation :: Annotation<br>
, parser :: String -> (Maybe a, String)<br>
}<br>
<br>
The `Annotation` records metadata about the structure of an<br>
`AnnotatedParser` that can be accessed statically (that is, without<br>
having to run the parser on some input). `AnnotatedParser`s are built<br>
from various primitive constructors and composed using various<br>
combinators. These combinators end up looking something like this:<br>
<br>
(<+>) :: AnnotatedParser a -> AnnotatedParser b -> AnnotatedParser (a, b)<br>
AnnotatedParser ann1 f <+> AnnotatedParser ann2 g = AnnotatedParser<br>
{ annotation = Seq ann1 ann2<br>
, parser = \s1 -><br>
let !(a, s2) = f s1<br>
!(b, s3) = g s2<br>
in ((,) <$> a <*> b, s3)<br>
}<br>
<br>
Use of these combinators leads to the construction and subsequent case<br>
analysis of numerous `AnnotatedParser` closures. Happily, constructed<br>
product result[1] analysis kicks in and rewrites such combinators to cut<br>
down on the needless boxing, leading to worker/wrapper splits like this:<br>
<br>
$w<+> :: Annotation<br>
-> (String -> (Maybe a, String))<br>
-> Annotation<br>
-> (String -> (Maybe b, String))<br>
-> (# Annotation, String -> (Maybe (a, b), String) #)<br>
$w<+> ann1 f ann2 g =<br>
(# Seq ann1 ann2<br>
, \s1 -> let !(a, s2) = f s1<br>
!(b, s3) = g s2<br>
in ((,) <$> a <*> b, s3) #)<br>
<br>
<+> :: AnnotatedParser a -> AnnotatedParser b -> AnnotatedParser (a, b)<br>
<+> (AnnotatedParser ann1 f) (AnnotatedParser ann2 g) =<br>
case $w<+> ann1 f ann2 g of<br>
(# a, b #) -> AnnotatedParser a b<br>
{-# INLINE <+> #-}<br>
<br>
This is great, and it cuts down on allocation significantly, but there<br>
is still something unsatisfying about it: the `parser` function inside<br>
the record is not affected by CPR! This is a shame, because<br>
essentially all use sites immediately deconstruct the pair, making it<br>
a prime candidate for unboxing. Ideally, we’d like to get this, instead:<br>
<br>
$w<+> :: Annotation<br>
-> (String -> (Maybe a, String))<br>
-> Annotation<br>
-> (String -> (Maybe b, String))<br>
-> (# Annotation, String -> (# Maybe (a, b), String #) #)<br>
$w<+> ann1 f ann2 g =<br>
(# Seq ann1 ann2<br>
, \s1 -> let !(a, s2) = f s1<br>
!(b, s3) = g s2<br>
in (# (,) <$> a <*> b, s3 #) #)<br>
<br>
In practice, little combinators like `$w<+>` are marked INLINE, so `f`<br>
and `g` are usually known rather than unknown calls. This nested CPR<br>
transformation would allow the tuple construction to fuse with the<br>
tuple deconstruction, eliminating quite a lot of unnecessary<br>
boxing/unboxing.<br>
<br>
Unfortunately, it seems as though GHC’s implementation of CPR is<br>
entirely first-order: although function arguments are given rich<br>
demand signatures, results are only described one level deep. But as<br>
the above example hopefully illustrates, that’s leaving significant<br>
optimization opportunities on the table! Hence, my questions:<br>
<br>
1. Has this notion of “nested CPR” been explored at all before?<br>
2. Does such an extension to CPR sound worth its weight?<br>
<br>
I peeked a little at GHC.Types.Cpr and GHC.Core.Opt.CprAnal, and it<br>
seems quite manageable to me, but I haven’t actually looked into an<br>
implementation attempt just yet. I’m mostly interested in whether<br>
others have thought about something like this and/or run into similar<br>
issues in the past, or if this is really an unusual construction.<br>
<br>
Thanks,<br>
Alexis<br>
<br>
[1]: <a href="https://www.microsoft.com/en-us/research/wp-content/uploads/2016/07/cpr.pdf" rel="noreferrer" target="_blank">https://www.microsoft.com/en-us/research/wp-content/uploads/2016/07/cpr.pdf</a><br>
<br>
_______________________________________________<br>
ghc-devs mailing list<br>
<a href="mailto:ghc-devs@haskell.org" target="_blank">ghc-devs@haskell.org</a><br>
<a href="http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs" rel="noreferrer" target="_blank">http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs</a><br>
</blockquote></div>