[GHC] #14461: Reuse free variable lists through nested closures

Fri Dec 1 13:19:01 UTC 2017

#14461: Reuse free variable lists through nested closures
-------------------------------------+-------------------------------------
        Reporter:  tdammers          |                Owner:  alexbiehl
            Type:  task              |               Status:  new
        Priority:  normal            |            Milestone:
       Component:  Compiler          |              Version:  8.2.1
      Resolution:                    |             Keywords:
Operating System:  Unknown/Multiple  |         Architecture:
                                     |  Unknown/Multiple
 Type of failure:  None/Unknown      |            Test Case:
      Blocked By:                    |             Blocking:
 Related Tickets:  7258              |  Differential Rev(s):
       Wiki Page:  NestedClosures    |
-------------------------------------+-------------------------------------

Comment (by simonpj):

 > Now, when forcing `outer` we push an update frame which overwrites
 `outer`s closure to be an indirection to the resulting value hereby
 turning the free variables effectively into garbage.

 Ah!  That's an excellent point, one that I had totally neglected.

 Possibilities:

 * Step 1. Don't do this for thunks.  We have no data about what the
 troublesome cases are, but the one case we do know about has functions.

 * Step 2.  Still don't do the this for thunks, but
 {{{
 let k1 = [fv1, ..., fvn] \ []
          let k2 = [fv1, ..., fvn] \ [] -> ...
          in ...
 in ...

 ===>

 let t = (fv1, ..., fvn) in
 let k1 = [t{fv1, ..., fvn}] \ [] ->
          let k2 = [t{fv1, ..., fvn}]] \ [] -> ...
          in ...
 in ...
 }}}
   NB: here I am allowing the `t{fv1,..,fvn}` as a way to access the free
 vars of a data constructor, as well as a function closure.  I don't think
 this should be hard to arrange.  And it might be useful anyway.

   Now the shared bit is a separate tuple, which won't get updated.

 One thing that would help would be some '''data''' on how common this kind
 of thing is.
 Even deciding what to measure is non-obvious.   For example:

 * For each STG let-binding
 {{{
      let x = [fv1..fvn] \ argsx -> ex
 }}}
   How often is it the case that there is an enclosing binding
 {{{
      let y = [gv1,...gvm] \ argsy -> ey
 }}}
   where the `[gv1,..gvm]` is a subset of `[fv1,..fvn]`?  And what is the
 distribution of savings?

 By "enclosing" here, in the RHS of a let, I include the let-binding
 itself, ''even if it is non-recursive''. All the examples above have this
 form.

 Some data along these lines would be extremely helpful in guiding what we
 do.

 Another variant: currently we insist that `[gv1,..gvm]` is a '''subset'''
 of `[fv1,..fvn]`, to avoid space leaks.  But with some RTS support we
 could do better: the closure for `x` could point to the closure for `y`,
 but ''also'' have a bitmap (in `x`'s info table) to say which of `y`'s
 free vars are actually used by `x`.  That would open up the possibility
 for much more flexible sharing, at the cost of more bitmaps etc.  E.g.
 {{{
 x = [fv1,...fv100] \ [a] ->
     let y = [fv1,...fv99, a] \ [] -> ...
     in ...
 }}}
 Here `y` needs all but one of `x`'s free vars.  Data on how often this
 happens would be would be jolly useful.

-- 
Ticket URL: <http://ghc.haskell.org/trac/ghc/ticket/14461#comment:10>
GHC <http://www.haskell.org/ghc/>
The Glasgow Haskell Compiler