[GHC] #7258: Compiling DynFlags is jolly slow
GHC
ghc-devs at haskell.org
Wed Nov 8 13:01:43 UTC 2017
#7258: Compiling DynFlags is jolly slow
-------------------------------------+-------------------------------------
Reporter: simonpj | Owner: simonpj
Type: bug | Status: new
Priority: normal | Milestone:
Component: Compiler | Version: 7.6.1
Resolution: | Keywords: deriving-perf
Operating System: Unknown/Multiple | Architecture:
Type of failure: Compile-time | Unknown/Multiple
performance bug | Test Case:
Blocked By: | Blocking:
Related Tickets: | Differential Rev(s):
Wiki Page: |
-------------------------------------+-------------------------------------
Comment (by simonpj):
Hmm. The `Cmm` for `g10` is actually a lot bigger than that for `f10`, and
looked to me as if it too would scale quadratically. Are `f10` and `g10`
accurate reflections of the "scale well" and "scale badly" versions of
Read?
I can see why things might scale badly, if we have a lot of
{{{
p1 >>= \x ->
p2 >>= \y ->
p3 >>= \z ->
return (K x y z)
}}}
And `f10` has that structure, albeit with data constructors rather than
functions. BUT: if that's it, why doesn't `(>>=)` inline, so we get a
case expression instead??
So if `f10` is reasonably accurate, I can see two ways forward:
1 Do something in the back end to generate less code.
2 Generate different code. I'm still intrigued (and a bit disbelieving)
why a different exposition of Read generates code that scales better.
Concerning (1) the kind of thing I had in mind was more like this
{{{
f10 =
A (\i0 ->
A (\i1 ->
A (\i2 ->
A (\i3 ->
A (\i4 -> let t = (i0,i1,i2,i3,i4) in
A (\i5 ->
A (\i6 ->
A (\i7 ->
A (\i8 ->
A (\i9 ->
case t of (i0,i1,i2,i3,i4) ->
N (i0, i1, i2, i3, i4, i5, i6, i7, i8, i9)
}}}
Now no function closure gets more than 5 free variables. I imagine one
could do this as a STG-to-STG transformation if we decided that this was
really the core issue.
A tantalising point is this. We have something like this:
{{{
let f4 = [i0,i1,i2,i3] \i4 ->
let f5 = [i0,i1,i2,i3, i4] \i5 -> ...blah...
}}}
The list before the `\` are the free vars of the closure.
Inside the code for `f4` we have `f4`'s function closure in hand, with
`i0...` in it. Rather than capturing `i0, i1...` as the free vars of f5,
we could just store a pointer to `f4`'s function closure (it's a kind of
tuple, after all) and fetch its free vars from there. It's as if we
wanted to say
{{{
let f4 = [i0,i1,i2,i3] \i4 ->
let f5 = [f4, i4] \i5 -> ...blah...
}}}
with `f4` and `i4` being the free vars of `f5`'s closure. This would be a
deeper change to the code generator, but it could make a lot of sense in
cases where ''all'' of `f4`'s free variables are also free in `f5`. (If
not all where, you might get a space leak by closing over `f4`.)
--
Ticket URL: <http://ghc.haskell.org/trac/ghc/ticket/7258#comment:88>
GHC <http://www.haskell.org/ghc/>
The Glasgow Haskell Compiler
More information about the ghc-tickets
mailing list